Tufte on visual statistic and reasoning

When Tufte talk about visual statistic, he used two example. The first one is cholera epidemic in London and NASA’s space shuttle. I have heard a short version of the cholera and John Snow’s story when I was a kid. Snow used several kinds of graph and analyzed them, by analyzing Snow did find the cause and solve the problem. As a statistic major student, when I use graph or data, I just use them to solve the homework, I don’t even think about what a good presentation of data can solve problem in real life. I did not read the space shuttle part thoroughly. I think it is just another example to make the point that, good design, clear and precise visual representation can solve the problem easily.

diminishing return

A good visual statistic should be objective and don’t try to fool people with dramatic difference in the graph while comparing data. The example here is trying to convince people tuition does not worth. If look closer, you can see the problem, it is just 4 year tuition and one year earning and also education is not something that can be quantified by the amount of money made. I think people who made this graph probably did not get enough education to understand the reason of it.

Question: to build the set of statistical analysis like Snow, you have to be will trained so that you know what you do can be useful. But like us, like for the case study, we probably do not certain about if it is a good representation of data and therefore solve problem or lead to a conclusion. Do you have some advice for dummies.

Advertisements

You Know Nothing, Ed Tufte

In today’s reading Tufte compares two examples of data representation through charts. Both were made to help prevent deaths, but only one was effective in that regard. John Snow investigated a mysterious outbreak of cholera near Broad Street. He not only created charts of how many people died, but he also created a map with those deaths plotted on top. This allowed him to locate the source of the disease. He also examined oddities, such as two locations that didn’t have any deaths yet were located near the contaminated water pump. He found that these places had explanations for why they had no deaths, which allowed him to further nail down the pump as the culprit. Through his use of easily readable maps and charts, Snow helped get the pump shut down and prevented hundreds of deaths from occurring. The other example is space shuttle Challenger’s o-ring failure and how it could have been prevented if NASA had listened to Morton Thiokol. The day before the launch Morton produced a set of thirteen charts so that NASA would delay the launch. Morton had found that the o-rings would likely fail due to the extremely low temperatures of the next morning. Although he understood why this was a problem, his charts failed to show that fact in an easy and clear way. If he had created simple charts and graphs that logically linked the effect temperature has on o-rings, then he also could have saved lives.

I thought this was interesting because even though both men knew what the danger was and how to prevent it, the main difference was the fact that Snow got his point across through simple and obvious visualizations of data, while Morton didn’t due to his use of complicated and flawed methods. This is a good lesson to learn, and Tufte seemed super frustrated about the term “info-graphics” throughout the reading.

Here is a chart showing the largest bankrupt companies in history sorted by year. How would Tufte react to this?

A Response to Tufte

In the article “Visual Explanations: Images and Quantities, Evidence and Narrative”, Tufte focuses more on how text in a visual data set plays its part in the whole and how it contributes to the data being presented. He outlines a successful method in which you can have the text portion appropriately highlight what the data set is saying. He first discusses the importance of assessing cause and effect and how you might display data for this purpose. He suggested focusing on the take-away message for the cause and effect, so when presenting the data, concentrate on what impacts the cause and effect message. Then, he describes the purpose of making quantitative comparisons. Statistics have always been an effect way to clearly illustrate ideas, but comparing this data to others helps solidify the applicability for the audience. The next piece of advice he gives is to mention alternative explanations and other cases. This increases the credibility of your data when conflicting evidence is obviously avoided. This shows that the topic has been researched extensively and you still hold the same conclusion, solidifying the ethos of your data. The last piece of advice he mentions goes along with the third, which is openly assessing possible errors in the data’s results. This, again, shows that the data has been extensively researched and strengthens its ethos.

I found this article particularly interesting, because I feel like it has talked about text itself the most out of anything else we have read this semester. Not only does it talk about methods for the text alongside the data, but he elaborates on the unification between text and images. Also, the context is mostly statistical data, so Tufte even covers a more scientific approach to text and its visual graphic. On that note, I thought the redirection of the article towards the specific example f the rocket accident was not very related and did not have as much of the visual graphics, rhetoric application as I would’ve hoped for.

This caption is a good example of text being to the point and elaborating on only the necessary trend.

My questions is, in what ways could using any of these elements he’s outlined be a negative asset to the graphic’s purpose?

Tufte- 10/30

For the readings today, I belive that the author used the examples about cholera and the Challenger disaster in order to show that there can be right and wrong ways to show visual data and evidence. The example of Jon Snow with cholera was a successful one because he had a good idea, and he had a good method for presenting his data. Jon snow was able to successfully show that infected water, food, and sewage is what is the cause of cholera. On the other hand, it appears that the example of the Challenger Disaster was not successful presentation of data. The administration did not believe that the engineer had sufficient evidence to prove that the O rings could have any sort of disaster. Overall, I thought this article was pretty dense and rather boring, but the main point that I got from this reading was that it is really important to have an effective way for presenting data visually. However, I still think the example of the challenger was more of a political move, rather than it just being about the administration not believing that the data showed sufficient evidence.

I belive that this reading was meant to show us that is important to present visual data that is clear to understand and presents information in a way that is correct and valid. I think that this picture is a good visual example of that. This graphic says that information must make sense, not just look good. So, this is a good real life example to look at for this class. Although this graphic isn’t directly linked with the main point of this article, I think it is useful for us as students to remember this going forward because you don’t want to focus too much on the visual appeal of your graphic and forget that valid information is just as important.

Bad-Data-Size-Matters-In-Surveys

Question for the class: When presenting statistics, how much effort should be put into visual appeal? In other words, how much does visual appeal matter if the most important part is the data?

Visual Explanations (feat. Tufte)

Tufte brings some interesting real world examples into the importance of accurate and clearly displayed visualizations and graphics. He discusses and analyzes two different situations in which the clarity and accuracy of these graphics was literally the difference between life and death. The first instance of this type of situation occurred during a cholera outbreak in London in 1854. John Snow was out to find the source of the outbreak in hopes of stopping the death and destruction of the people of London. He uses data obtained from the general register office and plots it out accordingly using different maps and bar graphs. Through his successful implementation of these graphics, he is able to identify that the cholera was coming from an infected water pump and those who drank from that pump were getting sick and often dying. Because of the graphics Snow was able to create he was able to discover the source of the illness and the epidemic was ended after officials replaced the water pump. The second instance of life and death infographics is not quite as successful. The launch of the Challenger rocket in 1986 resulted in a catastrophic crash killing all seven crew members aboard the shuttle. This was due to a part of the rocket called an O-ring that could not function properly due to the freezing temperatures of a cold January day. Engineers saw this problem and tried to communicate it as an issue to NASA, but the charts and data they presented were not successful in their intentions and the shuttle launched anyway. This confirmed the doubts of the engineers and seven lives were lost in a fiery explosion.

Tufte brings forth these two interesting and important points in ensuring accuracy and clarity in graphics created, because it can mean the difference between life and death.

escape-plan-large2

One example I have come up with for an important infographic that can mean life or death is a fire escape plan.

What other instances might the accuracy and clarity of an infographic mean the difference between life and death?

Tufte and Visual Explainations

Tufte’s article sets up two different examples of real life situations where data visualization is used to come to a conclusion. It was interesting to see how many different ways these data sets could be compiled to show different conclusions, especially the initial one with the water pump and John Snow.

The first example, from the mid 1800s, follows the story of a man named John Snow in the middle of a cholera outbreak in London. Plotting deaths from cholera on a map, he is able to make a correlation between the outbreak and a water pump on Broad Street. Obviously this is in the 1800s and his methods aren’t exactly up to the scientific specifications of today it was interesting to see that in a time of lower scientific advancement he tested the water itself to no conclusion. Initially I thought this was pretty interesting, he worked with what he had, but Tufte gave examples to put a bit of a spin on Snow’s data visualization. Tufte showed different aggregations of the same data through using different geological subdivisions of the map. Depending on how the locations of the map were seperated, the data could be displayed in such a way that show no correlation to the water pump location and the outbreak. He makes an important note on plot maps like this in that they could just show population data. This way not the case in this example, but it is possible there are more cases in an area simply because it’s more highly populated. It reminded me of this XKCD comic, linked below.

https://xkcd.com/1138/

It really hits home the message of right and wrong ways to display data. It’s easy to work backwards from today knowing the cause of the outbreak and look at his data map and say “Of course, it’s so obvious.” But at the time, it was probably counter-intuitive for Snow to find no impurities in the water, only to keep looking at it as a possible cause. I felt myself saying this a lot in the second part of the article about the Challenger accident. Looking at the graphs as a layperson it seems obvious to look at and say “Look, there’s more accidents the colder it gets” but I’m sure scientists who knew better than me with millions of dollars and multiple lives on the line were more scientific in their analysis. That said, it looks like their is a clear correlation between temperature and O-ring failure from the small data set shown. I think a lot of it had to do with the fact NASA had never rescheduled a launch prior. It seemed like they put convenience and reputation over clear evidence.

Going back to Tufte’s examples of different ways to display data in reference to the cholera outbreak dot map, how do we know when we’re displaying data correctly? Like I said prior, it’s easy to make a correlative link between the two now because we know they’re linked, but at the time what if Snow had aggregated the data wrong or displayed it in a different way?

Death from Cholera and Chartjunk

This week’s reading from Tufte was an interesting one. Tufte examines two different historical events which were either helped or hurt by good or bad data visualization techniques. The first example he talks about was a cholera outbreak in London back in the late 1800s. The outbreak was traced back to a contaminated water pump on a specific street by a man named John Snow. He did a bit of detective work and linked the people who died to where they lived, worked, or spent the vast majority of their time during the day. He plotted all of this information on a dot graph on a map of the general area of the outbreak and was able to draw a direct correlation to a specific street, Broad Street, and deduced from there that the water pump was contaminated. He is credited for stopping the outbreak, which Tufte isn’t entirely certain about because it seems that the outbreak was dying down by the time Snow had charted his data and made his findings know, but the amount of cases after Snow’s work was done were significantly lower so there was probably some relation. The other event Tufte analyzes is the explosion of the Challenger space shuttle. The company which made the rocket apparently had strong beliefs that the temperature in which the Challenger would be launching would cause a malfunction of a specific rocket part and low and behold that is exactly what happened. The rocket manufacturer conversed with NASA the night before the explosion to voice their concerns and the sent NASA 13 different charts to try and convince NASA. Tufte says that these charts were lacking elements to make them convincing and they actually provided excuses for NASA to disregard the warnings the rocket manufacturer was giving them. Tufte mentions that the charts lacked specific names of the people doing the research, some of the charts didn’t relate the data directly to the temperature, and others were poorly worded or shown. Even the research into the explosion after the fact produced sub-par, by Tufte’s standards, charts and graphs loaded with chartjunk and unnecessary facts and figures.

Overall, this reading is a good reinforcement of the principals Tufte was championing in the reading of his we had earlier in the semester. You need to put a lot of effort into your charts and graphs to make sure that they have only the most necessary information and that you did the required research to adequately tie your research to the points you are trying to make with your infographics or else the infographics serve no purpose.

The example below is of chartjunk, which Tufte has talked about at length, and is an example of what you should not do if you actually want to get the audience to read and engage with your stats and information.

My question is this: do you believe that poor infographic practices actually have as strong of an impact as Tufte is trying to convince us of in his reading?