Creating the field of evidence based data analysis - do people know what a p-value looks like?

16 Oct 2014

In the medical sciences, there is a discipline called “evidence based medicine”. The basic idea is to study the actual practice of medicine using experimental techniques. The reason is that while we may have good experimental evidence about specific medicines or practices, the global behavior and execution of medical practice may also matter. There have been some success stories from this approach and also backlash from physicians who don’t like to be told how to practice medicine. However, on the whole it is a valuable and interesting scientific exercise.

Roger introduced the idea of evidence based data analysis in a previous post. The basic idea is to study the actual practice and behavior of data analysts to identify how analysts behave. There is a strong history of this type of research within the data visualization community starting with Bill Cleveland and extending forward to work by Diane Cook, , Jeffrey Heer, and others.

Today we published a large-scale evidence based data analysis randomized trial. Two of the most common data analysis tasks (for better or worse) are exploratory analysis and the identification of statistically significant results. Di Cook’s group calls this idea “graphical inference” or “visual significance” and they have studied human’s ability to detect significance in the context of [In the medical sciences, there is a discipline called “evidence based medicine”. The basic idea is to study the actual practice of medicine using experimental techniques. The reason is that while we may have good experimental evidence about specific medicines or practices, the global behavior and execution of medical practice may also matter. There have been some success stories from this approach and also backlash from physicians who don’t like to be told how to practice medicine. However, on the whole it is a valuable and interesting scientific exercise.

We performed a randomized study to determine if data analysts with basic training could identify statistically significant relationships. Or as the first author put it in a tweet:

First paper just dropped! Can you tell the difference between these two plots? https://t.co/Lng0FWI0XY pic.twitter.com/zFCwwcxaAX

— Aaron Fisher (@PrfFarnsworth) October 16, 2014

What we found was that people were pretty bad at detecting statistically significant results, but that over multiple trials they could improve. This is a tentative first step toward understanding how the general practice of data analysis works. If you want to play around and see how good you are at seeing p-values we also built this interactive Shiny app. If you don’t see the app you can also go to the Shiny app page here.

Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Creating the field of evidence based data analysis - do people know what a p-value looks like?

Related Posts

Some default and debt restructuring data 04 May 2017

Science really is non-partisan: facts and skepticism annoy everybody 24 Apr 2017

Redirect 06 Apr 2017