Building the Team That Built Watson

08 Jan 2012

Make us a part of your day - add Simply Statistics to your RSS feed

08 Jan 2012

You can add us to your RSS feed through feedburner.

P-values and hypothesis testing get a bad rap - but we sometimes find them useful.

06 Jan 2012

This post written by Jeff Leek and Rafa Irizarry.

The p-value is the most widely-known statistic. P-values are reported in a large majority of scientific publications that measure and report data. R.A. Fisher is widely credited with inventing the p-value. If he was cited every time a p-value was reported his paper would have, at the very least, 3 million citations* - making it the most highly cited paper of all time.

However, the p-value has a large number of very vocal critics. The criticisms of p-values, and hypothesis testing more generally, range from philosophical to practical. There are even entire websites dedicated to “debunking” p-values! One issue many statisticians raise with p-values is that they are easily misinterpreted, another is that p-values are not calibrated by sample size, another is that it ignores existing information or knowledge about the parameter in question, and yet another is that very significant (small) p-values may result even when the value of the parameter of interest is scientifically uninteresting.

We agree with all these criticisms. Yet, in practice, we find p-values useful and, if used correctly, a powerful tool for the advancement of science. The fact that many misinterpret the p-value is not the p-value’s fault. If the statement “under the null the chance of observing something this convincing is 0.65” is correct, then why not use it? Why not explain to our collaborator that the observation they thought was so convincing can easily happen by chance in a setting that is uninteresting. In cases where p-values are small enough then the substantive experts can help decide if the parameter of interest is scientifically interesting. In general, we find p-value to be superior to our collaborators intuition of what patterns are statistically interesting and which ones are not.

We also find p-values provide a simple way to construct decision algorithms. For example, a government agency can define general rules based on p-values that are applied equally to products needing a specific seal of approval. If the rule proves to be to lenient or restrictive, we change the p-value cut-off appropriately. In this situation we view the p-value as part of a practical protocol, not a tool for statistical inference.

Moreover the p-value has the following useful properties for applied statisticians:

p-values are easy to calculate, even for complicated statistics. Many statistics do not lend themselves to easy analytic calculation; but using permutation and bootstrap procedures p-values can be calculated even for very complicated statistics.
p-values are relatively easy to understand. The statistical interpretation of the p-value remains roughly the same no matter how complicated the underlying statistic and they also bounded between 0 and 1. This also means that p-values are easy to mis-interpret - they are not posterior probabilities. But this is a difficulty with education, not a difficulty with the statistic itself.
p-values have simple, universal properties Correct p-values are uniformly distributed under the null, regardless of how complicated the underlying statistic.
p-values are calibrated to error rates scientists care about Regardless of the underlying statistic, calling all P-values less than 0.05 significant leads to on average about 5% false positives even if the null hypothesis is always true. If this property is ignored things like publication bias can result, but again this is a problem with education and the scientific process, not with p-values.
p-values are useful for multiple testing correction. The advent of new measurement technology has shifted much of science from hypothesis driven to discovery driven making the existing multiple testing machinery useful. Using the simple, universal properties of p-values it is possible to easily calculate estimates of quantities like the false discovery rate - the rate at which discovered associations are false.
p-values are reproducible. All statistics are reproducible with enough information. Given the simplicity of calculating p-values, it is relatively easy to communicate sufficient information to reproduce them.

We agree there are flaws with p-values, just like there are with any statistic one might choose to calculate. In particular, we do think that confidence intervals should be reported with p-values when possible. But we believe that any other decision-making statistic would lead to other problems. One thing we are sure about is that p-values beat scientists’ intuition about chance any day. So before bashing p-values too much we should be careful because, like democracy to government, p-values may be the worst form of statistical significance calculation except all those other forms that have been tried from time to time.

————————————————————————————————————

* Calculated using Google Scholar using the formula:

Number of P-value Citations = # of papers with exact phrase “P < 0.05” + (# of papers with exact phrase “P < 0.01” and not exact phrase “P < 0.05”) + (# of papers with exact phrase “P < 0.001” and not exact phrase “P < 0.05” or “P < 0.001”)

= 1,320,000 + 1,030,000 + 662,500

This is obviously an extremely conservative estimate.

Why all #academics should have professional @twitter accounts

05 Jan 2012

I started my professional Twitter account @leekgroup about a year and half ago at the suggestion of a colleague of mine, John Storey (@storeylab). I started using the account to post updates on papers/software my group was publishing. Basically, everything I used to report on my webpage as “News”.

I started to give talks where the title slide included my Twitter name, rather than my webpage. It frequently drew the biggest laugh in the talk, and I would get comments like, “Do you really think people care what you are thinking every moment of every day?” That is what some people use Twitter for, and no I’m not really interested in making those kind of updates.

So I started describing why I think Twitter is useful for academics at the beginning of talks:

You can integrate it directly into your website (like so), using Twitter widgets. If you have a Twitter account you just go here, get the widget for your website, and add the code to your homepage. Now you don’t have to edit HTML to make news updates, you just login to Twitter and type the update in the box.
You can quickly gain a much broader audience for your software/papers. In the past, I had to rely on people actually coming to my website to find my papers or seeing them in journals. Now, when I announce a paper, my followers see it and if they like it, they pass it on to their followers, etc. I have noticed that my papers are being downloaded more and by a broader audience since I joined.
I can keep up on what other people are doing. Many statisticians have Twitter accounts that they use professionally. I follow many of them and when they publish new papers, I see them pop up, rather than having to go to all their websites. It’s like an RSS feed of papers from people I want to follow.
You can connect with people outside academia. Particularly in my area, I’d like the statistical tools I’m developing to be used by folks in industry who work on genomics. It’s hard to get the word out about my methods through traditional channels, but a lot of those folks are on Twitter.

The best part is, there is an amplification effect to this medium. So as more and more academics join and follow each other, it is easier and easier for us all to keep up with what is happening in the field. If you are intimidated by using any social media, you can get started with some really easy how-to’s like this one.

Alright, enough advertising for Twitter, I’m going back to work.

Will Amazon Offer Analytics as a Service?

05 Jan 2012

Will Amazon Offer Analytics as a Service?

Older Newer

Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Building the Team That Built Watson

Make us a part of your day - add Simply Statistics to your RSS feed

P-values and hypothesis testing get a bad rap - but we sometimes find them useful.

Why all #academics should have professional @twitter accounts

Will Amazon Offer Analytics as a Service?