Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Peter Norvig on the "Unreasonable Effectiveness of Data"

“The Unreasonable Effectiveness of Data”, a talk by Peter Norvig of Google. Sometimes, more data is more better. (Thanks to John C. for the link.)

A proposal for a really fast statistics journal

I know we need a new journal like we need a good poke in the eye. But I got fired up by the recent discussion of open science (by Paul Krugman and others) and the seriously misguided Research Works Act- that aimed to make it illegal to deposit published papers funded by the government in Pubmed central or other open access databases.

I also realized that I spend a huge amount of time/effort on the following things: (1) waiting for reviews (typically months), (2) addressing reviewer comments that are unrelated to the accuracy of my work - just adding citations to referees papers or doing additional simulations, and (3) resubmitting rejected papers to new journals - this is a huge time suck since I have to reformat, etc. Furthermore, If I want my papers to be published open-access I also realized I have to pay at minimum $1,000 per paper</p>

So I thought up my criteria for an ideal statistics journal. It would be accurate, have fast review times, and not discriminate based on how interesting an idea is. I have found that my most interesting ideas are the hardest ones to get published.  This journal would:

  • Be open-access and free to publish your papers there. You own the copyright on your work. 
  • The criteria for publication would be: (1) it has to do with statistics, computation, or data analysis, (2) is the work is technically correct. 
  • We would accept manuals, reports of new statistical software, and full length research articles. 
  • There would be no page limits/figure limits. 
  • The journal would be published exclusively online. 
  • We would guarantee reviews within 1 week and publication immediately upon review if criteria (1) and (2) are satisfied
  • Papers would receive a star rating from the editor - 0-5 stars. There would be a place for readers to also review articles
  • All articles would be published with a tweet/like button so they can be easily distributed
To achieve such a fast review time, here is how it would work. We would have a large group of Associate Editors (hopefully 30 or more). When a paper was received, it would be assigned to an AE. The AEs would agree to referee papers within 2 days. They would use a form like this:
  • Review of: Jeff’s Paper
  • Technically Correct: Yes
  • About statistics/computation/data analysis: Yes
  • Number of Stars: 3 stars

  • 3 Strengths of Paper (1 required): 
  • This paper revolutionizes statistics 

  • 3 Weakness of Paper (1 required): 
  • * The proof that this paper revolutionizes statistics is pretty weak
  • because he only includes one example.
</blockquote>
That’s it, super quick, super simple, so it wouldn’t be hard to referee. As long as the answers to the first two questions were yes, it would be published. 
So now here’s my questions: 
  1. Would you ever consider submitting a paper to such a journal?
  2. Would you be willing to be one of the AEs for such a journal? 
  3. Is there anything you would change? 
</div>

Frighteningly Ambitious Startup Ideas

Frighteningly Ambitious Startup Ideas

Sunday Data/Statistics Link Roundup (3/11)

  1. This is the big one. ESPN has opened up access to their API! It looks like there may only be access to some of the data for the general public though, does anyone know more? 
  2. Looks like ESPN isn’t the only sports-related organization in the API mood, Nike plans to open up an API too. It would be great if they had better access to individual, downloadable data. 
  3. Via Leonid K.: a highly influential psychology study failed to replicate in a study published in PLoS One. The author of the original study went off on the author of the paper, on PLoS One, and on the reporter who broke the story (including personal attacks!). It looks like the authors of the PLoS One paper actually did a more careful study than the original authors to me. The authors of the PLoS One paper, the reporter, and the editor of PLoS One all replied in a much more reasonable way. See this excellent summary for all the details. Here are a few choice quotes from the comments: 

1. But there’s a long tradition in social psychology of experiments as parables,

2. I’d love to write a really long response, but let’s just say: priming methods like these fail to replicate all the time (frequently in my own studies), and the news that one of Bargh’s studies failed to replicate is not surprising to me at all.

3. This distinction between direct and conceptual replication helps to explain why a psychologist isn’t particularly concerned whether Bargh’s finding replicates or not.

D.  Reproducible != Replicable in scientific research. But Roger’s perspective on reproducible research still seems appropriate here. 

Answers in Medicine Sometimes Lie in Luck

Answers in Medicine Sometimes Lie in Luck