Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Sunday data/statistics link roundup (9/29/13)

The links are back! Read on.

  1. Susan Murphy - a statistician - wins a Macarthur Award. Great for the field of statistics (via Dan S. and Simina B., among others).
  2. Related: an Interview with David Donoho about the Shaw Prize. Statisticians are blowing up! (via Rafa)
  3. Hope that the award winners don’t lose momentum! (via Andrew J.)
  4. Hopkins grad students take to the Baltimore Sun to report yet more ongoing negative effects of sequestration. Particularly appropriate in light of the current mayhem around keeping the government open. (via Rafa)
  5. Great BBC piece featuring David Spiegelhalter on the science of chance. I rarely watch Youtube videos that long all the way through, but I made it to the end of this one.
  6. Love how Yahoo finance has recognized the agonized cries of statisticians and is converting pie charts to bar charts. (via Rafa - who has actually given up on the issue).
  7. Don’t use Hadoop - your data aren’t that big.
  8. Don’t forget to sign up for the future of statistics unconference October 30th Noon-1pm eastern. We have an awesome lineup of speakers and over 500 people RSVP’d on google plus alone. It’s going to be a thing.

Announcing Statistics with Interactive R Learning Software Environment

Editor's note: This post was written by Nick Carchedi, a Master's degree student in the Department of Biostatistics at Johns Hopkins. He is working with us to develop software for interactive learning of R and statistics. 

Inspired by the relative lack of computer-based platforms for learning statistics and the R programming language, we at Johns Hopkins Biostatistics have created a new R package designed to teach both topics simultaneously and interactively. Accordingly, we’ve named the package swirl, which stands for “Statistics with Interactive R Learning”. We sought to model swirl after other highly successful interactive learning platforms such as Codecademy, Code School, and Khan Academy, but with a specific focus on teaching statistics and R. Additionally, we wanted users to learn these topics within the same environment in which they would be applying them, namely the R console.

If you’re reading this article, then you probably already have an appreciation for the R language and there’s no need to beat that drum any further. Staying true to the R culture, the swirl package is totally open-source and free for anyone to use, modify, or improve. Furthermore, anyone with something to teach can use the platform to create their own interactive content for the world to use.

A typical swirl session has a user load the package from the R console, choose from a menu of options the course he or she would like to take, then work through 10-15 minute interactive modules, each covering a particular topic. A module generally alternates between instructional text output to the user and prompts for the user to answer questions. One question may ask for the result of a simple numerical calculation, while another requires the user to enter an actual R command (which is parsed and executed, if correct) to perform a requested task. Multiple choice, text-based and approximate numerical answers are also fair game. Whenever the user answers a question incorrectly, immediate feedback is given in the form of a hint before prompting her to try again. Finally, plots, figures, and even videos may be incorporated into a module for the sake of reinforcing the methods or concepts being taught.

We believe that this form of interactive learning, or learning by doing, is essential for true mastery of topics as challenging and complex as statistics and statistical computing. While we are aware of a handful of other platforms for learning R interactively, our goal was to focus on the teaching of R and statistics simultaneously. As far as we know, swirl is the only platform of its kind and almost certainly the only one that takes place within the R console.

When we developed the swirl package, we wanted from the start to allow other people to extend and customize it to their particular needs. The beauty of the swirl platform is that anyone can create their own content and have it included in the package for all users to access. We have designed pre-formatted templates (color-coded spreadsheets) that instructors can fill out with their own content according to a fairly simple set of instructions. Once instructors send us the completed templates, we then load the content into the package so that anyone with the most recent version of swirl on their computer can access the content. We’ve tried to make the process of content creation as simple and painless as possible so that the statistics and computing communities are encouraged to share their knowledge with the world through our platform.

The package currently includes only a few sample modules that we’ve created in-house, primarily serving as demonstrations of how the platform works and how a typical module may appear to users. In the future, we envision a vibrant and dynamic collection of full courses and short modules that users can vote up or down based on the quality of their experience with each. In such a scenario, the very best courses would naturally float to the top and the less effective courses would fall out of favor and perhaps be recommended for revision.

In addition to making more content available to future users, we hope to one day transition swirl from being an interactive learning environment to one that is truly adaptive to the individual needs of each user. Perhaps this future version of our software would support a more intricate web of content, intelligently navigating users among topics based on a dynamic, data-driven interpretation of their strengths, weaknesses, competencies, and knowledge gaps. With the right people on board, this could become a reality.

We’ve created this package with the hope that the statistics and computing communities find it to be a valuable educational tool. We’ve got the basic infrastructure in place, but we recognize that there is a great deal of room for improvement. The swirl package is still very much in development and we are actively seeking feedback on how we can make it better. Please visit the swirl website to download the package or for more information on the project. We’d love for you to give it a try and let us know what you think.

Go to swirl website: http://swirlstats.com

How could code review discourage code disclosure? Reviewers with motivation.

appeared a couple of days ago in Nature describing Mozilla’s efforts to implement code review for scientific papers. As anyone who follows our blog knows, we are in favor of reproducible research, in favor of disclosing code, and in favor of open science.

So people were surprised when they saw this quote from Roger at the end of the Nature piece:

“One worry I have is that, with reviews like this, scientists will be even more discouraged from publishing their code. We need to get more code out there, not improve how it looks.”

Not surprisingly a bunch of reproducible research/open science people were quick to jump on this quote:

Now, Roger’s quote was actually a little more nuanced and it was posted after a pretty in-depth discussion on Twitter:

But I think the real source of confusion was best summed up by Titus B.:

That is the key issue. People are surprised that sharing code would be anything but an obvious thing to do. To people who share code all the time, this is an obvious no-brainer. My bias is clearly in that camp as well. I require reproducibility of my students analyses, I discuss reproducible research when I teach, I take my own medicine by making my analyses reproducible, and I frequently state in reviews that papers are only acceptable after the code is available.

So what’s the big deal?

In an incredibly interesting coincidence, I had a paper come out the same week in Biostatistics that has been uh…little controversial.

In this case, our paper was published with discussion. For people outside of statistics, a discussant and a reviewer are different things. The paper first goes through peer review in the usual way. Then, once it is accepted for publication, it is sent out to discussants to read and comment on.

A couple of discussants were very, very motivated to discredit our approach. Despite this, because we believe in open science, stating our assumptions, and being reproducible, we made all of the code we used and data we collected available for the discussants (and for everyone else). In an awesome win for open science, many of the discussants used/evaluated our code in their discussions.

One of the very motivated discussants identified an actual bug in the code. This bug caused the journal names to be scrambled in Figures 3 and 4. The bug (thank goodness!) did not substantively alter the methods, the results or the conclusions of our paper. On top of it, the cool thing about having our code on github meant we could carefully look it over, fix the bug, and push the changes to the repository (and update the paper) so the discussant could see the revised version as soon as we pushed it.

We were happy that the discussant didn’t find any more substantial bugs (because we knew they were motivated to review our code for errors as carefully as possible). We were also happy to make the changes, admit our mistake and move on.

An interesting thing happened though. The motivated discussant wanted to discredit our approach. So they included in the supplement how they noticed the bug (totally fair game, it was a bug). But they also included their email exchange with the editor about the bug and this quote:

As all seasoned methodologists know, minor coding errors causing total havoc is quite common (I have seen it happen in my own work).  I think that it is ironic that a paper that claims to prove the reliability of the literature had completely messed up the two main figures that represent the core of all its data and its main results.

A couple of points here: (1) the minor bug didn’t wreak havoc with our results, it didn’t change any conclusions and it didn’t affect our statistics and (2) the statement is clearly designed for the sole purpose of embarrassing us (the authors) and discrediting our work.

The problem here is that the code reviewer deeply cares about us being wrong. This incident highlights one reason for Roger’s concerns. I feel we acted in pretty good faith here to try to be honest about our assumptions and open with our code. We also responded quickly and thoroughly to the report of a bug. But the discussant used the fact that we had a bug at all to try to discredit our whole analysis with sarcasm. This sort of thing could absolutely discourage a person from releasing code.

One thing the discussant is absolutely right about is that most code will have minor bugs. Personally, I’m very grateful to the discussant for catching the bug before the work was published and I’m happy that we made the code available and corrected our mistake.

But the key risk here is that people who demand reproducible code do so only so they can try to embarrass analysts and discredit science they don’t like. 

If we want people to make code available, be willing to admit mistakes, and continuously update their code then we don’t just need code review. We need a policy and commitment from the community to not just use reproducible research as a vehicle for embarrassment and discrediting each other. We need a policy that:

  1. Doesn’t discourage people from putting code up before papers are published for fear of embarrassment.
  2. Acknowledges minor bugs happen and doesn’t penalize people for admitting them/fixing them.
  3. Prevents people from publishing when they have major typos, but doesn’t humiliate them.
  4. Defines specific, positive ways that code sharing can benefit the community (collaboration) rather than only reporting errors that are discovered when code is made available.
  5. Recognizes that most scientists are not professional software developers and focuses review on the scientific correctness/reproducibility of code, rather than technical software development skills.

One way I think we could address a lot of these issues is not to think of it as code review, but as code evaluation and update. That is one thing I really like about Mozilla’s approach - they report their findings to the authors and let them respond. The only thing that would be better is if Mozilla actually created patches/bug fixes for the code and issued pull requests that the authors could incorporate. 

Ultimately, I hope we can focus on a way to make scientific software correct, not just point out how it is wrong.

Is most science false? The titans weigh in.

Some of you may recall that a few months ago my colleague and I posted a paper to the ArXiv on estimating the rate of false discoveries in the scientific literature. The paper was picked up by the Tech Review and led to a post on Andrew G.’s blog, on Discover blogs, and on our blog. One other interesting feature of our paper was that we put all the code/data we collected on Github.

At the time this whole thing blew up our paper still wasn’t published. After the explosion of interest we submitted the paper to Biostatistics. They liked the paper and actually solicited formal discussion of our approach by other statisticians. We were then allowed to respond to the discussions.

Overall, it was an awesome experience at Biostatistics - they did a great job of doing a thorough, but timely, review. They  got some amazing discussants. Finally, they made our paper open-access. So much goodness. (conflict of interest disclaimer - I am an associate editor for Biostatistics)

Here are the papers that came out which I think are all worth reading:

I’m very proud of our paper and the rejoinder. The discussants were very passionate and added a huge amount of value, particularly in the collection/analysis of our data and additional data they collected.

I think it is 100% worth reading all of the papers over at Biostatistics but for the tldr crowd here are some take home messages I have from the experience and summarizing the discussion above:

  1. Posting to ArXiv can be a huge advantage for a paper like ours but be ready for the heat.
  2. Biostatistics (the journal) is awesome. Great job of reviewing/editing in a timely way and great job of organizing the discussion!
  3. When talking about the science-wise false discovery rate you have to bring data.
  4. We proposed the first formal framework for evaluating the science-wise false discovery rate which lots of people care about (and there are a ton of ideas in the discussion about ways to estimate it better).
  5. I think based on our paper and the discussion that it is pretty unlikely that most published research is false. But that probably varies by your definition of false/what you mean by most/the journal type/the field you are considering/the analysis type/etc.
  6. This is a question people care about. A lot.

Finally, I think this is the most important quote from our rejoinder:

We are encouraged, however, that several of the discussants collected additional data to evaluate the impact of the above decisions on the SWFDR estimates. The discussion illustrates the powerful way that data collection can be used to move the theoretical and philosophical discussion on to a more concrete, scientific footing—discussing the specific strengths and weaknesses of a particular empirical approach. Moreover, the interesting additional data collected by the discussants on study types, journals, and endpoints demonstrate that data beget data and lead to a stronger and more directed conversation.

How I view an academic talk: like a sports game

I know this is a little random/non-statisticsy but I have been thinking about it a lot lately. Over the last couple of weeks I have been giving a bunch of talks and guest lectures here locally around the Baltimore/DC area. Each one of them was to a slightly different audience.

As I was preparing/giving all of these talks I realized I have a few habits that I have developed in the way I view the talks and in the way that I give them. I 100% agree with Hilary M. that a talk should entertain more than it should teach. I also try to give talks that I would like to see myself.

Another thing I realized is that I view talks in a very specific way. I see them as a sports game. From the time I was a kid until the end of graduate school I was on sports teams. I love playing/watching all kinds of sports and I definitely miss playing competitively.

Unfortunately, being a faculty member doesn’t leave much time for sports. So now, the only chance I have to get up and play is during a talk. Here are the ways that I see the two activities as being similar:

  1. They both require practice. I played a lot of sports with this guy who liked the quote, “Practice doesn’t make perfect, perfect practice makes perfect”. I feel the same way.
  2. They are both a way to entertain. I rarely played in front of crowds as big as the groups I speak to these days, but whenever there was an audience I would always get way more pumped up.
  3. There is some competition to both. In terms of talks, there is always at least one audience member who wants to challenge your ideas. I see this exchange as a game, rather than something I dread. Sometimes I win (my answers cover all the questions) and sometimes I lose (I missed something important). Usually, being prepared is associated with better practice.
  4. I get a rush off of both playing in games and giving talks. Part of that is self fueled. I like to listen to pump up music right before I give a talk or play a game.

One thing I wish is that more talks were joint talks. One thing I love about sports is playing on a team. The preparation of a talk is always done with a team - usually the students/postdocs/collaborators working on the project. But I wish presentations were more often a team activity. It makes it more fun to celebrate if the talk went well and less painful if I flub when I give a talk with someone else. Plus it is fun to cheer on your team mate.

Does anyone else think of talks this way? Or do you have another way of thinking about talks?