Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Should the Cox Proportional Hazards model get the Nobel Prize in Medicine?

I’m not the first one to suggest that Biostatistics has been undervalued in the scientific community, and some of the shortcomings of epidemiology and biostatistics have been noted elsewhere. But this previous work focuses primarily on the contributions of statistics/biostatistics at the purely scientific level.

The Cox Proportional Hazards model is one of the most widely used statistical models in the analysis of data from clinical trials and other medical studies. The corresponding paper has been cited over 32,000 times; this is a dramatically low estimate of the number of times the model has been used. It is one of “those methods” that doesn’t even require a reference to the original methods paper anymore.

Many of the most influential medical studies, including major studies like the Women’s Health Initiative have used these methods to answer some of our most pressing medical questions. Despite the incredible impact of this statistical technique on the world of medicine and public health, it has not received the Nobel Prize. This isn’t an aberration, statistical methods are not traditionally considered for Nobel Prizes in Medicine. They primarily focus on biochemical, genetic, or public health discoveries.

In contrast, many economics Nobel Prizes have been awarded primarily for the discovery of a new statistical or mathematical concept. One example is the ARCH model. The Nobel Prize in Economics in 2003 was awarded to Robert Engle, the person who proposed the original ARCH model. The model has gone on to have a major impact on financial analysis, much like the Cox model has had a major impact on medicine?

So why aren’t Nobel Prizes in medicine awarded to statisticians more often? Other methods such as ANOVA, P-values, etc. have also had an incredibly large impact on the way we measure and evaluate medical procedures. Maybe as medicine becomes increasingly driven by data, we will start to see more statisticians recognized for their incredible discoveries and the huge contributions they make to medical research and practice.

 

Sunday data/statistics link roundup (12/16/12)

  1. A directory of open access journals. Very cool idea to aggregate them. Here is a blog post from one of my favorite statistics bloggers about why open-access journals are so cool. Just like in a lot of other areas, open access journals can be thought of as an open data initiative.
  2. Here is a website that displays data on the relative wealth of neighborhoods, broken down by census track. It’s pretty fascinating to take a look and see what the income changes are, even in regions pretty close to each other.
  3. More citizen science goodness. Zooniverse has a new project where you can look through a bunch of pictures in the Serengeti and see if you can find animals.
  4. Nate Silver talking about his new book with Hal Varian. (via). I have skimmed the book and found that the parts about baseball/politics are awesome and the other parts seem a little light. But maybe that’s just my pre-conceived bias? I’d love to hear what other people thought…

Computing for Data Analysis Returns

I’m happy to announce that my course Computing for Data Analysis will return to Coursera on January 2nd, 2013. While I had previously announced that the course would be presented again right here, it made more sense to do it again on Coursera where it is (still) free and the platform there is much richer. For those of you who missed it the last time around, this is your chance to take it and learn a little R.

I’ve gotten a number of emails from people who were interested in watching the videos for the course. If you just want to sit around and watch videos of me talking, I’ve created a set of four YouTube playlists based on the four weeks of the course:

The content in the YouTube playlists reflect the content from the first iteration of the course and will not reflect any new material I add to the second iteration (at least not for a little while).

I encourage everyone who is interested to enroll in the course on Coursera because there you’ll have the benefit of in-video quizzes and other forms of assessment and will be able to interact with all of the great students who are also enrolled in the class. Also, if you’re interested in signing up for Jeff Leek’s Data Analysis course (starts on January 22, 2013) and are not very familiar with R, I encourage you to check out Computing for Data Analysis first to get yourself up to speed.

I look forward to seeing you there!

Joe Blitzstein's free online stat course helps put a critical satellite in orbit

As loyal readers know, we are very enthusiastic about MOOCs. One of the main reasons for this is the potential of teaching Statistics to students from all over the world, in particular those that can’t afford or don’t have acces to college. However, it turns out that rocket scientists can also benefit. Check out the feedback Joe Blitztsein, professor of one of the most popular online stat courses,  received from one of his students:

As an “old bubba” aerospace engineer I watched your Stat 110 class and enjoyed it very much. It sure blew out a lot of cobwebs that had collected over the past 35 years working as an aerospace engineer. As you might guess, we deal with a lot of probability. Just recently I was involved in a study to see what a blocked Reaction Control System (RCS) might do to a satellite… I am a Spacecraft Attitude Control systems engineer and it was my job to simulate what would happen if a certain RCS engine was plugged. It was a weird problem and it inspired me to watch your class… Fortunately, the statistics showed that the RCS nozzles that could get plugged would have a low probability and would not affect our ability to adjust the vehicle’s orbit. And we launched it this past summer and everything went perfect! So I just wanted to tell you that when you teach your “kiddos” tell them that Stat 110 has real life implications. This satellite is a critical national defense asset that saves the lives of our soldiers on the ground.”

I doubt “Old Bubba” has time to go back to school to refresh his stats knowledge… but thanks to Joe’s online class, he no longer needs to. This is yet another advantage MOOCs offer: giving busy professionals a practical way to learn new skills or brush up on specific topics.

Sunday data/statistics link roundup (12/9/12)

  1. Some interesting data/data visualizations about working conditions in the apparel industry. Here is the full report. Whenever I see reports like this, I wish the raw data were more clearly linked. I want to be able to get in, play with the data, and see if I notice something that doesn’t appear in the infographics. 
  2. This is an awesome plain-language discussion of how a bunch of methods (CS and Stats) with fancy names relate to each other. It shows that CS/Machine Learning/Stats are converging in many ways and there isn’t much new under the sun. On the other hand, I think the really exciting thing here is to use these methods on new questions, once people drop the stick
  3. If you are a reader of this blog and somehow do not read anything else on the internet, you will have missed Hadley Wickham’s Rcpp tutorial. In my mind, this pretty much seals it, Julia isn’t going to overtake R anytime soon. In other news, Hadley is coming to visit JHSPH Biostats this week! I’m psyched to meet him. 
  4. For those of us that live in Baltimore, this interesting set of data visualizations lets you in on the crime hotspots. This is a much fancier/more thorough analysis than Rafa and I did way back when. 
  5. Check out the new easy stats tool from the Census (via Hilary M.) and read our interview with Tom Louis who is heading over there to the Census to do cool things. 
  6. Watch out, some Tedx talks may be pseudoscience! More later this week on the politicization/glamourization of science, so stay tuned.