Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Name 5 statisticians, now name 5 young statisticians

I have been thinking for a while how hard it is to find statisticians to interview for the blog. When I started the interview series, it was targeted at interviewing statisticians at the early stages of their careers. It is relatively easy, if you work in academic statistics, to name 5 famous statisticians. If you asked me to do that, I’d probably say something like: Efron, Tibshirani, Irizarry, Prentice, and Storey. I could also name 5 famous statisticians in industry with relative ease: Mason, Volinsky, Heineike, Patil, Conway.

Most of that is because of where I went to school (Storey/Prentice), the area I work in (Tibshirani/Irizarry/Storey), my advisor (Storey), or the bootstrap (Efron) and the people I see on Twitter (all the industry folks). I could, of course, name a lot of other famous statisticians. Almost all of them biased by my education or the books I read.

But almost surely I will miss people who work outside my area or didn’t go to school where I did. This is particularly true in applied statistics, where people might not even spend most of their time in statistics departments. It is doubly true of people who are young and just getting started, as I haven’t had a chance to hear about them.

So if you have a few minutes in the comments name five statisticians you admire. Then name five junior statisticians you think will be awesome. They don’t have to be famous (in fact it is better if they are good but not famous so I can learn something). Plus it will be interesting to see the responses.

Yes, Clinical Trials Work

This saturday the New York Times published an opinion pieces wondering “do clinical trials work?”. The answer, of course, is: absolutely. For those that don’t know the history, randomized control trials (RCTs) are one of the reasons why life spans skyrocketed in the 20th century. Before RCTs wishful thinking and arrogance lead numerous well-meaning scientist and doctors to incorrectly believe their treatments worked. They are so successful that they have been adopted with much fanfare in far flung arenas like poverty alleviation (see e.g.,this discussion by Esther Duflo); where wishful thinking also lead many to incorrectly believe their interventions helped.

The first chapter of this book contains several examples and this is a really nice introduction to clinical studies. A very common problem was that the developers of the treatment would create treatment groups that were healthier to start with. Randomization takes care of this. To understand the importance of controls I quote the opinion piece to demonstrate a common mistake we humans make: “Some patients did do better on the drug, and indeed, doctors and patients insist that some who take Avastin significantly beat the average.” The problem is that the fact that Avastin did not do better on average means that the exact same statement can be made about the control group! It also means that some patient did worse than average too. The use of a control points to the possibility that Avastin has nothing to do with the observed improvements.

The opinion piece is very critical of current clinical trials work and complains about the “dismal success rate for drug development”. But what is the author comparing too? Dismal compared to what? We are talking about developing complicated compounds that must be both safe and efficacious in often critically ill populations. It would be surprising if our success rate was incredibly high.  Or is the author comparing the current state of affairs to the pre-clinical-trials days when procedures such as bloodletting were popular.

A better question might be, “how can we make clinical trials more efficient?” To answer this question there is definitely a lively and ongoing research area. In some cases they can definitely be better by adapting to new developments such as biomarkers and the advent of personalized medicine. This is why there are dozens of statisticians working in this area.

The article says that

“[p]art of the novelty lies in a statistical technique called Bayesian analysis that lets doctors quickly glean information about which therapies are working best. “

As Jeff pointed out this a pretty major oversimplification of all of the hard work that it takes to maintain scientific integrity and patient safety when studying new compounds. The fact that the analysis is Bayesian is ancillary to other issues like adaptive trials (as Julian pointed out in the comments), dynamic treatment regimes, or even more established ideas like group sequential trials. The basic principle underlying these ideas is the same: _can we run a trial more efficiently while achieving reasonable estimates of effect sizes and uncertainties? _You could imagine doing this by focusing on subpopulations that seem to work well for subpopulations with specific biomarkers, or by stopping trials early if drugs are strongly (in)effective, or by picking optimal paths through multiple treatments. That the statistical methodology is Bayesian or Frequentist has little to do with the ways that clinical trials are adapting to be more efficient.

This is a wide open area and deserves a much more informed conversation. I’m providing here a list of resources that would be a good place to start:

  1. An introduction to clinical trials
  2. Michael Rosenblum’s adaptive trial design page. 
  3. Clinicaltrials.gov - registry of clinical trials
  4. Test, learn adapt - a white paper on using clinical trials for public policy
  5. Alltrials - an initiative to make all clinical trial data public
  6. ASCO clinical trials resources - on clinical trials ethics and standards
  7. Don Berry’s paper on adaptive design.
  8. Fundamentals of clinical trials - a good general book (via David H.)
  9. Clinical trials, a methodological perspective - a more regulatory take (via David H.)

This post is by Rafa and Jeff. 

Sunday data/statistics link roundup (7/14/2013)

  1. Question: Do clinical trials work?Answer: Yes. Clinical trials are one of the defining success stories in the process of scientific inquiry. Do they work as fast/efficiently as a pharma company with potentially billions on the line would like? That is definitely much more up for debate. Most of the article is a good summary of how drug development works - although I think the statistics reporting is a little prone to hyperbole. I also think this sentence is both misleading, wrong, and way over the top, “Part of the novelty lies in a statistical technique called Bayesian analysis that lets doctors quickly glean information about which therapies are working best. There’s no certainty in the assessment, but doctors get to learn during the process and then incorporate that knowledge into the ongoing trial.” 
  2. The fun begins in the grim world of patenting genes. Two companies are being sued by Myriad even though they just lost the case on their main patent. Myriad is claiming violation of one of their 500 or so other patents. Can someone with legal expertise give me an idea - is Myriad now a patent troll?
  3. R spells for data wizards from Thomas Levine. I also link the pink on grey look.
  4. Larry W. takes on non-informative priors. Worth the read, particularly the discussion of how non-informative priors can be informative in different parameterizations. The problem Larry points out here is one I think that is critical - in big data applications where the goal is often discovery, we rarely have enough prior information to make reasonable informative priors either. Not to say some regularization can’t be helpful, but I think there is danger in putting an even weakly informative prior on a poorly understood, high dimensional space and then claiming victory when we discover something.
  5. Statistics and actuarial science are jumping into a politically fraught situation by raising the insurance on schools that allow teachers to carry guns. Fiscally, this is clearly going to be the right move. I wonder what the political fallout will be for the insurance company and for the governments that passed these laws (via Rafa via Marginal Revolution).
  6. Timmy!! Tim Lincecum throws his first no hitter. I know this isn’t strictly data/stats but he went to UW like me!

What are the iconic data graphs of the past 10 years?

This article in the New York Times about the supposed death of photography got me thinking about statistics. Apparently, the death of photography has been around the corner for some time now:

For years, photographers have been bracing for this moment, warned that the last rites will be read for photography when video technology becomes good enough for anyone to record. But as this Fourth of July showed me, I think the reports of the death of photography have been greatly exaggerated.

Yet, photography has not died and, says Robin Kelsey, a professor of photography at Harvard,

The fact that we can commit a single image to memory in a way that we cannot with video is a big reason photography is still used so much today.

This got me thinking about data graphics. One long-time gripe about data graphics in R has been it’s horrible lack of support for dynamic or interactive graphics. graphics. This is an indisputable fact, especially in the early years. Nowadays there are quite a few extensions and packages that allow R to create dynamic graphics, but it still doesn’t feel like part of the “core”. I still feel like when I talk to people about R, the first criticism they jump to is the poor support for dynamic/interactive graphics.

But personally, I’ve never thought it was a big deal. Why? Because I don’t really find such graphics useful for truly thinking about data. I’ve definitely enjoyed viewing some of them (especially some of the D3 stuff), and it’s often fun to move sliders around and see how things change (perhaps my favorite is the Baby Name Voyager or maybe this one showing rapper wealth).

But in the end, what are you supposed to walk away with? As a a creator of such a graphic, how are you supposed to communicate the evidence in the data? The key element of dynamic/interactive graphics is that it allows the viewer to explore the data in their own way, not in some prescribed static way that you’ve explicitly set out. Ultimately, I think that aspect makes dynamic graphics useful for presenting data, but not that useful for presenting evidence. If you want to present evidence, you have to tell a story with the data, you can’t just let the viewer tell their own story.

This got me thinking about what are the iconic data “photos” of the past 10 years (or so). The NYT article mentions the famous “Raising the Flag on Iwo Jima” by AP photographer Joe Rosenthal as an image that many would recognize (and perhaps remember). What are the data graphics that are burned in your memory?

I’ll give one example. I remember seeing Richard Peto give a talk here about the benefits of smoking cessation and its effect on life expectancy. He found that according to large population surveys, people who quit smoking by the age of 40 or so had more or less the same life expectancy as those who never smoked at all.  The graph he showed was one very similar to Figure 3 from this article. Although I already knew that smoking was bad for you, this picture really crystalized it for me in a specific way.

Of course, sometimes data graphics are memorable for other reasons, but I’d like to try and stay positive here. Which data graphics have made a big impression on you?

Repost: Preventing Errors Through Reproducibility

Checklist mania has hit clinical medicine thanks to people like Peter Pronovost and many others. The basic idea is that simple and short checklists along with changes to clinical culture can prevent major errors from occurring in medical practice. One particular success story is Pronovost’s central line checklist which dramatically reduced bloodstream infections in hospital intensive care units.

There are three important points about the checklist. First, it neatly summarizes information, bringing the latest evidence directly to clinical practice. It is easy to follow because it is short. Second, it serves to slow you down from whatever you’re doing. Before you cut someone open for surgery, you stop for a second and run the checklist. Third, it is a kind of equalizer that subtly changes the culture: everyone has to follow the checklist, no exceptions. A number of studies have now shown that when clinical units follow checklists, infection rates go down and hospital stays are shorter compared to units using standard procedures.

Here’s a question: What would it take to convince you that an article’s results were reproducible, short of going in and reproducing the results yourself? I recently raised this question in a talk I gave at the Applied Mathematics Perspectives conference. At the time I didn’t get any responses, but I’ve had some time to think about it since then.

I think most people are thinking of this issue along the lines of “The only way I can confirm that an analysis is reproducible is to reproduce it myself”. In order for that to work, everyone needs to have the data and code available to them so that they can do their own independent reproduction. Such a scenario would be sufficient (and perhaps ideal) to claim reproducibility, but is it strictly necessary? For example, if I reproduced a published analysis, would that satisfy you that the work was reproducible, or would you have to independently reproduce the results for yourself? If you had to choose someone to reproduce an analysis for you (not including yourself), who would it be?

This idea is embedded in the reproducible research policy at _Biostatistics, _but of course we make the data and code available too. There, a (hopefully) trusted third party (the Associate Editor for Reproducibility) reproduces the analysis and confirms that the code was runnable (at least at that moment in time).

It’s important to point out that reproducible research is not only about correctness and prevention of errors. It’s also about making research results available to others so that they may more easily build on the work. However, preventing errors is an important part and the question is then what is the best way to do that? Can we generate a reproducibility checklist?