04 May 2012
The National Academy of Sciences elected new members a few days ago. Among them are statisticians Robert Tibshirani and sociologist Stephen Raudenbush. Obviously well-deserved!
(Thanks to Karl Broman.)
04 May 2012
[youtube http://www.youtube.com/watch?v=k6aBITJuSQA?wmode=transparent&autohide=1&egm=0&hd=1&iv_load_policy=3&modestbranding=1&rel=0&showinfo=0&showsearch=0&w=500&h=375]
Hammer on the importance of statistics (or, as I used to know him, MC Hammer). The overlay of the video for “Can’t Touch This” really helps me understand what he’s talking about. (Thanks to Chris V. for the link.)
02 May 2012
Bad news comrades. Dongle communism in under attack. Check out how this poor dongle has been subjugated. This is in our lab meeting room. To add insult to injury, this happened on May 1st!

01 May 2012
If you have analyzed enough high throughput data you have seen it before: a male sample that is really a female, a liver that is a kidney, etc… As the datasets I analyze get bigger I see more and more sample mix-ups. When I find a couple of samples for which sex is incorrectly annotated (one can easily see this from examining data from X and Y chromosomes) I can’t help but wonder if there are more that are undetectable (e.g. swapping samples of same sex). Datasets that include two types of measurements, for example genotypes and gene expression, make it possible to detect sample swaps more generally. I recently attended a talk by Karl Broman on this topic (one of best talks I’ve seen.. check out the slides here). Karl reports an example in which it looks as if whoever was pipetting skipped a sample and kept on going, introducing an off-by-one error for over 50 samples. As I sat through the talk, I wondered how many of the large GWAS studies have mix-ups like this?
A recent paper (gated) published by Lude Franke and colleagues describes MixupMapper: a method for detecting and correcting mix-ups. They examined several public datasets and discovered mix-ups in all of them. The worst performing study, published in PLoS Genetics, was reported to have 23% of the samples swapped. I was surprised that the MixupMapper paper was not published in a higher impact journal. Turns out PLoS Genetics rejected the paper. I think this was a big mistake on their part: the paper is clear and well written, reports a problem with a PLoS Genetics papers, and describes a solution to a problem that should have us all quite worried. I think it’s important that everybody learn about this problem so I was happy to see that, eight months later, Nature Genetics published a paper reporting mix-ups (gated)… but they didn’t cite the MixupMapper paper! Sorry Lude, welcome to the reverse scooped club.