Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Are banks being sidelined by retailers' data collection?

Are banks being sidelined by retailers’ data collection?

Characteristics of my favorite statistics talks

I’ve been going to/giving statistics talks for a few years now. I think everyone in our field has an opinion on the best structure/content/delivery of a talk. I am one of those people that has a pretty specific idea of what makes an amazing talk. Here are a few of the things I think are key, I try to do them and have learned many of these things from other people who I’ve seen speak. I’d love to hear what other people think. 

Structure

  1. I don’t like outline slides. I think they take up space but don’t add to most talks. Instead I love it when talks start with a specific, concrete, unsolved problem. In my favorite talks, this problem is usually scientific/applied. Although I have also seen great theoretical talks where a person starts with a key and unsolved theoretical problem. 
  2. I like it when the statistical model is defined to solve the problem in the beginning, so it is easy to see the connection between the model and the purpose of the model. 
  3. I love it when talks end by showing how they solved the problem they described at the very beginning of the talk. 

Content

  1. I like it when people assume I’m pretty ignorant about their problem (I usually am) and explain everything in very simple language. I think some people worry about their research looking too trivial. I have almost never come away from a talk thinking that, but I frequently leave talks confused because the background material wasn’t clear. 
  2. I like it when talks cover enough technical detail so I can follow the basic algorithm, but not so much that I get lost in notation. I also struggle when talks go off on tangents, describing too many subproblems, rather than focusing on the main problem in the talk and just mentioning subproblems succinctly. 
  3. I like it when proposed methods are compared to the obvious straw man and one legitimate competitor (if it exists) on a realistic simulation/data set where the answer is known. 
  4. I love it when people give talks on work that isn’t totally finished. This type of talk is scary for two reasons: (1) you can be scooped and (2) you might not have all the answers. But I find that unfinished work leads to way more discussion/ideas than a talk about work that has been published and is “complete”. 

Delivery

  1. I like it when a talk runs short. I have never been disappointed when a talk ended 10-15 min early. On the other hand, when a talk is long, I almost always lose focus and don’t follow the last part. I’d love it if we moved to 30 minute seminars with more questions. 
  2. I like it when speakers have prepared their slides and they have a clear flow and don’t get bogged down in transitions. For this reason, I don’t mind it when people give the same talk a bunch of places. I usually find that the talk is very polished.

DealBook: Roche Extends Deadline for Illumina Offer

DealBook: Roche Extends Deadline for Illumina Offer

Sunday data/statistics link roundup (3/4)

  1. A cool article on Github by the folks at Wired. I’m starting to think the fact that I’m not on Github is a serious dent in my nerd cred. 
  2. Datawrapper - a less intensive, but less flexible open source data visualization creator. I have seen a few of these types of services starting to pop up. I think that some statistics training should be mandatory before people use them. 
  3. An interesting blog post with the provocative title, “Why bother publishing in a journal” The story he describes works best if you have a lot of people who are interested in reading what you put on the internet. 
  4. A post on stackexchange comparing the machine learning and statistics cultures. 
  5. Stackoverflow is a great place to look for R answers. It is the R mailing list, minus the flames…
  6. Roger’s posts on Beijing air pollution are worth another read if you missed them. Particularly this one, where he computes the cigarette equivalent of the air pollution levels. 

True Innovation

True Innovation