Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

APIs!

Application programming interfaces (APIs) are tools that are built by companies/governments/organizations to allow software engineers to interact with their websites. One of the main uses of these APIs is to allow software engineers to build apps on top of Facebook/Twitter/etc. Many APIs are really helpful for statisticians/data scientists as well. Using APIs, it is generally very easy to collect large amounts of interesting data. Here are some examples of APIs (you may need to sign up for accounts to get access to some of these). They vary in how easy/useful it is to obtain data from them. If people know of other good ones, I’d love to see them in the comments. 

Web 2.0

  1. Twitter and associated R package
  2. Google analytics
  3. Blogger
  4. Indeed
  5. Groupon

Publishing

  1. New York Times
  2. ArXiv
  3. Pubmed
  4. PLoS
  5. Mendeley

Government

  1. FedSpending 
  2. Department of Education
  3. CDC