Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Web-scraping

The internet is the greatest source of publicly available data. One of the key skills to being able to obtain data from the web is “web-scraping”, where you use a piece of software to run through a website and collect information. 

This technique can be used for collecting data from databases or to collect data that is scattered across a website. Here is a very cool little exercise in web-scraping that can be used as an example of the things that are possible. 

Related Posts: Jeff on APIs, Data Sources, Regex, and The Open Data Movement.