1: Process Review

Screen Scraping Process

As you'll remember from the previous tutorial, extracting information from web sites using screen-scraper typically involves four main steps:

  1. Use the proxy server to determine the exact files that need to be requested in order to get the information you're after.
  2. Create a scraping session with scrapeable files that define the sequence of pages screen-scraper will request.
  3. Generate extractor patterns to define the exact information you need screen-scraper to grab from each page.
  4. Write small scripts or programming code to invoke screen-scraper and/or work with the data it extracts.