Some of the best scraping session examples are available from our main site. We always keep these scraping sessions up-to-date, so they should work if you download and import them into your own screen-scraper instance. You can get the scrapes by visiting each of these pages and clicking the Download Scrape button:
Used with Tutorial 1: Hello World!.
Attachment | Size |
---|---|
Hello World (Scraping Session).sss | 2.27 KB |
Used with Tutorial 2: Shopping Site
Attachment | Size |
---|---|
dvds.txt | 897 bytes |
Shopping Site (Scraping Session).sss | 11.36 KB |
Used with Tutorial 3: Extending Hello World
Attachment | Size |
---|---|
dvds.txt | 897 bytes |
Shopping Site (Scraping Session).sss | 11.36 KB |
Used with Tutorial 4: Scraping a Shopping Site from an External Program
Attachment | Size |
---|---|
Shopping Site (Scraping Session).sss | 11.63 KB |
Used with Tutorial 5: Saving Scraped Data to a Database
Attachment | Size |
---|---|
Shopping Site (Scraping Session).sss | 13.18 KB |
Used with Tutorial 6: Generating an RSS/Atom Feed from a Product Search
Attachment | Size |
---|---|
Shopping Site (Scraping Session).sss | 12.37 KB |
Used with Tutorial 7: Scraping a Site Multiple Times Based on Search Terms
Attachment | Size |
---|---|
Shopping Site (Scraping Session).sss | 13.06 KB |
Example implementation of the RunnableScrapingSession Class.
Import both scraping sessions.
Run the "RunnableScrapingSession Example Starter" scraping session. It will set a variable name "Var1" and will spawn the "RunnableScrapingSession Example" scraping session where the value of "Var1" will be referenced.
Takes the session variable CAPTCHA_URL, generates a user input window, then saves the output to CAPTCHA_TEXT.
This scraping session downloads CAPTCHA image from Google's recaptcha.com, passes image to decaptcher.com service and receives response as TEXT.
Within screen-scraper you have the ability to call outside programs directly from your scripts. The following is an example scraping session that makes use of Tesseract OCR and Imagemagick in order to take an image from the internet and attempt to read the text of the image.
As is, the scraping session is intended to run on Linux. However, it is possible to run both dependent programs under Windows either directly or using Cygwin.
To use:
Download and import the following scraping session.
Attachment | Size |
---|---|
ocr (Scraping Session).sss | 5.96 KB |