Script Triggers

Overview

You designate a script to be executed by associating it with some event. For example, if you click on a scraping session, you'll notice that you can designate scripts to be invoked either before a scraping session begins or after it completes. Other events that can be used to invoke scripts relate to scrapeable files and extractor patterns.

Available associations (based on object location) are listed with a brief description of how they can be useful.

  • Scraping Session
    • Before scraping session begins - Script to initialize or debug work well here.
    • After scraping session ends - This association is good for closing any open processes or finishing data processes.
    • Always at the end - Forces scripts to run at the end of a scraping session, even if the scraping session is stopped prematurely.
  • Scrapeable File
    • Before file is scraped - Helpful for files used with iterators to get product lists and such.
    • After file is scraped - Good for processing the information scraped in the file.
  • Extractor Pattern
    • Before pattern is applied - Good for giving default values to variables, in case they don't match.
    • After pattern is applied - Good if you want to work with the data set as a whole and it's methods.
    • Once if pattern matches - Simplifies the issue of matching the same link multiple times but only wanting to follow it once.
    • Once if no matches - Helpful in catching and reporting possible errors.
    • After each pattern match - Gives access to data records and their associated methods.

Managing Associations

Adding

All objects that can have scripts associated with them have buttons to add the script association with the exception of scripts. To create a association between scripts you would use the executeScript method of the session object.

Locations to specify script associations are listed below.

Removing

  • Press the Delete key when the association is selected.
  • Right-click the association and select Delete.

Ordering

Script associations are ordered automatically in a natural order based on their relation to the object they are connected to: scripts called after the file is scraped cannot be ordered before associations the are called before the file is scraped. Beyond the natural ordering you can specify the order of the scripts using the Sequence number.

Enable/Disable

You can selectively enable and disable scripts using the Enabled checkbox in the rightmost column. It's often a good practice to create scripts used for debugging that you'll disable once you run scraping sessions in a production environment.