Web Interface: Schedule Scraping Session

Overview

It can be very helpful to have scraping sessions run automatically or on an on going basis. The web interface makes this simple allowing you to schedule and manage multiple scrapes in a single location.

Managing Scheduled Scrapes

Scheduling Run

Editing Scheduled Run

  • You can alter the settings for an already scheduled scraping session by clicking on teh Edit button on the scheduled tab.

Removing Scheduled Run

  • You can remove an already scheduled scraping session by clicking on the Remove button on the scheduled tab.

Schedule Scraping Session: General tab

General Tab

  • Scraping Session: The name of the scheduled scraping session.
  • Timeout: The number of minutes the scraping session is allowed to run before a request to stop is inserted.

    If this value is blank, 0, or negative, the scraping session will not time out.

  • Session Variables: This is a list of session variables that will be passed to the scraping session when it is run.

Schedule Scraping Session: Schedule tab

Schedule Tab

  • Date: The calendar date when the scraping session is to run next. Click the box to bring up a graphical calendar from which you can select the desired date.
  • Time: The time of day when the scraping session is to run next. This should be a 24-hour (military) time.
  • Repeat Every: Use this to set the frequency with which the scraping session is to run. For example, if you enter 2 into the Hours box, the scraping session will run when it is scheduled, then be re-scheduled to run once again two hours from the time it started.

    If these boxes are left blank, the scraping session will run once and not be re-scheduled.

Schedule Scraping Session: Thresholds tab

Thresholds Tab

  • Time: The percentage of time whereby two runs of a scraping session may differ without being flagged as a possible error.
  • Record Count: The percentage of records scraped whereby two runs of a scraping session may differ without being flagged as a possible error.

Flagged scrapes are highlighted in red in the run/running tab.