Any samples around for error handling using wasErrorOnReques

OR noExtractorPatternsMatched.

they are both part of scrapeableFile, but I want to kinda check them from the piece calling the scrape original.

basically we have a script we run, which creates a session and starts it scraping. BUT along the way some pages are causing problems, and I want to run them in a loop TILL I GET VALID DATA. HELP!!!!

Any samples around for error handling using wasErrorOnReques

Hi,

If the site will occasionally (but not always) return HTTP error status codes, the simplest way to deal with that would be to increase the "Max requests per file" value found under the "Advanced" tab for your scraping session. If the server sends anything but a 200 OK HTTP response screen-scraper will try the request again.

If you'd like to have more fine-grained control over how this works, you might use a script like the following, which more or less replicates the functionality of setting the "Max requests per file" value (with the addition of the noExtractorPatternsMatched check):

// Define the maximum number of requests we'd like to try.
MAX_REQUESTS = 5;

// Get the number of previous requests that have been made.
// If the variable hasn't yet been set we initialize it to 0.
if( session.getVariable( "NUM_REQUESTS" )==null )
{
numRequests = 0;
// We save it in a session variable so that we can reference it the next
// time around.
session.setVariable( "NUM_REQUESTS", String.valueOf( numRequests ) );
}
else
{
numRequests = Integer.parseInt( session.getVariable( "NUM_REQUESTS" ) );
}

// If there was an error and we haven't yet hit the maximum
// number of requests...
if( ( scrapeableFile.wasErrorOnRequest() || scrapeableFile.noExtractorPatternsMatched() ) && numRequests < MAX_REQUESTS )
{
// Scrape the file again and increment the maximum
// number of requests.
session.scrapeFile( "My File" );
session.addToVariable( "NUM_REQUESTS", 1 );
}
else
{
// If we successfully requested the file we can reset this
// value back to 0.
session.setVariable( "NUM_REQUESTS", "0" );
}

Kind regards,

Todd Wilson