session

Overview

This object refers to the current scraping session that is running. To make the methods a little easier to sort through they have been grouped into related methods. The groups have been named to ease in finding them when they are needed.

Anonymization

Overview

The following methods are provided to aid you in setting up an anonymous scraping session. If you are using your own server proxy pool you will use the methods to allow screen-scraper to interact with and manage your proxy pool. If you are using automatic anonymization then the only method you will use is currentProxyServerIsBad as screen-scraper will manage the servers using the anonymization settings from your setup.

See an example of Anonymization via Manual Proxy Pools.

currentProxyServerIsBad

void session.currentProxyServerIsBad ( ) (professional and enterprise editions only)

Description

Remove proxy server from proxy pool. This is only used with anonymization and indicates that one server in the pool is bad and should be removed.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

If you are using automatic anonymization or manual proxy pools, a new proxy server will be created as a result of the method call.

When checking if a request you have made is invalid it is best not to rely on the HTTP status code (eg. 404) alone as the status codes are not always accurate. It is recommended that you also scrape a known string (eg. "Not found") from the response HTML that validates the status code.

Examples

Flag Proxy Server

 // Indicates that the current proxy server is bad.
 session.currentProxyServerIsBad();

getCurrentProxyServerFromPool

ProxyServer session.getCurrentProxyServerFromPool ( )

Description

Get the current proxy server from the proxy server pool.

Parameters

This method does not receive any parameters.

Return Values

Returns the current proxy server being used.

Change Log

Version Description
4.5 Available for all editions.

Examples

Write Proxy Server Description to Log

 // Get Proxy Server
 proxyServer = session.getCurrentProxyServerFromPool();

 // Log Server Description
 session.log( "Proxy Server: " + proxyServer.getDescription() );

getProxyServerPool

void session.getProxyServerPool ()

Description

Holds the proxy server pool object that allows proxies to be cycled through.

Parameters

  • This method does not receive any parameters.

Return Values

Returns true if there is an available proxy server pool.

Change Log

Version Description
4.5 Available for all editions.

Examples

Check if ProxyServerPool object exists.

 // If ProxyServerPool does not exist
 // Create a new ProxyServerPool object.
 if ( !session.getProxyServerPool() )
 {
  // The ProxyServerPool object will
  // control how screen-scraper interacts with proxy servers.
 
  proxyServerPool = new ProxyServerPool();
 
  // We give the current scraping session a reference to
  // the proxy pool. This step should ideally be done right
  // after the object is created (as in the previous step).

  session.setProxyServerPool( proxyServerPool );
 }

getTerminateProxiesOnCompletion

boolean session.getTerminateProxiesOnCompletion ( )

Description

Determine whether proxies are set to be terminated when the scrape ends.

Parameters

This method does not receive any parameters.

Return Values

Returns true if a proxy will be terminated; otherwise, it returns false.

Change Log

Version Description
5.0 Available for all editions.

Examples

Check Termination Setting

// Log whether proxies are being terminated or not
if ( session.getTerminateProxiesOnCompletion() )
{
    session.log( "Anonymous Proxies are set to be terminated with the scrape." );
}
else
{
    session.log( "Anonymous Proxies are set to continue running after the scrape is finished." );
}

See Also

getUseProxyFromPool

boolean session.getUseProxyFromPool ( )

Description

Determine whether proxies are being used from proxy pool.

Parameters

This method does not receive any parameters.

Return Values

Returns true if a proxy pool is being used; otherwise, it returns false.

Change Log

Version Description
4.5 Available for all editions.

Examples

Turn On Proxy Pool Usage If Not Running

 // Are proxies being used from a pool
 if ( !session.getUseProxyFromPool() )
 {
     session.setUseProxyFromPool( true );
 }

See Also

  • setUseProxyFromPool() [session] - Sets whether a proxy from the proxy pool should be used when making a request

setProxyServerPool

void session.setProxyServerPool ( ProxyServerPool proxyServerPool )

Description

Associate a proxy pool with a scraping session.

Parameters

  • proxyServerPool A ProxyServerPool object.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Associate Proxy Pool with Scraping Session

 // Create a new ProxyServerPool object. This object will
 // control how screen-scraper interacts with proxy servers.

 proxyServerPool = new ProxyServerPool();

 // We give the current scraping session a reference to
 // the proxy pool. This step should ideally be done right
 // after the object is created (as in the previous step).

 session.setProxyServerPool( proxyServerPool );

setTerminateProxiesOnCompletion

void session.setTerminateProxiesOnCompletion ( boolean terminateProxies )

Description

Manually set the outcome of proxies when the scrape ends.

Parameters

  • terminateProxies Whether proxies should be terminated at the end of the session or not, as a boolean.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for all editions.

Examples

Make Sure Proxies are Deleted on Scrape Completion

// Test
if ( session.getTerminateProxiesOnCompletion() )
{
    session.log( "Anonymous Proxies are set to be terminated with the scrape." );
}
else
{
    // Set proxies to be terminated with the scrape
    session.setTerminateProxiesOnCompletion( true );
    session.log( "Anonymous Proxies updated to be terminated with the scrape." );
}

See Also

setUseProxyFromPool

void session.setUseProxyFromPool ( boolean useProxyFromPool )

Description

Determine if proxies from a proxyServerPool be used when making scrapeable file request.

Parameters

  • useProxyFromPool Whether proxies in the proxyServerPool should be used, as a boolean.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Anonymize Scrapeable Files

 // Create a new ProxyServerPool object. This object will
 // control how screen-scraper interacts with proxy servers.

 proxyServerPool = new ProxyServerPool();

 // We give the current scraping session a reference to
 // the proxy pool. This step should ideally be done right
 // after the object is created (as in the previous step).

 session.setProxyServerPool( proxyServerPool );

 ... Proxy Server Pool Setup ...

 // This is the switch that tells the scraping session to make
 // use of the proxy servers. Note that this can be turned on
 // and off during the course of the scrape. You may want to
 // anonymize some pages, but not others.
 session.setUseProxyFromPool( true );

See Also

  • getUseProxyFromPool() [session] - Returns whether or not a proxy from the proxy pool will be used upon making a request

External Proxy Settings

Overview

If you are already going through a proxy server, screen-scraper must be told the credentials in order to get out to the internet. These methods are all provided to manually tell screen-scraper how to get through your external proxy.

If you always go through the same external proxy you would probably want to set the credentials in screen-scraper's proxy settings so that you don't have to specify them in all of your scrapes.

getExternalNTProxyDomain

string session.getExternalNTProxyDomain ( )

Description

Retrieve the external NT proxy domain.

Parameters

This method does not receive any parameters.

Return Values

Returns the external NT domain, as a string.

Change Log

Version Description
5.0 Added for all editions.

Examples

Log External NT Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalNTProxyUsername( ) );
session.log( "Password: " + session.getExternalNTProxyPassword( ) );
session.log( "Domain: " + session.getExternalNTProxyDomain( ) );
session.log( "Host: " + session.getExternalNTProxyHost( ) );

See Also

getExternalNTProxyHost

string session.getExternalNTProxyHost ( )

Description

Retrieve the external NT proxy host.

Parameters

This method does not receive any parameters.

Return Values

Returns the external NT host, as a string.

Change Log

Version Description
5.0 Added for all editions.

Examples

Log External NT Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalNTProxyUsername( ) );
session.log( "Password: " + session.getExternalNTProxyPassword( ) );
session.log( "Domain: " + session.getExternalNTProxyDomain( ) );
session.log( "Host: " + session.getExternalNTProxyHost( ) );

See Also

getExternalNTProxyPassword

string session.getExternalNTProxyPassword ( )

Description

Retrieve the external NT proxy password.

Parameters

This method does not receive any parameters.

Return Values

Returns the external NT password, as a string.

Change Log

Version Description
5.0 Added for all editions.

Examples

Log External NT Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalNTProxyUsername( ) );
session.log( "Password: " + session.getExternalNTProxyPassword( ) );
session.log( "Domain: " + session.getExternalNTProxyDomain( ) );
session.log( "Host: " + session.getExternalNTProxyHost( ) );

See Also

getExternalNTProxyUsername

string session.getExternalNTProxyUsername ( )

Description

Retrieve the external NT proxy username.

Parameters

This method does not receive any parameters.

Return Values

Returns the external NT username, as a string.

Change Log

Version Description
5.0 Added for all editions.

Examples

Log External NT Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalNTProxyUsername( ) );
session.log( "Password: " + session.getExternalNTProxyPassword( ) );
session.log( "Domain: " + session.getExternalNTProxyDomain( ) );
session.log( "Host: " + session.getExternalNTProxyHost( ) );

See Also

getExternalProxyHost

string session.getExternalProxyHost ( )

Description

Retrieve the external proxy host.

Parameters

This method does not receive any parameters.

Return Values

Returns the external host, as a string.

Change Log

Version Description
5.0 Available for all editions.

Examples

Log External Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalProxyUsername( ) );
session.log( "Password: " + session.getExternalProxyPassword( ) );
session.log( "Host: " + session.getExternalProxyHost( ) );
session.log( "Port: " + session.getExternalProxyPort( ) );

See Also

getExternalProxyPassword

string session.getExternalProxyPassword ( )

Description

Retrieve the external proxy password.

Parameters

This method does not receive any parameters.

Return Values

Returns the external password, as a string.

Change Log

Version Description
5.0 Available for all editions.

Examples

Log External Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalProxyUsername( ) );
session.log( "Password: " + session.getExternalProxyPassword( ) );
session.log( "Host: " + session.getExternalProxyHost( ) );
session.log( "Port: " + session.getExternalProxyPort( ) );

See Also

getExternalProxyPort

string session.getExternalProxyPort ( )

Description

Retrieve the external proxy port.

Parameters

This method does not receive any parameters.

Return Values

Returns the external port, as a string.

Change Log

Version Description
5.0 Available for all editions.

Examples

Log External Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalProxyUsername( ) );
session.log( "Password: " + session.getExternalProxyPassword( ) );
session.log( "Host: " + session.getExternalProxyHost( ) );
session.log( "Port: " + session.getExternalProxyPort( ) );

See Also

getExternalProxyUsername

string session.getExternalProxyUsername ( )

Description

Retrieve the external proxy username.

Parameters

This method does not receive any parameters.

Return Values

Returns the external username, as a string.

Change Log

Version Description
5.0 Available for all editions.

Examples

Log External Proxy Settings

// Log External Proxy Settings
session.log( "Username: " + session.getExternalProxyUsername( ) );
session.log( "Password: " + session.getExternalProxyPassword( ) );
session.log( "Host: " + session.getExternalProxyHost( ) );
session.log( "Port: " + session.getExternalProxyPort( ) );

See Also

setExternalNTProxyDomain

void session.setExternalNTProxyDomain ( String domain )

Description

Manually set external NT proxy domain.

Parameters

  • domain Domain for the external NT proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external NT proxy settings.

If you are using NTLM (Windows NT) authentication you'll need to designate settings for both the standard external proxy as well as the external NT proxy.

Examples

Manually Setup External NT Proxy

 // Setup External Proxy
 session.setExternalNTProxyUsername( "guest" );
 session.setExternalNTProxyPassword( "guestPassword" );
 session.setExternalNTProxyDomain( "Group" );
 session.setExternalNTProxyHost( "proxy.domain.com" );

See Also

setExternalNTProxyHost

void session.setExternalNTProxyHost ( String host )

Description

Manually set external NT proxy host/domain.

Parameters

  • host Host/domain for the external NT proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external NT proxy settings.

If you are using NTLM (Windows NT) authentication you'll need to designate settings for both the standard external proxy as well as the external NT proxy.

Examples

Manually Setup External NT Proxy

 // Setup External Proxy
 session.setExternalNTProxyUsername( "guest" );
 session.setExternalNTProxyPassword( "guestPassword" );
 session.setExternalNTProxyDomain( "Group" );
 session.setExternalNTProxyHost( "proxy.domain.com" );

See Also

setExternalNTProxyPassword

void session.setExternalNTProxyPassword ( String password )

Description

Manually set external NT proxy password.

Parameters

  • password Password for the external NT proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external NT proxy settings.

If you are using NTLM (Windows NT) authentication you'll need to designate settings for both the standard external proxy as well as the external NT proxy.

Examples

Manually Setup External NT Proxy

 // Setup External Proxy
 session.setExternalNTProxyUsername( "guest" );
 session.setExternalNTProxyPassword( "guestPassword" );
 session.setExternalNTProxyDomain( "Group" );
 session.setExternalNTProxyHost( "proxy.domain.com" );

See Also

setExternalNTProxyUsername

void session.setExternalNTProxyUsername ( String username )

Description

Manually set external NT proxy username.

Parameters

  • username Username for the external NT proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external NT proxy settings.

If you are using NTLM (Windows NT) authentication you'll need to designate settings for both the standard external proxy as well as the external NT proxy.

Examples

Manually Setup External NT Proxy

 // Setup External Proxy
 session.setExternalNTProxyUsername( "guest" );
 session.setExternalNTProxyPassword( "guestPassword" );
 session.setExternalNTProxyDomain( "Group" );
 session.setExternalNTProxyHost( "proxy.domain.com" );

See Also

setExternalProxyHost

void session.setExternalProxyHost ( String host )

Description

Manually set external proxy host/domain.

Parameters

  • host Host/domain for the external proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external proxy settings.

Examples

Manually Setup External Proxy

 // Setup External Proxy
 session.setExternalProxyUsername( "guest" );
 session.setExternalProxyPassword( "guestPassword" );
 session.setExternalProxyHost( "proxy.domain.com" );
 session.setExternalProxyPort( "80" );

See Also

setExternalProxyPassword

void session.setExternalProxyPassword ( String password )

Description

Manually set external proxy password.

Parameters

  • password Password for the external proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external proxy settings.

Examples

Manually Setup External Proxy

 // Setup External Proxy
 session.setExternalProxyUsername( "guest" );
 session.setExternalProxyPassword( "guestPassword" );
 session.setExternalProxyHost( "proxy.domain.com" );
 session.setExternalProxyPort( "80" );

See Also

setExternalProxyPort

void session.setExternalProxyPort ( String port )

Description

Manually set external proxy port.

Parameters

  • port Port for the external proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external proxy settings.

Examples

Manually Setup External Proxy

 // Setup External Proxy
 session.setExternalProxyUsername( "guest" );
 session.setExternalProxyPassword( "guestPassword" );
 session.setExternalProxyHost( "proxy.domain.com" );
 session.setExternalProxyPort( "80" );

See Also

setExternalProxyUsername

void session.setExternalProxyUsername ( String username )

Description

Manually set external proxy username.

Parameters

  • username Username for the external proxy, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

If you are using this method on all of your scripts you might want to set it in screen-scraper's external proxy settings.

Examples

Manually Setup External Proxy

 // Setup External Proxy
 session.setExternalProxyUsername( "guest" );
 session.setExternalProxyPassword( "guestPassword" );
 session.setExternalProxyHost( "proxy.domain.com" );
 session.setExternalProxyPort( "80" );

See Also

Logging

Overview

Use of log is a great tool to ensure that your scrapes are working correctly as well as troubleshooting problems that arise. Though logging large amounts of information may slow down a scrape, the best way around this is not to remove log writing requests but rather change the verbosity of the logging when running the scrape in a production environment. If you do this, know that you make it harder to troubleshoot some problems should they arise.

The number of methods provided is merely to enhance your ability to log information according to importance.

See Also

  • debug() [log] - Sends a message to the log as an debug message
  • info() [log] - Sends a message to the log as an info message
  • warn() [log] - Sends a message to the log as an warning message
  • error() [log] - Sends a message to the log as a error message

getLogFileName

String session.getLogFileName ( ) (professional and enterprise editions only)

Description

Get the name of the current log file.

Parameters

This method does not receive any parameters.

Return Values

Returns the name of the log file, as a string.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

This method can be very helpful when screen-scraper is running in server mode and you are tracking the log where the scrape of a record is located, or for tracking the location of errors in larger scrapes.

Examples

Get Log's File Name

 // Output the name of the log file to the session log.
 logName =  session.getLogFileName();

log

void session.log ( Object message )

Description

Write message to the log.

Parameters

  • message Message to be written to the log after being converted to a String using String.valueOf( message ).

Return Values

Returns void.

Change Log

Version Description
5.5 Now accepts any Object as a message
4.5 Available for all editions.

When the workbench is running, this will be found under the log tab for the scraping session. When screen-scraper is running in server mode, the message will get sent to the corresponding .log file found in screen-scraper's log folder. When screen-scraper is invoked from the command line, the message will get sent to standard out.

Examples

Write to Log

 // Sends the message to the log.
 session.log( "Inserting extracted data into the database." );

See Also

  • logDebug() [session] - Sends a message to the log as a debugging message
  • logInfo() [session] - Sends a message to the log as an informative message
  • logWarn() [session] - Sends a message to the log as a warning
  • logError() [session] - Sends a message to the log as an error message
  • log() [log] - Write message to the log

logCurrentDateAndTime

void session.logCurrentDateAndTime ( ) (professional and enterprise editions only)

Description

Write current date and time to log (at most verbose level). It is formatted to be human readable.

Parameters

This method does not receive any parameters.

Return Values

Returns void. If an error occurs, an error will be thrown.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

Examples

Log Date and Time

 // Output the current date and time to the log.
 session.logCurrentDateAndTime();

logCurrentTime

void session.logCurrentTime ( ) (professional and enterprise editions only)

Description

Write current time to log (at most verbose level). The time is formatted to be human readable.

Parameters

This method does not receive any parameters.

Return Values

Returns void. If an error occurs, an error will be thrown.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

Examples

Log Formatted Time

 // Output the current date and time to the log.
 session.logCurrentTime();

logDebug

void session.logDebug ( Object message ) (professional and enterprise editions only)

Description

Write message to the log, at the the debug level (most verbose).

Parameters

  • message Message to be written to the log after being converted to a String using String.valueOf( message ).

Return Values

Returns void.

Change Log

Version Description
5.5 Now accepts any Object as a message
4.5 Available for professional and enterprise editions.

Examples

Write to Log at Debug level

 // Sends the message to the lowest level of logging.
 session.logDebug( "Index: " + session.getVariable( "INDEX" ) );

  • log() [session] - Sends a message to the log as a debugging message
  • logInfo() [session] - Sends a message to the log as an informative message
  • logWarn() [session] - Sends a message to the log as a warning
  • logError() [session] - Sends a message to the log as an error message
  • debug() [log] - Sends a message to the log as a debug message

logElapsedRunningTime

void session.logElapsedRunningTime ( ) (professional and enterprise editions only)

Description

Write scrape run time to the log (at most verbose level). It is formatted to be human readable, including breaking it into days, hours, minutes, and seconds.

Parameters

This method does not receive any parameters.

Return Values

Returns void. If an error occurs, an error will be thrown.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

Examples

Log Time the Scrape has been Running

 // Output the running time to the log.
 session.logElapsedRunningTime();

See Also

logError

void session.logError ( Object message ) (professional and enterprise editions only)

Description

Write message to the log, at the the error level (least verbose).

Parameters

  • message Message to be written to the log after being converted to a String using String.valueOf( message ).

Return Values

Returns void. If an error occurs, an error will be thrown.

Change Log

Version Description
5.5 Now accepts any Object as a message
4.5 Available for professional and enterprise editions.

Examples

Write to Log at Error level

 // Sends the message to the highest level of logging.
 session.logError( "Error parsing date: " + session.getVariable( "DATE" ) );

  • log() [session] - Sends a message to the log as a debugging message
  • logDebug() [session] - Sends a message to the log as a debugging message
  • logInfo() [session] - Sends a message to the log as an informative message
  • logWarn() [session] - Sends a message to the log as a warning
  • error() [log] - Sends a message to the log as an error message

logInfo

void session.logInfo ( Object message ) (professional and enterprise editions only)

Description

Write message to the log, at the the info level (second most verbose).

Parameters

  • message Message to be written to the log after being converted to a String using String.valueOf( message ).

Return Values

Returns void. If an error occurs, an error will be thrown.

Change Log

Version Description
5.5 Now accepts any Object as a message
4.5 Available for professional and enterprise editions.

Examples

Write to Log at Info level

 // Sends the message to the second lowest level of logging.
 session.logInfo( "Traversing search results pages..." );

  • log() [session] - Sends a message to the log as a debugging message
  • logDebug() [session] - Sends a message to the log as a debugging message
  • logWarn() [session] - Sends a message to the log as a warning
  • logError() [session] - Sends a message to the log as an error message
  • info() [log] - Sends a message to the log as an info message

logVariables

void session.logVariables ( ) (professional and enterprise editions only)

Description

Write all session variables to log.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

Examples

Log All Session Variables

 // Write Variables to Log
 session.logVariables();

See Also

  • berakpoint [dataSet] - Pause scrape and display breakpoint window.

logWarn

void session.logWarn ( Object message ) (professional and enterprise editions only)

Description

Write message to the log, at the the warn level (third most verbose).

Parameters

  • message Message to be written to the log after being converted to a String using String.valueOf( message ).

Return Values

Returns void. If an error occurs, an error will be thrown.

Change Log

Version Description
5.5 Now accepts any Object as a message
4.5 Available for professional and enterprise editions.

Examples

Write to Log at Info level

 // Sends the message to the third level of logging.
 session.logWarn( "Warning! Received a 404 response."  );

  • log() [session] - Sends a message to the log as a debugging message
  • logDebug() [session] - Sends a message to the log as a debugging message
  • logInfo() [session] - Sends a message to the log as an informative message
  • logError() [session] - Sends a message to the log as an error message
  • warn() [log] - Sends a message to the log as an warning message

Web Interface Interactions

Overview

These methods are used in connection with the web interface of screen-scraper. Their use will provide the interface with more detailed information regarding the state of a running scrape. If you are not running the scrapes using the web interface then these methods are not particularly helpful to you.

As the web interface is an enterprise edition feature, these methods are only available in enterprise edition users.

addToNumDuplicateRecordsScraped

void session.addToNumDuplicateRecordsScraped ( Object value ) (enterprise edition only)

Description

Add to the value of duplicate records scraped. (As opposed to new or error records.)

Parameters

  • value Value to be added to the count. Usually a integer but if it is given a string (e.g. "10") it will try to transform it into an integer before adding.

Return Values

Returns void.

Change Log

Version Description
7.0 Available for enterprise edition.

Examples

Record New Records Scraped

 // Adds 10 to the value of new records scraped.
 session.addToNumDuplicateRecordsScraped(10);

Have session record each time a new record saved to the database

// In script called "After each pattern match"
import java.sql.PreparedStatement;
import java.sql.ResultSet;

dm = session.getv("_DM");
con = dm.getConnection();

try
{
        String sql = "SELECT id FROM table WHERE did = ?";
        PreparedStatement pstmt = con.prepareStatement(sql);
        pstmt.setString(1, dataRecord.get("ID"));
        ResultSet rs = pstmt.executeQuery();
        if (rs.next())
        {
                log.log("---Already in DB");
                session.addToNumDuplicateRecordsScraped(1);
        }
        else
        {
                session.scrapeFile("Results");
        }
}
catch (Exception e)
{
        log.logError(e);
        session.setFatalErrorOccurred(true);
        session.setErrorMessage(e);    
}
finally
{
        con.close();   
}

addToNumErrorRecordsScraped

void session.addToNumErrorRecordsScraped ( Object value ) (enterprise edition only)

Description

Add to the value error records. (As opposed to duplicate or new records.)

Parameters

  • value Value to be added to the count. Usually a integer but if it is given a string (e.g. "10") it will try to transform it into an integer before adding.

Return Values

Returns void.

Change Log

Version Description
7.0 Available for enterprise edition.

Examples

Record New Records Scraped

// Adds 10 to the value of new records scraped.
session.addToNumErrorRecordsScraped(10);

Have session record each time a dataRecord is missing a vital datam

// In script called "After each pattern match"
if (sutil.isNullOrEmptyString(dataRecord.get("VITAL_DATUM")))
{
    log.logError("Missing VITAL_DATUM");
    session.addToNumErrorRecordsScraped(1);
}

addToNumNewRecordsScraped

void session.addToNumNewRecordsScraped ( Object value ) (enterprise edition only)

Description

Add to the value of new records scraped. (As opposed to duplicate or error records.)

Parameters

  • value Value to be added to the count. Usually a integer but if it is given a string (e.g. "10") it will try to transform it into an integer before adding.

Return Values

Returns void.

Change Log

Version Description
7.0 Available for enterprise edition.

Examples

Record New Records Scraped

 // Adds 10 to the value of new records scraped.
 session.addToNumNewRecordsScraped(10);

Have session record each time a new record saved to the database

// In script called "After each pattern match"
dm = session.getv("_DM");
dm.addData("db_table", dataRecord);
dm.commit("db_table");
if (dm.flush())
{
        session.addToNumNewRecordsScraped(1);
}

addToNumRecordsScraped

void session.addToNumRecordsScraped ( Object value ) (enterprise edition only)

Description

Add to the value of number of records scraped.

Parameters

  • value Value to be added to the count. Usually a integer but if it is given a string (e.g. "10") it will try to transform it into an integer before adding.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Record Number of Records Scraped

 // Adds 10 to the value of the number of records scraped.
 session.addToNumRecordsScraped( 10 );

Have session record each time a DataRecord exists

 // In script called "After file is scraped"

 // Adds number of DataRecords in DataSet
 // to the value of the number of records scraped.

 session.addToNumRecordsScraped( dataSet.getNumDataRecords() );

See Also

appendErrorMessage

void session.appendErrorMessage ( String errorMessage ) (enterprise edition only)

Description

Append an error message to any existing error messages.

Parameters

  • errorMessage Error message that should be added, as a string.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

User Specified Error

 // First set the flag indicating that an error occurred.
 session.setFatalErrorOccurred( true );

 // Append an error message.
 session.appendErrorMessage( "An error occurred in the scraping session." );

See Also

getErrorMessage

String session.getErrorMessage ( ) (enterprise edition only)

Description

Get the current error message.

Parameters

This method does not receive any parameters.

Return Values

Returns current error message, as a string.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Write Error Message to the Log

 // Output the current error message to the log.
 session.log( "Error message: " + session.getErrorMessage() );

See Also

getFatalErrorOccurred

boolean session.getFatalErrorOccurred ( ) (enterprise edition only)

Description

Determine the fatal error status of the scrape.

Parameters

This method does not receive any parameters.

Return Values

Returns whether a fatal error has occurred, as a boolean .

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Write Fatal Error State to Log

 // Output the "fatal error" state to the log.
 session.log( "Fatal error occurred: " + session.getFatalErrorOccurred() );

See Also

getNumRecordsScraped

int session.getNumRecordsScraped ( ) (enterprise edition only)

Description

Get the number of records that have been scraped.

Parameters

This method does not receive any parameters.

Return Values

Returns number of records scraped, as a integer.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Write Number of Records Scraped to Log

 // Outputs the number of records that have been scraped to the log.
 session.log( "Num records scraped so far: " + session.getNumRecordsScraped() );

See Also

resetNumRecordsScraped

void session.resetNumRecordsScraped ( ) (enterprise editions only)

Description

Reset the count on the number of scraped records.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for all editions.

Examples

Reset Count

// Clear number of records scraped
session.resetNumRecordsScraped();

See Also

setErrorMessage

void session.setErrorMessage ( String errorMessage ) (enterprise edition only)

Description

Set the current error message.

Parameters

  • errorMessage Desired error message, as a string.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Specify an Error Message

 // First set the flag indicating that an error occurred.
 session.setFatalErrorOccurred( true );

 // Append an error message.
 session.setErrorMessage( "An error occurred in the scraping session." );

Web Interface Feedback

 // Append an error message. Without flagging it as an error.
 // This will hijack the error message so it is more just a
 // status message. Don't hijack if there was a fatal error.

 if ( !session.getFatalErrorOccurred() )
 {
     session.appendErrorMessage( "Scraping Page: " + session.getv( "PAGE" ) );
 }

See Also

setFatalErrorOccurred

void session.setFatalErrorOccurred ( boolean fatalErrorOccurred ) (enterprise edition only)

Description

Set the fatal error status of the scrape.

Parameters

  • fatalErrorOccurred Desired fatal error status to set, as a boolean.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Set Fatal Error Flag

 // Set the flag indicating that an error occurred.
 session.setFatalErrorOccurred( true );

See Also

setNumRecordsScraped

void session.setNumRecordsScraped ( Object value ) (enterprise edition only)

Description

Set the number of records that have been scraped.

Parameters

  • value Value to set the count of the number of records scraped.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Set the Number of Records Scraped

 // Sets the value of the number of records scraped to 10.
 session.setNumRecordsScraped( 10 );

See Also

addEventCallback

void session.addEventCallback ( EventFireTime eventTime, EventHandler callback ) (professional and enterprise editions only)
void session.addEventCallbackWithPriority ( EventFireTime eventTime, EventHandler callback, int priority ) (professional and enterprise editions only)

Description

Add a runnable that will be executed at the given time.

Note: session.addEventCallback is automatically executed at a priority of 0.

Parameters

  • eventTime The time to execute a callback.
  • callback The callback to execute.
  • priority The prority for this callback. Lower numbers are higher priority.

Return Values

Returns void.

Change Log

Version Description
6.0.55a Introduced for pro and enterprise editions.

Examples

Sets a handler to do something after the scripts set to run at the end of the session have run.

   // using the default callback with the priority being 0.
   session.addEventCallback(SessionEventFireTime.AfterEndScripts, handler);
   
   // if we need to set the priority to be something else (or variable) use the second option
   // in this case the priority could still be set to 0 if you wanted to.
   session.addEventCallbackWithPriority(SessionEventFireTime.AfterEndScripts, handler, 3);

More Examples

EventFireTime

The EventFireTime is an interface which defines the methods that a fire time must have and so the addEventCallback method can take different types of fire times.

A number of different types of classes based on this interface have been defined for you which call out the various parts of a scrape that you can add event handlers to. Those are defined below.

ExtractorPatternEventFireTime

ExtractorPatternEventFireTime

Enum

  • BeforeExtractorPattern Before an extractor is applied (including before any scripts on it run). The returned value should be a boolean and indicates whether the extractor should be run or not. Any non-boolean result is the same as true. Also note that regardless of whether the extractor will be run or not, the event for after extractor pattern will still be fired.
  • AfterExtractorPatternAppliedButBeforeScripts After an extractor is applied (but before any scripts on it run &emdash; including the after apparent match scripts).
  • AfterEachExtractorMatch After each match of an extractor. This will be applied before any of the "After each pattern match" scripts are applied.
  • AfterExtractorPattern After an extractor is applied (including any scripts on it run).

Change Log

Version Description
6.0.55a Introduced for pro and enterprise editions.

Examples

How to use the EventFireTime with the session.addEventcallback method.

    session.addEventCallback(ExtractorPatternEventFireTime.AfterEachExtractorMatch, handler);

ScrapeableFileEventFireTime

ScrapeableFileEventFireTime

Enum

  • BeforeScrapeableFile Before a scrapeable file is launched (inlcuding before any scripts on it run).
  • BeforeHttpRequest Fired right before the http request (after any "before scrapeable fie" scripts, and wil fire each time the request is retired). If it returns a non-null String, that will be used as the response instead of issuing a request. This response will still get passed into the AfterHttpRequest even, but it will not pass through any tidying.
  • AfterHttpRequest Fire right after the http response and running tidy, if set, but before anything else happens. Returns the data that should be used as the response data.
  • AfterScrapeableFile After a scrapeable file is completed (including afer any scripts on it run).
  • OnHttpRedirect* Called when a redirect will occur, and returns true if a redirect should occur or false if it should not (any non boolean results in no chanage).

*Note: When using the Async HTTP client you will have access to the request builder from ScrapeableFileEventData.getRedirectRequestBuilder() which can be used to modify and adjust the request before it is sent. If you use the Apache HTTP client the getRedirectRequestBuilder() method will always return null.

Change Log

Version Description
6.0.55a Introduced for pro and enterprise editions.

Examples

How to use the EventFireTime with the session.addEventcallback method.

    session.addEventCallback(ScrapeableFileEventFireTime.BeforeScrapeableFile, handler);

getRedirectToURL

String scrapeableFileEventData.getRedirectToURL ( )

Description

Returns the RedirectToURL value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the RedirectToURL value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the redirect URL

    public Object handleEvent(EventFireTime fireTime, ScrapeableFileEventData data) {
        String url = data.getRedirectToURL();
       
        // do something
    }

ScriptEventFireTime

ScriptEventFireTime

Enum

  • AfterScript After a script is executed
  • BeforeScript Before a script is executed
  • OnScriptEnd Run when the script finishes executing. The difference between AfterScript and this is that AfterScript fires after the script is done running, and this runs after all the developer code has run but the script engine is still active. The return value is an injected string to execute, or null (or the empty string) to do nothing aside from execute the script code.
  • OnScriptError Executes when an error occurs in a script.
  • OnScriptStart Run when the script beings to execute. The difference between BeforeScript and this is that BeforeScript fires as preparation is made to launch a script, and this runs after all the default pre-script code is executed by the script engine, but before the developer code in the script. The return value is an injected string to execute, or null (or the empty string) to do nothing aside from execute the script code.

Change Log

Version Description
6.0.55a Introduced for pro and enterprise editions.

Examples

How to use the EventFireTime with the session.addEventcallback method.

    session.addEventCallback(ScriptEventFireTime.OnScriptEnd, handler);

SessionEventFireTime

SessionEventFireTime

Enum

  • AfterEndScripts After the scrape finishes and all
  • NumRecordsSavedModified When the ScrapingSession.addToNumRecordsScraped(Object) is called, this will also be called. The returned value will be the actual value to add.
  • StopScrapingCalled When the session is stopped, either by calling the stopScraping method or clicking the stop scraping button in the workbench.
  • SessionVariableSet* Called whenever a session variable is set. This is called before the value is actually set. The variable value passed in will be the new value to be set, and the return value of the handler will be the actual value returned.
  • SessionVariableRetrieved* Called whenever a session variable is retrieved. This is called after the value is retrieved. The variable value passed in will be the current value, and the return value of the handler will be the actual value returned.

*Note: Calling a setVariable or getVariable method in here WILL trigger the events for those again. Avoid infinite recursion please!

Change Log

Version Description
6.0.55a Introduced for pro and enterprise editions.

Examples

How to use the EventFireTime with the session.addEventcallback method.

    session.addEventCallback(SessionEventFireTime.AfterEndScripts, handler);

StringOperationEventFireTime

StringOperationEventFireTime

Enum

  • HttpParameterEncodeKey Called when an http parameter key (GET or POST) is encoded. The input string will be the value that is already encoded, and the return value should be the value to actually use.
  • HttpParameterEncodeValue Called when an http parameter value (GET or POST) is encoded. The input string will be the value that is already encoded, and the return value should be the value to actually use.

Change Log

Version Description
6.0.55a Introduced for pro and enterprise editions.

Examples

How to use the EventFireTime with the session.addEventcallback method.

    session.addEventCallback(StringOperationEventFireTime.HttpParameterEncodeKey, handler);

EventHandler

EventHandler EventHandler ( ) (professional and enterprise editions only)

Description

Creates an EventHandler callback object which will be called when the event triggers

Change Log

Version Description
6.0.55a Introduced for pro and enterprise editions.

Examples

Define a handler for the session.addEventCallback to use.

    // Create an EventHandler object which will be called when the event triggers
    EventHandler handler = new EventHandler()
    {
        /**
        * Returns the name of the handler.  This method doens't need to be implemented
        * but helps with debugging (on error executing the callback it will output this)
        */

        public String getHandlerName()
        {
            return "A test event handler";
        }

        /**
        * Processes the event, and potentially returns a useful value modifying something
        * in the internal code
        *
        * @param fireTime The fire time of the event. This helps when using the same handler
        * for multiple event times, to determine which was called
        * @param data The actual data from the event. Based on the event time this
        * will be a different type. It could be SessionEventData, ScrapeableFileEventData,
        * ScriptEventData, StringEventData, etc...  It will match the fire time class name
        *
        * @return A value indicating how to proceed (or sometimes the value is ignored)
        */

        public Object handleEvent(EventFireTime fireTime, AbstractEventData data)
        {
            // While you can specifically grab any data from the data object,
            // if this is a method that has a return value that matters,
            // it's best to get it as the last return value, so that multiple
            // events can be chained together.  The input data object
            // will always have the original values for all the other getters
            Object returnValue = data.getLastReturnValue();

            // Do stuff...

            // The EventFireTime values describe in the documentation what the return
            // value will do, or says nothing about it if the value is ignored
            // If you don't intend to modify the return, always return data.getLastReturnValue();
            return returnValue;
        }
    };

getHandlerName

String getHandlerName ( )

Description

Returns the name of the handler. This method doesn't need to be implemented but helps with debugging.

Parameters

This method does not receive any parameters.

Return Values

Returns the name of the handler. This method doesn't need to be implemented but helps with debugging.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

    // Create an EventHandler object which will be called when the event triggers
    EventHandler handler = new EventHandler()
    {
        /**
         * Returns the name of the handler.  This method doens't need to be implemented
         * but helps with debugging (on error executing the callback it will output this)
         */

        public String getHandlerName()
        {
            return "A test event handler";
        }

        public Object handleEvent(EventFireTime fireTime, AbstractEventData data)
        {
            // do something
        }
    };

See Also

handleEvent

Object handleEvent ( EventFireTime fireTime, AbstractEventData data )

Description

Processes the event, and potentially returns a useful value modifying something in the internal code as defined by the EventFireTime used to launch this event.

Parameters

  • fireTime Defines the methods that a fire time must have.
  • data Allows for the accessing of various data values found within ScreenScraper dependent on the class used.

Return Values

Returns a value based on which AbstractEventData class is used.

Change Log

Version Description
6.0.55a Available for all editions.

    EventHandler handler = new EventHandler()
    {  
        public String getHandlerName()
        {
            // return something
        }

        /**
         * Processes the event, and potentially returns a useful value modifying something
         * in the internal code
         *
         * @param fireTime The fire time of the event. This helps when using the same handler
         * for multiple event times, to determine which was called
         * @param data The actual data from the event. Based on the event time this
         * will be a different type. It could be SessionEventData, ScrapeableFileEventData,
         * ScriptEventData, StringEventData, etc...  It will match the fire time class name
         *
         * @return A value indicating how to proceed (or sometimes the value is ignored)
         */

        public Object handleEvent(EventFireTime fireTime, AbstractEventData data)
        {
            // While you can specifically grab any data from the data object,
            // if this is a method that has a return value that matters,
            // it's best to get it as the last return value, so that multiple
            // events can be chained together.  The input data object
            // will always have the original values for all the other getters
            Object returnValue = data.getLastReturnValue();

            // Do stuff...

            // The EventFireTime values describe in the documentation what the return
            // value will do, or says nothing about it if the value is ignored
            // If you don't intend to modify the return, always return data.getLastReturnValue();
            return returnValue;
        }
    };

See Also

AbstractEventData

The AbstractEventData class is an abstract class which allows for the accessing of various data values found within ScreenScraper. Below are the various classes that extend AbstractEventData

AbstractEventData is extended by the following classes and it is those classes that should be used in place of AbstractEventData.

getLastReturnValue

Object getLastReturnValue ( )

Description

Returns the LastReturnValue for the object. This is the value previously returned by another callback. This can be null, if no callbacks have been fired yet for this event. A null value is also the default return value for the given event.

Parameters

This method does not receive any parameters.

Return Values

Returns the LastReturnValue for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Write to Log

   // In practice AbstractEventData is just the abstract class.
   // You must actually use one of the classes that extend it.
    public Object handleEvent(EventFireTime fireTime, AbstractEventData data) {
        // While you can specifically grab any data from the data object,
                // if this is a method that has a return value that matters,
                // it's best to get it as the last return value, so that multiple
                // events can be chained together.  The input data object
                // will always have the original values for all the other getters
                Object returnValue = data.getLastReturnValue();

       
        // do something
       
        // The EventFireTime values describe in the documentation what the return
                // value will do, or says nothing about it if the value is ignored
                // If you don't intend to modify the return, always return data.getLastReturnValue();
        return data.getLastReturnValue();
    }

setLastReturnValue

void setLastReturnValue ( Object lastReturnValue )

Description

Sets the LastReturnValue fro the object.

Parameters

  • lastReturnValue The new value for the LastReturnValue

Return Values

Returns void.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

   // In practice AbstractEventData is just the abstract class.
   // You must actually use one of the classes that extend it.
    public Object handleEvent(EventFireTime fireTime, AbstractEventData data) {
               
        Object foo = // something here;
        data.setLastReturnValue(foo);
       
        // do something
       
        // The EventFireTime values describe in the documentation what the return
                // value will do, or says nothing about it if the value is ignored
                // If you don't intend to modify the return, always return data.getLastReturnValue();
        return data.getLastReturnValue();
    }

ExtractorPatternEventData

ExtractorPatternEventData extends AbstractEventData

This contains the data for various extractor pattern operations

Inherits the following methods from AbstractEventData

See Also

extractorPatternTimedOut

boolean extractorPatternEventData.extractorPatternTimedOut ( )

Description

Returns the status of the extractor pattern timeout. Returns true if and only if the extractor pattern was applied and timed out while doing so. Otherwise it will return false.

Parameters

This method does not receive any parameters.

Return Values

Returns a boolean value representing the status of the extractor pattern timeout.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Determine if an extractor pattern has timed out.

    public Object handleEvent(EventFireTime fireTime, ExtractorPatternEventData data) {
        if (data.extractorPatternTimeOut()) {
            // do something
        }
    }

getDataRecord

DataRecord extractorPatternEventData.getDataRecord ( )

Description

Returns the DataRecord value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the DataRecord value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current DataRecord.

    public Object handleEvent(EventFireTime fireTime, ExtractorPatternEventData data) {
        DataRecord dr = data.getDataRecord();
       
        // do something
    }

getDataSet

DataSet extractorPatternEventData.getDataSet ( )

Description

Returns the DataSet value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the DataSet value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current DataSet.

    public Object handleEvent(EventFireTime fireTime, ExtractorPatternEventData data) {
        DataSet ds = data.getDataSet();
       
        // do something
    }

getExtractorPattern

ExtractorPattern extractorPatternEventData.getExtractorPattern ( )

Description

Returns the ExtractorPattern value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the ExtractorPattern value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current ExtractorPattern.

    public Object handleEvent(EventFireTime fireTime, ExtractorPatternEventData data) {
        ExtractorPattern pattern = data.getExtractorPattern();
       
        // do something
    }

getScrapeableFile

ScrapeableFile extractorPatternEventData.getScrapeableFile ( )

Description

Returns the Scrapeablefile value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Scrapeablefile value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current ScrapeableFile.

    public Object handleEvent(EventFireTime fireTime, ExtractorPatternEventData data) {
        ScrapeableFile sf = data.getScrapeableFile();
       
        // do something
    }

getSession

ScrapingSession extractorPatternEventData.getSession ( )

Description

Returns the Session value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Session value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current Session.

    public Object handleEvent(EventFireTime fireTime, ExtractorPatternEventData data) {
        ScrapingSession _session = data.getSession();
       
        // do something
    }

ScrapeableFileEventData

ScrapeableFileEventData extends AbstractEventData

This contains the data for various scrapeable file operations

Inherits the following methods from AbstractEventData

See Also

getHttpResponseData

String scrapeableFileEventData.getHttpResponseData ( )

Description

Returns the HttpResponseData for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the HttpResponseData for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the HttpResponseData

    public Object handleEvent(EventFireTime fireTime, ScrapeableFileEventData data) {
        String responseData = data.getHttpResponseData();
       
        // do something
    }

getRedirectRequestBuilder

ScrapingRequest.Builder scrapeableFileEventData.getRedirectRequestBuilder ( )

Description

Returns the RedirectRequestBuilder for the object. Use this to add headers, etc... for the redirect. It can be null depending on the HTTP client being used, and whether or not it supports manually playing with the redirect.

Parameters

This method does not receive any parameters.

Return Values

Returns the RedirectRequestBuilder for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the Request Builder in order to modify it.

    public Object handleEvent(EventFireTime fireTime, ScrapeableFileEventData data) {
        ScrapingRequest.Builder builder = data.getRedirectRequestBuilder();
       
        // do something
    }

getScrapeableFile

ScrapeableFile scrapeableFileEventData.getScrapeableFile ( )

Description

Returns the Scrapeablefile value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Scrapeablefile value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current ScrapeableFile.

    public Object handleEvent(EventFireTime fireTime, ScrapeableFileEventData data) {
        ScrapeableFile sf = data.getScrapeableFile();
       
        // do something
    }

getSession

ScrapingSession scrapeableFileEventData.getSession ( )

Description

Returns the Session value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Session value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current Session.

    public Object handleEvent(EventFireTime fireTime, ScrapeableFileEventData data) {
        ScrapingSession _session = data.getSession();
       
        // do something
    }

ScriptEventData

ScriptEventData extends AbstractEventData

This contains the data for various script operations

Inherits the following methods from AbstractEventData

See Also

getDataRecord

DataRecord scriptEventData.getDataRecord ( )

Description

Returns the DataRecord value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the DataRecord value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current DataRecord.

    public Object handleEvent(EventFireTime fireTime, ScriptEventData data) {
        DataRecord dr = data.getDataRecord();
       
        // do something
    }

getDataSet

DataSet scriptEventData.getDataSet ( )

Description

Returns the DataSet value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the DataSet value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current DataSet.

    public Object handleEvent(EventFireTime fireTime, ScriptEventData data) {
        DataSet ds = data.getDataSet();
       
        // do something
    }

getScrapeableFile

ScrapeableFile scriptEventData.getScrapeableFile ( )

Description

Returns the Scrapeablefile value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Scrapeablefile value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current ScrapeableFile.

    public Object handleEvent(EventFireTime fireTime, ScriptEventData data) {
        ScrapeableFile sf = data.getScrapeableFile();
       
        // do something
    }

getScriptException

java.lang.Exception scriptEventData.getScriptException ( )

Description

Returns the ScriptException for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the ScriptException for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the script exception

    public Object handleEvent(EventFireTime fireTime, ScriptEventData data) {
        java.lang.Exception e = data.getScriptException();
       
        // do something
    }

getScriptName

String scriptEventData.getScriptName ( )

Description

Returns the ScriptName value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the ScriptName value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the script name

    public Object handleEvent(EventFireTime fireTime, ScriptEventData data) {
         String name = data.getScriptName();
       
        // do something
    }

getSession

ScrapingSession scriptEventData.getSession ( )

Description

Returns the Session value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Session value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current Session.

    public Object handleEvent(EventFireTime fireTime, ScriptEventData data) {
        ScrapingSession _session = data.getSession();
       
        // do something
    }

SessionEventData

SessionEventData extends AbstractEventData

This contains the data for various session operations

Inherits the following methods from AbstractEventData

See Also

getIncrementRecordsAmount

Object sessionEventData.getIncrementRecordsAmount ( )

Description

Returns the IncrementRecordsAmount value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the IncrementRecordsAmount value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current increment records amount.

    public Object handleEvent(EventFireTime fireTime, SessionEventData data) {
        Object recordsAmt = data.getIncrementRecordsAmount();
       
        // do something
    }

getSession

ScrapingSession sessionEventData.getSession ( )

Description

Returns the Session value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Session value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current Session.

    public Object handleEvent(EventFireTime fireTime, SessionEventData data) {
        ScrapingSession _session = data.getSession();
       
        // do something
    }

getVariableName

String sessionEventData.getVariableName ( )

Description

Returns the VariableName value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the VariableName value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the variable name.

    public Object handleEvent(EventFireTime fireTime, SessionEventData data) {
        String name = data.getVariableName();
       
        // do something
    }

getVariableValue

Object sessionEventData.getVariableValue ( )

Description

Returns the VariableValue value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the VariableValue value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Get the current Session.

    public Object handleEvent(EventFireTime fireTime, SessionEventData data) {
        Object value = data.getVariableValue();
       
        // do something
    }

StringEventData

StringEventData extends AbstractEventData

This contains the data for various string operations

Inherits the following methods from AbstractEventData

See Also

getInput

String stringEventData.getInput ( )

Description

Returns the Input value for the object.

Parameters

This method does not receive any parameters.

Return Values

Returns the Input value for the object.

Change Log

Version Description
6.0.55a Available for all editions.

Examples

Write to Log

    public Object handleEvent(EventFireTime fireTime, StringEventData data) {
        String str = data.getInput();
       
        // do something
    }

addToVariable

void session.addToVariable ( String variable, int value ) (professional and enterprise editions only)

Description

Add to the value of a session variable.

Parameters

  • variable Key of the variable, as a string.
  • value Value to be added to the variable, as a integer.

Return Values

Returns void. If the variable doesn't exist, or is not a string or integer, a message will be added to the log. If it cannot add to the variable for any other reason it will write an error to the log.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

Examples

Increment Variable

 // Increments the session variable "PAGE_NUM" by one.
 session.addToVariable( "PAGE_NUM", 1 )

See Also

  • getVariable() [session] - Returns the value of a session variable
  • getv() [session] - Returns the value of a session variable (alias of getVariable)
  • setVariable() [session] - Sets the value of a session variable
  • setv() [session] - Sets the value of a session variable (alias of setVariable)

breakpoint

void session.breakpoint ( ) (professional and enterprise editions only)

Description

Pause scrape and display breakpoint window. If the scrape is running in server mode, to avoid the break, logVariables will be called in place of breakpoint.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

Examples

Open BreakPoint Window

 // Causes the breakpoint window to be displayed.
 session.breakpoint();

clearAllSessionVariables

void session.clearAllSessionVariables ( )

Description

Remove all session variables.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Clear Session Variables

 // Clear all session variables.
 session.clearAllSessionVariables();

See Also

  • setVariable() [session] - Sets the value of a session variable
  • setv() [session] - Sets the value of a session variable (alias of setVariable)

clearCookies

void session.clearCookies ( ) (enterprise edition only)

Description

Clear stored cookies.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Clear Cookies

 // Clear all current cookies,
 session.clearCookies();

See Also

  • getCookies() [session] - Gets all the cookies currently stored by this scraping session
  • setCookie() [session] - Sets the value of a cookie

clearVariables

void session.clearVariables ( Map variables ) (professional and enterprise editions only)
void session.clearVariables ( Collection variables ) (professional and enterprise editions only)

Description

Clears the value of all session variables that match the keys in the Map. This will ignore a key of DATARECORD.

This method is provided using a Map or Collection rather than a List or Set to work easier with the setSessionVariables method.

Parameters

  • Map The map to use when clearing the session variables.
  • Collection The collection to use when clearing the session variables.

Return Value

This method returns void.

Change Log

Version Description
5.5.29a Available in all editions.
5.5.43a Changed from session.removeSessionVariablesInMap to session.clearVariables.

Examples

Clear the ASPX values for a .NET site after scraping the next page

 DataRecord aspx = scrapeableFile.getASPXValues();
 
 session.setSessionVariables(aspx);
 session.scrapeFile("Next Results");
 session.clearVariables(aspx);

convertHTMLEntitiesInVariable

void session.convertHTMLEntitiesInVariable ( String variable )

Description

Decode HTML Entities on a session variable.

Parameters

  • variable Session variable whose HTML Entities will be converted to characters.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

Examples

Decode HTML Entities In Variable

// Set variable
session.setv( "LOCATION", "Angela's Room" );

// Convert HTML entities
session.convertHTMLEntitiesInVariable( "LOCATION" );

// Write to Log
session.log( session.getv( "LOCATION" ) ); //logs Angela's Room

See Also

downloadFile

boolean session.downloadFile ( String url, String fileName ) (professional and enterprise editions only)
boolean session.downloadFile ( String url, String fileName, int maxNumAttempts ) (professional and enterprise editions only)
boolean session.downloadFile ( String url, String fileName, int maxNumAttempts, boolean doLazy ) (enterprise edition only)

Description

Downloads the file to the local file system.

Parameters

  • url URL reference to the desired file, as a string.
  • fileName Local file path when the file should be saved, as a string.
  • maxNumAttempts (optional) Number of times the file will be requested without success, as an integer. Defaults to 3.
  • doLazy (optional) Whether the file should be downloaded in a separate thread, as a boolean. Defaults to false.

Return Values

Returns true on successful download of the file otherwise it return false.

Change Log

Version Description
4.5 Available for professional and enterprise editions. Lazy scrape only available for enterprise edition.

If the file to download requires that POST data is sent in order to get the file you would use saveFileOnRequest with a scrapeable file.

Using this method in a script takes the place of requesting the target URL as a scrapeable file.

Examples

Download File in a Separate Thread

 // Downloads the image pointed to by the URL to the local C: drive.
 // A maximum number of 5 attempts will be made to download the file,
 // and the file will be downloaded in its own thread.

 session.downloadFile( "http://www.foo.com/imgs/puppy_image.gif", "C:/images/puppy.gif", 5, true );

executeScript

void session.executeScript ( String scriptName ) (professional and enterprise editions only)

Description

Manual start the execution of a script.

Parameters

  • scriptName Name of the script to execute, as a string. The script has to be on the same instance of screen-scraper as the scraping session.

Return Values

Returns void. If the file doesn't exist a message will be written to the log. If the called script has an error in it a warning will be written to the log.

Change Log

Version Description
5.0 Scripts called using this method are now exported with the scraping session.
4.5 Available for professional and enterprise editions.

Examples

Execute Script

 // Executes the script "My Script".
 session.executeScript( "My Script" );

executeScriptWithContext

void session.executeScriptWithContext ( String scriptName ) (professional and enterprise editions only)

Description

Executes the named script, but preserves the current context (dataRecord, scrapeableFile, etc...)

Parameters

  • scriptName The name of the script to execute.

Return Value

This method returns void.

Change Log

Version Description
5.5.29a Available in professional and enterprise editions.

Examples

Execute a script, but preserve the context

 // Execute the 'Do more stuff' script, but give it access to the scrapeableFile this script has access to.
 session.executeScriptWithContext("Do more stuff");

getCharacterSet

String session.getCharacterSet ( )

Description

Get the general character set being used in page response renderings.

Parameters

This method does not receive any parameters.

Return Values

Returns the character set applied to the scraping session's files, as a string. If a character set has not been specified then it will default to the character set specified in settings dialog box.

Change Log

Version Description
4.5 Available for all editions.

If you are having trouble with characters displaying incorrectly, we encourage you to read about how to go about finding a solution using one of our FAQs.

Examples

Get Character Set

 // Get the character set of the dataSet
 charSetValue = session.getCharacterSet();

See Also

  • setCharacterSet() [session] - Set the character set used to render all responses.
  • getCharacterSet() [scrapeableFile] - Get the character set used to responses to a specific scrapeable file.
  • setCharacterSet() [scrapeableFile] - Set the character set used to responses to a specific scrapeable file.

getConnectionTimeout

int session.getConnectionTimeout ( )

Description

Retrieve the timeout value for scrapeable files in the session.

Parameters

This method does not receive any parameters.

Return Values

Returns the timeout value in milliseconds, as an integer.

Change Log

Version Description
5.0.1a Introduced for all editions.

Examples

Retrieve Connection Timeout

 // set variable to connection timeout
 timeout = session.getConnectionTimeout( );

See Also

getCookies

Cookie[] session.getCookies ( )

Description

Get the current cookies.

Parameters

This method does not receive any parameters.

Return Values

Returns an array of the cookies in the session.

Change Log

Version Description
5.0 Available for all editions.

Examples

Add Cookie If Missing

// Get cookies
cookies = session.getCookies();

// Cookie Information
cookieDomain = "mydomain.com";
cookieName = "cookie_test";
cookieValue = "please_accept_for_session";

// Exists Flag
cookieExists = false;

// Loop through cookies
for (i = 0; i < cookies.length; i++) {
    cookie = cookies[i];

    // Check if this is the cookie
    if (cookie.getName().equals(cookieName) && cookie.getValue().equals(cookieValue)&&cookie.getDomain().equals(cookieDomain)) {
        //if the cookie matches then it exists
        cookieExists = true;
        // Log search status
        session.log( "+++Cookie Exists" );
        // Stop searching
        break;
    }
}

// Add cookie, if it doesn't exist
if ( !cookieExists ) {
    session.log( "+++Cookie Does NOT Exists: Setting Cookie" );
    session.setCookie( cookieDomain, cookieName, cookieValue);
}

Write Cookies to Log

// Get cookies
cookies = session.getCookies();

// Loop through Cookies
for (i = 0; i < cookies.length; i++) {
    cookie = cookies[i];

    // Write Cookie information to the Log
    session.log( "COOKIE #" + i );
    session.log( "Name: " + cookie.getName() );
    session.log( "Value: " + cookie.getValue() );
    session.log( "Path: " + cookie.getPath() );
    session.log( "Domain: " + cookie.getDomain() );
    // Only log expiration if it is set
    if (cookie.getExpiryDate() != null) {
        session.log( "Expiration: " + cookie.getExpiryDate().toString() );
    }
}

See Also

  • clearCookies() [session] - Clears all the cookies from this scraping session
  • setCookie() [session] - Sets the value of a cookie

getDebugMode

boolean session.getDebugMode ( )

Description

Checks to see if this is currently set to run in debug mode. This is useful for developing scrapes, as enabling debug mode logs a warning message, so it is easier to notice a scrape with hard-coded values used for development. Also logs a warning in the web interface or log each time monitored variables are logged with the logMonitoredValues or webMessage methods are called.

Parameters

This method takes no parameters.

Return Value

True if debug mode is enabled, false otherwise.

Change Log

Version Description
5.5.29a Available in all editions.

Examples

Set some hardcoded values to use when the scrape is being developed

 // Comment out the line below for production
 session.setDebugMode(true);
 
 if(session.getDebugMode())
 {
   session.setVariable("SEARCH_TERM", "DVDs");
   session.setVariable("USERNAME", "some user");
   session.setVariable("PASSWORD", "the password");
 }

getDefaultRetryPolicy

RetryPolicy session.getDefaultRetryPolicy ( ) (professional and enterprise editions only)

Description

Gets the default retry policy to be used by each scrapeable file when one wasn't set for it.

Parameters

This method takes no parameters

Return Value

The default return policy, or null if there isn't one

Change Log

Version Description
5.5.29a Available in professional and enterprise editions.

Examples

Check for a default RetryPolicy

 if(session.getDefaultRetryPolicy() == null)
 {
   session.logWarn("No default retry policy specified");
 }

getElapsedRunningTime

long session.getElapsedRunningTime ( ) (professional and enterprise editions only)

Description

Get how long the current session has been running.

Parameters

This method does not receive any parameters.

Return Values

Returns number of milliseconds the scrape has been running, as a long (8-byte integer).

Change Log

Version Description
4.5 Available for professional and enterprise editions.

If you would like to log the running time of the scraping session you should use logElapsedRunningTime.

Examples

Generic Scrape Timeout

 // On pagination iterator

 // Setup length to run
 timeout = 1000*60*60*24; // 1 day

 // Check how long scrape has been running
 if (session.getElapsedRunningTime() >= timeout )
 {
     session.stopScraping();
 }

See Also

getLoggingLevel

int session.getLoggingLevel ( )

Description

Get the logging level of the scrape.

Parameters

This method does not receive any parameters.

Return Values

Returns the logging level, as an integer. Currently there are four levels: 1 = Debug, 2 = Info, 3 = Warn, 4 = Error.

Change Log

Version Description
5.0.1a Introduced for all editions.

Examples

Set Logging Level If Low

// get logging level
logLevel = session.getLoggingLevel();

if (logLevel < Notifiable.LEVEL_WARN )
{
    session.setLoggingLevel( Notifiable.LEVEL_WARN );
}

See Also

getMaxConcurrentFileDownloads

int session.getMaxConcurrentFileDownloads ( ) (professional and enterprise editions only)

Description

Retrieve the maximum number of concurrent file downloads being allowed.

Parameters

This methods does not receive any parameters.

Return Values

Returns the max number of concurrent file downloads allowed, as an integer.

Change Log

Version Description
5.0 Added for professional and enterprise editions.

Examples

Check Max Concurrent File Downloads

 // How many concurrent downloads are permitted
 maxConcurrentDownloads = session.getMaxConcurrentFileDownloads();

See Also

getMaxHTTPRequests

int session.getMaxHTTPRequests ( ) (professional and enterprise editions only)

Description

Retrieve the number of attempts that scrapeable files should make to get the requested page.

Parameters

This method does not receive any parameters.

Return Values

Returns the number of attempts that will be made, as a integer.

Change Log

Version Description
5.0 Available for all editions.

Examples

Retrieve the Retry Value

// Write retries to log
session.log( "Retries per file: " + session.getMaxHTTPRequests() );

See Also

  • setMaxHTTPRequests() [session] - Sets the number of attempts a scrapeable file will make to get the requested page

getMaxScriptsOnStack

int session.getMaxScriptsOnStack ( )

Description

Get the total number of scripts allowed on the stack before the scraping session is forcibly stopped.

Parameters

This method does not receive any parameters.

Return Values

Returns max number of scripts that can be running at a time, as an integer.

Change Log

Version Description
5.0 Added for all editions.

Examples

Check If More Scripts Can Be Run

 import java.math.*;

 // Get Number of Scripts (running and max)
 BigDecimal numRunningScripts = new BigDecimal(session.getNumScriptsOnStack());
 BigDecimal maxAllowedScripts = new BigDecimal(session.getMaxScriptsOnStack());

 // Calculate percentage used
 BigDecimal percentageUsedBD = numRunningScripts.divide(maxAllowedScripts, 2, BigDecimal.ROUND_HALF_UP);

 double percentageUsed = percentageUsedBD.doubleValue();

 if (percentageUsed < 90)
 {
     session.log(percentageUsed.toString() + "% of max scripts used");
 }
 else
 {
     session.logWarn("90% max scripts threshold has been reached.");
 }

See Also

getName

String session.getName ( )

Description

Get the name of the current scraping session.

Parameters

This method does not receive any parameters.

Return Values

Returns the name of the scraping session, as a string.

Change Log

Version Description
4.5 Available for all editions.

Examples

Write Scraping Session Name to Log

 // Outputs the name of the scraping session to the log.
 session.log( "Current scraping session: " + session.getName() );

getNumScriptsOnStack

int session.getNumScriptsOnStack ( )

Description

Get the number of scripts currently running.

Parameters

This method does not receive any parameters.

Return Values

Returns number of running scripts, as an integer.

Change Log

Version Description
5.0 Added for all editions.

Examples

Check If More Scripts Can Be Run

 import java.math.*;

 // Get Number of Scripts (running and max)
 BigDecimal numRunningScripts = new BigDecimal(session.getNumScriptsOnStack());
 BigDecimal maxAllowedScripts = new BigDecimal(session.getMaxScriptsOnStack());

 // Calculate percentage used
 BigDecimal percentageUsedBD = numRunningScripts.divide(maxAllowedScripts, 2, BigDecimal.ROUND_HALF_UP);

 double percentageUsed = percentageUsedBD.doubleValue();

 if (percentageUsed < 90)
 {
     session.log(percentageUsed.toString() + "% of max scripts used");
 }
 else
 {
     session.logWarn("90% max scripts threshold has been reached.");
 }

See Also

getRetainNonTidiedHTML

boolean session.getRetainNonTidiedHTML ( ) (enterprise edition only)

Description

Determine whether or not non-tidied HTML is to be retained for all scrapeable files in this scraping session.

Parameters

This method does not receive any parameters.

Return Values

Returns whether non-tidied HTML is be retained for all scrapeable files or not, as a boolean.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Determine if Non-tidied HTML is Being Retained

 // Outputs the non-tidied HTML from the scrapeable file
 // to the log if it was retained otherwise just a message.

 if (session.getRetainNonTidiedHTML())
 {
     session.log( "All scrapeable files will retain non-tidied HTML" );
 }
 else
 {
     session.log( "Non-tidied HTML will not be not retained." );
 }

See Also

getScrapeableSessionID

int session.getScrapeableSessionID ( ) (enterprise edition only)

Description

Get the unique identifier for the scraping session.

Parameters

This method does not receive any parameters.

Return Values

Returns unique session id for the scraping session, as an integer.

Change Log

Version Description
5.0 Added for enterprise edition.

Examples

Retrieve Unique ID

 // Get Unique ID
 int i = session.getScrapeableSessionID();

getStartTime

long session.getStartTime ( )

Description

Retrieve the time at which the scrape started.

Parameters

This method does not receive any parameters.

Return Values

Returns the start time of the scrape in milliseconds, as a long.

Change Log

Version Description
4.5 Available for all editions.

Examples

Get Session Start Time

// Retrieves the start time and places it
// in the variable "start".

start = session.getStartTime();

getTimeZone

TimeZone session.getTimeZone ( )

Description

Gets the current time zone of the Scraping Session

Parameters

This method takes no parameters.

Return Value

The time zone this scrape is set to.

Change Log

Version Description
5.5.29a Available in all editions.

Examples

Get the current Time Zone in use

 TimeZone currentTimeZone = session.getTimeZone();

getVariable

Object session.getVariable ( String identifier )

Description

Retrieve the value of a saved session variable.

Parameters

  • identifier The name of the variable whose value is to be retrieved, as a string.

Return Values

Returns the value of the session variable. This will be a string unless you have used setVariable to place something other than a string into a session variable.

Change Log

Version Description
4.5 Available for all editions.

Examples

Retrieve Session Variable

 // Places the session variable "CITY_CODE" in the local
 // variable "cityCode".

 cityCode = session.getVariable( "CITY_CODE" );

See Also

  • addToVariable() [session] - Adds an integer to the value of a session variable.
  • getv() [session] - Retrieve the value of a saved session variable (alias of getVariable).
  • setv() [session] - Set the value of a session variable (alias of setVariable).
  • setVariable() [session] - Set the value of a session variable.

getv

Object session.getv ( String identifier )

Description

Retrieve the value of a saved session variable (alias of getVariable).

Parameters

  • identifier The name of the variable whose value is to be retrieved, as a string.

Return Values

Returns the value of the session variable. This will be a string unless you have used setVariable to place something other than a string into a session variable.

Change Log

Version Description
4.5 Added for all editions.

Examples

Retrieve Session Variable

 // Places the session variable "CITY_CODE" in the local
 // variable "cityCode".

 cityCode = session.getv( "CITY_CODE" );

See Also

  • addToVariable() [session] - Adds an integer to the value of a session variable.
  • getVariable() [session] - Retrieve the value of a saved session variable.
  • setv() [session] - Set the value of a session variable (alias of setVariable).
  • setVariable() [session] - Set the value of a session variable.

isRunningFromCommandLine

boolean session.isRunningFromCommandLine ( )

Description

Returns whether or not we are currently running in the command line. This is a convenience method for doing something different in a script when running in the command line as opposed to other modes

Parameters

This method does not receive any parameters.

Return Values

Returns true if and only if the scrape is currently running in the command line.

Change Log

Version Description
6.0.37a Introduced for all editions.

Examples

Retrieve Connection Timeout

 if (session.isRunningFromCommandLine()) {
    // do something only done in the command line
 }

isRunningInServer

boolean session.isRunningInServer ( )

Description

Returns whether or not we are currently running in the server. This is a convenience method for doing something different in a script when running in the server as opposed to other modes

Parameters

This method does not receive any parameters.

Return Values

Returns true if and only if the scrape is currently running in the server.

Change Log

Version Description
6.0.37a Introduced for all editions.

Examples

Retrieve Connection Timeout

 if (session.isRunningInServer()) {
    // do something only done in the server
 }

isRunningInWorkbench

boolean session.isRunningInWorkbench ( )

Description

Returns whether or not we are currently running in the workbench. This is a convenience method for doing something different in a script when running in the workbench as opposed to other modes

Parameters

This method does not receive any parameters.

Return Values

Returns true if and only if the scrape is currently running in the workbench.

Change Log

Version Description
6.0.37a Introduced for all editions.

Examples

Retrieve Connection Timeout

 if (session.isRunningInWorkbench()) {
    // do something only done in workbench
 }

loadStateFromString

boolean session.loadStateFromString ( String stateXML ) (professional and enterprise editions only)

Description

Loads the state that would have been previously saved by invoking the session.saveStateToString method.

Parameters

  • stateXML A string representing session state.

Return Values

None

Change Log

Version Description
5.5.30a Available in Professional and Enterprise editions.

Examples

Load state in from a file

import org.apache.commons.io.FileUtils;

File f = new File( "session_state.xml" );
sessionState = FileUtils.readFileToString( f, session.getCharacterSet() );

session.loadStateFromString( sessionState );

loadVariables

void session.loadVariables ( String fileToReadFrom ) (enterprise edition only)

Description

Load session variables from a file.

Parameters

  • fileToReadFrom File path of the file that contains the session variables, as a string.

Return Values

Returns void. If there is a problem retrieving the file contents an I/O error will be written to the log.

Change Log

Version Description
4.5 Available for enterprise edition.

See also: saveVariables.

If you want to create your own file of session variables, the format is a hard return-delimited list of name/value pairs. Both the key and value should be URL-encoded.

Examples

Load Session Variables from File

 // Reads in variables from the file located at "C:\myvars.txt".
 // Note that a forward slash is used instead of a back slash
 // as a folder delimiter. If back slashes were used, they
 // would need to be doubled so that they're properly escaped
 // out for the script interpreter.

 session.loadVariables( "C:/myvars.txt" );

Sample Variables File

BIRTHDAY=12%2F25
NAME=Santa
AGE=Unknown

See Also

saveStateToString

boolean session.saveStateToString ( boolean saveCookies, boolean saveVariables ) (professional and enterprise editions only)

Description

Saves the current state of the scraping session to a string. An example use case for this method would be a scraping session that logs in to a site, extracts some information, and then is stopped, saving its state out to a file. A second scraping session could then be run, loading the state back in from the file, which would keep the session logged in so that other information could be obtained without logging in once again. By default the scraping session will save out information such as the URL to use as a referer. More information can be saved using the boolean flags described below.

Parameters

  • saveCookies Whether or not cookies should be saved.
  • saveVariables Whether or not session variables should be saved.

Return Values

None

Change Log

Version Description
5.5.30a Available in Professional and Enterprise editions.

Examples

Save out state to a file

// Put the current state in a local variable.
sessionState = session.saveStateToString( true, true );

// Write the state out to a file.
sutil.writeValueToFile( sessionState, "session_state.xml", session.getCharacterSet() );

saveVariables

void session.saveVariables ( String fileToSaveTo ) (enterprise edition only)

Description

Saves all current string and integer variables to a file.

Parameters

  • fileToSaveTo File path where the file should be saved, as a string.

Return Values

Returns void. If there is a problem retrieving the file contents an I/O error will be written to the log.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Save Session Variables to File System

 // Saves the current session variables out to C:\myvars.txt.
 // Note that a forward slash is used instead of a back slash
 // as a folder delimiter. If back slashes were used, they
 // would need to be doubled so that they're properly escaped
 // out for the script interpreter.

 session.saveVariables( "C:/myvars.txt" );

See Also

scrapeFile

void session.scrapeFile ( String scrapeableFileIdentifier )

Description

Manually scrape a scrapeable file.

Parameters

  • scrapeableFileIdentifier Name of the scrapeable file, as a string.

Return Values

Returns void. If there is a problem accessing the scrapeable file an message will be written to the log.

Change Log

Version Description
4.5 Available for all editions.

Examples

Scrape File Manually

 // Causes the scrapeable file "Login" to be requested.
 session.scrapeFile( "Login" );

scrapeString

boolean session.scrapeString ( String scrapeableFileName, String content ) (professional and enterprise editions only)

Description

Invokes a scrapeable file using a string of content instead of a web page or local file.

Parameters

  • scrapeableFileName The scrapeable file to be invoked.
  • content The content to load.

Return Values

None

Change Log

Version Description
5.5.13a Available in all editions.

Examples

Invoke a scrapeable file using a string

content = session.getv( "PARTIAL_PAGE_CONTENT" );
session.scrapeString( "My Scrapeable File", content );

sendDataToClient

void session.sendDataToClient ( String key, Object value ) (enterprise edition only)

Description

Send data to the external script that initiated the scrape. This isn't currently supported with all drivers (e.g., remote scraping session), check the documentation on the language of the external script for more information.

Parameters

  • key Name of the information being sent, as a string.
  • value Data to be processed by external script, supported types are Strings, Integers, DataRecords, and DataSets.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

Examples

Send dataRecord to Client

 // Causes the current DataRecord object to be sent to the client
 // for processing.

 session.sendDataToClient( "MyDataRecord", dataRecord );

setCharacterSet

void session.setCharacterSet ( String characterSet )

Description

Set the general character set used in page response renderings. This can be particularly helpful when the pages render characters incorrectly.

Parameters

  • characterSet Java recognized character set, as a string. Java provides a list of supported character sets in its documentation.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

This method must be invoked before the session starts.

If you are having trouble with characters displaying incorrectly, we encourage you to ready about how to go about finding a solution using one of our FAQs.

Examples

Set Character Set of All Scrapeable Files

 // In script called "Before scraping session begins"

 // Sets the character set to be applied to the last responses
 // of all scrapeable files in session.

 session.setCharacterSet( "ISO-8859-1" );

See Also

  • getCharacterSet() [session] - Gets the character set used to render all responses.
  • getCharacterSet() [scrapeableFile] - Get the character set used to responses to a specific scrapeable file.
  • setCharacterSet() [scrapeableFile] - Set the character set used to responses to a specific scrapeable file.

setConnectionTimeout

void session.setConnectionTimeout ( int timeout )

Description

Set the timeout value for scrapeable files in the session.

Parameters

  • timeout The length of the timeout in seconds, as an integer.

Return Values

Returns void.

Change Log

Version Description
5.0.1a Introduced for all editions.

Examples

Set Connection Timeout

 // set connection timeout to 15 seconds
 session.setConnectionTimeout( 15 );

See Also

setCookie

void session.setCookie ( String domain, String key, String value ) (professional and enterprise editions only)

Description

Manually set a cookie in the current session state.

Parameters

  • domain The domain to which the cookie pertains, as a string.
  • key The name of the cookie, as a string.
  • value The value of the cookie, as a string.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

This method should be rarely used as screen-scraper automatically manages cookies. In cases where cookies are set via JavaScript, this function might be necessary.

Examples

Manually Set Cookie

 // Sets a cookie associated with "mydomain.com", using the
 // key "user" and the value "John Smith".

 session.setCookie( "mydomain.com", "user", "John Smith" );

See Also

  • clearCookies() [session] - Clear all cookies from this scraping session
  • getCookies() [session] - Gets all the cookies currently stored by this scraping session

setDebugMode

void session.setDebugMode ( boolean debugMode )

Description

Sets the debug state for the scrape. Enabled debug mode simply outputs a warning periodically while running, to help prevent running a production scrape in debug mode.

Parameters

  • debugMode True to enable debug mode, false to disable it.

Return Value

This method returns void.

Change Log

Version Description
5.5.29a Available in all editions.

Examples

Set some hardcoded values to use when the scrape is being developed

 // Comment out the line below for production
 session.setDebugMode(true);

 if(session.getDebugMode())
 {
   session.setVariable("SEARCH_TERM", "DVDs");
   session.setVariable("USERNAME", "some user");
   session.setVariable("PASSWORD", "the password");
 }

setDefaultRetryPolicy

void session.setDefaultRetryPolicy ( RetryPolicy retryPolicy ) (professional and enterprise editions only)

Description

Sets a retry policy that will affect all files in the scrape. This policy will be used by all scrapeable files that do not have a retry policy set for them. If a retry policy was manually set for them, this one will not be used.

Parameters

  • retryPolicy The retry policy to use by default, if no other retry policy is set.

Return Value

This method returns void.

Change Log

Version Description
5.5.29a Available in professional and enterprise editions.

Examples

Create a defaul RetryPolicy

 import com.screenscraper.util.retry.RetryPolicyFactory;

 // Use a retry policy that will rotate the proxy if there was an error on request
 session.setDefaultRetryPolicy(RetryPolicyFactory.getBasicPolicy(5, "Get new proxy"));

setKeyStoreFilePath

void session.setKeyStoreFilePath ( String filePath ) (professional and enterprise editions only)

Description

Sets the path to the keystore file. Some web sites require a special type of authentication that requires the use of a keystore file. See our blog entry on Using Client Certificates for more detail. Calling this method is the equivalent of setting the corresponding value under the "Advanced" tab for the scraping session in the workbench.

Parameters

  • filePath The path to the keystore file.

Return Values

None

Change Log

Version Description
5.5.10a Available in all editions.

Examples

Set the path to the keystore file

// Set the path.
session.setKeyStoreFilePath( "~/key_files/my_key.crt" );

// Output the current path.
session.log( "Keystore file path is: " + session.getKeyStoreFilePath() );

setKeyStorePassword

void session.setKeyStorePassword ( String password ) (professional and enterprise editions only)

Description

Sets the password for the keystore file. Some web sites require a special type of authentication that requires the use of a keystore file. See our blog entry on Using Client Certificates for more detail. Calling this method is the equivalent of setting the corresponding value under the "Advanced" tab for the scraping session in the workbench.

Parameters

  • filePath The password for the keystore file.

Return Values

None

Change Log

Version Description
5.5.10a Available in all editions.

Examples

Set the path to the keystore file

// Set the password.
session.setKeyStorePassword( "My_password" );

// Output the current password.
session.log( "Keystore password is: " + session.getKeyStorePassword() );

setLoggingLevel

void session.setLoggingLevel ( int loggingLevel )

Description

Set the logging level of the scrape.

Parameters

  • loggingLevel Level of logging that should be used, as an integer. It works best if you use the Notifiable interface in case levels are ever changed.

Return Values

Returns void.

Change Log

Version Description
5.0.1a Introduced for all editions.

Examples

Set Logging Level

// get logging level
logLevel = session.getLoggingLevel();

if (logLevel < Notifiable.LEVEL_WARN )
{
    session.setLoggingLevel( Notifiable.LEVEL_WARN );
}

See Also

setMaxConcurrentFileDownloads

void session.setMaxConcurrentFileDownloads ( int maxConcurrentFileDownloads ) (professional and enterprise editions only)

Description

Set the maximum number of concurrent file downloads to a allow.

Parameters

  • maxConcurrentFileDownloads The maximum number of downloads to allow, as an integer.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for professional and enterprise editions.

Examples

Set Max for Concurrent File Downloads

 // Limit the number of concurrent file downloads to 10
 session.setMaxConcurrentFileDownloads( 10 );

See Also

setMaxHTTPRequests

void session.setMaxHTTPRequests ( int maxAttempts ) (professional and enterprise editions only)

Description

Set the number of attempts that scrapeable files should make to get the requested page.

Parameters

  • maxAttempts The number of attempts that will be made, as a integer.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for all editions.

Examples

Set the Retry Value

// Set retries for files
session.setMaxHTTPRequests( 3 );

See Also

  • getMaxHTTPRequests() [session] - Returns the maximum number of attempts a scrapeable file will make to retrieve the file

setMaxScriptsOnStack

void session.setMaxScriptsOnStack ( int maxScriptsOnStack ) (enterprise edition only)

Description

Get the total number of scripts that can be running concurrently. Default value for maxScriptsOnStack is 50.

Parameters

  • maxScriptsOnStack Number of scripts to be allowed to run concurrently, as an integer.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for enterprise edition.

Before you start upping the value of the number of scripts that can be on the stack you should make sure that your scrape is not eating more then it should. One thing to consider is recursion instead of iterating. This is discussed in more details on our blog or in the Tips, Tricks, and Samples section of this site.

Examples

Allocate More Resources to Scrape

 // Allow for 100 scripts (instead of 50)
 session.setMaxScriptsOnStack(100);

See Also

setRandomizeUserAgent

void session.setRandomizeUserAgent ( boolean randomizeUserAgent ) (professional and enterprise editions only)

Description

Causes the "User-Agent" header sent by screen-scraper to be randomized. The user agent strings from which screen-scraper will select are found in the "resource\conf\user_agents.txt" file.

Parameters

  • randomizeUserAgent true or false

Return Values

None

Change Log

Version Description
5.5.34a Available in Professional and Enterprise editions.

Examples

Randomize the user-agent header

session.setRandomizeUserAgent( true );

// You can also access the current value like so:
session.log( "Randomize user agent: " + session.getRandomizeUserAgent() );

setRetainNonTidiedHTML

void session.setRetainNonTidiedHTML ( boolean retainNonTidiedHTML ) (enterprise edition only)

Description

Set whether or not non-tidied HTML is to be retained for all scrapeable files.

Parameters

  • retainNonTidiedHTML Whether the non-tidied HTML should be retained, as a boolean. The default is false.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for enterprise edition.

If, after the file is scraped, you want to be able to use getNonTidiedHTML this method has to be called before a file is scraped.

Examples

Retain Non-tidied HTML

 // Tell screen-scraper to retain tidied HTML for the all
 // scrapeable files.

 session.setRetainNonTidiedHTML( true );

See Also

setSessionVariables

void session.setSessionVariables ( Map variables) (professional and enterprise editions only)(professional and enterprise editions only)
void session.setSessionVariables ( Map variables, boolean ignoreLowerCaseKeys)(professional and enterprise editions only)

Description

Sets the value of all session variables that match the keys in the Map to the values in the Map. This will ignore a key of DATARECORD.

Parameters

  • Map The map to use when setting the session variables.
  • ignoreLowerCase True if keys with lowercase characters should be ignored. This would include A_KEy

Return Value

This method returns void.

Change Log

Version Description
5.5.29a Available in all editions.
5.5.43a Changed from session.setSessionVariablesFromMap to session.setSessionVariables.

Examples

Set the ASPX values for a .NET site before scraping the next page

 DataRecord aspx = scrapeableFile.getASPXValues();
 
 session.setSessionVariables(aspx);
 session.scrapeFile("Next Results");

setStatusMessage

void session.setStatusMessage ( String message ) (enterprise edition only)

Description

Sets a status message to be displayed in the web interface.

Parameters

  • message The message to be set.

Return Values

None

Change Log

Version Description
5.5.32a Available in Enterprise edition.

Examples

Append a status message

if( scrapeableFile.getMaxRequestAttemptsReached() )
{
        session.setStatusMessage( "Maximum requests reached for scrapeable file: " + scrapeableFile.getName() );
       
        // Output the current status message.
        session.log( "Current status message: " + session.getStatusMessage() );
}

setStopScrapingOnExtractorPatternTimeout

void session.setStopScrapingOnExtractorPatternTimeout ( boolean stopScrapingOnExtractorPatternTimeout ) (professional and enterprise editions only)

Description

If this method is passed the value of true, it will cause screen-scraper to stop the current scraping session if an extractor pattern timeout occurs.

Parameters

  • stopScrapingOnExtractorPatternTimeout true or false

Return Values

None

Change Log

Version Description
5.5.36a Available in Professional and Enterprise editions.

Examples

Indicate that the scraping session should be stopped when an extractor pattern timeout occurs

session.setStopScrapingOnExtractorPatternTimeout( true );

// You can also access the current value like so:
session.log( "Stop scraping on extractor pattern timeout: " + session.getStopScrapingOnExtractorPatternTimeout() );

setStopScrapingOnMaxRequestAttemptsReached

void session.setStopScrapingOnMaxRequestAttemptsReached ( boolean stopScrapingOnMaxRequestAttemptsReached ) (professional and enterprise editions only)

Description

If this method is passed the value of true, it will cause screen-scraper to stop the current scraping session if the maximum attempts to request a file is reached.

Parameters

  • stopScrapingOnMaxRequestAttemptsReached true or false

Return Values

None

Change Log

Version Description
5.5.36a Available in Professional and Enterprise editions.

Examples

Indicate that the scraping session should be stopped if the maximum attempts to request a file is reached

session.setStopScrapingOnMaxRequestAttemptsReached( true );

// You can also access the current value like so:
session.log( "Stop scraping on max attempts reached: " + session.getStopScrapingOnMaxRequestAttemptsReached() );

setStopScrapingOnScriptError

void session.setStopScrapingOnScriptError ( boolean stopScrapingOnScriptError ) (professional and enterprise editions only)

Description

If this method is passed the value of true, it will cause screen-scraper to stop the current scraping session if a script error occurs.

Parameters

  • stopScrapingOnScriptError true or false

Return Values

None

Change Log

Version Description
5.5.36a Available in Professional and Enterprise editions.

Examples

Indicate that the scraping session should be stopped if a script error occurs

session.setStopScrapingOnScriptError( true );

// You can also access the current value like so:
session.log( "Stop scraping on script error: " + session.getStopScrapingOnScriptError() );

setTimeZone

void session.setTimeZone ( String timeZone )
void session.setTimeZone ( TimeZone timeZone )

Description

Sets the time zone that will be used when using a method that returns a time formatted as a string.

Parameters

  • timeZone The new timezone to use. If null is given, the local timezone will be used.

Return Value

This method returns void.

Change Log

Version Description
5.5.29a Available in all editions.

Examples

Set the time zone

 session.setTimeZone("America/Denver");

setUseServerCharacterSet

void session.setUseServerCharacterSet ( boolean useServerCharacterSet ) (professional and enterprise editions only)

Description

If this method is passed the value of true, it will cause screen-scraper to utilize whatever character set is specified by the server in its "Content-Type" response header. If no such header exists, screen-scraper will default to either the character set indicated for the scraping session or the global character set (in that order).

Parameters

  • useServerCharacterSet true or false

Return Values

None

Change Log

Version Description
5.5.11a Available in all editions.

Examples

Indicate that the server character set should be used

session.setUseServerCharacterSet( true );

// You can also access the current value like so:
session.log( "Use server character set: " + session.getUseServerCharacterSet() );

setUserAgent

void session.setUserAgent ( String userAgent ) (professional and enterprise editions only)

Description

Sets the user agent to be used for all requests.

Parameters

  • userAgent true or false

Return Values

None

Change Log

Version Description
5.5.23a Available in Professional and Enterprise editions.

Examples

Set the user agent

session.setUserAgent( "Opera/9.64(Windows NT 5.1; U; en) Presto/2.1.1" );

// You can also access the current value like so:
session.log( "Session user agent: " + session.getUserAgent() );

setVariable

void session.setVariable ( String identifier, Object value )

Description

Set the value of a session variable.

Parameters

  • identifier Name of the session variable, as a string.
  • value Value of the session variable. This can be any Java object, including (but not llimited to) a String, DataSet, or DataRecord.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Set Session Variable

 // Sets the session variable "CITY_CODE" with the value found
 // in the first dataRecord (at index 0) pointed to by the
 // identifier "CITY_CODE".

 session.setVariable( "CITY_CODE", dataSet.get( 0, "CITY_CODE" ) );

See Also

  • addToVariable() [session] - Adds an integer to the value of a session variable.
  • getv() [session] - Retrieve the value of a saved session variable (alias of getVariable).
  • getVariable() [session] - Retrieve the value of a saved session variable.
  • setv() [session] - Set the value of a session variable (alias of setVariable).

setv

void session.setv ( String identifier, Object value )

Description

Set the value of a session variable (alias of setVariable).

Parameters

  • identifier Name of the session variable, as a string.
  • value Value of the session variable. This can be any Java object, including (but not llimited to) a String, DataSet, or DataRecord.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

Examples

Set Session Variable

 // Sets the session variable "CITY_CODE" with the value found
 // in the first dataRecord (at index 0) pointed to by the
 // identifier "CITY_CODE".

 session.setv( "CITY_CODE", dataSet.get( 0, "CITY_CODE" ) );

See Also

  • addToVariable() [session] - Adds an integer to the value of a session variable.
  • getv() [session] - Retrieve the value of a saved session variable (alias of getVariable).
  • getVariable() [session] - Retrieve the value of a saved session variable.
  • setVariable() [session] - Set the value of a session variable.

shouldStopScraping

boolean session.shouldStopScraping ( )

Description

Determine if the scrape has been stopped. This can be done using the stop button in the workbench or the stop scraping button on the web interface (for enterprise users).

Parameters

This method does not receive any parameters.

Return Values

Returns true if the scrape has been requested to stop; otherwise, it returns false.

Change Log

Version Description
5.0 Added for enterprise edition.

Examples

Stop Iterator if Scrape is Stopped

 for (int i = 0; i < dataSet.getNumDataRecords(); ++i)
 {
     // check during every iteration to see if we should exit early.
      // Without this check, the iteration will continue even
     // if the stop scraping button were to be pressed.
     if ( session.shouldStopScraping() )
      {
         break;
     }

     session.setVariable( "URL", dataSet.get( i, "NEXT_PAGE_URL" ) );
     session.scrapeFile( "NEXT_PAGE" );
 }

stopScraping

void session.stopScraping ( )

Description

Stop the current scraping session.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Stop Scrape on Scrapeable File Request Error

 // Stops scraping if an error response was received
 // from the server.
 if( scrapeableFile.wasErrorOnRequest() )
 {
     session.stopScraping();
 }

waitForFileDownloadsToComplete

void session.waitForFileDownloadsToComplete() (enterprise edition only)

Description

Waits for any file downloads to complete before returning. This should be used in tandem with the session.downloadFile method call that takes the "doLazy" paraameter.

Parameters

None

Return Values

None

Change Log

Version Description
5.5.43a Available in Enterprise edition.

Examples

Set the user agent

// Download five image files concurrently.
for( i = 0; i < 5; i++ )
{
        session.downloadFile( "http://www.mysite.com/images/image" + i + ".jpg", "output/image" + i + ".jpg", 5, true );
}

// Wait for all of the images to finish downloading before continuing.
session.waitForFileDownloadsToComplete();