Anonymization

Overview

The following methods are provided to aid you in setting up an anonymous scraping session. If you are using your own server proxy pool you will use the methods to allow screen-scraper to interact with and manage your proxy pool. If you are using automatic anonymization then the only method you will use is currentProxyServerIsBad as screen-scraper will manage the servers using the anonymization settings from your setup.

See an example of Anonymization via Manual Proxy Pools.

currentProxyServerIsBad

void session.currentProxyServerIsBad ( ) (professional and enterprise editions only)

Description

Remove proxy server from proxy pool. This is only used with anonymization and indicates that one server in the pool is bad and should be removed.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

If you are using automatic anonymization or manual proxy pools, a new proxy server will be created as a result of the method call.

When checking if a request you have made is invalid it is best not to rely on the HTTP status code (eg. 404) alone as the status codes are not always accurate. It is recommended that you also scrape a known string (eg. "Not found") from the response HTML that validates the status code.

Examples

Flag Proxy Server

 // Indicates that the current proxy server is bad.
 session.currentProxyServerIsBad();

getCurrentProxyServerFromPool

ProxyServer session.getCurrentProxyServerFromPool ( )

Description

Get the current proxy server from the proxy server pool.

Parameters

This method does not receive any parameters.

Return Values

Returns the current proxy server being used.

Change Log

Version Description
4.5 Available for all editions.

Examples

Write Proxy Server Description to Log

 // Get Proxy Server
 proxyServer = session.getCurrentProxyServerFromPool();

 // Log Server Description
 session.log( "Proxy Server: " + proxyServer.getDescription() );

getProxyServerPool

void session.getProxyServerPool ()

Description

Holds the proxy server pool object that allows proxies to be cycled through.

Parameters

  • This method does not receive any parameters.

Return Values

Returns true if there is an available proxy server pool.

Change Log

Version Description
4.5 Available for all editions.

Examples

Check if ProxyServerPool object exists.

 // If ProxyServerPool does not exist
 // Create a new ProxyServerPool object.
 if ( !session.getProxyServerPool() )
 {
  // The ProxyServerPool object will
  // control how screen-scraper interacts with proxy servers.
 
  proxyServerPool = new ProxyServerPool();
 
  // We give the current scraping session a reference to
  // the proxy pool. This step should ideally be done right
  // after the object is created (as in the previous step).

  session.setProxyServerPool( proxyServerPool );
 }

getTerminateProxiesOnCompletion

boolean session.getTerminateProxiesOnCompletion ( )

Description

Determine whether proxies are set to be terminated when the scrape ends.

Parameters

This method does not receive any parameters.

Return Values

Returns true if a proxy will be terminated; otherwise, it returns false.

Change Log

Version Description
5.0 Available for all editions.

Examples

Check Termination Setting

// Log whether proxies are being terminated or not
if ( session.getTerminateProxiesOnCompletion() )
{
    session.log( "Anonymous Proxies are set to be terminated with the scrape." );
}
else
{
    session.log( "Anonymous Proxies are set to continue running after the scrape is finished." );
}

See Also

getUseProxyFromPool

boolean session.getUseProxyFromPool ( )

Description

Determine whether proxies are being used from proxy pool.

Parameters

This method does not receive any parameters.

Return Values

Returns true if a proxy pool is being used; otherwise, it returns false.

Change Log

Version Description
4.5 Available for all editions.

Examples

Turn On Proxy Pool Usage If Not Running

 // Are proxies being used from a pool
 if ( !session.getUseProxyFromPool() )
 {
     session.setUseProxyFromPool( true );
 }

See Also

  • setUseProxyFromPool() [session] - Sets whether a proxy from the proxy pool should be used when making a request

setProxyServerPool

void session.setProxyServerPool ( ProxyServerPool proxyServerPool )

Description

Associate a proxy pool with a scraping session.

Parameters

  • proxyServerPool A ProxyServerPool object.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Associate Proxy Pool with Scraping Session

 // Create a new ProxyServerPool object. This object will
 // control how screen-scraper interacts with proxy servers.

 proxyServerPool = new ProxyServerPool();

 // We give the current scraping session a reference to
 // the proxy pool. This step should ideally be done right
 // after the object is created (as in the previous step).

 session.setProxyServerPool( proxyServerPool );

setTerminateProxiesOnCompletion

void session.setTerminateProxiesOnCompletion ( boolean terminateProxies )

Description

Manually set the outcome of proxies when the scrape ends.

Parameters

  • terminateProxies Whether proxies should be terminated at the end of the session or not, as a boolean.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for all editions.

Examples

Make Sure Proxies are Deleted on Scrape Completion

// Test
if ( session.getTerminateProxiesOnCompletion() )
{
    session.log( "Anonymous Proxies are set to be terminated with the scrape." );
}
else
{
    // Set proxies to be terminated with the scrape
    session.setTerminateProxiesOnCompletion( true );
    session.log( "Anonymous Proxies updated to be terminated with the scrape." );
}

See Also

setUseProxyFromPool

void session.setUseProxyFromPool ( boolean useProxyFromPool )

Description

Determine if proxies from a proxyServerPool be used when making scrapeable file request.

Parameters

  • useProxyFromPool Whether proxies in the proxyServerPool should be used, as a boolean.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Anonymize Scrapeable Files

 // Create a new ProxyServerPool object. This object will
 // control how screen-scraper interacts with proxy servers.

 proxyServerPool = new ProxyServerPool();

 // We give the current scraping session a reference to
 // the proxy pool. This step should ideally be done right
 // after the object is created (as in the previous step).

 session.setProxyServerPool( proxyServerPool );

 ... Proxy Server Pool Setup ...

 // This is the switch that tells the scraping session to make
 // use of the proxy servers. Note that this can be turned on
 // and off during the course of the scrape. You may want to
 // anonymize some pages, but not others.
 session.setUseProxyFromPool( true );

See Also

  • getUseProxyFromPool() [session] - Returns whether or not a proxy from the proxy pool will be used upon making a request