Anonymization
Overview
The following methods are provided to aid you in setting up an anonymous scraping session. If you are using your own server proxy pool you will use the methods to allow screen-scraper to interact with and manage your proxy pool. If you are using automatic anonymization then the only method you will use is currentProxyServerIsBad as screen-scraper will manage the servers using the anonymization settings from your setup.
See an example of Anonymization via Manual Proxy Pools.
currentProxyServerIsBad
void session.currentProxyServerIsBad ( ) (professional and enterprise editions only)
Description
Remove proxy server from proxy pool. This is only used with anonymization and indicates that one server in the pool is bad and should be removed.
Parameters
This method does not receive any parameters.
Return Values
Returns void.
Change Log
Version |
Description |
4.5 |
Available for professional and enterprise editions. |
If you are using automatic anonymization or manual proxy pools, a new proxy server will be created as a result of the method call.
When checking if a request you have made is invalid it is best not to rely on the HTTP status code (eg. 404) alone as the status codes are not always accurate. It is recommended that you also scrape a known string (eg. "Not found") from the response HTML that validates the status code.
Examples
Flag Proxy Server
// Indicates that the current proxy server is bad.
session.currentProxyServerIsBad();
getCurrentProxyServerFromPool
ProxyServer session.getCurrentProxyServerFromPool ( )
Description
Get the current proxy server from the proxy server pool.
Parameters
This method does not receive any parameters.
Return Values
Returns the current proxy server being used.
Change Log
Version |
Description |
4.5 |
Available for all editions. |
Examples
Write Proxy Server Description to Log
// Get Proxy Server
proxyServer = session.getCurrentProxyServerFromPool();
// Log Server Description
session.log( "Proxy Server: " + proxyServer.getDescription() );
getProxyServerPool
void session.getProxyServerPool ()
Description
Holds the proxy server pool object that allows proxies to be cycled through.
Parameters
- This method does not receive any parameters.
Return Values
Returns true if there is an available proxy server pool.
Change Log
Version |
Description |
4.5 |
Available for all editions. |
Examples
Check if ProxyServerPool object exists.
// If ProxyServerPool does not exist
// Create a new ProxyServerPool object.
if ( !session.getProxyServerPool() )
{
// The ProxyServerPool object will
// control how screen-scraper interacts with proxy servers.
proxyServerPool = new ProxyServerPool();
// We give the current scraping session a reference to
// the proxy pool. This step should ideally be done right
// after the object is created (as in the previous step).
session.setProxyServerPool( proxyServerPool );
}
getTerminateProxiesOnCompletion
boolean session.getTerminateProxiesOnCompletion ( )
Description
Determine whether proxies are set to be terminated when the scrape ends.
Parameters
This method does not receive any parameters.
Return Values
Returns true if a proxy will be terminated; otherwise, it returns false.
Change Log
Version |
Description |
5.0 |
Available for all editions. |
Examples
Check Termination Setting
// Log whether proxies are being terminated or not
if ( session.getTerminateProxiesOnCompletion() )
{
session.log( "Anonymous Proxies are set to be terminated with the scrape." );
}
else
{
session.log( "Anonymous Proxies are set to continue running after the scrape is finished." );
}
See Also
getUseProxyFromPool
boolean session.getUseProxyFromPool ( )
Description
Determine whether proxies are being used from proxy pool.
Parameters
This method does not receive any parameters.
Return Values
Returns true if a proxy pool is being used; otherwise, it returns false.
Change Log
Version |
Description |
4.5 |
Available for all editions. |
Examples
Turn On Proxy Pool Usage If Not Running
// Are proxies being used from a pool
if ( !session.getUseProxyFromPool() )
{
session.setUseProxyFromPool( true );
}
See Also
- setUseProxyFromPool() [session] - Sets whether a proxy from the proxy pool should be used when making a request
setProxyServerPool
void session.setProxyServerPool ( ProxyServerPool proxyServerPool )
Description
Associate a proxy pool with a scraping session.
Parameters
- proxyServerPool A ProxyServerPool object.
Return Values
Returns void.
Change Log
Version |
Description |
4.5 |
Available for all editions. |
Examples
Associate Proxy Pool with Scraping Session
// Create a new ProxyServerPool object. This object will
// control how screen-scraper interacts with proxy servers.
proxyServerPool = new ProxyServerPool();
// We give the current scraping session a reference to
// the proxy pool. This step should ideally be done right
// after the object is created (as in the previous step).
session.setProxyServerPool( proxyServerPool );
setTerminateProxiesOnCompletion
void session.setTerminateProxiesOnCompletion ( boolean terminateProxies )
Description
Manually set the outcome of proxies when the scrape ends.
Parameters
- terminateProxies Whether proxies should be terminated at the end of the session or not, as a boolean.
Return Values
Returns void.
Change Log
Version |
Description |
5.0 |
Available for all editions. |
Examples
Make Sure Proxies are Deleted on Scrape Completion
// Test
if ( session.getTerminateProxiesOnCompletion() )
{
session.log( "Anonymous Proxies are set to be terminated with the scrape." );
}
else
{
// Set proxies to be terminated with the scrape
session.setTerminateProxiesOnCompletion( true );
session.log( "Anonymous Proxies updated to be terminated with the scrape." );
}
See Also
setUseProxyFromPool
void session.setUseProxyFromPool ( boolean useProxyFromPool )
Description
Determine if proxies from a proxyServerPool be used when making scrapeable file request.
Parameters
- useProxyFromPool Whether proxies in the proxyServerPool should be used, as a boolean.
Return Values
Returns void.
Change Log
Version |
Description |
4.5 |
Available for all editions. |
Examples
Anonymize Scrapeable Files
// Create a new ProxyServerPool object. This object will
// control how screen-scraper interacts with proxy servers.
proxyServerPool = new ProxyServerPool();
// We give the current scraping session a reference to
// the proxy pool. This step should ideally be done right
// after the object is created (as in the previous step).
session.setProxyServerPool( proxyServerPool );
... Proxy Server Pool Setup ...
// This is the switch that tells the scraping session to make
// use of the proxy servers. Note that this can be turned on
// and off during the course of the scrape. You may want to
// anonymize some pages, but not others.
session.setUseProxyFromPool( true );
See Also
- getUseProxyFromPool() [session] - Returns whether or not a proxy from the proxy pool will be used upon making a request