Repeatable scraping session
If you need your scraping session to run multiple times in succession, consider this script, which will repeat multiple times until it either hits the "quitTime" specified (24-hour clock), or when it hits the "maxRuns" allowed. To quickly (and dirtily) disable the "maxRuns" factor, set it to 0, or something negative. To disable the time restraint in "quitTime", just make sure it starts with something greater than or equal to 24 (for example, "24:00" or "123412:43").
// Interpreted Java
String toRun = "first scrapeableFile name";
String quitTime = "14:43";
int maxRuns = 5;
/* No need to edit below here ----------------------------------- */
import java.text.SimpleDateFormat;
import java.util.Calendar;
if (quitTime.length() == 4)
quitTime = "0" + quitTime;
for (int i = 1; (new SimpleDateFormat("HH:mm")).format(Calendar.getInstance().getTime()).toString().compareTo(quitTime) < 0; i++)
{
session.scrapeFile(toRun);
if (i == maxRuns)
break;
}
String toRun = "first scrapeableFile name";
String quitTime = "14:43";
int maxRuns = 5;
/* No need to edit below here ----------------------------------- */
import java.text.SimpleDateFormat;
import java.util.Calendar;
if (quitTime.length() == 4)
quitTime = "0" + quitTime;
for (int i = 1; (new SimpleDateFormat("HH:mm")).format(Calendar.getInstance().getTime()).toString().compareTo(quitTime) < 0; i++)
{
session.scrapeFile(toRun);
if (i == maxRuns)
break;
}
scraper on 07/16/2010 at 5:00 pm
- Printer-friendly version
- Login or register to post comments
Comments
This script has errors!
A problem with this is that it will ONLY run from within a scraping session, it cannot be run on its own to make a session run over and over.
Here is a snippet that WILL run a session over and over and can be run independantly:
for( int i = 0; i < 50; i++ )
{
runnableScrapingSession = new com.screenscraper.scraper.RunnableScrapingSession( "MyScrapingSession" );
runnableScrapingSession.scrape();
}
Hope someone finds this useful!
This code didn't work for me,
This code didn't work for me, not sure what I am doing wrong.