Programmable URL Tokens
Not sure how to explain this, but it would be nice to initialize an URL Token with a starting number "1" and an ending number "100" so that only one script can take care of gathering all of the details links.
For Example:
http://link.url/cat/page~#PAGE#~/index.php
Will run through all the numbers...
/page1/
/page2/
/page3/
...
/pageN/
/page2/
/page3/
...
/pageN/
If you are on PAGE 47, the "next page" url is PAGE 48, just incremented. No need to run a "next page" script again, and add it to the "stack."
But we would need to know how far we want to go.
Is there anything that could work like this:
session.setVariable("PAGE", "1", "30");
Where the last # is the ending number?
Or a FOR loop? I don't know Java but could we have 1 script to do this?:
//$i is set to 1, loop will end when it reaches 30
For ($i=1 ; $i>30 ; $i++) {
//Initialize the PAGE variable to $i
session.setVariable("PAGE", $i);
//Scrape the results page
session.scrapeFile("Search Results");
//The "Search Results" page then runs the details scraper, which then runs the export script.. etc
}
For ($i=1 ; $i>30 ; $i++) {
//Initialize the PAGE variable to $i
session.setVariable("PAGE", $i);
//Scrape the results page
session.scrapeFile("Search Results");
//The "Search Results" page then runs the details scraper, which then runs the export script.. etc
}
How can you break out of this script?
Hi,
I tried out the script listed here but when I tried to break out of it, it continued to load the sessions until it reached the ending page... is there something I can add to the script it to make it stop if I need to break out of it?
session.setVariable( "CATEGORY", "all" );
/*
USE THE TWO INTEGERS BELOW TO SET THE PAGE NUMBERS TO SCRAPE
n = ENDING page
i = STARTING page
*/
//set starting page #
int i=1;
//set ending page #
int n=46;
//set session var to i
session.setVariable("PAGE", i);
//LOOP through
while (i < n) {
//Send info to LOG
session.log("+++Scraping page #" + i);
//Scrape the results page
session.scrapeFile("Search Results");
//add 1 to i
i++;
//set session variable to new i
session.setVariable("PAGE", i);
} session.log("+++Scraping page #" + i);
I would do the same thing
I would do the same thing like this:
while (session.getVaraible("PAGE")<=100)
{
session.addToVariable("PAGE", 1);
session.log("Starting page " + session.getVaraible("PAGE"));
session.scrapeFile("Next page");
}
I found a way
I found a way that worked for me. I was having trouble with the loop at first, it kept doing the last page number twice, whether there were 5 pages to scrape, or 50.
So I came up with this.
session.setVariable( "CATEGORY", "all" );
/*
USE THE TWO INTEGERS BELOW TO SET THE PAGE NUMBERS TO SCRAPE
n = ENDING page
i = STARTING page
*/
//set starting page #
int i=1;
//set ending page #
int n=46;
//set session var to i
session.setVariable("PAGE", i);
//LOOP through
while (i < n) {
//Send info to LOG
session.log("+++Scraping page #" + i);
//Scrape the results page
session.scrapeFile("Search Results");
//add 1 to i
i++;
//set session variable to new i
session.setVariable("PAGE", i);
} session.log("+++Scraping page #" + i);