Programmable URL Tokens

Not sure how to explain this, but it would be nice to initialize an URL Token with a starting number "1" and an ending number "100" so that only one script can take care of gathering all of the details links.

For Example:
http://link.url/cat/page~#PAGE#~/index.php

Will run through all the numbers...

/page1/
/page2/
/page3/
...
/pageN/

If you are on PAGE 47, the "next page" url is PAGE 48, just incremented. No need to run a "next page" script again, and add it to the "stack."

But we would need to know how far we want to go.

Is there anything that could work like this:
session.setVariable("PAGE", "1", "30");
Where the last # is the ending number?

Or a FOR loop? I don't know Java but could we have 1 script to do this?:

//$i is set to 1, loop will end when it reaches 30
For ($i=1 ; $i>30 ; $i++) {
 //Initialize the PAGE variable to $i
 session.setVariable("PAGE", $i);
 //Scrape the results page
 session.scrapeFile("Search Results");

 //The "Search Results" page then runs the details scraper, which then runs the export script.. etc
}

How can you break out of this script?

Hi,

I tried out the script listed here but when I tried to break out of it, it continued to load the sessions until it reached the ending page... is there something I can add to the script it to make it stop if I need to break out of it?

// Initialize the session variables.
session.setVariable( "CATEGORY", "all" );

/*
        USE THE TWO INTEGERS BELOW TO SET THE PAGE NUMBERS TO SCRAPE
                        n = ENDING page
                        i = STARTING page
*/
//set starting page #
int i=1;
//set ending page #
int n=46;


//set session var to i
session.setVariable("PAGE", i);


//LOOP through
while (i < n) {
       
        //Send info to LOG
        session.log("+++Scraping page #" + i);
        //Scrape the results page
        session.scrapeFile("Search Results");
        //add 1 to i
        i++;
        //set session variable to new i
        session.setVariable("PAGE", i);
} session.log("+++Scraping page #" + i);

I would do the same thing

I would do the same thing like this:

session.setVariable("PAGE", 1);

while (session.getVaraible("PAGE")<=100)
{
   session.addToVariable("PAGE", 1);
   session.log("Starting page " + session.getVaraible("PAGE"));
   session.scrapeFile("Next page");
}

I found a way

I found a way that worked for me. I was having trouble with the loop at first, it kept doing the last page number twice, whether there were 5 pages to scrape, or 50.

So I came up with this.

// Initialize the session variables.
session.setVariable( "CATEGORY", "all" );

/*
        USE THE TWO INTEGERS BELOW TO SET THE PAGE NUMBERS TO SCRAPE
                        n = ENDING page
                        i = STARTING page
*/

//set starting page #
int i=1;
//set ending page #
int n=46;


//set session var to i
session.setVariable("PAGE", i);


//LOOP through
while (i < n) {
       
        //Send info to LOG
        session.log("+++Scraping page #" + i);
        //Scrape the results page
        session.scrapeFile("Search Results");
        //add 1 to i
        i++;
        //set session variable to new i
        session.setVariable("PAGE", i);
} session.log("+++Scraping page #" + i);