scrape page without URL parameters

Hi everyone,

I'm trying to scan a site where the same pattern I want to extract occurs on different subpages, but those subpages can't be accessed via a parameter. For example, the pattern occurs on pages

www.mysite.com/a
www.mysite.com/b
www.mysite.com/c
...

So whereas normally you'd access the "a", "b" and "c" via a parameter (mysite.com/search?itemid=a), this would involve changing the end of the URL string following the forward slash after the top level domain.

Is there any way to do that in screen scraper? I can't see a way to do it since the scraping process seems to 1) require a variable parameter and 2) I can't seem to manually set a part of the URL string as a parameter.

Failing that, is there some way to 'parameterise' URLs such as the one above so I get to page 'a' by calling an alternate URL with the usual ? or & parameter operator (e.g. mysite.com/&pageid=a)? I'm guessing not but there might be some convention I'm not aware of.

Any help is greatly appreciated.

Thanks

Dan

I want do the same thing but do it dynamically

In my scrape I receive back a Search Page that has the URL for each subsequent detail page I would like to scrape. The problem is that I cannot seem to get the dynamic results to be passed then to the script to cause the scrape of the Details Page. I want the logic to work like the following:

Scrape Search Page
Receive Back URLs for each Detail Page
Iterate Through Each Detail Page and get needed data

Any help you could provide would be greatly appreciated!!

Thanks
JC

Sorry to Hijack but

Hi All, and sorry to Hijack...this thread,

Is it then possible to use the INDEX in your example as part of you extractor patter text?

Regards

Shaun

You cannot use a variable in

You cannot use a variable in an extractor pattern. It's looking at the HTML in the last response, and there's not a way to replace your value in.

On your scrapeable file, just

On your scrapeable file, just set the URL to:

http://www.mysite.com/~#INDEX#~

And set the INDEX in a script or extractor. An example:
// Create an array containing the alphabet.
String[] alphabet = { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" };

// Basic for loop to get started.
for (int i=0; i<=alphabet.length; i++)
{
        session.log("***Current letter is " + alphabet[i]);
        session.setVariable("INDEX", alphabet[i]);
        session.scrapeFile("File name");
}

thanks

Thanks Jason, I didn't know I could add a parameter like that; it works now. Cheers