Scraping the javascript link

Hi Talents,

I am new to screen scraping. And i have gone through the tutorials. and i found it is very useful.

I need to get product details from [url]http://www.futurebazaar.com[/url]

While trying this i was struck when i saw the product link in javascript.

while having close look at the tidied product url they are using as,
[b]

[/b]

Then, when creating the [b]extracter pattern[/b], I have replaced it as,

[b]

[/b]

but i cant get [b]dataset[/b].

How can i get data from futurebazaar.

How can i get going now.

Did i made any mistake there.

Help me.

by,
Alex

Thank you Scott

Thank you Scott,

I saw your example. It gives me lot of confidence..

Then i tried and found where i made a mistake..

Thank you for your Support..

Scraping the javascript link

alex,

I set up a little test session with what I had suggested and it works for me. Please download the session here and load it in screen-scraper to compare with what you have.

http://projects.screen-scraper.com/misc/Test-extractor-patterns_Scraping-Session.zip

You'll need to drop the test.html file right off your c: drive as you'll see in the session.

Hope this helps.

-Scott

Need Further Assistance

Thank you Scott.

Thank you for your support. :)

But i missed to add that in my original posting.
I have tried with single quotes too. And i have Edited the token too. But i din't change the regex for single quotes. Thanks for your information.

But [b]after changing the regular expressions too. i cant get the dataset [/b]:(

Please, Guide me further

Scraping the javascript link

alex,

It's very important to use regular expressions when you can. Here's how I would modify your extractor pattern.

<td valign="middle" width="40%" wrap="wrap"><a class="anc2" href="javascript&#58;seeSingleItem&#40;'~@productkey@~'&#41;">~@producttitle@~</a> <br />
</td>

Note how productkey includes the single quotes around it. Now, for the regular expressions.

productkey:

&#91;^'&#93;*
(do not match single quotes zero or more times or match anything between single quotes)

producttitle:

&#91;^<>&#93;*
(do not match opening and closing tags or match anything between an opening and closing HTML tag)

Give this a try and let us know how it goes.

-Scott