Pausing SS to wait for results

I am scraping a site that goes to a temporary screen with the words "Processing..." flashing on the screen and then loads the results that I want. Right now Screen-scraper is scraping the processing screen and of course failing to find the extractor patterns from the final results screen.

My question is: how can I tell screen scraper to wait for the final page to load before scraping, is there a way to specify a pause?

Thanks,

Joel

jgardner on 02/22/2008 at 5:19 pm

screen-scraper public support

Pausing SS to wait for results

Joel,

Not to ask too much of your time but could you share with us a brief synopsis of what you had to do with the JavaScript Ajax calls to get screen-scraper to work with the site? Did you have to learn what the JS was doing because the HTTP proxy transactions didn't reveal enough?

This is the future, I'm afraid.

-Scott

swilsonmc on 03/05/2008 at 3:26 pm

Pausing SS to wait for results

Scott,

In this case, unfortunately the final data is being retreived with an AJAX call. I fiddled with replicating that call as another scrapable file, but then found another way into the data i was looking for that usually bypasses the loading delay.

This should work for now.

Thanks for the response,

Joel

jgardner on 03/05/2008 at 12:07 am

Pausing SS to wait for results

Joel,

One correction. screen-scraper will not follow a redirect made via a meta tag in the HTML header (different than the HTTP header). In the case that the browser is being redirected by code that looks like this:

<meta http-equiv="refresh" content="0;url=/newWebPage.html" />

You would need to extract out the "newWebPage.html" portion and use it in the URL of a subsequent scrapeable file.

-Scott

swilsonmc on 02/29/2008 at 4:37 pm

Pausing SS to wait for results

Joel,

If you're running the professional or enterprise editions you have the session.pause() method available. However, I doubt pausing will result in what you want.

When a website pauses like this it displays the content after pausing in a few possible ways.

1. It redirects the browser using a meta tag redirect. screen-scraper should follow this.
2. It redirects using an HTTP Header 302 "object has moved". Again, screen-scraper should follow.
3. It does not redirect, rather is calls in the new content using AJAX. This is when client-side JavaScript makes an HTTP request to the server from within the already-loaded page. screen-scraper does not always follow the request like you would want it to (something we're working on). You need to parse through the HTTP transactions you recorded with the screen-scraper proxy server and read through their JavaScript to find out what the server needs you to send it in order for it to give you back what you want.

I hope this helps.

-Scott

swilsonmc on 02/29/2008 at 1:24 pm

Search

Community

screen-scraper

User login

Pausing SS to wait for results

Pausing SS to wait for results

Pausing SS to wait for results

Pausing SS to wait for results

Pausing SS to wait for results