can't extract multiple elements from XML page
I'm new to Screen-scraper, and I'm trying to extract multiple elements using the same pattern from an XML page. So in other words, the same XML tag is repeated on the page, but the values change. When I apply the pattern (Apply Pattern to Last Scraped Data) I see all the data there in the small pop-up table. But when I come to apply a script to extract, not all the data is extracted to the text file; I don't get all my sequences. I'm only using the basic extract script given in the tutorial, so:
- Does the extract script need modifiying?
- Or can screen-scraper not do this?
Many thanks
There shouldn't be anything
There shouldn't be anything preventing your scrape from doing its job... Two suggestions:
1) In Basic edition, you can't use breakpoints, but if you could at least alter your scrape to end once you've gotten to the desired page, you should be able to click that 'Last Response' tab, and then click the button to view that response in your browser. Compare that to the actual page in your browser-- Of course they should be the same, but it may not be.
2) If the above pages aren't the same, then you might want to try to use a script to print out some of your key variables just before the page request happens. for instance, if you're using a page number or offset variable in your request, you'll want to make sure that everything is the same. This could include cookies, as well. Screen-scraper will track those for you, but you might have to hit their home page first in order to get the cookie in the first place.
If you're really stuck on this, you might want to download one of the trial versions of a higher version of SS-- you'll be able to use breakpoints (
session.breakpoint()
) in your scripts, and you can update to alpha updates via the main settings in screen-scraper. One of our recent additions has been a feature where you can go to a scrapeableFile's "Last Request" tab and click a button which lets your compare everything with something in your proxy list of recorded pages. It'll highlight any significant difference between your "good" proxy version and then your scraping session's version. If things are missing, you'll certainly be able to tell using that feature. Without it, I'm afraid all I can do is guess at the problem.Hope that helps,
Tim