screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

XmlWriter vs. JDOM

Hi

I have been looking at the XmlWriter in Screen Scraper. It is rather limeted compared to JDOM. Isn't it posible to import JDOM in Screen Scraper? Does XmlWriter have advantages i have overlooked?

Hans

Saving a web page in order to render the page later with as pdf with graphics.

I need to print the web page being scraped to a PDF with graphics. The preferred approach I would like to use is to perform save as similar to IE or Firefox, placing and referencing the the supporting files in a sub-folder.

thank you for your assistance.

Help with scraping white text

Hey all Great little program...

I am currently dealing with the below bit of scraped HTML Manually, but if anyone had an idea to make life easier it would be great. I recieve around 25 results each week for the area I am interested in, and then type these property prices manually into a database against the advertised property (scraped from another website).

Scrape request:

GET /propertydata/vic/BORONIA/index.html HTTP/1.1
Cookie: PASSPORT=c3bbb7fb199ef30a319c8e0ef139002c
Host: realestateview.com.au

Snip of two records:

resolved url and 404 error

I'm new to screen scraper and programming, but trying to make this work. I'm trying to apply tutorial 2 to a different website BUT.... i'm having trouble because the url between the first and second results pages are the same, I can't seem to find a variable that starts my extraction pattern.
url is this http://www.swoopo.com/brw/vouchers_58.html?pge=10&ast=3

and this is error code

running scripts in the workbench in v4.5

I was previously able to run scripts by right-clicking on one and choosing "Run Script". I can't seem to find a similar option anywhere - was it removed?

thx
Joshua

Analytics

When you scrape a site with screen-scraper, will it cause a spike in the analytics for the site you are scraping? For example, I know if you scan a site using Xenu to get all the pages on a site, this does not show up in analytics (google analytics). Likewise, when other spiders come through a site, they are also not usually recorded by analytics. So how does screen-scroper behave, when I scan a site will it cause a huge spike in recorded traffic?

Thanks.

first scrape - odd results

just starting using screen scraper so bear with me.

i have generated my scrapeable file using the HTTP transaction below:

http://www.bet24.com/bet24NetWeb/games.jsp?rl=1&&s=Football&t=g0101&t=g0...

When i run a scraping session, part of the log reads :

Unable to match dollar sign in regular expressions

I'm running into a problem where I need to be able to extract a dollar sign in an extractor pattern. As far as I can tell it should be as simple as setting the regular expression to "\$". It just doesn't seem to be working for me...

I've tried matching a dollar sign in about the simplest example I could think of:

 This: $ is a dollar sign

With the extrator pattern of:

 This: ~@DOLLARSIGN@~ is a dollar sign

POST data exceeded maximum length and was truncated

http://www1.dhcr.state.ny.us/BuildingSearch/Default.aspx is the site I want to scrape.The search has to be made using zip code.
In the page for making zipcode search there are two dropdown lists; one for county and one for zipcode for each county there is a list of zipcodes. I want the results for all of them.

StartedUsing your New Post for: Next Page - Memory Conscious, I got this error

An error occurred while processing the script: Next Page
The error message was: Encountered "( \"OFFSET\" , ( currentPage - 1 ) * offsetStep + initialOffset ;" at line 18, column 24.

In the code:
session.setVariable("OFFSET", (currentPage - 1) * offsetStep + initialOffset;

Where does OFFSET come from and where is it used...

Thanks,
Clarence