screen-scraper public support

Mapping product numbers

Hi

I have just been looking at screen-scraper for a few days and am planning to use it for scraping e-commerce sites. I have a bit experience with Java, and after trying the tutorials I think the programming of the scraper won't be a problem for me.

As one of the posts in the blog says, this is just one part of it. Another is to find the right pages, and get the data mapped together. I think these two problems will take the most of my time. I thought of one way to do it.

H_Lambert on 03/20/2009 at 10:03 am

screen-scraper public support

ssv45 on linux

testing JVM in /usr ...
Warning: Cannot convert string "-b&h-luxi sans-medium-r-normal--*-140-*-*-p-*-iso8859-1" to type FontStruct
Warning: Cannot convert string "-arphic-ar pl shanheisun uni-medium-r-normal--*-*-*-*-p-*-iso10646-1" to type FontStruct
Warning: Cannot convert string "-arphic-ar pl uming uni-medium-r-normal--*-*-*-*-p-*-iso10646-1" to type FontStruct
Warning: Cannot convert string "-sazanami-gothic-medium-r-normal--*-140-*-*-c-*-jisx0208.1983-0" to type FontStruct

max on 03/19/2009 at 8:47 am

screen-scraper public support

remote scraping sessions called from beanshell

I seem to be able to call a remote session ok and pass session variables back and forth as long as it's not in lazy mode.

shadders on 03/19/2009 at 7:06 am

screen-scraper public support

couple of questions

My session stores search results in a .htm file each. It also creates a Overview.htm with each search result listed and linked.

I need the date (dd-mm-yyyy-hh:mm) in the title of the Overview.htm, so I create Date() in the Initialize Session script.
Noelle showed us this:

r0tzl0effel on 03/18/2009 at 5:14 am

screen-scraper public support

session's import question in v4.5

hi guys,

when import a seesion by copy it to another ss'import directory, and run it by command line
but sometimes i found ss can't analyse out session included script completely

I haven't meet a similar problem in V4.0 and it's import analyse very well .

do i need fix some properties in V4.5 or it only import by GUI ways ?

Thanks in advance

will

will on 03/18/2009 at 3:00 am

screen-scraper public support

7 comments

multi-threading questions

I'm writing a project to scrape some very large forums.

I have one scraping session which collects all the config data and pretty much sets everything up for the main scraping exercise which is the posts.

I want to scrape the posts in several threads at once. The number of threads will be set by a session var in the init script (along with lots of other parameters) and basically I'm just planning to use an iterative loop to check if each thread is finished then spawn another one if it is...

shadders on 03/17/2009 at 1:58 am

screen-scraper public support

godaddy - am I dreaming?

I realise my chances of being able to make this happen aren't real good but I'm trying it anyway.

I'm trying to install on a godaddy linus deluxe shared hosting account... I can run the install successfully in an ssh session.

If I try to run it using the link I get the following:

shadders on 03/16/2009 at 2:49 am

screen-scraper public support

'font' gets replaced with 'span'

I'm getting some strange results when I examine my last scraped data.

I am scraping two pages from a site, which are identical apart from being for two different products. One gives:

<h2><b>Price:</b> <span style="color: 990000">Â£1349.00</span></h2>

The other gives:

<h2><b>Price: </b><font color="990000">£239.00</font></h2>

burgesst on 03/13/2009 at 10:51 am

screen-scraper public support

The response exceeded the maximum length and was truncated. If you'd like to view the full response........

hi guys

"The response exceeded the maximum length and was truncated. If you'd like to view the full response, click the "Display Response in Browser" button, then view the source in your web browser."

what's going on with the version ssV45? i got that message in "Last Response". Is this a bug or ...?

//Max

max on 03/12/2009 at 7:51 am

screen-scraper public support

6 comments

Another Next Page Question -How do you make it go to the next page when their is no next button...

Thanks... You put me on the right track...

http://www.accesscb.net/results.php?page=2&&keywords=fax&location=&categ...

You are currently on results page 2

[email protected] on 03/10/2009 at 4:36 pm

screen-scraper public support

Search

Community

screen-scraper

User login