screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

Automated screen scraping

Hi talents,

We have used screen scraper basic edition and scraped 2 of the sites sucessfully and loaded those data in our portal.

Its all working fine.. well.

Now i need to automate my screen scrapping session like, it has to scrape those sites for every 24 hours and store the new data to our portal..

Is it possible..

Do i need to buy any versions of screen-scraper.
Will it support.

Guide me.

Thanks in advance,

Alex

Are hidden form fields updated in each session?

Hello,

Thanks for the great product!!

I'm scraping a website that uses a few hidden fields on a form to pass user information(website member id,etc.). When I scrape the form I can see my information. My question is, in production when another user login is used and they run the scraping session, will my user information be passed or hers?

The worst scenario would be that mine is passed, indicating that I have scrape all users information from this site then store it in a DB and change it for every user.

scrape files with 2 / 3 parameters in url

Hi,
i have problems scraping files from with url containing 2/3 (two/three) different parameters :
url examples :

test.com/test.php?page=1&cat=hotels&country=IP.html
test.com/test.php?page=2&cat=hotels&country=HS.html
test.com/test.php?page=1&cat=hotels&country=DF.html

- one is the page number, and ok, i use the ~#PAGE#~ variable in loop script with scrape url like

test.com/test.php?page=~#PAGE#~&cat=hotels&country=DF.html with loop code

Tutorial 1

Sorry to be a pain but I have followed the insructions step by step but when trying to run tutorial 1 (im at the end where it writes the data Hellow World! out to a file.

I get this error message in my log.

Referencing results in extractor patterns

I'm trying to scrape one page, extract the data then use the result to search a second page and search for data next to the result from the first.

e.g.

Another Instance of SS is already running?

Hi, just installed ver 4 of ss on a Ubuntu 8.04 machine.

After finally setting the permissions for the conf and db files, I now get an error saying "Another instance of screen-scraper is already running. The application will now exit."

Details...
It was installed using sudo. (sudo $file ./setup_ss_basic.sh)

I set permissions to the conf and db files using sudo as well. (sudo chmod a+w )

Now, all I get is the error saying it's already running. How can that be? I've even tried rebooting and it still doesn't make a difference.

Redirect is Ruining Me

I am trying to scrape a store locator but after the URL is resolved, I am redirected to a different page that doesn't contain the data that I need. Any insights to why this is happening would be greatly appreciated!

Log output:

Strip HTML

Hi,

Some of the content I'm looking to scrape has some legacy HTML, anyone know of a bit if Java that I can add to a script to strip this out?

Some of the HTML is basic stuff etc but some looks to be old MS work tags and there are occasionally some font tags.

rgds/alex

IF ELSE help

Hi,

I know this is a simple question but my script does not seem to be working.

I basically want to write out a variable if it has some data if not I want to write out some others.

Problem with URL being changed (Variables are removed)

I'm having some problems with scraping Yahoo Hot Jobs. The URL isn't following the standard format when you search (where the search variables follow a '?').

For example, my search result is yielding this as the URL:

 http://hotjobs.yahoo.com/job-search-l-Pomona-CA-k-pomona%20valley%20hospital%20medical%20center-c-Healthcare-m-0-d-FT-d-PT-j-PERM-j-CONT-n-Pomona%20Valley%20Hospital%20Medical%20Center-h-pomona%20valley%20hospital%20medical%20center;_ylt=AjnfZ0O0ZNJXhTMeW8c.NTb6Q6IX