screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

Warning! Received a status code of: 500

Hi,

I tried to scrap the cars.com site. And I successfully scrapped search list page but while scrapping details page I passed the value for one of the field named Criteria and it has value like below

delay for website load time

Hi,

I am working on a very slow connection and am worried that the page load times are affecting my scrape data quality.

Is it possible to somehow insert a delay between when a page is requested and when a scraping / pattern matching script is run to allow the page to load fully? If so, it'd be great if someone can provide the snippet of Java to do this (and explain when and how to invoke it) - as I'm not a seasoned programmer.

Thank you

Error When Writing to MySQL Database

I am getting the following error message when writing to a MySQL Database:

The error message was: class bsh.EvalError (line 25): .replace ( "'" , "\\'" ) -- Attempt to invoke method replace on null value

This is the script that is running after each pattern match:

//Import the entire java.sql package
import java.sql.*;
import java.util.Date;
import java.text.DateFormat;
import java.text.SimpleDateFormat;

//Set up a connection and a drivermanager.
Class.forName("com.mysql.jdbc.Driver").newInstance();
      Connection conn;

Problem logging on to site - I don't see any POST data

Hi,

I've run through the tutorials and everything worked just fine. This product seems to be perfect for what I want to achieve but now I'm trying to scrape a 'real' site I'm running into problems logging in.

The site in question is http://www.lambaplc.com/ and I'm able to login with my credentials whilst Proxy is running but I don't see any POST data. I've also run the HTTP headers add-on for firefox but couldn't see anything useful.

Next Link extractor pattern

Hi friends

Am using a Screen Scraper Basic Version

Am following the steps in E-Commerce Site tutorial

While scraping a job portal web site. The error Occurred at Creating the Next Link Extractor pattern.

It is Showing only one DataSet Record when i clicked on Test Pattern Button

The Url is

Next >>

I had extracted the some portion i.e.,

Iteration problem with research scrape

I'm using the example script of the "memory conscious next page" to scrape case law from a website for quantitative reasearch. For some reason the VOLGPAGE variable does not increase from its initial value and the page sequence never get going. To my knowledge I have not changed anything that should interfere with the logic in the below example. The variable HAS_NEXT_PAGE is saved to session variable and I am calling the script "after each pattern match" on the Search result page and the following Next search result page.

pagination code missing

Hi friends,

While creating "Next Link" Extractor pattern am unable to find the pagination code in my last response tab.

When i have created the scrapeable file i found the code but after initializing the script the error occurred.

multiple web sites scraping

hi friends,

Am scraping a website(job portal) in that web site it is provided with job details but each and every job detail is from different web site . i.e., they are providing link of each and every job detail. when i clicked any link it is going to other web site then am getting different kind of details of each and every link so how can i scrap all such kind of details.

so how can i scrap such kind of web site.

please provide me the required info and the procedure.

addToVariable (basic edition)

The organization that I volunteer for added some laptops last year, and at that time I upgraded to basic 5.5. But I've just come to run some scripts for them, and addToVariable is no longer in scope:

"The error message was: Exception (line 75): RevScript12: The "addToVariable" method is not available in this edition of screen-scraper.-- Method Invocation session.addToVariable"

That kind of makes things difficult for us, I see this occurred in 4.5 (?).
Is there any alternative incrementation function in basic? I can't see one :-(

thanks
J.

Scrape only new content

Hi, is there a way to scrape only new content?

I need to scrape some urls daily, but each url has a table with a lot of rows, so it takes a while. But only a few rows are added daily, so if I could only scrape the new content it would be very quick.

Is it possible? How?