screen-scraper public support
Warning! Received a status code of: 500
Hi,
I tried to scrap the cars.com site. And I successfully scrapped search list page but while scrapping details page I passed the value for one of the field named Criteria and it has value like below
delay for website load time
Hi,
I am working on a very slow connection and am worried that the page load times are affecting my scrape data quality.
Is it possible to somehow insert a delay between when a page is requested and when a scraping / pattern matching script is run to allow the page to load fully? If so, it'd be great if someone can provide the snippet of Java to do this (and explain when and how to invoke it) - as I'm not a seasoned programmer.
Thank you
Error When Writing to MySQL Database
I am getting the following error message when writing to a MySQL Database:
The error message was: class bsh.EvalError (line 25): .replace ( "'" , "\\'" ) -- Attempt to invoke method replace on null value
This is the script that is running after each pattern match:
import java.sql.*;
import java.util.Date;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
//Set up a connection and a drivermanager.
Class.forName("com.mysql.jdbc.Driver").newInstance();
Connection conn;
Problem logging on to site - I don't see any POST data
Hi,
I've run through the tutorials and everything worked just fine. This product seems to be perfect for what I want to achieve but now I'm trying to scrape a 'real' site I'm running into problems logging in.
The site in question is http://www.lambaplc.com/ and I'm able to login with my credentials whilst Proxy is running but I don't see any POST data. I've also run the HTTP headers add-on for firefox but couldn't see anything useful.
Next Link extractor pattern
Hi friends
Am using a Screen Scraper Basic Version
Am following the steps in E-Commerce Site tutorial
While scraping a job portal web site. The error Occurred at Creating the Next Link Extractor pattern.
It is Showing only one DataSet Record when i clicked on Test Pattern Button
The Url is
I had extracted the some portion i.e.,
Iteration problem with research scrape
I'm using the example script of the "memory conscious next page" to scrape case law from a website for quantitative reasearch. For some reason the VOLGPAGE variable does not increase from its initial value and the page sequence never get going. To my knowledge I have not changed anything that should interfere with the logic in the below example. The variable HAS_NEXT_PAGE is saved to session variable and I am calling the script "after each pattern match" on the Search result page and the following Next search result page.
pagination code missing
Hi friends,
While creating "Next Link" Extractor pattern am unable to find the pagination code in my last response tab.
When i have created the scrapeable file i found the code but after initializing the script the error occurred.
multiple web sites scraping
hi friends,
Am scraping a website(job portal) in that web site it is provided with job details but each and every job detail is from different web site . i.e., they are providing link of each and every job detail. when i clicked any link it is going to other web site then am getting different kind of details of each and every link so how can i scrap all such kind of details.
so how can i scrap such kind of web site.
please provide me the required info and the procedure.
addToVariable (basic edition)
The organization that I volunteer for added some laptops last year, and at that time I upgraded to basic 5.5. But I've just come to run some scripts for them, and addToVariable is no longer in scope:
"The error message was: Exception (line 75): RevScript12: The "addToVariable" method is not available in this edition of screen-scraper.-- Method Invocation session.addToVariable"
That kind of makes things difficult for us, I see this occurred in 4.5 (?).
Is there any alternative incrementation function in basic? I can't see one :-(
thanks
J.
Scrape only new content
Hi, is there a way to scrape only new content?
I need to scrape some urls daily, but each url has a table with a lot of rows, so it takes a while. But only a few rows are added daily, so if I could only scrape the new content it would be very quick.
Is it possible? How?