screen-scraper public support
Connection Refused Problem
I'm trying to write a simple screen-scraper but I'm having trouble with a the url I'm using. I keep receiving a "Connection Refused" error whenever I try to bring the site up through screen-scraper (through a scraper or through a proxy session)
When I go to this url outside of s-s (IE or FF) it works fine, once I try anything in s-s I receive a "Connection Refused" error.
See log below. Anything I'm doing wrong?
Thanks!
exceptions on running tutorial
Hi,
Attempting to eval this product, but run across this after my initial session (I saved my data from a straight-forward tutorial 1 run-through, and then get this exception trace)
Any clues what might be happening (I reproduced this after a reinstall; running under WinXP SP2 if that matters); obviously this is jdbc related, not sure of a workaround?
(From the log file)
An error occurred while executing the SQL statement. The error was Table not found in statement [SELECT * FROM scrapingsession].
Extractor Patterns, Data Records and PHP Scripts?
I have an extractor pattern that checks for instances of author/author ID. If there are multiple instances, I understand that screen-scraper stores them in records.
Some related questions ...
1. Am I correct in my understanding that to access these records, I would need to write a script that uses the getDataRecord() or getAllDataRecords() APIs [and that the script must be added to the extractor pattern].
2. PHP is not an available scripting language in screen-scraper.
Script not found error on headless machine
I am trying to run screen-scraper on a headless (no GUI) version of Gentoo Linux with Java 1.5 (although I guess that doesn't matter since ss already has a JRE). I am using the Hello World example from the downloaded hello_world1.zip file. I changed the name of the interpreted java file from Hello World (Scraping Session).xml to Hello World.xml. I have also tried renaming it to HelloWorld.xml (no spaces) and changed the command line appropriately.
For the command line I use the following.
Extractor pattern to "end of line"
Question:
Say the HTML is this:
foo |
And you needed to get "foo" -- your subextraction would start with that leading ">" but you don't have anything to end on.. Such as: >~@MYVALUE@~ .. ? Assume you're locked into that pattern -- is there a way within an extraction pattern to say "and everything else left in this chunk?" If you just leave it blank it does not seem to pick it up.. Answer:
fnirt on 11/06/2006 at 2:05 pm
Where is "data_set_example.php"?On the "Invoking screen-scraper from PHP" webpage, there is the following entry: "getVariable( $var_name ). Gets the value of a session variable that was set during the course of the scraping session. If the object identified by $var_name is a data record an associative array will be returned. If the object identified by $var_name is a data set a two-dimensional ordinal array of associative arrays will be returned (see the "data_set_example.php" file for an illustration of this). Note that currently only Strings, DataRecords, and DataSets can be accessed by this method." Problem with Shopping.asp in Tutorial 4I am working on Tutorial 4, and I'm having a problem with the Shopping.asp file that I downloaded. When I run the original file, I get the following error Error Type: This is line 82 I've tried different variations of this line to no avail. I get various error messages. Any help would be appreciated.
Salty Dog on 10/31/2006 at 8:28 pm
Backing up (and printing out) screen-scraper files?I am using screen-scraper professional 2.7.2 with Windows XP. Where does screen-scraper store the configurations for Scraping Sessions (e.g., extractor patterns, etc.) and Scripts? I would like to add them to my backup. Is there a way to print the configurations for Scaping Sessions? It would be nice to have documentation to use as reference. Best regards,
PeterWest on 10/31/2006 at 12:07 pm
Problem ? Redirect requested but followRedirects is disabledDear Todd, I get blank response when i try to scrape following sites in "HttpClient mode" if i scrape google site like "http://www.google.com/search?q=xxxx" the problem does not occurs. Also this problem does not occurs in "Internet Explorer(Windows ony)" mode When i build the error.log in debug mode, i seen the following line "[i]Redirect requested but followRedirects is disabled[/i]" Could you please help me to resolve this problem.
dogant on 10/29/2006 at 5:40 pm
Single page / Next page / Previous page problemI am scraping from a yellow pages type site. In my start script I provide values for location and category variables. When my script retrieves the first page, I scrape the details from the list of companies. I want to move to the next page if there is one. In the spot where I can retrieve the data I need as a session variable for the next page one of 2 things can happen:
shockeymoe on 10/26/2006 at 9:58 pm
|