screen-scraper support for licensed users
Regex to get an extractor pattern to stop at a </td>
Hi,
I was wondering if you know the regex to stop a extractor pattern at
rather than a > or a ".
It is just I don't know the number of html elements in a td but I know it ends with a td.
Any help would be really appreciated.
Regards,
Seamus McMahon
Get and remove project items
Is there an API to get a list of all items in the screen-scraper project tree? I'm using the professional version on several servers and would like to keep track of the current items in a database. In addition to reading the list, is it possible to delete items from the tree?
I understand that the enterprise version web api supports this, however the current server/software structure is based on professional using the .net API. If an enhancement is required an update that allows a screen-scraper script to read/delete items would be great and I'll have the .net pass the request off.
NullPointerException in a FOR loop?
A scraper that has been running daily smoothly so far has given me an error some times lately. The error is:
NullPointerException (line 39): for ( i = 0 ; -- Null Pointer in Method Invocation
The full line of code that seems to be generating this ocasional errors is:
for( i = 0; i < postsdataSet.getNumDataRecords(); i++ )
Any idea what could be happening?
Thank you,
Boga
Screen Scraper Taking forever to shutdown or start up
Over the past couple of days, sometimes(not every time) when I start screen scraper or close it down, it literally takes a couple of hours for it to start up or shut down completely.
I probably have 175+ scrapes.
Have I maxed something out?
Is there something I can clean up?
Thanks for your help
Bart
CvsWriter encoding issue
I am scraping product names from a website and one of the products as listed on the website is "Convoy™ 2 u660"
With JTidy as the default setting the product name was being scraped as "Convoyª 2 u660"
I disabled tidy HTML for that page and the product name is showing correctly on the console as "Convoy™ 2 u660"
However, when I write this to file using the CSVWriter it is being written as "Convoyª 2 u660"
I have the character set for the scraping session set to UTF-8.
Is there an encoding bug in the csvwriter?
- Vivek
Remote scraping is throwing an error when scrape session contains a script
Hi,
I have a scraper that works fine on Workbench but not when invoked remotely.
To simplify debugging I created a simple scraper that loads a page (can be any page) and invokes a script after that is complete.
The script is very simple and is as follows:
// Import the necessary classes.
import com.screenscraper.scraper.*;
import com.screenscraper.csv.CsvWriter;
import java.util.*;
import java.text.SimpleDateFormat;
session.setVariable("DeviceList",new HashMap());
........
Site delays in screen-scraper but not browser
http://www.browncountyclerkofcourts.org/Search/srchmain.shtml
If you search on that site for a name that doesn't have any results, like "JONES, AMETHYST" it comes back almost instantly with a "no records" page/message.
In screen-scraper, the same call takes a LONG (30 seconds+) time to come back, and when it does sometimes the server has responded with a timeout from their database.
java.lang.NullPointerException - Fresh install of Screen Scraper 6.0.19a
Just installed Java 1.7 into a blank directory.
java -version
java version "1.7.0_09-icedtea"
OpenJDK Runtime Environment (rhel-2.3.4.1.el6_3-x86_64)
OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)
Running on CentOS release 6.3 (Final)
Reinstalled Screen Scraper and ran the updater.
Version=6.0.19a
Cleared the log files and restart Screen Scraper in server mode.
cat error.log
High Memory Use (100.0%)
Memory Profiling error
java.lang.NullPointerException
at com.screenscraper.profiling.ObjectMemoryProfiler.buildProfilerFromFiles(ObjectMemoryProfiler.java:234)
Exception in thread "Thread-62" java.lang.IllegalArgumentException: Host name may not be null
Have a session that locks up trying to validate a proxy.
Error is: Exception in thread "Thread-62" java.lang.IllegalArgumentException: Host name may not be null
This same session works file on local systems as well as out other server.
I've reinstalled Screen Scraper and updated to the latest beta.
Standard server install script no modifications.
Stderror.log shows this:
Exception in thread "Thread-62" java.lang.IllegalArgumentException: Host name may not be null
at org.apache.commons.httpclient.HttpHost.
file access error on "ss.script"
Hi,
I routinely run several screen-scraper sessions (4-5) in parallel on a Windows XP machine. Each session is started from a DOS batch file which in turn is started as a Windows "scheduled task", e.g.:
jre\bin\java -jar screen-scraper.jar -s "WILLHABEN" -p "PARAMETER=HEADER:1/AREA:900/LND:WIE/WAS:haus-kaufen/ANGEBOT:haus-angebote/WAS_NR:3;1;4;110;111;100;101;102;114;112;16;20;18/ZEILEN:30" >willhaben-wie-hk.log
Normally this works fine, but from time to time a session fails on start-up and from that point on any other session.