screen-scraper support for licensed users
Altering extracted data in the DATARECORD
Is there a way to alter the data in the DATARECORD, and then reinsert that altered data back into the dataRecord after each pattern match of an extractor pattern? In other words, can I make alterations to a DATARECORD array, and have those changes write out to a file after each pattern match without using a session variable?
Strange license error when using scrapeableFile.setUserAgent
I am trying to change the browser user agent so I have this script which I call "Before file is scraped":
int randomnumber = randomGenerator.nextInt(32);
switch(randomnumber){
case 0 : scrapeableFile.setUserAgent( "Mozilla/5.0 (SunOS 5.8 sun4u; U) Opera 5.0 [en]" );
case 1 : scrapeableFile.setUserAgent( "Mozilla/5.0 (Windows; U; Win98; en-US; rv:0.9.2) Gecko/20010726 Netscape6/6.1" );
case 2 : scrapeableFile.setUserAgent( "Mozilla/5.0 (Windows; U; Win98; en-US; rv:x.xx) Gecko/20030423 Firebird Browser/0.6" );
java.lang.RunTimeException Could not generate DH Keypair
Dear screen scraper....
We have looked in the forum(s) at both of these postings http://community.screen-scraper.com/node/2337 and http://community.screen-scraper.com/node/2284 neither of which seem to offer a solution to our problem which just started occurring Thursday 1st of October. We have upgraded to the latest version of Java and the current version we have is Version * Update 60 (build 1.8.0_60-b27)
when we try to scrape content from www.dustinhome.se we get the following errors.
Remove response header
Is it possible to remove or disable the server header in the response? I have a unique project that needs to collect all numbers from a page and the date, server version, etc... is causing a problem.
--REMOVE THIS--
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Date: Tue, 29 Sep 2015 17:31:24 GMT
Vary: Accept-Encoding
Connection: keep-alive
Vary: User-Agent
--REMOVE THIS--
The script throwing error - The message was java.net.ConnectException: https://xxx:443
We were trying to capture one https site, its throwing the following error
Encountered a connection error for domain "webapp.halton.gov.uk". Message was "https://webapp.halton.gov.uk:443". Trying different protocols...
Landing Page: An input/output error occurred while connecting to 'https://webapp.halton.gov.uk/PlanningApps/index.asp'. The message was java.net.ConnectException: https://webapp.halton.gov.uk:443
We are using 6.0.65a version of screen scrapper tool.
We have selected http client as Async http client. Still it is throwing the error.
problem running SS from OSX command line...
Hi,
I am trying to run a scrape from my osx terminal window. In the workbench everything works, but for some reason I am getting this error when I run anything in osx command line:
/usr/bin/java -jar screen-scraper.jar -s "test"
Exception in thread "main" java.lang.NoClassDefFoundError: java/lang/AutoCloseable
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:637)
at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
Replacing spaces with hyphens in token
Hi
Not sure about this one, but I am struggling.
I need to replace all spaces with a hyphen in a token to form a URL to take me to a product page.
http://www.groundcaretrader.com/#!tractors/c1exz has the components of the URL but needs a - to find the 'escaped fragment' page which I can scrape.
For example: http://www.groundcaretrader.com/?_escaped_fragment_=Kubota-F2560-Outfront-Rotary-Mower/c1se7
I can grab the end code in the URL, but the make-model-category appears with no hyphens in the results page.
this is where I need to replace the spaces.
Problem with objects tree
Hi,
I have a problem with my screen scraper (version 6.0.58a). Sometime (a few times per day) I can`t see any object from objects tree on the left side. When I click on empty field I can see only a part of my objects.
Has anyone the same problem?
Thanks,
Mariusz
Controlling(pausing/restarting) from one scraping session to another possible?
Hello,
I have 2 scraping sessions:
session A scrapes website A and runs every night from 01AM to approx 07:30AM.
session B scrapes website B and runs once per week an takes approx 3 days to complete running non-stop day and night.
Prevent Get parameter escape
Is there a method that will prevent a Get parameter from escaping characters on request? For example I'm making a request to a site www.somesite.com/profile?id=T05/VT4gViy+SVKle2A9 Notice the / and + in the parameter. When I use screen-scraper it requests www.somesite.com/profile?id=T05%2FVT4gViy%2BSVKle2A9 replacing / with %2F and + with %2B. I know this is the correct method, however the site I'm making the request to doesn't properly handle the escaped characters and results in an error. I tried it in Chrome (which does not convert the characters) and everything works.