screen-scraper support for licensed users

Questions and answers regarding the use of screen-scraper. Only licensed Professional and Enterprise Edition users can post; anyone can read. Licensed users please contact support with your registered email address for access. This forum is monitored closely by screen-scraper staff. Posts are generally responded to in one business day.

A New Challenge to Iterate Pages

Hi
I've not come across this one before:

I am trying to scrape this page: http://ukcabins.com/cabins/search/

It all goes well until I want to iterate the page. The 'next' URL gives me what appears to be a command or something, but will not run under screen scraper:
http://ukcabins.com/incs/actions/next_page.php?start=0

In the browser it takes you to the next page of data - in Screen Scraper it does not.

I have tried saving a PHPSESSID but I'm a little clueless here. There seems to be little or any POST data when I run through the proxy server.

jas777 on 03/24/2015 at 5:11 am

screen-scraper support for licensed users

The message was peer not authenticated Problem again

I have upgraded to 6.0.58a and I have an up to date version of Java and I have ticked the use SSL3 only check box, but I am still getting the following error:
LoginNewton: Resolved URL: http://www.newtontrailers.com
LoginNewton: Sending request.
LoginNewton: Redirecting to: https://www.newtontrailers.com/
LoginNewton: An input/output error occurred while connecting to 'http://www.newtontrailers.com'. The message was peer not authenticated.
LoginNewton: It's possible that checking the "Use only SSL version 3" checkbox under the "Advanced" tab will fix this.

jas777 on 03/20/2015 at 12:01 pm

screen-scraper support for licensed users

Session variables in an extractor pattern

Is it possible to "place" a session variable in an extractor pattern? I have a a json object as in the example below for a map that appears. In addition to the page that I'm viewing it also lists other locations. I currently extract each item, then setup a loop to compare the id of each item too the id of the page that I opened.

www.somesite.com/viewpage.aspx?id=~#PAGE_ID#~

{"id":1,"lat":46.3629,"longt":7.3810},{"id":3,"lat":47.78954,"longt":6.753,...}

If this isn't possible it would be nice to have a feature in extractor patterns that would use a session variable:

exdap on 03/18/2015 at 12:14 pm

screen-scraper support for licensed users

Is there a REST command that implements the removeCompletedScrapeableSessions function?

I'm looking for a way to automate removing all Interrupted and completed Scraping Sessions.
There seems to be an internal server API call when one clicks on "Remove Completed Scraper Sessions" button.
POST parameter: "7|0|4|http://extractor1:8801/|9E4DB75A1455A717F46BFE9DCFFB0394|com.screenscraper.services.ScrapingSessionService|removeCompletedScrapeableSessions|1|2|3|4|0|"

Adding this POST parameter with a call to scrapeableFile.setRequestEntity does not seem to do the trick.
Using wget fails as well with an internal server error.

cyberstar on 03/02/2015 at 12:47 pm

screen-scraper support for licensed users

Unable to select next page from ajax site

I am trying to scrape this site http://www.sjhallplant.com/stock-list and have used a proxy session to obtain the post data (needs a key and page number) but when I run it on screenscraper I get a status of either 400 or 404 depending on the variants that I try when I include the page parameter.

Alternatively I just get the first page of data - selecting the pattern for the 'next page' is easy - applying it is the tricky bit...

I am not sure if it is token or cookie related - I have a feeling it is somewhere in this area?

Any help would be gratefully appreciated.

Thanks

jas777 on 02/24/2015 at 5:58 am

screen-scraper support for licensed users

Import Error for Scripts written in stable 6.0

You can ignore the part of this post about not being able to import a script. I figured out the problem. The default character set was UTF-16 in my version of screen-scraper. The crash log is still relevant, but feel free to delete this post as I don't see it being very useful to other users. Thank you.

GenericPause on 02/19/2015 at 3:19 pm

screen-scraper support for licensed users

HTTP ReDirects

Is there a way to temporarily suspend redirects? We have a site that gets caught up in a redirect loop based on they way they do things.

lwallace on 01/23/2015 at 11:36 am

screen-scraper support for licensed users

1 comment

javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated

Hi Team,

Proxy is not giving me anything for this site why? Please help. and In log i am getting this:

An input/output error occurred while connecting to 'https://www.selfmgmt.com/en/login.aspx'. The message was peer not authenticated
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
at com.sun.net.ssl.internal.ssl.SSLSessionImpl.getPeerCertificates(Unknown Source)
at org.apache.http.conn.ssl.AbstractVerifier.verify(AbstractVerifier.java:128)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:390)

biru on 12/17/2014 at 7:05 am

screen-scraper support for licensed users

Screen-Scraper Freezing Up

Hello. I am trying to scrape the web address: http://www.escocorp.com/EN/operations/Pages/dealer-locator.aspx

Running the scrape gets a response from the server, but trying to view the last response tab results in screen-scraper locking up. The GUI becomes unresponsive to any mouse movements/clicks. If I wait for a few minutes, the program eventually responds to a mouse click, but the interface is virtually unresponsive to any meaningful navigation. After deleting the scrape from screen-scraper, the GUI interface returns to normal and responds quickly to mouse clicks.

GenericPause on 11/12/2014 at 1:59 pm

screen-scraper support for licensed users

Not able to scrape a site which uses JavaScript

Hi Screen Scrapper Team,

We are having a trouble scrapping a page which uses JavaScript after page load. It says "JavaScript is not enable". We have analyzed the script and found that the page which is throwing an error having JavaScript inside which modify the URL after page is loaded. We have also tried to hit the URL which JavaScript does, but no luck. Below is the URL for your reference:
http://forms.ceredigion.gov.uk/ufs/ufsmain?formid=DESH_PLANNING_APPS&ebd=0&ebp=10&ebz=1_1414647399403

Awaiting for your response.

Regards,
Barnali.

barnali on 10/30/2014 at 4:19 am

screen-scraper support for licensed users

1 comment

Search

Community

screen-scraper

User login