screen-scraper support for licensed users
A New Challenge to Iterate Pages
Hi
I've not come across this one before:
I am trying to scrape this page: http://ukcabins.com/cabins/search/
It all goes well until I want to iterate the page. The 'next' URL gives me what appears to be a command or something, but will not run under screen scraper:
http://ukcabins.com/incs/actions/next_page.php?start=0
In the browser it takes you to the next page of data - in Screen Scraper it does not.
I have tried saving a PHPSESSID but I'm a little clueless here. There seems to be little or any POST data when I run through the proxy server.
The message was peer not authenticated Problem again
I have upgraded to 6.0.58a and I have an up to date version of Java and I have ticked the use SSL3 only check box, but I am still getting the following error:
LoginNewton: Resolved URL: http://www.newtontrailers.com
LoginNewton: Sending request.
LoginNewton: Redirecting to: https://www.newtontrailers.com/
LoginNewton: An input/output error occurred while connecting to 'http://www.newtontrailers.com'. The message was peer not authenticated.
LoginNewton: It's possible that checking the "Use only SSL version 3" checkbox under the "Advanced" tab will fix this.
Session variables in an extractor pattern
Is it possible to "place" a session variable in an extractor pattern? I have a a json object as in the example below for a map that appears. In addition to the page that I'm viewing it also lists other locations. I currently extract each item, then setup a loop to compare the id of each item too the id of the page that I opened.
www.somesite.com/viewpage.aspx?id=~#PAGE_ID#~
{"id":1,"lat":46.3629,"longt":7.3810},{"id":3,"lat":47.78954,"longt":6.753,...}
If this isn't possible it would be nice to have a feature in extractor patterns that would use a session variable:
Is there a REST command that implements the removeCompletedScrapeableSessions function?
I'm looking for a way to automate removing all Interrupted and completed Scraping Sessions.
There seems to be an internal server API call when one clicks on "Remove Completed Scraper Sessions" button.
POST parameter: "7|0|4|http://extractor1:8801/|9E4DB75A1455A717F46BFE9DCFFB0394|com.screenscraper.services.ScrapingSessionService|removeCompletedScrapeableSessions|1|2|3|4|0|"
Adding this POST parameter with a call to scrapeableFile.setRequestEntity does not seem to do the trick.
Using wget fails as well with an internal server error.
Unable to select next page from ajax site
Hi
I am trying to scrape this site http://www.sjhallplant.com/stock-list and have used a proxy session to obtain the post data (needs a key and page number) but when I run it on screenscraper I get a status of either 400 or 404 depending on the variants that I try when I include the page parameter.
Alternatively I just get the first page of data - selecting the pattern for the 'next page' is easy - applying it is the tricky bit...
I am not sure if it is token or cookie related - I have a feeling it is somewhere in this area?
Any help would be gratefully appreciated.
Thanks
Import Error for Scripts written in stable 6.0
You can ignore the part of this post about not being able to import a script. I figured out the problem. The default character set was UTF-16 in my version of screen-scraper. The crash log is still relevant, but feel free to delete this post as I don't see it being very useful to other users. Thank you.
HTTP ReDirects
Is there a way to temporarily suspend redirects? We have a site that gets caught up in a redirect loop based on they way they do things.
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
Hi Team,
Proxy is not giving me anything for this site why? Please help. and In log i am getting this:
An input/output error occurred while connecting to 'https://www.selfmgmt.com/en/login.aspx'. The message was peer not authenticated
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
at com.sun.net.ssl.internal.ssl.SSLSessionImpl.getPeerCertificates(Unknown Source)
at org.apache.http.conn.ssl.AbstractVerifier.verify(AbstractVerifier.java:128)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:390)
Screen-Scraper Freezing Up
Hello. I am trying to scrape the web address: http://www.escocorp.com/EN/operations/Pages/dealer-locator.aspx
Running the scrape gets a response from the server, but trying to view the last response tab results in screen-scraper locking up. The GUI becomes unresponsive to any mouse movements/clicks. If I wait for a few minutes, the program eventually responds to a mouse click, but the interface is virtually unresponsive to any meaningful navigation. After deleting the scrape from screen-scraper, the GUI interface returns to normal and responds quickly to mouse clicks.
Not able to scrape a site which uses JavaScript
Hi Screen Scrapper Team,
We are having a trouble scrapping a page which uses JavaScript after page load. It says "JavaScript is not enable". We have analyzed the script and found that the page which is throwing an error having JavaScript inside which modify the URL after page is loaded. We have also tried to hit the URL which JavaScript does, but no luck. Below is the URL for your reference:
http://forms.ceredigion.gov.uk/ufs/ufsmain?formid=DESH_PLANNING_APPS&ebd=0&ebp=10&ebz=1_1414647399403
Awaiting for your response.
Regards,
Barnali.