How do I upgrade from version 6.x to 7.0?

screen-scraper 7.0 requires a newer JRE than the previous stable release, therefore upgrading requires some additional steps.

If you don’t already have all your scrapes exported, or just want to preserve the current configuration, you need to upgrade your current screen-scraper to the latest alpha version 6.0.64a (instructions). Once done, back up the content of the screen-scraper/resource/db directory.

Linux/OSX

The new installer does not include the JRE
You need to have the Java JRE 1.8 installed

Why do I get "HTML Truncated" on the Last Response tab?

Some large web pages are enough to make the "Last Response" tab non-responsive. In order to prevent performance issues, screen-scraper will truncate the HTML. You can still see it, however, if you:

  1. Click to "display response in a browser"
  2. Right click and view the source for that page

You may edit the screen-scraper.properties file to allow more to be displayed, but in so doing you may run afoul the aforementioned performance issues. To do so you either edit or add a line:
 

SSH connection issues/peer not authenticated

Notes on the various HTTPS issues are posted the blog.

How do I resolve connection issues when trying to scrape a site that uses SSL?

SSL issues can be manifest as a number of errors including but not limited to:

SSLHandshakeException

ssl_error_rx_record_too_long

An input/output error occurred while connecting to https:// … The message was peer not authenticated.

javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated

If you make a request and get one of these errors, the best steps to rectify it are

Make sure you are using the newest version of screen-scraper

Can I put a session variable in an extractor pattern to limit the results?

Extractor patterns can't accept variables. The extractor pattern is dealing with the last response HTML and doesn't have the means to snip some of that HTML out and replace with a token.

In cases where you would do this, the extractor pattern might look like:

name="ProductID" value="~@SKU@~">~#NAME#~<

The hope would be to get only the match for ~#NAME#~

The correct means to do this would be to:

name="ProductID" value="~@SKU@~">~@NAME@~<

You would then invoke a script that would compare the name you scraped to that you want:
 

My update was cancelled prematurely. How do I restore screen-scraper?

From within the workbench, if when prompted you agreed to download an update, and for some reason the update download stopped prematurely then the next time you try to run screen-scraper it may fail. Instead an error window will appear containing a java.lang.ClassNotFoundException error.

I've just upgraded to an alpha version of screen-scraper, but it has bugs. How to I downgrade back to a more stable version?

We try our best to make even the alpha versions usable, but invariably bugs will slip by us once in a while. If you'd like to downgrade back to a previous version of screen-scraper, follow these steps: