General Technical

Questions regarding how screen-scraper works or how to get it to do something.

Does screen-scraper follow redirects?

screen-scraper will automatically follow certain redirects, so it just depends on what type the web site is making use of. There are three types of redirects that are typically used on the web:

Can screen-scraper work with sites that use HTTPS?

screen-scraper supports HTTPS on all supported platforms except certain early versions of Mac OS X. If you're using the screen-scraper proxy server to access a site that uses HTTPS follow the directions found under the "Viewing encrypted transactions" found on this documentation page: Using the Proxy Server. In setting up scrapeable files to access pages that use HTTPS you don't need to treat them any differently than those that use HTTP.

The web site I'd like to scrape uses cookies, can screen-scraper handle this?

Absolutely. screen-scraper handles cookies (and BASIC authentication tokens) transparently behind the scenes. When setting up screen-scraper to scrape information from your site you rarely need to take any thought for cookies. In certain cases, sites will set cookies in JavaScript. In such cases, you can set them within a screen-scraper script via the session.setCookie method.

I'd like to scrape information from a web site that requires me to log in first. Can screen-scraper handle this?

Yes. This is a common situation, and generally just requires that you create a scrapeable file to handle logging in. This scrapeable file should be run first in the scraping session, allowing the web site to set cookies, which screen-scraper will then track for you.