Scraping Session: Advanced tab

Advanced tab

  • Max retries per file (professional and enterprise editions only): The number of times that screen-scraper should attempt to request a page, in the case that a request fails. In some cases web sites may not be completely reliable, which could necessitate making the request for a given page more than once.
  • Cookie policy (professional and enterprise editions only): The way screen-scraper works with cookies. In most cases you won't need to modify this setting.

    There may be instances where you find yourself unable to log in to a web site or advance through pages as you're expecting. If you've checked other settings, such as POST and GET parameters, you may need to adjust the cookie policy. Some web sites issue cookies in uncommon ways, and adjusting this setting will allow screen-scraper to work correctly with them.

  • Character set (professional and enterprise editions only): Set the character set for the scraping session.

    If pages are rendering with strange characters then you likely have the wrong character set. You should also try turning off tidying if international characters aren't being rendered properly.

  • Key store file path: The path to a JKS file that contains the certificates required for this scrape
  • Key store password: The password used when generating the JKS file

    Some web sites require that you supply a client certificate, that you would have previously been given, in order to access them. This feature allows you to access this type of site while using screen-scraper.

  • External proxy authentication: These text boxes are used in cases where you need to connect to the Internet via an external proxy server.
    • Username: Your username on the proxy server.
    • Password: Your password on the proxy server.
    • Host: The host/domain of the proxy server
    • Port: The port that you use on the host server.
  • External NT proxy authentication: These text boxes are used in cases where you need to connect to the Internet via an external NT proxy server.

    If you are using NTLM (Windows NT) authentication you'll need to designate settings for both the standard proxy as well as the NTLM one.

    • Username: Your username on the NT proxy server.
    • Password: Your password on the NT proxy server.
    • Domain: The domain/group name that the NT proxy server uses.
    • Host: The host of the proxy server.