Confusing https situation
So I have this site:
https://pa_allegheny.uslandrecords.com/palr/controller
And it's https, no problem. Works in a browser. However, when running it through the (ver5.5.4a) proxy I get an error:
Error 107 (net::ERR_SSL_PROTOCOL_ERROR): SSL protocol error.
Ok, fine, proxy can be strange some times (have never had this issue before, however) so I'll do it by hand. I use tamper with firefox and get the http request response flow and start plugging URLs into screen-scraper, the first being the simple:
https://pa_allegheny.uslandrecords.com/palr/controller
But from screen-scraper now I'm getting this:
Scraping file: "New Scrapeable File"
New Scrapeable File: Resolved URL: https://pa_allegheny.uslandrecords.com/palr/controller
Setting referer to: http://pa_allegheny.uslandrecords.com/palr/controller
New Scrapeable File: Sending request.
New Scrapeable File: An input/output error occurred while connecting to 'https://pa_allegheny.uslandrecords.com/palr/controller'. The message was URI does not specify a valid host name: https://pa_allegheny.uslandrecords.com/palr/controller.
Processing scripts after scraping session has ended.
Which confuses me more, as that's complaining about a host name, not about security.
=====
That said, I'm having trouble getting the site to work in firefox at all, even without the proxy. So I'm guessing it's all about the site. But the situation really felt like one in which screen-scraper would be able to plow through and fake it enough to get where it needed to go.
I checked too, and currently
I checked too, and currently I can't get the URL
https://pa_allegheny.uslandrecords.com/palr/controller
To come up in Chrome, IE9, or Fx4.
Robert, My guess is it's the
Robert,
My guess is it's the underscore in the subdomain. An underscore is an acceptable character to use in a domain name. However, out of the box, HTTPClient apparently does not support underscores in domain names.
I'll submit a bug for this and look for a fix in an upcoming alpha release.
Thanks,
Scott
Working on our scrape
Working on our scrape timelines and was curious to know if this would be an alpha "soon" or "at some point" -- just need to know if we should put off this site scrape for a bit or if we should just hang out a bit.
Thanks for the help, by the way! We love to find strange things for you to scratch your head over. :)
Robert, Thank you for your
Robert,
Thank you for your patience. This has been fixed in 5.5.7a.
-Scott
Do you rock? Yes. Yes you do.
Wow. That was fast! Thanks!