Sorry, tidying HTML failed. Returning the original HTML
Hello,
i want to scrape from myspace.com. but when I try to access any page from myspace i get the referal:
Redirecting to: http://www.myspace.com/help/browserunsupported
and the screen scraper log says:
Sorry, tidying HTML failed. Returning the original HTML
Is there anything you can do here, please?
Thanks in advance
nebben, What page are you
nebben,
What page are you visiting prior to the page that redirects you?
I tested myspace.com and made a few scrapeable files from the home page, a profile page and an all friends page. I did not experience the redirect you're talking about.
-Scott
cool i think its any page of
cool i think its any page of myspace. i start with
http://www.myspace.com
i do nothing just scrape and im redirected to browser unsupported.
How did you do that?
Can you send me your .sss file? (but dont forget not to pass your credentials)
[email protected]
On it's way to you via
On it's way to you via email...
The HTML tidy isn't a big
The HTML tidy isn't a big deal. It's a tool to make the source HTML look nicer. On the advanced tab of the scrapeable file you can just change it to try Jerico, and if that's no good you can turn off tidy; it will still work.
thanks,ok now i dont get the
thanks,
ok now i dont get the tidy thing anymore but its still
Redirecting to: http://de.myspace.com/help/browserunsupported
What version of
What version of screen-scraper you on? Older versions defaulted to use the IE 6 user agent. If you use setUserAgent for something newer, it should help. Try:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2
Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Or
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11
thanks
thanks