URL with colon in it.
For some reason, the following URL won't return the same HTML as a browser will:
http://www.dlcmgmt.com/property/output/center/detail/id:1439
I tried checking if it was cookies or anything, but you can just paste that URL into any browser and it works, but if you try to run it in a scrape, it doesn't. My only thought is that it's because there's an extra colon in there? I tried using %3A instead, but to no avail. Thanks in advance.
Chris, I had Mike look at it,
Chris,
I had Mike look at it, and he found it's not the colon. The problem is in JTidy. If you change the tidier to Jericho or turn off tidy it works.
We just had a similar thing
We just had a similar thing with a semi-colon. I think it was HTTP Client adhering strictly to the spec, and we had to write in an allowance.
I confirmed this is an issue, and we're looking at it.