Transfer Encoding: chunked - causing ERROR to be displayed
I'm downloading files using the session.downloadFile method and noticing an error being displayed during the scraping session. The Response Header has Transfer-Encoding: chuncked. The files appear to have downloaded properly and completely but the ERROR is disconcerting and may be causing asynchronous issues within the Java environment it is being called from. The Scraping Session is set up to use HttpClient. Any thoughts on why this would appear as an error
ERROR: Failed to save the file: C:\Temp\test.pdf. The error message was: chunked stream ended unexpectedly.
Any help would be greatly appreciated.
rhelsen49, Can you send an
rhelsen49,
Can you send an example URL that you're trying to download and the line of code you're using?
-Scott
scottw[at]our-domain
script code
Not sure if this is enough or if you nee the entire session file
The starting point is:
http://www.centraltransportint.com/confirm/trace.aspx
with 55524821653 submitted as the PRO Number then select Trace
The script below builds the url and then calls the session.downLoad
http://www.centraltransportint.com/confirm/DocumentOut.aspx?pro=121-5683066-8&type=pod
String docType = dataRecord.get("DOCTYPE");
String realPRO = dataRecord.get("PRONUM");
String baseURL = "";
String slash = System.getProperty("file.separator");
baseURL += "http://www.centraltransportint.com/confirm/DocumentOut.aspx?";
String downloadDir = System.getProperty("java.io.tmpdir");
if (session.getVariable("DOWNLOAD_DIR") != null) {
downloadDir = session.getVariable("DOWNLOAD_DIR");
}
session.log("downloadDir = " + downloadDir);
String docurl;
String filename = "CTII_" + docType + "_" + realPRO + ".pdf";
docurl = baseURL + session.getVariable( "PRO") + realPRO + "&" + session.getVariable( "DOC_TYPE") + docType;
docurl = docurl.replaceAll("&", "&");
session.log("docurl = " + docurl);
String downloadURL = scrapeableFile.resolveRelativeURL(docurl);
session.log("resolved URL = " + downloadURL );
session.setVariable("ORIGINAL_FILENAME", filename);
filename = downloadDir + slash + filename;
session.log("filename = " + filename);
if (session.downloadFile( downloadURL, filename)) {
session.log(filename + " downloaded");
}
else{
session.log(" Downloaded Unsuccessful");
}
session.setVariable("DOWNLOADED_FILE", filename);
rhelsen49, I was able to
rhelsen49,
I was able to replicate the problem you're having. I think you may have revealed a bug in HTTP Client. We'll do more checking but if it is a bug in HTTP client (and not in screen-scraper itself) then we may not have a fix for it right away. I'll get back with you on any changes you can expect with HTTP Client and screen-scraper.
-Scott