Uploading Data Files - CONNECT www.xxxx.com:443 HTTP/1.
I am having problems with the proxy session receiving an error status each time I attempt to upload data files to a website. The website offers a webpage interface where you select the location of the data file on your local hard drive and select a couple of optional attibutes and then click submit to upload the file. The proxy session shows a connect string under the request line but has status of error.
CONNECT home.netscape.com:443 HTTP/1.0
I have been able to scrape screens and download data files but I seem to run into difficulties when I try to automate uploads. Any hints on how to accomplish this?
Thanks.
Uploading Data Files - CONNECT www.xxxx.com:443 HTTP/1.
Hi,
It's going to be pretty tricky to help troubleshoot this without seeing your scraping session. Is there a chance you could send it to me so that I can take a closer look? By the way, I'm guessing there may be some sensitive information involved here (e.g., usernames and passwords), which we deal with quite frequently. If you'd like, we could put an NDA in place before we help you. If this is a viable route, you can email me directly. My email address is my first name at screen-scraper.com.
Kind regards,
Todd
More Help on Uploads
I am still having trouble getting file uploads to work. Below is one upload form I have spent countless hours on:
<form action='edi.php' method='post' enctype='multipart/form-data'><table align="center" cellspacing="0" cellpadding="2" width="450"><tr><td bgcolor="#C4DF9A" class="leftgreennav">> FLAT FILE</td><input type='hidden' name='submitted' value='TRUE' id='1175971826'><input type='hidden' name='MAX_FILE_SIZE' value='1048576'><td bgcolor="#c4df9a" class="leftgreennav"><input type='file' name='file[]'></td></tr><tr><td bgcolor="#c4df9a" class="leftgreennav" colspan="2" align="center"><input type='submit' value='Upload'></td></tr></form>
I setup a scrape with the following parameters (in squence below):
key: filename value: c:\temp\feed.txt type: FILE
key: submitted value: TRUE type: POST
key: MAX_FILE_SIZE value: 1048576 type: POST
key: submit value: Upload type POST
I keep receiving the following errors in the response from the website:
> > > function checkform(){ // make sure start date is before the end date if (Date.parse(document.tracking.date2.value) Date.parse(document.tracking.date1.value)) { alert("Invalid Date Range!\nStart Date cannot be after End Date!") return false; } }
Here is the raw request I was able to capture from the proxy session:
Cookie: PHPSESSID=04a5c30daa076cbc20363245d31c5285
Keep-Alive: 300
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070219 Firefox/2.0.0.2
Content-Length: 671
Accept-Encoding: gzip,deflate
Accept-Language: en-us,en;q=0.5
Content-Disposition: form-data; name="file[]"; filename="edi_feed.txt"
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Referer: https://xxxxxxx.xxxxxxx/edi.php
Content-Type: text/plain
Content-Disposition: form-data; name="submitted"
Connection: keep-alive
Host: xxxxxxx.xxxxxxxx.com
Content-Disposition: form-data; name="MAX_FILE_SIZE"
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Content-Type: multipart/form-data; boundary=---------------------------13142435111958
-----------------------------13142435111958--
Thanks in advance for your time and any hints you can provide.
Uploading Data Files - CONNECT www.xxxx.com:443 HTTP/1.
Nope. That's generally only used for CSS styles.
Todd
Uploading Data Files - CONNECT www.xxxx.com:443 HTTP/1.
Is the "ID" attribute important?
Uploading Data Files - CONNECT www.xxxx.com:443 HTTP/1.
Hi,
In this example the key would be "submitted" and the value would be "TRUE" (you'd obviously omit the double-quotes when entering those into screen-scraper, remember).
Just let me know if I can clarify anything else.
Kind regards,
Todd
Post Parameter Definition / Uploading Files
Hi Todd. Thanks for your help. I am having trouble determining what the post parameters should be for the input tags I am seeing.
For example:
<input type='hidden' name='submitted' value='TRUE' id='1175092081'>
If this should be a post parameter, what would be the key and value translation?
I am using screen scraper pro version.
THANKS.
Uploading Data Files - CONNECT www.xxxx.com:443 HTTP/1.
Hi,
We do have plans to add that to the proxy server, but it may be a bit. There are quite a few other features competing for priority.
You're correct that any other parameters would be designated as type POST.
Just let me know if I can clarify further.
Kind regards,
Todd
RE: Uploading Data Files
Thanks for your help Todd. Are there any plans to add this proxy server functionality in the future?
I tried to follow the documentation and setup a parameter with file type and path. I am not sure what to do with the miscellaneous options that you choose for the upload file in this particular form. Would these options typically be submitted as a "post" type parameter in addition to the file type parameter?
For example -
<td width="200" align="right" valign="top"> <b>File Name:</b></td>
<td align="left">
<input type="file" size=40 name="Submitted_ImportFile" value=""> (max 10MB)
<input type="hidden" name="local_filepath" value="">
<br>
</td>
</tr>
<tr bgcolor="#EEEEEE">
<td width="200" align="right" valign="top"><nobr><b>Overwrite Existing Data?</b></nobr></td>
<td align="left"> <input type="radio" name="OVERWRITE" value="" checked>
No, do not change any records already in my database.<br> <input type="radio" name="OVERWRITE" value="Y" >
Yes, replace any existing data with my new updated data.<br> <input type="radio" name="OVERWRITE" value="T" >
Clear entire table (delete all records), and replace with this import
file<br>
</td>
</tr>
<tr bgcolor="#EEEEEE">
<td width="200" align="right" valign="top"> <b>Test Mode?</b></td>
<td align="left"> <input type="radio" name="TEST" value="" checked>
Go ahead and import my data.<br> <input type="radio" name="TEST" value="Y" >
Test my file; do not import it yet.<br>
</td>
</tr>
Uploading Data Files - CONNECT www.xxxx.com:443 HTTP/1.
Hi,
I'm sorry to say this is actually a shortcoming in the proxy server--it won't currently handle file uploads. You can do file uploads in a scraping session, however. Documentation on that can be found under the "Parameters tab" section on this page
http//www.screen-scraper.com/support/docs/using_scrapeable_files.php
Kind regards,
Todd Wilson