Different results on server vs workbench
I've attached a very simplified version of a scrape we've created. The problem is that when you run this scrape on your local computer using the screen-scraper workbench, it returns a normal result, but as soon as I upload it to a server interface and run it there, it gives a Bad Request 400 error. The ss versions are basically the same (recent alphas); why would it be doing this?
Thanks,
Chris
I tested in 5.0.22a, and it
I tested in 5.0.22a, and it worked in both the workbench and server. I have a theory though: in version 5.0.20a there was a bug that caused scrapes to fail import. I would think that your server is on that version, and not importing the scrape as expected. Should be fixed by update to 5.0.22a.
I upgraded the server to
I upgraded the server to 5.0.22a (it was 5.0.13a), but it still gave me the error. Here is the output of the log file:
Test: Resolved URL: http://news.bonton.com/apps/storelocator/searchresults_popup.asp?d=130&state=IL
Test: Sending request.
Test: Warning! Received a status code of: 400.
Test: Applying extractor pattern: Untitled Extractor Pattern
Test: Extracting data for pattern "Untitled Extractor Pattern"
Test: The following data elements were found:
Untitled Extractor Pattern--DataRecord 0:
ALL=xmlns="http://www.w3.org/1999/xhtml"><head><meta name="generator" content="HTML Tidy, see www.w3.org" /><title>Invalid URL</title></head><body><h1>Invalid URL</h1>The requested URL "/apps/storelocator/searchresults_popup.asp?d=130&state=IL", is invalid.<p>Reference #9.812f0660.1287445857.b2db48a</p></body>
Processing scripts after scraping session has ended.
Scraping session "Test Scrape" finished.
If you want to run it on this server, you can do so here:
http://ec2-184-72-175-139.compute-1.amazonaws.com:8779/ScreenScraperWeb.html
Is it possible that your EC2
Is it possible that your EC2 address is blocked? No one here can reproduce. If you were to launch server mode on your development machine, can you run it there?
I think I see the problem:
I think I see the problem: the ampersand in the GET string is HTML encoded. If you were to use the parameters tab to set them, that won't happen (make 2 seperate entries on there), but now I'm confused as to 1) why it would happen in the server and not the workbench, and 2) why it's not happening for me.
I will keep looking.
I saw that, too, where the &
I saw that, too, where the & was encoded out, so I tried it both ways and it was still the same. You'll notice the resolved URL has the correct & on it. I think it has the
&
there just so it shows up in the error page with the correct looking URL.This EC2 instance was just created yesterday, for the sole purpose of testing this problem (I have to shut it down, by the way, it's causing some problems with my controller). It's not the first EC2 instance I've tried it on, either.