The filename, directory name, or volume label syntax is incorrect

Hello Support,

I would imagine that my problem appears simple but I've been working on this one for about six hours with no solution. I'm saving a website URL as a session variable...

Processing script: "04 Realtor Scrape Agent Website"
 UNFIXED: AGENT_WEBSITE = href="http://www.kevincrozier.com""

Removed unwanted text variables...

FIXED AGENT_WEBSITE = "http://www.kevincrozier.com"

Double checked to see if the URL is correct...

The agent's website is: "http://www.kevincrozier.com"

In the Properties tab I inserted the website URL token...

~#AGENT_WEBSITE#~/

The remaining log is as follows:

Scraping file: "05 Realtor Agent Website"
 05 Realtor Agent Website: Processing scripts before a file is scraped.
 05 Realtor Agent Website: Preliminary URL: ~#AGENT_WEBSITE#~/
 Scraping local file: "http://www.kevincrozier.com"/

But I'm receiving the following error:

An error occurred while reading in the file: "http://www.kevincrozier.com"/. The error was "http:\www.kevincrozier.com" (The filename, directory name, or volume label syntax is incorrect). The scrapeable file was: 05 Realtor Agent Website, and the scraping session was: Agent Search.

As an alternative, I tried to use key/value/GET in the Parameters tab, whereby the key was blank, value was ~#AGENT_WEBSITE#~ and the type was GET and the Properties tab URL was blank, however I received an error as well. The log is as follows...

Processing script: "04 Realtor Scrape Agent Website"
 UNFIXED: AGENT_WEBSITE = href="http://www.alaskaviewproperties.com"
 FIXED AGENT_WEBSITE = "http://www.alaskaviewproperties.com"
 The agent's website is: "http://www.alaskaviewproperties.com"
 Scraping file: "05 Realtor Agent Website"
 05 Realtor Agent Website: Processing scripts before a file is scraped.
 05 Realtor Agent Website: Preliminary URL: 
 Scraping local file: 
 An error occurred while reading in the file: . The error was . The scrapeable file was: 05 Realtor Agent Website, and the scraping session was: Agent Search.
 05 Realtor Agent Website: Processing scripts after a file is scraped.

Thanks for your help.

Adrianjay on 03/06/2010 at 11:09 pm

screen-scraper support for licensed users

Are you using version 4.5. or

Are you using version 4.5. or a more recent alpha? I seem to remember that we saw this on some systems with 4.5, but an upgrade would fix it.

http://community.screen-scraper.com/faq#80n867

jason on 03/08/2010 at 9:41 am

Version 4.5

Hi Jason,

I have version 4.5.

Regards,

Adrian

Adrianjay on 03/08/2010 at 12:38 pm

Do try the newest version

Do try the newest version (4.5.36a as of now), and I think that will correct it.

jason on 03/08/2010 at 1:40 pm

Edit the "Version" and Same Error

How do I Edit the "Version" property of your "resource/conf/screen-scraper.properties" file so that it reflects the new version? I have Windows XP Professional and I was unable to do this step.

I think the new version is working because the layout of Screen-Scraper is different, however I'm still getting the same error message as before. I noticed that in the error message...

(The error was "http:\www.kevincrozier.com" (The filename, directory name, or volume label syntax is incorrect). The scrapeable file was: 05 Realtor Agent Website, and the scraping session was: Agent Search.)

the "//" is displayed as "\", does that indicate anything? Any suggestions on how to fix this error?

Adrianjay on 03/09/2010 at 2:05 am

You shouldn't need to edit

You shouldn't need to edit that if you're upgrading through the GUI.

That slash in the URL is definitely a problem, but I have no idea of from where it's coming.

jason on 03/09/2010 at 10:06 am

Removing Double Quotes and GUI

Hi Jason,

Thanks to Scott I think the problem is I did not remove the quotes from the variable. For example, I'm using the following

value = value.replaceAll(",", "");

to remove all commas from the variable but I don't know how to alter the above code to remove the double quotes.

Also, when I select "About screen-scraper" from the Help menu the version states that the version is 4.5 even though the version I'm using appears to be a newer version.

Thanks Scott and Jason !!!! I really appreciate your help.

Adrianjay on 03/09/2010 at 1:37 pm

Adrian, The following will

Adrian,

The following will remove the double-quotes.

value = value.replaceAll("\"", "");

To verify that you're fully updated to the latest alpha, follow these instructions.

-Scott

swilsonmc on 03/09/2010 at 2:06 pm

Same Error Message...

Hi Scott,

I updated Screen-Scraper to the most current version and removed the double quotes but I'm still receiving the same error message.

The script (04 Realtor Scrape Agent Website) to remove the unwanted text from the variable URL is:

//Remove unwanted text based values String [] variables = {"AGENT_WEBSITE"};

i = 0;

// Iterate through each variable in the array above
while (i < variables.length){

//Get the variables to be fixed
value = session.getVariable(variables[i]);

//Log the UNFIXED values
session.log("UNFIXED: " + variables[i] + " = " + value);

if(value != null){
//Remove non-numerical elements
value = value.replaceAll(",", "");
value = value.replaceAll("\"","");
value = value.replaceAll("href=", "");

// Set variables with new values
dataRecord.put(variables[i], value);
session.setVariable(variables[i], value);

//Log the FIXED values
session.log("FIXED " + variables[i] + " = " + session.getVariable(variables[i]));
}
i++;
}

// Make sure it is correct.
session.log( "The agent's website is: " + session.getVariable("AGENT_WEBSITE") );

session.scrapeFile( "05 Realtor Agent Website" );

The log is as follows:

Processing script: "04 Realtor Scrape Agent Website" UNFIXED: AGENT_WEBSITE = href="http://www.debbiehigbee.com" FIXED AGENT_WEBSITE = http://www.debbiehigbee.com The agent's website is: http://www.debbiehigbee.com Scraping file: "05 Realtor Agent Website" 05 Realtor Agent Website: Processing scripts before a file is scraped. 05 Realtor Agent Website: Preliminary URL: ~#AGENT_WEBSITE#~ Scraping local file: http://www.debbiehigbee.com An error occurred while reading in the file: http://www.debbiehigbee.com. The error was http:\www.debbiehigbee.com (The filename, directory name, or volume label syntax is incorrect). The scrapeable file was: 05 Realtor Agent Website, and the scraping session was: Agent Search. 05 Realtor Agent Website: Processing scripts after a file is scraped.

I find it very strange that the error message changes the "//" to "\" of the URL. I hope I have included enough information to solve the problem.

I thank you for your help so much that I'm considering of legally changing my son's to Scott.

Adrianjay on 03/09/2010 at 3:05 pm

Adrian, A couple things here.

Adrian,

A couple things here. It's very rare that you'll ever modify URL that you extract from a page. Very rare. So, be very certain you want to modify the URL. I realize that "href=" does not belong but could you modify your extractor pattern to not include the "href="?

If you absolutely need to modify the URL (or any string for that matter) to remove certain things, here is the approach you could take.

String fixString (String value) { if (value != null) { value = value.replaceAll(",", ""); value = value.replaceAll("\"", ""); value = value.replaceAll("href=", ""); value = value.trim(); } return value; }

Now, elsewhere in your script you'll apply the above function to the string by doing this.

//write to the log the value of the session variable session.log("MY_STRING: " + session.getVariable("MY_STRING"));

//set the value of your session variable to a local variable
myString = session.getVariable("MY_STRING");
session.log("before fix myString: " + myString);

//apply the fix function to the string
myString = fixString(myString);
session.log("after fix myString: " + myString);

//set the resulting variable back to the session variable
session.setVariable("MY_STRING", myString);
session.log("MY_STRING: " + session.getVariable("MY_STRING"));

I hope this helps.
-Scott

swilsonmc on 03/09/2010 at 4:47 pm

Search

Community

screen-scraper

User login