Unable to scrape website with French characters (again)!!
I'm sorry to bother you again, but I have always my problem with French accent when I tried to invoke screen-scraper from Java . It seems not to be a hard problem but a have no response.
I would like to buy Screen Scraper Professional but I want to evaluate it before
Here is the url : http://www.novaplanet.com/bons-plans/?ville=1&ddj=2009-08-16
When, i define Iso 8859-1 as the Character set, the character are ok in the Extractor Pattern and also when i write them to a file.
But when i try to invoke mu screen-scraper project from Java (with remoteScrapingSession), i get bad Character like "?" . I try lot of things but nothing works . Maybe somethink to deal with RemoteScrapingSession and the character encoding ?
Hope you can help me !!
Cheers,
Gils
I am on a Windows XP machine,
I am on a Windows XP machine, and sometimes the OS can make a difference, so if this doesn't help, let me know your OS.
class="titre">~@TEST@~
There is a RegEx in the TEST token for non-HTML.
That works for me, and get the correct characters.
Did you try from JAVA ??
As i said in my last mail, when i define ""ISO-8859-1" as the Default character set, it’ works fine in screen Scraper (Extractor Pattern) and when i write the session to a file (see , the script « WriteNOVADATA « in the mail i send you)
But when i want to invoke my Scraping session from JAVA (with remoteScrapingSession), i get bad Character like “? » .
Did you try with a Java client ??? (i can send you again my full project)
I tried on mac & PC : same problem ☹
Regards
Gilles
That part too works for my
That part too works for my test. My script is just:
// Define new session
myScrapingSession = new com.screenscraper.scraper.RunnableScrapingSession("Nova");
// Run
myScrapingSession.scrape();
I suspect you may be passing the "session" when you define the new session. If so, I don't think you need it.
I’m using remoteScrapingSession in my Java client
I’m using remoteScrapingSession in my Java client , not RunnableScrapingSession ! So screen-scraper is running as a server !!.
Maybe my problem is similar to this one :
http://community.screen-scraper.com/node/1254
I can send you my projet and my Java Client if you need !
Thanks for your help
Gilles
Giles, If you update to the
Giles,
If you update to the latest version of screen-scraper, there is an updated driver you can use.
Instructions to do so:
http://community.screen-scraper.com/FAQ/NoUpdates
And then make sure you're using the drivers in this version.
Always problem with French characters from Java
Hi,
I upgrade screen scraper to version 4.5.14a.
But nothing change !! I get always my bad characters from my java client like :
Le festival de cin� en plein aime les grandes plaines et le prouve avec sa th�matique
I feel desesperate !
I can send you my all project if that’s help
Thanks
Giles
Okay, do you want to email it
Okay, do you want to email it over to me?
[email protected]