FileWriter Output Encoding problems

Hi,

Before I get into my problems I've got to say this is a really impressive program. Really appreciate the flexibility in the basic edition.

I'm scraping a Chinese Web Site and I'm running into some problems getting the final output to give me non-ascii - instead I'm getting question marks. I've done the following:

- Checked the FAQ, set UTF-8 as my default, Using Arial Unicode for the font
- Checked the forums, learned that HTML Tidy can't be disabled in basic (wish that was in the FAQ!)
- Upgraded to the professional Trial, disabled HTML tidy
- I've tried forcing UTF-8 in my text editor in case there wasn't a BOM

When I apply the extractor patterns to the last scraped data I see the proper non-ascii data within the screen-scraper UI. So everything is good above, and I think that covers all the different points in the forums/faqs. However when I actually write this out using FileWriter and out.write I lose the non-ascii.

Note I'm using OS X. I'm wondering if my problem is that OS X actually defaults to using MacRoman with Java output:
http://developer.apple.com/DOCUMENTATION/Java/Conceptual/Java14Developme...

If that's the case, what's the proper way to get my script to force the output to UTF-8? If that's not the problem here what else could be the cause?

Here's the basic script I'm using, copied from the tutorials:

FileWriter out = null;

session.log( "Writing data to a file." );

// Open up the file to be appended to.
out = new FileWriter( "Cpod_Tome.txt", true );

// Write out the data to the file.
out.write( " " + session.getVariable( "TITLE" ) + "\n" );

// Close up the file.
out.close();

EDIT:
I found a partial solution. If I replace this line:
out = new FileWriter( "Cpod_Tome.txt", true );

To either this:
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream("cpodtome.txt"),"UTF-8");
OR:
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("cpodtome.txt"),"UTF8"));

Then I can get it to output UTF-8 and the output looks good.

However OutputStreamWriter works a bit differently from FileWriter. Instead of appending the file like FileWriter it just replaces the existing file. I haven't been able to find any code examples that show how I can get OutputStreamWriter to append. Would appreciate any help, Thanks!

Found the cause, which is

Found the cause, which is more or less along the lines of what I expected, but looks like it affects all platforms. Filewriter uses the default system encoding, this apparently can't be changed.

I just followed the other recent thread on encoding problems and changed this line:
out = new FileWriter( "Cpod_Tome.txt", true );

To this:
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream("cpodtome.txt"),"UTF-8");

Works like a charm, but perhaps the FAQ should be updated?

Yes, that would be a good

Yes, that would be a good idea. Thanks for doing that research :)

The first fix I posted wasn't

The first fix I posted wasn't quite write, as it doesn't append to the output file like the tutorials do. Here is the proper code:

Replace (from the tutorial):
out = new FileWriter( "filename.txt", true );

With:
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream("filename.txt", true),"UTF-8");