New NekoHTML Tidy Engine

I just downloaded the new alpha version, and I wanted to mention that the new default Tidy engine, NekoHTML, is kind of annoying. It doesn't clean up the HTML that well (it puts way too much whitespace in), it doesn't preview well, or sometimes at all, in the browser, and any copied extractor patterns from my old scrapes don't work. I don't mind that it's an option, but it would be nice if we could change the default so I don't have to switch it every time I create a new scrapeableFile.

Hi Chris, This is a good

Hi Chris,

This is a good suggestion. NekoHTML is new to us as well, so we're still working out some of the kinks. It's more robust than JTidy (i.e., it's more likely to be able to tidy a very malformed HTML document), but you're right that it doesn't do the greatest job at tidying things up. I've added to our to-do list a task that we need to allow the user to indicate the default tidier to use. Watch for that in an upcoming alpha version.

Todd