General Observations/Some Specific Questions

First off, let me say... this is a great product and it filled a need I was desperate to fill!

Now then... some of the problems I have been experiencing and some work-arounds I have found and other suggestions/questions. Note: I am using the .NET interface as I could not get the COM working properly using vb.net 2005 express:

Main one, we will call it #1 :-) :

When using the server, if the ss.data file (the ss.script file seems to 'refresh' itself) gets to be about 240+ Megs or so (I work the scrapper pretty hard), the server will choke and shut itself down, leaving it in an unstopped/unstartable state with a lock file still in place. It takes about 2.5 hours to get to this point. My eventual workaround... Every 2 hours I'll have the system stop the server, copy in a pristine /db file/folders, and the start the server back up. This has stopped the crashes, but it is slowing me down :-)
Question: Is there something I am neglecting to do (or dispose of?) to keep the ss.data file from getting so large? The longest I need to keep ANY scraped info is a couple seconds... anything I need gets put into a DB. Is the memory option under 'options' tied to this?

#2 The GUI will allow me to exit without asking if I want to save my work. It also is picky about getting focus on someplace else on the same screen before switching screens, otherwise the changes you just made are not recorded.

#3 I had tried upgrading to the latest version (2 days ago) to see if my issues in #1 had been addressed, and I noticed a slowdown in the scraping AND the server would call it quits after only 30 minutes or so. I didn't spend any more time with it to give you more info, as I quickly reinstalled the one I downloaded for my trial 1.5 months ago (I can't tell you version numbers as I am writing this away from the 'work' computer).

#4 I think I am not utilizing the scripting cabilities as well as I could/should when extracting data... more examples could help :-)

Thank you! Any suggestions re: Issue one would be wonderful!

-Les

General Observations/Some Specific Questions

spillage,

These are very good questions that you are asking.

#1 When doing large scrapes memory is a big deal. We are running into that issue in some of the work we are doing these days. Here are some recommendations.

Save data to some external source such as a database or file often. When doing so set the session dataSet or whatever you are saving it in the session as to 'null'. The Java garbage collector will delete references which no longer remain for you.

If you do not need to don't save something in the session. For example variables in most cases will not need to be saved to the session and to a data set. ("Automatically save the data set..." and "Save in session variable?")

#2 We are aware this fact. Currently it is low on our priority list, since the work around is simple. It is on our list, but there are features that just can not wait to allow screen-scraper to become even greater.

#3 This is worth looking into. We will put it on the list of bugs.

#4 We have considered putting more examples on the website. With this feedback we will put this as a higher priority. I can give you no guarantees as to when it would happen, but I know how much a good example or two can help a programmer in their work.

Thanks for the feedback. Also, please continue to ask questions.

Brent
[email protected]

General Observations/Some Specific Questions

i'm with spillage, this is a great product (only 3rd day of trial, but i'm convinced), but better documentation and examples would go a long way in making the product better.

specifically:
- more examples of how/where to use scripts
- more examples of complex extractor patterns and sub-patterns
- more examples of access via asp, php, etc.
- easy to find list and reference of all commands and syntax
- gui/workbench interface when running as a service

good stuff e-kiwi