Any recommendations on how to handle projects that involve large numbers of scraping sessions?
In cases where you're dealing with large numbers of scraping sessions, it becomes too cumbersome to retain them all in the workbench. Even if you organize them neatly into folders, there will likely still be too many to viably work with. Rather than keep all scraping sessions in the workbench at once, we generally find it useful to export and save them all to a central directory, which, ideally is under version control using something like Subversion or CVS. When you need to work with a particular scraping session, you simply import it from the repository. Every once in a while, you export the scraping session back to the central directory. Ideally the directory also gets backed up once in a while so that you don't lose any work. When working with a project where there are a large number of scraping sessions, you'll also often have a series of "general" scripts that get used by most, if not all, of your scraping sessions. For example, you might have one script that gets invoked by every scraping session, which is in charge of opening a database connection or initializing a file to which extracted data will be written. We typically handle these "general" scripts by storing them in a separate folder, alongside where all of the scraping sessions are stored. This directory should get versioned and backed up as well. The difference with the "general" scripts is that it's typically a good idea to keep them all in the workbench in their own folder. Usually there aren't very many of them, and they get used often enough that you'll typically want to just retain them in the screen-scraper workbench.