a few feature requests

Hello Screen Scraper,
again thanks for a great product!

I have been using SS for a few weeks now and I'm quite impressed with it, but I would however like to give you some ideas as to where you might improve.
I have listed them in order of importance.

To be able to store HTML in the "last response" tab manually, so we could just copy/paste the HTML source code, and start testing the extractor patterns immediately (without the proxy).

When making the extractor pattern, it would be easier to make it match the correct tags / places if we were able to place regex directly in the pattern:

from this:
a href="[email protected]@~/[email protected]@~.aspx"[email protected]@~title="[email protected]@~">[email protected]@~<

I would like it better if one could write something like this:
a href="[/.*]*?(\w+).aspx".*?title=".*?">([a-zA-Z1-0 ]*)<

and then later use the \1 and \2 and store named vars in the dataSet, maybe with a doubleclick like we have now with "store in session" options etc..

That way I wouldn't have to sort the result later on, and less testing/validation in the scripts would have to be done.

A way to further refine extractions after an initial extraction without the use of session.executeScript();

The ability to reload .jar files without a restart of SS would be really nice.

A tool in the extractor to easily convert text to doubles would also be appreciated. For example in Sweden they seperate decimals with : (on prices) and in Denmark we use , and in US they use .
When using the same scraper to target similar pages internationally, where the only difference is the number format (199,- one place, 199:- another, and 199.99 in yet another) it would make the job easier.

That would probably make SS near perfect for me
(until I think of something else anyway) :-)

Best regards
Gustav Palsson

Thanks much, Gustav. That

Thanks much, Gustav. That clarifies it.

feature # 3

I have this kind of HTML structure for menus:





      Apple løsdele
    • ect. ect. ect...

      I use this script to extract the menus, and I call use

      DataSet scdataset = scrapeableFile.extractData( mcdata, "subcat extract" )

      to call extractors on subsets of the extractions.
      It is not possible to do this with [email protected]@~ because I need to sort the categories in the correct order, så my script looks like this:

      if( dataSet.getNumDataRecords() > 0 ){
      //looping datarecord
      for(int i =0; i DataRecord r = dataSet.getDataRecord(i);

      //defensive copy
      CompetitorCategoryWriter cw = session.getVariable("cw").clone();
      CompetitorCategory c = new CompetitorCategory();

      //extracting subcats
      String mcdata = r.get("DATARECORD");
      DataSet scdataset = scrapeableFile.extractData( mcdata, "subcat extract" );

      //looping subcats
      for(int l = 0; l DataRecord re = scdataset.getDataRecord(l);

      String s = re.get("SCID");

      if (s.toLowerCase().indexOf("software") != -1)

      //another defensive copy
      CompetitorCategoryWriter cws = session.getVariable("cw").clone();
      CompetitorCategory cs = new CompetitorCategory();

      String scdata = re.get("DATARECORD")+"

DataSet sscdataset = scrapeableFile.extractData( scdata, "subcat extract" );

//looping sub sub cats
for(int k =0; k DataRecord rec = sscdataset.getDataRecord(k);

String s = rec.get("SSCID");

if (s.toLowerCase().indexOf("software") != -1)

CompetitorCategoryWriter cwss = session.getVariable("cw").clone();
CompetitorCategory css = new CompetitorCategory();

session.scrapeFile( "product" );

session.setVariable("cw", cwss);
session.setVariable("cw", cws);
session.setVariable("cw", cw);

I know it is a bit clumsy, but I couldn't figure out how to make this in a more elegant way. There luckily wasn't need for recursiveness, or it would be even more messy :-)

What I would really like is to have the option to refine the results (for each row) of the result dataset that comes with an extraction.
Like the current "mapping" feature, a tab for conversion (with regex etc.), and one for refining the resultset would be great.

I will email you a picture of how it might look (can't upload anything but txt's and csv's here)

Best regards
Gustav Palsson

suggestion # 4

Hey Todd,
I'm talking about the .jar files in the ext directory in suggestion # 4.
Often when developing (in the beginning anyway, before I get a reasonable library) I need to change the .jar files that the screenscraper scripts import during scrapes.
At the moment that means that I need to close SS, export the package from eclipse to the ext directory and then reopen SS to continue working.

That being said, it is only a minor annoyance.

It would greatly shorten the time if there was a "reload external files" option in the menu somewhere, but seeing as SS is probably made in java, I don't know how that would interact with SS.

Thanks, that helps. Also,

Thanks, that helps. Also, could you clarify a bit on #3? Again, an example might be helpful.


Hi Gustav, These are

Hi Gustav,

These are excellent suggestions. I'll add them to our to-do list so that we don't lose track of them. Please also feel free to send along any other thoughts you might have.

Also, could you clarify just a bit what you mean on #4? Perhaps an example would help.

Kind regards,

Todd Wilson