Repeat sub-extractor pattern?

I would like to scrape a page where a "header information" precedes a variable-length table, from which I want to scrape each row. The output data I want should contain a set of data for each row in the table, with the "header information" included (repeated) in each set of data. I.e.

Header1
A B C
D E F
Header2
G H I

- should give

Header1 A B C
Header1 D E F
Header2 G H I

It's no problem to tell the header rows from the data rows. I have tried with sub-extractor patterns, making the data rows a DATARECORD, but the problem is that the sub-extractor pattern is only applied once, giving

Header1 A B C
Header2 G H I

Is there any way to apply the sub-extractor pattern to all matches in the DATARECORD, like extractor patterns do on the scrapeable file?

I was thinking I might make a script that calls a seperate extractor pattern many times to get each row, but I don't know whether I could pass the DATARECORD to become a scrapeable file for the extractor pattern.

Any ideas?

A solution

  1. For the Extractor pattern, use a pattern that is common for the start of the header and for the start of each row.
  2. Put ~@DATARECORD@~
  3. For the end of your Extrator patter, une the end of a Row
  4. Make a Sub-extractor pattern that grab the Header
  5. Make a Sub-extractor pattern that grab one Row
  6. When a Sub-extractor didn't match, he just jump it to the next, then the value in the token used don't change!! And you are done! You will get:
    1. Header1: row1
    2. The same as before: row2
    3. The same as before: row3
    4. Header2: row1
    5. ...

Sorry for my hawfull english.
Does it help you?

Same issue

I have the same issue but the link is broken. Does anybody know what he has trying to link too.

Kevin

Repeat sub-extractor pattern?

There is only one. 8)

Repeat sub-extractor pattern?

Well, I'm no MacGyver :D

Thanks anyway.

Repeat sub-extractor pattern?

manscher,

Boy, I'm not sure what a viable work around would be. That would be a challenge for a programming MacGyver trapped in a compound surrounded by heavily-armed thugs when all he has is a stolen tape recorder, access to a PA system, a ceiling fan, a spatula, and the basic edition of screen-scraper.

Air Raid!!

I wish you luck,

Scott

Repeat sub-extractor pattern?

Yes, this looks like it would work, but scrapeableFile.extractData( String text, String name ) is marked "professional and enterprise editions only". I'm not ready to buy the professional edition yet - do you have other suggestions/hints?

Repeat sub-extractor pattern?

Martin,

This is a situation where what seems like a simple problem actually requires a more complicated solution. Some day we'll have a more user-friendly solution. In order to avoid reinventing the wheel, please have a look at this forum posting...

http://community.screen-scraper.com/node/456

Let us know if it makes sense for your situation.

Thanks,
Scott

Brokren Link

Anyone know which thread this is supposed to point to, the link is broke

http://community.screen-scraper.com/node/456

I am also trying to resolve the same issue.

I've updated the link in both

I've updated the link in both your post and Scott's post, so it should be pointing to the right thread now.

Ooo.. not sure. that was an

Ooo.. not sure. that was an old link from an older version of our website. I'll try to dig it up though.