an extractor token with zero or more characters?
Hi
I thought the extractor tokens would be able to use a "zero or more characters" mask, but I can't be able to get it to work. I found that most of my errors are because if this.
Here is a simple example
I want a single pattern to match both texts:
text1
-------------
</td>
<td valign="top">
<div style="float:left;"><a class="lbb" href="
text2
-------------
</td>
<td class="odd" valign="top">
<div style="float:left;"><a class="lbb" href="
----
-------------
</td>
<td valign="top">
<div style="float:left;"><a class="lbb" href="
text2
-------------
</td>
<td class="odd" valign="top">
<div style="float:left;"><a class="lbb" href="
----
they are both identical except for on the second line: class="odd".
So I thought I could put an extractor token in its place, like this
-----
</td>
<td ~@junk@~valign="top">
<div style="float:left;"><a class="lbb" href="
----------
</td>
<td ~@junk@~valign="top">
<div style="float:left;"><a class="lbb" href="
----------
However screen-scraper is only matching up the 2nd text example. So I though it must be because my extractor token needs to have a regex of 'zero or more' times - but I can't get it working.
Any tips would greatly be appreciated
cheers
Ben
Your extractor should work.
Your extractor should work. In the junk token, I would put the RegEx "[^<>]*" so it won't overflow, and then it should be fine.