sub extractor not working the same as normal pattern
I'm trying to extract the following snippet froma phpBB forum. I've seen this in a few different places on the site...
I've created a main extractor pattern:
<select name="g">~@DATARECORD@~</select>
then the following subextractor:
<option value="~@GROUP_ID@~">~@GROUP_NAME@~</option>
It only grabs the first line of each datarecord. Yet if I use the subextractor pattern above as the main pattern it finds them all (along with some other stuff I don't want).
I've tried shortening the beginning and end of the subextractor in case there were linefeeds in the way or something but it never finds more than the first line? help!
Actually, your observations
Actually, your observations are 100% accurate. Sub-extractor patterns only match once. The intended effect is not "only make it match once", but rather, "look until you find this data, then save it". We would like to implement something to fix this issue, but for now, the solution is something like this:
import com.screenscraper.common.*;
text = dataRecord.get("DATARECORD");
DataSet myDataSet = scrapeableFile.extractData(text, "Give a new pattern name here");
Now you should be grabbing that info, but you'll have to process it in a script... Use the notes found in the API on the DataSet.extractData method. It shows you how to effectively loop through the results.
This has become a common dilemma with a few site structures, and we'd like to implement something to make it easier.
Hope that helps.
Tim
still a problem
Thanks, Tim; that solution is a good start until a fix is implemented. There is, however, one rather large problem: If the new "main" extractor pattern (which contains the former sub-extractor pattern) is set to run manually, it won't execute scripts after each application. That means the extracted values cannot be used further and we're back to square one. (The useability flaw in http://community.screen-scraper.com/node/691 appears to be related.) Is there a solution? Perhaps a different way of running those scripts?
Thanks!
When I am faced with a
When I am faced with a similar situation, I generally use this:
http://community.screen-scraper.com/API/extractData
Brilliant!
That worked perfectly. Thanks!