Extractor Pattern with Vars
I am trying to scrap a site which is very lengthy. I know already the small area I want to scrap. I am trying something like this in my pattern.
<tr id="Row~#ACCOUNT#~-~#ROWID#~-L"~@DATARECORD@~
</tr>
</tr>
Because I already know the account number and the row ID.
I then have a sub extractor pattern which gets me exactly what I need from the pattern.. If I hard code the vars it works.
I read this node here that references this as invalid.
http://community.screen-scraper.com/node/837
Is there another way I can achieve this?
An alternative...
Yes, you are right in your observation that this variable-referencing syntax is no longer supported, even though it would certainly proove useful to you at a time like this!
An alternative would be to attempt something like the following:
// Code is in interpreted Java.
// Change "DATARECORD" to the name of the variable that you used in the extractor pattern
String tableBody = dataRecord.get("DATARECORD");
String lookFor = "
";
String tableRow = "";
if (tableBody.contains(lookFor)) {
tableRow = tableBody.substring(tableBody.indexOf(lookFor), tableBody.indexOf(endMark, tableBody.indexOf(lookFor)));
}
session.setVariable("ROW_TO_EXAMINE", tableRow);
Make sense? Sorry for my everything-in-one-line approach. That assignment to tableRow is "get the substring", while supplying the arguements: "beginning with where the found TR starts" and "get the index of the end mark, beginning the search from where the lookFor variable is found".
Now you've got a session variable called "ROW_TO_EXAMINE" (or whatever you choose to call it) that contains the row that you were after.
If you're not concerned about the row having a possibility of not existing in the table, you could remove the defensive IF statement, and just make the assignment to tableRow.
Does that help to answer your question?
Thanks.
That should do it..
error
I found the error. There needs to be one more )...
oops!
Sorry for the typo. I corrected it in my code example above.
Hope that solves your problem!
Tim
I understand what you are
I understand what you are trying to do. The snip you gave me throws an error.
The error message was: Encountered "( tableBody . indexOf ( lookFor ) , tableBody . indexOf ( endMark , tableBody . indexOf ( lookFor ) ) ;" at line 7, column 31.
Here is the trimmed down version of what I have;
String tableBody = dataRecord.get("ACCOUNTROW");
String lookFor = "
";
String tableRow = "";
tableRow = tableBody.substring(tableBody.indexOf(lookFor), tableBody.indexOf(endMark, tableBody.indexOf(lookFor));
session.setVariable("ROW_TO_EXAMINE", tableRow);
Can you help me through the syntax?