Regex to get an extractor pattern to stop at a </td>

Hi,

I was wondering if you know the regex to stop a extractor pattern at

rather than a > or a ".
It is just I don't know the number of html elements in a td but I know it ends with a td.
Any help would be really appreciated.

Regards,
Seamus McMahon

seamus1982 on 03/28/2013 at 2:14 pm

screen-scraper support for licensed users

Would your HTML look

Would your HTML look something like like:

<td>
Content<br />
More
</td>

And you want to get all of the content of the TD, but can't account for other HTML therein?

You would

<td>
~@TOKEN@~
</td>

The token's RegEx might be blank.

jason on 03/29/2013 at 9:03 am

I'm trying to understand what you're doing

Hi Seamus,

I read your post but I'm not sure I follow what the RegEx would do. What I think is the case is that the HTML you're scraping looks like this...

...and you just want to extract the value of rows in the table, right? If this is the case, you might be better off using a sub-extractor pattern to get at these values. Otherwise, let us know and we'll try to help.

Take care,
Justin

Justin_S on 03/29/2013 at 7:36 am

Search

Community

screen-scraper

User login

Regex to get an extractor pattern to stop at a </td>

Would your HTML look

I'm trying to understand what you're doing