Capturing subsection data
I'm trying to capture physician data from the following URL: http://www.scanhealthplan.com/article/discoverscan/findascandoctor/provi...
Any given physician can have nultiple offices, and there may be multiple addresses listed for a given office (see example below). Is there a way to get a single record for each name/office/address combination?
Bugliosi Jr, Vincent J MD #8551
INTERNAL MEDICINE
Gender: Male
(909) 793-3311
Beaver Medical Group - Highland #852 - Open to Existing Patients [Office #1]
33758 Yucaipa BlvdGet Map
Yucaipa, CA, 92399
(909) 795-9747
6109 W Ramsey StGet Map
Banning, CA, 92220
(951) 845-0313
7000 Boulder AveGet Map
Highland, CA, 92346
(909) 862-1191
- Redlands Community Hospital
Beaver Medical Group - Redlands #853 - Open [Office #2]
2 W Fern AveGet Map
Redlands, CA, 92373
(909) 793-3311
- Redlands Community Hospital
That can be done, but it
That can be done, but it would be a little tricky.
You would need to use the scrapeableFile.extractData() method that is only in the professional and enterprise editions.
This thread should talk you through the same process:
http://community.screen-scraper.com/node/1341
How to export the data
OK, I see how that method could be used. Usually, though, I write the data to a file using a script which runs after each pattern application, but when the extractor pattern is applied via the method, it doesn't seem to register as an event that would trigger the execution of the script. Is there an alternative method I should use?
Thanks.
I do something like this in
I do something like this in the script that extracts the data:
import com.screenscraper.common.*;
myDataSet = scrapeableFile.extractData( productDescriptionText, "PRODUCT" );
for (i = 0; i < myDataSet.getNumDataRecords(); i++)
{
myDataRecord = myDataSet.getDataRecord(i);
session.setVariable("PRICE", myDataRecord.get("PRICE"));
session.setVariable("PRODUCT_ID", myDataRecord.get("PRODUCT_ID"));
session.executeScript("Write output");
}
Just note that when you call a script within a script like this, the dataRecord isn't in scope, so you want to write out session variables or save your whole dataRecord as a session variable and refer to it.