csv problem : extract > loop > save

Hi all,
this program is really great..i was searching for something like that long long time...

i have to extract data from a local dir containing files that have a variable in url, and save them to a csv.

All works quite perfect but with "little" problem in final csv that produce something like that, ID, empty,empt than empty ADDRESS empty,..and so on.. :
CSV looks like :

"ID","NAME","ADDRESS"
1,,,,,
2,,,,,
3,,,,,
4,,,,,
5,,,,,
,"NAME01",,,,
,"NAME02",,,,
,"NAME03",,,,
,"NAME04",,,,
,"NAME05",,,,
,,"ADDRESS01",,,
,,"ADDRESS02",,,
,,"ADDRESS03",,,
,,"ADDRESS04",,,
,,"ADDRESS05",,,

The code i'm using for looping is this and is placed on the root session, with "Before scraping session begin":

// LOOP
for( int i = 1; i < 20; i++ )
{
//Add the next value as a session variable
session.setVariable( "PAG", String.valueOf(i) );

//Run the scrapeable file
session.scrapeFile("pa);
}

The code for saving is placed inside every single extractor pattern with run "After each pattern application". The code is :

 <br />
// Fix format issues.<br />
String fixString(String value)<br />
{<br />
 if (value != null)<br />
 {<br />
 value = value.replaceAll("\"", "\'");<br />
 value = value.replaceAll("&", "&");<br />
 value = value.replaceAll("

csv problem : extract > loop > save

ok, now i understand...
Only one extractor pattern.. and the loop through the 10 patterns in page comes automatically from applying "after each extractor pattern"... and the loop through pages i make it with script.. good good! yes! yes..

i will try now....
thanks Scott ;-)

csv problem : extract > loop > save

webmark,

Typically, you would use sub-extractor patterns but you may not need to. If you can make one extractor pattern that includes the three items you need then you would only need the loop to get to the pages. When you apply your pattern that matches all 10 listings and call a script "After each pattern application" that will basically be another loop that will write out each listing one-by-one.

If you're not able to fit all three variables into one extractor pattern then your only other alternative would be to use a manual extraction approach making use of the extractData() method (professional and enterprise only).

http://community.screen-scraper.com/API/extractData
http://community.screen-scraper.com/FAQ/SimilarTables
http://community.screen-scraper.com/script_repository/manual-extraction-example
http://community.screen-scraper.com/website_varies

-Scott

csv problem : extract > loop > save

Thanks Scott.

Yes, i have the other 2 extractor pattern...
With your help and working with program, i think i found where is the problem.

With similar structure and script of this one posted before (just changing the testsave position, like you mentioned) i arrive to save my data correctly from a "detail page" that contains only one "block of data" ...

So, i think i just need [b]two different loops scripts[/b] : one that will loop through every block of data (ID-NAME-ADDRESS- that repeats 10 times on the page) and one will loop through pages...
Now have to find how to construct 1st loop..

I even know that maybe i can do it with sub-extractor pattern, but have little reject to fill this sub-extractors...

It can be correct?
Thanks again Scott.
I will post here results of my problem...

csv problem : extract > loop > save

webmark,

I only see one extractor pattern where you're extracting a value for one of your variables ("ID"). I'm assuming there are two others for NAME and ADDRESS. If this is true then you'll want to call your testsave script after the file is scraped and NOT from the extractor pattern(s).

The reason is because each time you call your testsave script where you write to the file you're always writing a hard return.

out.write&#40; "\n" &#41;;

So, call your script when you have a value for each of the variables, write to your file (along with the hard return), null out your variables as I described earlier and on the next loop do the same thing.

Also, I keep seeing people using the ~@IGNORE@~ tag. Can you tell me where you found it? It's been deprecated (no longer used) because it's too unpredictable. As an alternative please use something like ~@JUNK@~ and choose an appropriate regular expression from the drop-down.

Hope that helps,

Scott

csv problem : extract > loop > save

ok, yes...it's maybe little hard to understand without examples and with my quite bad english.. ;)

I will try to explain better what's happening :

[b]1) here the stucture of my session[/b]
[img]http://img139.imagevenue.com/loc1171/th_35427_session_122_1171lo.jpg[/img]

[b]2) here the code i'm using :[/b]
testloop script

for&#40; int i = 1; i < 30; i++ &#41;
&#123;

   session.setVariable&#40; "PAGE", String.valueOf&#40;i&#41; &#41;;

   // Scrape the scrapeable file &#40;change the name to fit yours&#41;.
   session.scrapeFile&#40; "testfile" &#41;;
   //
&#125;

testsave script

// Fix format issues.
String fixString&#40;String value&#41;
&#123;
   if &#40;value != null&#41;
   &#123;
      value = value.replaceAll&#40;"\"", "\'"&#41;;
      value = value.replaceAll&#40;"&", "&"&#41;;  
      value = value.replaceAll&#40;"

csv problem : extract > loop > save

Webmark,

Could you check one thing first off? For the scrapeable file where you have your URL with the "page=~#PAGE#~" querystring GET parameter, could you check under the parameters tab there and make sure there isn't another reference to the page parameter there?

Also, could you give more examples of what your code looks like and what your output looks like, too. I'm having a hard time understanding your description of what's going on.

Thanks,
Scott

csv problem : extract > loop > save

I try your suggestion, Todd, but still have problems with writing to CSV.
I know the problem is with looping.

I have 26 page to extract data from and so in the scrapeable file address i put a variable "page=~#PAGE~#" and the loop through pages works well.

The problem is that i don't arrive to loop through all the 10 blocks of data to extract them from any single page.

If, for example, i put 30 in the limit of the loop, i have a csv with 26 well ordered data (id, name, address etc) + 4 duplicate data of the last... and in the same time in the log i see that (correctly) it cannot find page=27 page=28 etc.

I think i need two different loops, one for pages and one for data in single page.. but i don't know how to do it.

Can someone help me?
Thanks

csv problem : extract > loop > save

thanks Scott,
i will try with your solution and post result...

thanks again, hope it works..
WM

csv problem : extract > loop > save

webmark,

I believe the problem will be solved by calling your write to file script after the scrapeable file has finished rather than after each extractor pattern.

When doing it this way you'll also need to change all of your dataRecord.get's to session.getVariable. AND, be sure to null out all of your session variables at the end of your write to file script like so.

session.setVariable("myVar", null);

This way they don't come back around on the next loop.

-Scott