Loop through array
Hi,
I need to call a scrape file for each day of the year so I've constructed a list of dates but I need a little help in constructing (and understanding the syntax) something to store these in (an array?) and writing a loop to use this information.
Something like
For each (date) in (array)
call scrapefile
rgds/alex
You're on the right track!
Hi there-- sorry for the delayed response. (We've been busy getting the new support site up!)
You'll likely be forced into controlling your entire project with a main script (ie, Having a scraping session with all of its scrapeableFiles set to "This scrapeable file will be invoked manually from a script", and having your master script added to your scraping session, set to run "Before scraping session begins")
As for execution of each separate scrapeableFile per date listed, you'll likely have to do the following things inside of your main script...
1) For storing the dates, I'd try to use a HashTable, instead of an Array, where each date is a String and serves as a key for the Hashtable, and the data that it corresponds to is also a String, which will be the name of the scrapeableFile that you would like to execute on said date.
Example (in Java):
Hashtable scrapeLookup = new hashtable()
scrapeLookup.put("September 11", "sep11--WorldWideBrands.com")
scrapeLookup.put("February 29", "leapyear!--the craigslist.com scrape")
etc, etc. If you've already put all your dates into a really big array, you'll have to shift that implamentation into this Hashtable one. The 'keys' of the Hashtable in the example above are "September 11" and "February 29", and the String that each points to is "sep11--WorldWideBrands.com" and "leapyear!--the craigslist.com scrape", respectively. Data pointed-to by each Key in the Hashtable will be the name of the scrapeableFile you'd like to run.
2) Use (as in Java) a Date object call in order to get the current date in the format that you've used in the above Hashtable. We'll assume that the Date that you retrieve is put into the variable named "date_today", and that you've had it put into a "full-month-name day" format (to match your Hashtable Keys above).
3) Check to see if the Hashtable has a Key (ie, an acceptable date that you want to run a scrape on). Example, continuing with the Hashtable name as given above:
if (scrapeLookup.containsKey(date_today.toString()))
{
session.scrapeFile(scrapeLookup.get(date_today.toString()));
}
So, to guard against dates that are not in your Hashtable, we test to see if such a date (in the form of a Hashtable key) is present. You have to use the ".toString()" method on the Date object "date_today" because your Hashtable keys are Strings, not Dates. ".toString()" will return a pure String on most objects that are not already such.
If the key was successfully found, then scrape the file that your Hashtable key points to. You could alternately replace the "session.scrapeFile" part with a "session.executeScript" instead, so that this master script executes some other script rather than jumping directly into a scrapeable file. The concept is the same: variable scrapeableFile / script execution. Your choice.
One last note, though... the Date object class can be really annoying to work with, but check the java API on its methods, particularly those for returning or parsing a Date object into the format that you'd like (in this example, my format was just a simple full-month-name folowed by the day). A simple parameter-less call to create a new instance of a Date will not give you this same format. You'll have to look that one up... I can't recall exactly how to parse Date objects... not well enough to post a definitive answer about the syntax, at least :)
Hope that helps! I think this is more effective than using a For loop to go through the whole list of dates. If you're only interested in executing a certain scrape per date, this avoids the time spent in the For loop. Test if it exists, and if so, then run it. If not, then nothing happens.