This problem haven't solution?
Hi !
I've one problem without solution for me
This is more on less a sample of the problem. I've one html "index" with the next structure:
Genre1
Film1 Link
Film2 Link
Film3 Link
Genre2
Film4 Link
Film5 Link
Genre3
Film6 Link
Film7 Link
Film8 Link
Film9 Link
Film10 Link
Film11 Link
Genre4
Film12 Link
Film13 Link
Film14 Link
Film15 Link
In each Film page contains:
Title
Autor
Argument
Price
I will generate a file CSV with this info:
Genre1 Title1 Autor1 Argument1 Price1
Genre1 Title2 Autor2 Argument2 Price2
Genre1 Title3 Autor3 Argument3 Price3
Genre2 Title4 Autor4 Argument4 Price4
Genre2 Title5 Autor5 Argument5 Price5
Genre3 Title6 Autor6 Argument6 Price6
.
.
.
Is very easy with yours application, scrape the links of Films, the Film page contains, and generate a CSV file, but I don't now how can identify Genre with the rest of info.
All my tests, haven't a satisfactory results.
This have a cool solution?
Thanks, and sorry for my patetic english :)
Regards
This problem haven't solution?
Hi,
Sounds like you're getting closer. Based on the code you've given, I'm guessing your script will look something like this:
// block of <option> lines from which the genres are to be extracted.
// This is probably a simple matter of checking the "Save in session variable"
// checkbox for the "BLOCK" extractor pattern token.
// Also, assuming this script is being run from the scrapeable file
// containing your extractor pattern, you can probably just use this:
// "GENRE_PATTERN"
// as the name of the extractor pattern instead of this:
// "AllFilms:AllFilms-Genre:GENRE_PATTERN"
GenreVar = scrapeableFile.extractData( session.getVariable( "BLOCK" ), "AllFilms:AllFilms-Genre:GENRE_PATTERN" );
FileWriter out = null;
try
{
// Open up the file to be appended to.
out = new FileWriter( "dvds.txt", true );
for( i = 0; i < GenreVar.getNumDataRecords(); i++ )
{
session.log( "Writing a genre to a file." );
genreDataRecord = GenreVar.getDataRecord( i );
// I'm not sure what you called them, but these "GENRE"
// and "GENRE_ID" values should be the names of the extractor
// pattern tokens in your "GENRE_PATTERN" extractor pattern.
out.write( dataRecord.get( "GENRE" ) + "\t" );
out.write( dataRecord.get( "GENRE_ID" ) );
out.write( "\n" );
}
// Close up the file.
out.close();
}
catch( Exception e )
{
session.log( "An error occurred while writing the data to a file: " + e.getMessage() );
}
Your "executeScript" method call looks fine, so you probably just need to upgrade your version of screen-scraper. If you don't mind helping us test, I'd recommend upgrading to the latest alpha version: [url]http://blog.screen-scraper.com/2006/09/13/version-27214a-of-screen-scraper-available/[/url]. Just be sure to back up your work before doing that :) Feel free to post a reply if upgrading doesn't seem to do the trick.
Kind regards,
Todd
This problem haven't solution?
Ok, I believe that understand you
Now, I have a block pattern, with a variable session “BLOCKâ€, inside have a html code there be one genre with yours title links. This pattern calls to script “Films-Block “ after each pattern application.
I have another pattern that extract only the genre (“GENRE_PATTERNâ€)
The script of Films-Block:
import com.screenscraper.common.*;
GenreVar = scrapeableFile.extractData( "BLOCK" , "AllFilms:AllFilms-Genre:GENRE_PATTERN" );
This code haven’t error, but I don’t know if GenreVar have a correct value.
I have another script (“AllFilms-Endâ€) where save all data in a text file
I suposse than my doubt is very very simple, but I’m not a coder and this is the first time with java
I will write the value of GenreVar into the text file, how? You can show any code example ?
First, I will write in a text file the Genre using extractData method, when all be succesfully, then I will work with the FilmsLinks
Another question, this is for syntax error ?
Code: session.executeScript( "Films-End" );
The error message was: Error in method invocation: Method executeScript( java.lang.String ) not found in class'com.screenscraper.scraper.ScrapingSession'
Thanks a lot!
This problem haven't solution?
Hi,
These types of situations can be a bit tricky to work with, but they're the very reason we created this method: [url]http://www.screen-scraper.com/support/docs/api_documentation.php#extractData[/url].
The basic idea is that you'll create an extractor pattern that will pull an entire block of genre plus film data. For example, your extractor pattern would pull this entire block:
Genre3
Film6 Link
Film7 Link
Film8 Link
Film9 Link
Film10 Link
Film11 Link
You'd have one extractor pattern token to pull the genre, and one extractor pattern token to pull all of the film links. You'd then create another extractor pattern that would pull only a single film link. Within a script, you'd then apply the pattern that pulls a film link to the text extracted by the extractor pattern token that extracts all of the film links. You'd use the scrapeableFile.extractData method to do this.
Hopefully that does the trick. Feel free to let me know if I can clarify.
Kind regards,
Todd Wilson