screen-scraper public support
Increasing sub-extractor patterns = decreasing reliability?
Hi there,
In one of my scrapes, I pass data via a DATARECORD to a set of sub-extractor patterns. Because the site I'm scraping a) may provide data in 2 languages (English/French) and b) scrambles the information I'm trying to extract among differing arrangements/configurations, I've set up an increasing number of sub-extractor patterns to handle each arrangement/configuration. (I'm at 14 patterns and counting...) What I've noticed is that as I add more sub-extractor patterns, SS becomes less stable.
More specifically, when I add a new pattern, SS usually does one of the following:
1 issues possible bug?
I've used the Shopping .sss as a base to learn and create my own .sss.
When I try to create the pattern match my '~@DataRecord@~' doesn't generate using all of the text I've selected. It seems to leave some of the selected text outside of the expression.
Example:
Hi, I'm Bob and I would like to be friends.
Becomes:
~@DataRecord@~ o be friends.
I thought this was due to the length of the expression, but after creating a significantly long one and running it, I don't think this is the case.
Edit:
Second issue was a data type issue. I resolved it.
XMLWriter is not working
Hi there,
I have to explain a lot, because I don't know why my scripts don't work.
My Initialize Script:
import com.screenscraper.xml.XmlWriter;
import java.util.*;
import java.text.*;
DateFormat df1 = DateFormat.getDateInstance(DateFormat.SHORT);
Date now = new Date();
String s1 = df1.format(now);
XmlWriter xmlWriter = new XmlWriter("C:\\Users\\Basti\\Documents\\Scrapes\\soul\\" + s1 + ".xml",
"soul");
session.setVariable( "XML_WRITER", xmlWriter );
session.setVariable("PAGE_NUMBER", "1");
Call another script
Hi, I'm wondering if there's any way to call a script using the basic version. This is the only feature I'd need from professional and it doesn't seem worth it to pay $550 for it.
Stop current scrape and return
Hi,
Currently I have to manually stop a scrape on a paginated website because the next button on the site won't disappear when there are no more results. I would like to search the site for multiple terms as in tutorial 7. However, if I do session.stopScraping() when there are no more results, it completely stops the program even if there are more terms I want to search on. Is there any way to return back to the loop and continue scraping?
Edit: Got this figured out.
Need help saving dataRecordNumber to CSV
Hello,
I'm still unable to save the dataRecordNumber that screen scraper generates as it scrapes to CSV. I'm hoping someone can provide me with the code to do this.
Here is my csv writer:
try
{
session.log( "Writing data to a file." );
// Open up the file to be appended to.
out = new FileWriter( "csvfile.txt", true );
// Write out the data to the file.
out.write( session.getVariable( "PAGE" ) + "\t" );
out.write( dataRecord.get( "VARIABLE1" ) + "\t" );
out.write( dataRecord.get( "VARIABLE2" ) + "\t" );
MYSQL Driver
Hi,
I'm having an issue loading the MYSQL driver. I'm following this tutorial: http://community.screen-scraper.com/writing_extracted_data_to_a_database and I'm having problems with step 1, "To start, download the appropriate JDBC Driver connector Jar file for your particular database and place it in the lib/ext folder where screen-scraper is installed."
I've downloaded the file, but I'm not sure which part to put in the ext folder. I've tried it multiple ways. Could someone let me know exactly what I'm supposed to put there? Thanks.
Ending execution
Hi,
I'm scrapping through thefind.com, but if you hit next and there aren't any products, it doesn't hide the next button. Is there any way I can force it to quit if there aren't any results?
An error occurred while writing the data to a file: null
Hi, I'm trying to do the second tutorial. Currently I'm at the end where I need to save the data, but I'm getting the error "An error occurred while writing the data to a file: null." It's because the dataRecord variable isn't being set. Where should it be set? This is my code, for reference:
try
{
session.log( "Writing data to a file." );
// Open up the file to be appended to.
out = new FileWriter( "dvds.txt", true );
// Write out the data to the file.
out.write( dataRecord.get( "TITLE" ) + "\t" );
Read from Database, then set SessionVariable
At last I got my scrape running and dumping the results into my database!! Hurrah! But now as part of my two part scrape I need to read records from my database.
I have done the intital scrape which was to retreive a list of brands (Brand) and the URL to the Brands page (BrandURL).
My plan is to read out the next BrandURL to scrape, Set a Session with it, then use the session to remove that Brand record from the table. Pretty sure a Delete will be much the same as an insert, BUT how do I set a session from a select query? (mysqlstring="SELECT Brand FROM `Brands` LIMIT 0 , 1";)