dataSet

Overview

The dataSet object holds all data records extracted by an extractor pattern after it has been applied as many times as possible to the HTML retrieved by a scrapeable file. A data set is analogous to a result or record set that would be returned from a database query. A data set contains any number of data records, which are analogous to rows in a database.

The dataSet object provides methods to aid in getting at the information that has been gathered.

See example usage: Iterate over DataSets & DataRecords.

DataSet

DataSet DataSet ( void )
DataSet DataSet ( ArrayList dataRecords )

Description

Manually create a DataSet.

Parameters

  • dataRecords (optional) Java ArrayList of DataRecord elements.

Return Values

Returns DataSet object.

Change Log

Version Description
4.5 Available for all editions.

Class Location

com.screenscraper.common.DataSet

Examples

Manually Create DataSet

 // Create DataSet
 myDataSet = new DataSet();

 // Create DataRecord
 myDataRecord = new DataRecord();
 myDataRecord.put( "STATE", "AZ");

 // Add DataRecord to DataSet
 myDataSet.addDataRecord( myDataRecord );

Create DataSet from Array List

 // Create Array List
 ArrayList dataRecords = new ArrayList();

 // Create DataRecord
 myDataRecord = new DataRecord();
 myDataRecord.put( "STATE", "AZ");

 // Add DataRecord to "dataRecords" DataSet
 dataRecords.add( myDataRecord );

 // Create DataSet From ArrayList.
 myDataSet = new DataSet( dataRecords );

See additional example usage: Iterate over DataSets & DataRecords.

addDataRecord

void dataSet.addDataRecord ( DataRecord dataRecord )

Description

Add a DataRecord to a DataSet.

Parameters

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Add Data Record to DataSet

 // Create DataSet
 myDataSet = DataSet();

 // Create DataRecord
 myDataRecord = new DataRecord();
 myDataRecord.put( "STATE", "AZ");

 // Add DataRecord to DataSet
 myDataSet.addDataRecord( myDataRecord );

See Also

See additional example usage: Iterate over DataSets & DataRecords.

clearDataRecords

void dataSet.clearDataRecords ( )

Description

Remove all DataRecord objects from the DataSet.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Remove DataRecords from DataSet

 // Removes all DataRecord objects from the dataSet object.
 dataSet.clearDataRecords();

See additional example usage: Iterate over DataSets & DataRecords.

See Also

deleteDataRecord

void dataSet.deleteDataRecord ( int dataRecordNumber )

Description

Remove a DataRecord from the DataSet.

Parameters

  • dataRecordNumber Index of the DataRecord in the DataSet, as an integer. Remember that the DataRecords set is zero based and so the first DataRecord would be at the index of zero.

Return Values

Returns void.

Change Log

Version Description
4.5 Available for all editions.

Examples

Remove one DataRecords from DataSet

 // Deletes the third data record in the set. Remember that data sets
 // are zero-based.

 dataSet.deleteDataRecord( 2 );

See Also

findValue

Object dataSet.findValue ( String valueToFind, String columnToMatch, String columnToReturn )

Description

Retrieve a field's value in a data set based on another field.

Parameters

  • valueToFind Value being looked for, as a string.
  • columnToMatch Column/token name where the value is being searched for, as a string.
  • columnToReturn Column/token name whose value should be returned, as a string.

Return Values

Returns the value in the returned column, usually a string (unless records have been manually added). If no match is found, null is returned.

Change Log

Version Description
5.0 Added for all editions.

Examples

Get Value of Token based on Another Token

 // Create new DataSet
 DataSet myDataSet = new DataSet();

 // Create DataRecords<
 DataRecord john = new DataRecord();
 john.put("FIRST_NAME", "John");
 john.put("LAST_NAME", "Doe");

 DataRecord jill = new DataRecord();
 jill.put("FIRST_NAME", "Jill");
 jill.put("LAST_NAME", "Smith");

 // Add dataRecords to dataSet
 myDataSet.addDataRecord(john);
 myDataSet.addDataRecord(jill);

 // Search dataSet for "John" in the "FIRST_NAME"
 // field. Return the value of the "LAST_NAME" in
 // the same record
 String result = myDataSet.findValue("John", "FIRST_NAME", "LAST_NAME");

 // Write result to log
 session.log(result); // Logs "Doe"

See Also

  • get() [dataSet] - Get a single piece of data held by a DataRecord in the DataSet.

get

Object dataSet.get ( int dataRecordNumber, String identifier )

Description

Get a single piece of data held by a DataRecord in the DataSet.

Parameters

  • dataRecordNumber Index of the DataRecord in the DataSet, as an integer. Remember that the DataRecords set is zero based and so the first DataRecord would be at the index of zero.
  • identifier The name of the element to retrieve from the DataRecord, as a string.

Return Values

Returns the value associated with the DataRecord identifier. It will be a string unless you have added values to the DataRecord whose values are not strings.

Change Log

Version Description
4.5 Available for all editions.

Examples

Get Token Value From DataRecord

 // Gets the value "CITY_CODE" from the first data record in the
 // data set.

 firstCityCode = dataSet.get( 0, "CITY_CODE" );

See Also

getAllDataRecords

ArrayList dataSet.getAllDataRecords ( )

Description

Get all DataRecords in the DataSet.

Parameters

This method does not receive any parameters.

Return Values

Returns an ArrayList of DataRecord objects.

Change Log

Version Description
4.5 Available for all editions.

This method is provided as a convenience, the recommended way to iterate over data records in a data set is to use getNumDataRecords and getDataRecord.

Examples

Loop Through DataRecords

 // Stores all of the data records in the variable allData.
 allData = dataSet.getAllDataRecords();

 // Loop through each of the data records.
 for( i = 0; i < allData.size(); i++ )
 {
     // Store the current data record in the variable myDataRecord.
     myDataRecord = allData.get( i );

     // Output the "PRODUCT_NAME" value from the data record to the log.
     session.log( "Product name: " + myDataRecord.get( "PRODUCT_NAME" ) );
 }

See Also

getCharacterSet

String dataSet.getCharacterSet ( )

Description

Get the character set being applied the scraped data.

Parameters

This method does not receive any parameters.

Return Values

Returns the character set applied to the scraped data, as a string. If a character set has not been specified then it will default to the character set specified in settings dialog box.

Change Log

Version Description
5.0 Added for all editions.

Examples

Get Character Set

 // Get the character set of the dataSet
 charSetValue = dataSet.getCharacterSet();

See Also

getDataRecord

DataRecord dataSet.getDataRecord ( int dataRecordNumber )

Description

Get one DataRecord in the DataSet.

Parameters

  • dataRecordNumber Index of the DataRecord in the DataSet, as an integer. Remember that the DataRecords set is zero based and so the first DataRecord would be at the index of zero.

Return Values

Returns a DataRecord (Hashtable object). If there is not a DataRecord at the specified index an error will be thrown.

Change Log

Version Description
4.5 Available for all editions.

Examples

Get DataRecords in a Loop

 // Loop through each of the data records.
 for( i = 0; i < dataSet.getNumDataRecords(); i++ )
 {
     // Store the current data record in the variable myDataRecord.
     myDataRecord = dataSet.getDataRecord( i );

     // Output the "PRODUCT_NAME" value from the data record to the log.
     session.log( "Product name: " + myDataRecord.get( "PRODUCT_NAME" ) );
 }

See Also

getFirstValueForKey

Object dataSet.getFirstValueForKey (String key )

Description

Get the first non-null value, in a data set, for a given token.

Parameters

  • key Name of the column whose value is returned, as a string.

Return Values

Returns the first non-null value in the column, usually a string (unless records have been manually added). If none is found, null is returned.

Change Log

Version Description
5.0 Added for all editions.

Examples

Get First Non-null Token Value

 // Gets the value of the first "CITY_CODE" in the
 // data set.

 fieldValue = dataSet.getFirstValueForKey("CITY_CODE");

See Also

  • get() [dataSet] - Get a single piece of data held by a DataRecord in the DataSet.
  • findValue() [dataSet] - Retrieve a field's value in a data set based on another field.

getNumDataRecords

int dataSet.getNumDataRecords ( )

Description

Get the number of DataRecords in the DataSet.

Parameters

This method does not receive any parameters.

Return Values

Returns the number of DataRecords in the DataSet, as an integer.

Change Log

Version Description
4.5 Available for all editions.

Examples

Get the Number of DataRecords in the DataSet

 // Loop through each of the data records.
 for( i = 0; i < dataSet.getNumDataRecords(); i++ )
 {
     // Store the current data record in the variable myDataRecord.
     myDataRecord = dataSet.getDataRecord( i );

     // Output the "PRODUCT_NAME" value from the data record to the log.
     session.log( "Product name: " + myDataRecord.get( "PRODUCT_NAME" ) );
 }

See Also

  • size() [dataSet] - Return the number of dataRecords in the dataSet.

join

void dataSet.join ( DataSet dataSet )

Description

Merge data records from two data sets.

Parameters

  • dataSet Data set whose records are to be merged.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

Examples

Merge DataRecords from DataSets

 // Create dataSet
 DataSet dataSet = new DataSet();

 // Load dataSet with information
 for (i = 0; i < 3; ++i)
 {
     DataRecord record = new DataRecord();
     record.put("DATA_SET_ONE", i);
     dataSet.addDataRecord(record);
 }

 // Create another dataSet
 DataSet anotherDataSet = new DataSet();

 // Load dataSet with information
 for (i = 0; i < 2; ++i)
 {
     DataRecord record = new DataRecord();
     record.put("DATA_SET_TWO", i);
     anotherDataSet.addDataRecord(record);
 }

 // Join DataSets
 dataSet.join(anotherDataSet);

 // Write merged DataSet to Log (in dataRecords)
 for (i = 0; i < dataSet.getNumDataRecords(); ++i)
 {
     DataRecord record = dataSet.getDataRecord(i);
     session.log("DataRecord " + i + ": " + record.toString());
 }

 // Log Output:
 // DataRecord 0: {DATA_SET_TWO=0, DATA_SET_ONE=0}
 // DataRecord 1: {DATA_SET_TWO=1, DATA_SET_ONE=1}
 // DataRecord 2: {DATA_SET_ONE=2}

setCharacterSet

void dataSet.setCharacterSet ( String characterSet )

Description

Set the character set to be used for rendering dataSet values.

Parameters

  • characterSet Java recognized character set, as a string. Java provides a list of supported character sets in its documentation.

Return Values

Returns void.

Change Log

Version Description
5.0 Added for all editions.

This will only change the character set on the current data set. If you want it to be changed for all data sets, you would need to change it in the settings dialog box or screen-scraper.properties file.

Examples

Set Character Set

 // Set the character set of the dataSet
 dataSet.setCharacterSet("UTF-8");

See Also

size

int dataSet.size ( )

Description

Get the number of DataRecords in the DataSet.

Parameters

This method does not receive any parameters.

Return Values

Returns the number of DataRecords in the DataSet, as an integer.

Change Log

Version Description
6.0.3a Available for all editions.

Examples

Get the Number of DataRecords in the DataSet

 // Loop through each of the data records.
 for( i = 0; i < dataSet.size(); i++ )
 {
     // Store the current data record in the variable myDataRecord.
     myDataRecord = dataSet.getDataRecord( i );

     // Output the "PRODUCT_NAME" value from the data record to the log.
     log.info( "Product name: " + myDataRecord.get( "PRODUCT_NAME" ) );
 }

See Also

writeToFile

void dataSet.writeToFile ( String fileName ) (professional and enterprise editions only)

Description

Write DataSet string and integer contents to a file. The fields will be tab-delimited and records hard-return delimited.

Parameters

  • fileName File path where the contents of the DataSet should be written. If the file already exists the contents will be appended to the file.

Return Values

Returns void. If the file cannot be written to then an error will be thrown.

Change Log

Version Description
4.5 Available for professional and enterprise editions.

Examples

Write DataSet Contents to a File

 // Writes the data found in the current data set to the file
 // "extracted_data.txt".

 dataSet.writeToFile( "C:/site_data/extracted_data.txt" );