SqlDuplicateFilter

Overview

SqlDuplicateFilters are designed to filter duplicates when more information than just a primary key might define a duplicate entry. For example, you might define a unique person by their SSN, driver's license number, or by their first name, last name, and phone number. It is also possible that a single person may have multiple phone numbers, and if any of them match then the duplicate constraint should be met. Using an SqlDuplicateFilter can check for conditions such as this and correctly recognize duplicate entries.

This feature is only available for Professional and Enterprise editions of screen-scraper.

Examples

Register a new duplicate filter

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter nameFilter = SqlDuplicateFilter.register("people", dm);

 //Add constraints to match when a first name, middle initial, and last name match a different row in the database
 nameFilter.addConstraint( "people", "first_name" );
 nameFilter.addConstraint( "people", "middle_initial" );
 nameFilter.addConstraint( "people", "last_name" );

Match Duplicates across tables

Sometimes the data will need to be filtered across multiple tables, or possibly different constaints might indicate a duplicate. An example of this is a person might be a duplicate if their SSN matches OR if their driver's license number matches. Alternatively, they may be a duplicate when they have the same first name, last name, and phone number.

import com.screenscraper.datamanager.sql.SqlDuplicateFilter;

/*
Perform the setup of the SqlDataManager, as shown previously, and name the variable dm.
*/


//register an SqlDuplicateFilter with the DataManager for the social security number
SqlDuplicateFilter ssnDuplicate = SqlDuplicateFilter.register( "person", dm );
ssnDuplicate.addConstraint( "person", "ssn" );

//register an SqlDuplicateFilter with the DataManager for the drivers license
SqlDuplicateFilter licenseDuplicate = SqlDuplicateFilter.register( "person", dm );
licenseDuplicate.addConstraint( "person", "drivers_license" );

//register an SqlDuplicateFilter with the DataManager for the name/phone number
//where the person table has a child table named phone.
SqlDuplicateFilter namePhoneDuplicate = SqlDuplicateFilter.register( "person", dm );
namePhoneDuplicate.addConstraint( "person", "first_name" );
namePhoneDuplicate.addConstraint( "person", "last_name" );
namePhoneDuplicate.addConstraint( "phone", "phone_number" );

Duplicate filters are checked in the order they are added, so consider perfomance when creating duplicate filters. If, for instance, most duplicates will match on the social security number, create that filter before the others. Also make sure to add indexes into your database on those columns that you are selecting by or else performance will rapidly degrade as your database gets large.

Duplicates will be filtered by any one of the filters created. If multiple fields must all match for an entry to be a duplicate, create a single filter and add each of those fields as constraints, as shown in the third filter created above. In other words, constraints added to a single filter will be ANDed together, while seperate filters will be ORed.

addConstraint

void sqlDuplicateFilter.addConstraint ( String table, String column ) (professional and enterprise editions only)

Description

Add a constraint that checks the value of new entries against the value of entries already in the database for a given column and table.

Parameters

  • table Name of the database table, either the same table the filter is registered to or one of it's children
  • column The column that will be checked in the table for a duplicate with new values

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Register a new duplicate filter

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter nameFilter = SqlDuplicateFilter.register("people", dm);

 //Add constraints to match when a first name, middle initial, and last name match a different row in the database
 nameFilter.addConstraint( "people", "first_name" );
 nameFilter.addConstraint( "people", "middle_initial" );
 nameFilter.addConstraint( "people", "last_name" );

Sometimes the data will need to be filtered across multiple tables, or possibly different constaints might indicate a duplicate. An example of this is a person might be a duplicate if their SSN matches OR if their driver's license number matches. Alternatively, they may be a duplicate when they have the same first name, last name, and phone number.

import com.screenscraper.datamanager.sql.SqlDuplicateFilter;

/*
Perform the setup of the SqlDataManager, as shown previously, and name the variable dm.
*/


//register an SqlDuplicateFilter with the DataManager for the social security number
SqlDuplicateFilter ssnDuplicate = SqlDuplicateFilter.register( "person", dm );
ssnDuplicate.addConstraint( "person", "ssn" );

//register an SqlDuplicateFilter with the DataManager for the social security number
SqlDuplicateFilter licenseDuplicate = SqlDuplicateFilter.register( "person", dm );
licenseDuplicate.addConstraint( "person", "drivers_license" );

//register an SqlDuplicateFilter with the DataManager for the name/phone number
//where the person table has a child table named phone.
SqlDuplicateFilter namePhoneDuplicate = SqlDuplicateFilter.register( "person", dm );
namePhoneDuplicate.addConstraint( "person", "first_name" );
namePhoneDuplicate.addConstraint( "person", "last_name" );
namePhoneDuplicate.addConstraint( "phone", "phone_number" );

Duplicate filters are checked in the order they are added, so consider perfomance when creating duplicate filters. If, for instance, most duplicates will match on the social security number, create that filter before the others. Also make sure to add indexes into your database on those columns that you are selecting by or else performance will rapidly degrade as your database gets large.

Duplicates will be filtered by any one of the filters created. If multiple fields must all match for an entry to be a duplicate, create a single filter and add each of those fields as constraints, as shown in the third filter created above. In other words, constraints added to a single filter will be ANDed together, while seperate filters will be ORed.

register

SqlDuplicateFilter SqlDuplicateFilter.register ( String table, SqlDataManager dataManager ) (professional and enterprise editions only)

Description

Create an SqlDuplicateFilter for a specific table and register it with the data manager.

Parameters

  • table Name of the database table with the primary key, as a string.
  • dataManager The data manager that will use this filter when adding entries to the database.

Return Values

Returns an SqlDuplicateFilter that can then be configured for duplicate entries.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Register a new duplicate filter

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter nameFilter = SqlDuplicateFilter.register("people", dm);

 //Add constraints to match when a first name, middle initial, and last name match a different row in the database
 nameFilter.addConstraint( "people", "first_name" );
 nameFilter.addConstraint( "people", "middle_initial" );
 nameFilter.addConstraint( "people", "last_name" );

Match Duplicates across tables

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter personFilter = SqlDuplicateFilter.register("people", dm);

 // Catch duplicates when a new entry has the same first name, last name, and phone number as another entry
 // Note that phone is a child table of people
 personFilter.addConstraint( "people", "first_name" );
 personFilter.addConstraint( "people", "last_name" );
 personFilter.addConstraint( "phone", "phone_number" );