tidyDataRecord
DataRecord sutil.tidyDataRecord ( DataRecord record, boolean ignoreLowerCaseKeys ) (professional and enterprise editions only)
DataRecord sutil.tidyDataRecord ( DataRecord record, Map<String, Boolean> settings ) (professional and enterprise editions only)
DataRecord sutil.tidyDataRecord ( DataRecord record, Map<String, Boolean> settings, boolean ignoreLowerCaseKeys ) (professional and enterprise editions only)
DataRecord sutil.tidyDataRecord ( ScrapeableFile scrapeableFile, DataRecord record ) (professional and enterprise editions only)
DataRecord sutil.tidyDataRecord ( ScrapeableFile scrapeableFile, DataRecord record, boolean ignoreLowerCaseKeys ) (professional and enterprise editions only)
DataRecord sutil.tidyDataRecord ( ScrapeableFile scrapeableFile, DataRecord record, Map<String, Boolean> settings ) (professional and enterprise editions only)
DataRecord sutil.tidyDataRecord ( ScrapeableFile scrapeableFile, DataRecord record, Map<String, Boolean> settings, boolean ignoreLowerCaseKeys ) (professional and enterprise editions only)
Description
Tidies the DataRecord by performing actions based on the values of the settings map given (or getDefaultTidySettings if none is given). Each value in the record that is a string will be tidied. Keys are not modified. The record given will not be modified, but a new record with the tidied values will be returned. If no settings are given, will use the values obtained from sUtil.getDefaultTidySettings().
Parameters
- record The DataRecord to tidy (values in the record will not be overwritten with the tidied values)
- scrapeableFile (optional) The current ScrapeableFile, used for resolving relative URLs when tidying links
- settings (optional) The operations to perform when tidying, using a Map<String, Boolean>
The settings tidy settings and their default values are given below. If a key is missing in the settings map, that operation will not be performed.
Map Key Default Value Description of operation performed trim true Trims whitespace from values convertNullStringToLiteral true Converts the string 'null' (without quotes) to the null literal (unless it has quotes around it, such as "null") convertLinks true Preserves links by converting <a href="link">text</a> to text (link), will try to resolve urls if scrapeableFile isn't null. Note that if there isn't a start and end <a> tag, this will do nothing removeTags true Remove html tags, and attempts to convert line break HTML tags such as <br> to a new line in the result removeSurroundingQuotes true Remove quotes from values surrounded by them -- "value" becomes value convertEntities (professional and enterprise editions only) true Convert html entities removeNewLines false Remove all new lines from the text. Replaces them with a space removeMultipleSpaces true Convert multiple spaces to a single space, and preserve new lines convertBlankToNull false Convert blank strings to null literal - ignoreLowerCaseKeys (optional) True if values with keys containing lowercase characters should be ignored
Return Values
A new DataRecord containing all the tidied values and any values that were not Strings in the original record. The values that were Strings but were not tidied as well as the DATARECORD value will not be in the returned record.
Change Log
Version | Description |
---|---|
5.5.26a | Available in all editions. |
5.5.28a | Now uses a Map for the settings, rather than bit flags. |
Examples
Tidy all values in an extracted DataRecord
// Run code here to save the tidied record
- Printer-friendly version
- Login or register to post comments