tidyString
String sutil.tidyString ( String value ) (professional and enterprise editions only)
String sutil.tidyString ( String value, Map<String, Boolean> settings ) (professional and enterprise editions only)
String sutil.tidyString ( ScrapeableFile scrapeableFile, String value ) (professional and enterprise editions only)
String sutil.tidyString ( ScrapeableFile scrapeableFile, String value, Map<String, Boolean> settings ) (professional and enterprise editions only)
String sutil.tidyString ( String value, Map<String, Boolean> settings ) (professional and enterprise editions only)
String sutil.tidyString ( ScrapeableFile scrapeableFile, String value ) (professional and enterprise editions only)
String sutil.tidyString ( ScrapeableFile scrapeableFile, String value, Map<String, Boolean> settings ) (professional and enterprise editions only)
Description
Tidies the string by performing actions based on the values of the settings map.
Parameters
- value The String to tidy
- settings(optional) The operations to perform when tidying, using a Map<String, Boolean>
The tidy settings and their default values are given below. If a key is missing in the settings map, that operation will not be performed.
Map Key Default Value Description of operation performed trim true Trims whitespace from values convertNullStringToLiteral true Converts the string 'null' (without quotes) to the null literal (unless it has quotes around it, such as "null") convertLinks true Preserves links by converting <a href="link">text</a> to text (link), will try to resolve urls if scrapeableFile isn't null. Note that if there isn't a start and end <a> tag, this will do nothing removeTags true Remove html tags, and attempts to convert line break HTML tags such as <br> to a new line in the result removeSurroundingQuotes true Remove quotes from values surrounded by them -- "value" becomes value convertEntities (professional and enterprise editions only) true Convert html entities removeNewLines false Remove all new lines from the text. Replaces them with a space removeMultipleSpaces true Convert multiple spaces to a single space, and preserve new lines convertBlankToNull false Convert blank strings to null literal - scrapeableFile (optional) The current ScrapeableFile, used for resolving relative URLs when tidying links
Return Values
The tidied string
Change Log
Version | Description |
---|---|
5.5.26a | Available in all editions. |
5.5.28a | Now uses a Map for the settings, rather than bit flags. |
Examples
Tidy a comment extracted from a website
Assuming the extracted text's HTML code was:
<a href="http://www.somelink.com">This</a> was great because of these reasons:<br />
1 - Some reason<br />
2 - Another reason<br />
3 - Final reason
String comment = sutil.tidyString(scrapeableFile, dataRecord.get("COMMENT"));
The output text would be:
This (http://www.somelink.com) was great because of these reasons:
1 - Some reason
2 - Another reason
3 - Final reason
Run only specific operations
Map settings = new HashMap();
settings.put("convertEntities", true);
settings.put("trim", true);
String text = sutil.tidyString(" A String to tidy", settings);
settings.put("convertEntities", true);
settings.put("trim", true);
String text = sutil.tidyString(" A String to tidy", settings);
mikes on 10/27/2011 at 1:59 pm
- Printer-friendly version
- Login or register to post comments