Tidies the string by performing actions based on the values of the settings map.
The tidy settings and their default values are given below. If a key is missing in the settings map, that operation will not be performed.
Map Key | Default Value | Description of operation performed |
---|---|---|
trim | true | Trims whitespace from values |
convertNullStringToLiteral | true | Converts the string 'null' (without quotes) to the null literal (unless it has quotes around it, such as "null") |
convertLinks | true | Preserves links by converting <a href="link">text</a> to text (link), will try to resolve urls if scrapeableFile isn't null. Note that if there isn't a start and end <a> tag, this will do nothing |
removeTags | true | Remove html tags, and attempts to convert line break HTML tags such as <br> to a new line in the result |
removeSurroundingQuotes | true | Remove quotes from values surrounded by them -- "value" becomes value |
convertEntities (professional and enterprise editions only) | true | Convert html entities |
removeNewLines | false | Remove all new lines from the text. Replaces them with a space |
removeMultipleSpaces | true | Convert multiple spaces to a single space, and preserve new lines |
convertBlankToNull | false | Convert blank strings to null literal |
The tidied string
Version | Description |
---|---|
5.5.26a | Available in all editions. |
5.5.28a | Now uses a Map for the settings, rather than bit flags. |
Assuming the extracted text's HTML code was:
<a href="http://www.somelink.com">This</a> was great because of these reasons:<br />
1 - Some reason<br />
2 - Another reason<br />
3 - Final reason
The output text would be:
This (http://www.somelink.com) was great because of these reasons:
1 - Some reason
2 - Another reason
3 - Final reason