The cleanser enables you to declare a set of source words or phrases and their replacements, for match rules. You simply create a text file where each line represents a source string and the replacement string, using the “=>” syntax between them. For examples see the documentation page.
The most common question about CleanseAdapter
dictionary:
How does sequence in the dictionary file impact sequence?
The sequence in the dictionary goes from top to bottom. It has no impact unless you have repetitive source values.
Notice, if you have repetitive source values, the outcome will be different based on the order.
If a source term matches multiple entries, how will this be processed?
It is not recommended to have multiple sources with the same value and different replacements. In this case, the behavior will be the following:
Example 1:
st=>street
st=>str
Outcome: strreet
Example 2:
st=>str
st=>street
Outcome: streetr
Should punctuation rules (period, comma, exclamation point, etc.) be at the start or end of the file?
The order of punctuation rules will not make any difference. The most common way is to include punctuation rules toward the bottom of the file.
What are recommendations on handling spaces?
To handle spaces you can put space before/after the source and value strings.
Examples:
[space]A[space]=>[space]ABC[space]
A[space]=>ABC[space]
[space]A=>[space]ABC
What is the file limitation in terms of size?
The file size should be less than 10 MB. If the number of lines is more than 1000 then the performance may be impacted.
References:
String Replacement Cleanser - link
Comments
Please sign in to leave a comment.