Why is verifyMatch not matching as expected for a fuzzy match rule using DistinctWordsComparator?

Question

We can see for the individual entities, the Fuzzy rule on the first name is not working as expected.
Match analysis is showing the fuzzy match as false.


Example :
Source First name - SHARO
Target First Name - SHARON

The match comparator class that is being used in this case is:

{
"attribute": "configuration/entityTypes/Individual/attributes/FirstName",
"parameters": [
{
"parameter": "pattern",
"value": "[a-zA-Z]+"
},
{
"parameter": "useStemmer",
"value": "true"
},
{
"parameter": "useSoundex",
"value": "true"
},
{
"parameter": "useNoiseIfEmpty",
"value": "true"
}
],
"class": "com.reltio.match.comparator.DistinctWordsComparator"
}

1st entity generated document

"FirstName": [
"sharo"
],
"FirstName~.@ps": [
"10"

2nd entity generated document

"FirstName": [
"sharon"
],

"FirstName~.@ps": [
"10"
],

Answer

DistinctWordsComparator will not work in this case. Possible workarounds that you can use are:

  1. DistinctWordsComparator with "thresholdChars": "x". It will split the value by symbols. "thresholdChars" allows setting the number of differences or percent of differences. The possible issue here - this comparator will match symbol transpositions also (e.g. words ‘sharo' and 'rosha’ will be matched with this comparator).

  2. DamerauLevenshteinDistance or DynamicDamerauLevenshteinDistance

  3. Using of SoundexComparator with maxCodeLen=2.

  4. Using pattern in DistinctWordsComparator. The pattern can be configured to split the value, for example, to compare the first 5 symbols only.

If I perform the following

POST https://prod-h360.reltio.com/reltio/tools/matching/compare
Body:

{
"first": "SHARO",
"second": "SHARON",
"comparatorClass": {
"parameters": [
{
"parameter": "pattern",
"value": "[a-zA-Z]+"
}
],
"class": "com.reltio.match.comparator.DynamicDamerauLevenshteinDistance"
},
"fuzzy": true
}

The response from this test was correct.

{
    "fuzzy": true,
    "first": "SHARO",
    "second": "SHARON",
    "equals": true,
    "relevance": 0.9722222222222222
}

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.