Every unique crosswalk must have a unique key. When a source system provides a primary key with the entity in a record, the primary key must be used as the crosswalk key in the Reltio platform.
Sometimes a source system does not provide a key for its data. For example, there are two records of customers and addresses that have unique IDs offered for each customer but not independently for their address:
- 1500764, John Smith, 123 Main Street, Canton OH 87552
- 2786453, Jane Smith, 123 Main Street, Canton OH 87552
What is a surrogate crosswalk?
It is a unique ID generated by Reltio using a specific set of attributes.
What purpose does it serve?
If a source system doesn't provide the crosswalk ID, a surrogate crosswalk then it creates a unique ID. It reduces the number of crosswalks on an entity (Profile or Record)
Example of Surrogate Crosswalk
Example 1 (no unique ID from source system for crosswalk)
If your source system (perhaps it is a simple file) doesn’t provide a unique ID for an entity you wish to create, you can specify a source name for the crosswalk and ask Reltio to synthesize the crosswalk value. This is called a surrogate crosswalk.
Reltio synthesizes a surrogate crosswalk value by concatenating crosswalk attributes of your choice and then generating a hash key.
The attributes that will be used to synthesize the crosswalk value are declared in the “surrogateCrosswalks” collection of the entity definition.
Each source that requires a surrogate crosswalk must be declared separately.
Note: Only simple attributes drawn from the entity type can be used to form a surrogate crosswalk key. You cannot use nested or reference sub-attributes.
Let’s take a look at how to configure a surrogate crosswalk. In this example, we have defined a surrogate crosswalk using the first and last names and the tax ID of an employee, for the Finance source. Note that you have to repeat the surrogate crosswalk declaration for each source that needs a surrogate crosswalk.
Now let’s take a look at the request. You can use any of the approaches to creating a new entity, but in this example, we are going to create an Employee with first and last names plus tax ID which will be used to generate the surrogate, as well as an address and phone number.
You still have to supply a value for the crosswalk ID, but this will be ignored, so you can set it to anything you like.
- The list of attributes defined in a surrogate crosswalk should be the same that are configured in the match rule.
- All the attributes that are used in the match rule should be defined in the surrogate crosswalk, even if the source does not provide the information.
- Even if the match rule is lenient, with few attributes, the same set of attributes should be used in the surrogate crosswalk.
Example 2 (crosswalks for refEntity and refRelation is not known, where cardinality is high)
This example is where the crosswalks for refEntity and refRelation are not known. If the cardinality is high (In this example, suppose 1000 customers all referencing the same product), the product entity could accumulate a large number of crosswalks which could impact performance. Always try to specify a crosswalk refEntity, otherwise, consider using a surrogate crosswalk when using this approach.
- The body of this request is the same as the prior example, except that “refEntity” and “refRelation” are not specified.
- A new instance of the “Product” entity will be created with a Reltio crosswalk.
- This could trigger a match rule and cause a merge with an existing entity.
- This approach is commonly used when the crosswalk or ID of the referenced entity is not known.
- CAUTION: If the cardinality is high (In this example, suppose 1000 customers all referencing the same product), the product entity could accumulate a large number of crosswalks which could impact performance. Always try to specify a crosswalk refEntity, otherwise, consider using a surrogate crosswalk when using this approach.
Article is closed for comments.