Audience: Customer tenant administrators.
Applies to: Reltio Data Cloud tenants after Clone or Snapshot Restore operations
Purpose
This article provides a checklist to verify that a Reltio tenant clone or snapshot restore completed successfully and that search, matching, and downstream analytics are consistent.
Since physical tenant configuration is not clonable, you should compare the physical tenant of the target and source using the following API endpoint. If you need assistance, please open a support ticket.
GET {envUri}/reltio/tenants/{tenantId}Scope & Assumptions
- You have administrator access to the target tenant.
- You can use Reltio Console and/or the Tasks API.
- Replace placeholders in API examples:
https://<env>.reltio.com,<tenant_id>.
Quick Checklist
Confirm Support Operation Status
Ensure the clone/restore request shows Completed in your support case before enabling integrations or write access to the target tenant.
Background: Overview & FAQ — Tenant cloning, backups, and snapshots:
- Cloning is the process of creating an exact copy of an entire tenant in the Reltio platform.
- Restoring from a snapshot will replace your entire environment with the state it was in at the time of the snapshot. Use it only when absolutely necessary, with full awareness of the potential for data loss. If you need a targeted rollback (e.g., only configuration), it is recommended to re-import the saved configuration files manually rather than perform a full restore. Supported data size for free snapshot-and-restore operations is up to 50 TB.
Documentation: https://docs.reltio.com/en/reltio/whats-in-the-box/whats-in-the-box-at-a-glance/tenants-at-a-glance/tenant-architecture/tenant-cloning-backups-and-snapshots
Run a Consistency Check (Search vs. Main Storage) - Only run if a problem is found with the Clone.
This is the primary integrity check to verify search indices are consistent with primary storage. Consistency check refers to the systematic process of ensuring that data remains consistent, accurate, and synchronized across different storage layers, partitions, or components within a system. The Memory Safe Cassandra ES Consistency Task is one of several consistency tasks that can be run to address specific issues detected in your system. These tasks are time-consuming and should be executed based on the specific issues you encounter.
Only run if a problem is found with the complete clone process.
Conditions where you should run a Memory Safe Cassandra ES Consistency Task (especially after a clone) are all about suspected or observed inconsistency between Cassandra and Elasticsearch.
You should consider running it when you see things like:
Search / UI not reflecting the latest data
Version mismatches between ES and Cassandra
-
Known ES index disruptions around the time of clone
Index rebuilds were interrupted or failed.
Bulk operations or data loads during/after clone that may have skipped ES events.
Any operational incident where ES was unavailable or lagging significantly while writes to Cassandra continued.
-
Entities present in one store but missing in the other
IDs that exist in Cassandra but are absent in ES, or vice versa.
Symptoms: search shows fewer objects than entity counts or UI shows ghost records not resolvable by API.
-
Match / Potential Match behavior that suggests ES lag
-
Potential matches or search facets not aligning with what you see in the underlying entities, and you’ve already ruled out:
Match rule config issues
-
“Not a Match” flags
-
-
Console: Tenant Management → New Job → Consistency check.
How‑to:- https://docs.reltio.com/en/applications/console/tenant-management-applications/tenant-management-at-a-glance/jobs-at-a-glance/creating-a-consistency-check-job
- https://docs.reltio.com/en/developer-resources/system-administration-apis/system-administration-apis-at-a-glance/tasks-api/memory-safe-cassandra-es-consistency-task
-
API: ES Consistency Task:
https://docs.reltio.com/en/developer-resources/system-administration-apis/system-administration-apis-at-a-glance/tasks-api/memory-safe-cassandra-es-consistency-task
API Example
POST {ApplicationURL}/api/{tenant_id}/esCassandraConsistencyCheck?fixInconsistency=true Success criteria: Task completes without inconsistency counts.
Rebuild/verify matching
The Rebuild Match Table task is a process in Reltio's Environment Management that recalculates and recreates match tables for entities. This task is essential when there are changes in the grouping functionality or when inconsistencies are detected. You should plan to run Rebuild Match Table after a clone or restore from a snapshot as part of secondary storage synchronization.
API example
POST {ApplicationURL}/api/{tenantId}/rebuildmatchtable?distributed=true&taskPartsCount=6Key query parameters:
tenantId(required for platform admin call; in the tenant-scoped URL it’s in the path)entityType– limit to one L3, e.g.configuration/entityTypes/IndividualskipEntitiesCount– default0entitiesLimit– default infinitydistributed–true/false(defaultfalse)taskPartsCount– used whendistributed=truedeleteOldCF– defaultfalsedistributedTaskIndex– for individual parts when distributedrebuildInBackground– defaultfalse
Validate downstream analytics syncs (GBQ/Snowflake/Databricks)
syncToDataPipeline automatically after every clone or restore, and you especially should not do it blindly after a data-only clone or snapshot restore.
If this is a fresh analytics / Data Pipeline setup, or you need a full backfill
-
Running
syncToDataPipelinecan be appropriate, but:Expect full reindex, including merges.
Consider pausing or protecting downstream consumers first.
Coordinate with CE/PS so they’re aware of the volume.
If you just did a clone or snapshot restore for functional testing / UAT, and analytics are already broadly aligned.
Do not automatically run
syncToDataPipeline.-
Prefer more targeted actions:
If only the schema changed: run appropriate schema sync/configuration steps for your analytics adapter, not a full sync.
If only certain objects need refreshing, use more-scoped reindex/pipeline options (where available) instead of a full tenant sync.
If your only goal after clone/restore is to make UI and analytics roughly consistent, but there were no major model/match changes:
Running
syncToDataPipelineis risky because of theReindexMergesTaskbehavior.-
Coordinate with support to either:
Exclude merge reindex in your procedure (if and when such control is available for your adapter), or
Protect downstream systems (stop consumers, increase capacity, clear queues afterward).
Practical recommendation
Given the current known behavior:
Default stance:
After a clone or snapshot restore, do not runsyncToDataPipelineby default.-
Only run it when:
There is a clearly defined need for a full backfill into Data Pipeline / GBQ / Delta Lake, and
Stakeholders understand that all merges, interactions, etc., will be replayed, and
You’ve put controls in place for downstream impact.
DPH counts are different after a clone.
Operational smoke tests to verify the clone
Search a few golden entities to confirm searchability.
Open Potential Matches for known duplicates to confirm the index is populated.
Execute an AutoMatch verification on a known pair using
/entities/_verifyMatchesto ensure tokens/intersections are present and rules fire as expected (after Rebuild Match Tables)
Comments
Please sign in to leave a comment.