Overview
Rebuild Match Table recalculates match tokens and match documents for selected entity types. It’s typically required after you introduce or change match rules or after cleaning up token issues. The task supports distributed execution and can be tuned with taskPartsCount; it supports a quiet mode to skip match events if you don’t want downstream traffic during maintenance.
When should you run it?
After enabling or modifying match rules (Organization, Location, Person, etc.).
After removing over-collisioned tokens (see below), because that operation requires a follow-up rebuild to repopulate MATCH tables.
Pre-flight checklist
Analyze your match rules
Run Analyze Match Strategy (Console) or use Match Rule Analyzer v2 to detect token explosion, heavy comparators, or high-collision tokens. Tuning before a rebuild often saves days of runtime.Deal with over-collisioned tokens (if present)
Run RemoveOvercollisionedTokens first, then plan the rebuild. The task resets the token state and fixes the MATCH tables, but it explicitly requires a subsequent rebuild.Decide on your event policy
If this is maintenance and you don’t need re-emitted match events, setmaintenanceOptions=skipMatchEvents. This refreshes internal match structures without generating match events to external queues.-
Determine when this task will be initiated. Avoid competing workloads that starve the rebuild job or amplify environment impact.
Avoid peak data loads, bulk updates, and performance tests.
Avoid concurrent large tasks on other big tenants in the same environment.
Capacity & concurrency planning
Distributed mode: set distributed=true to split the job into sub-tasks that can be processed in parallel across available API nodes. Control the degree of parallelism via taskPartsCount. Effective throughput depends on the number of available nodes and on other tasks running in the environment.
-
Avoid environmental contention: starting many resource-intensive tasks simultaneously across multiple tenants can reduce throughput or cause tasks to bounce between SCHEDULED and PROCESSING when nodes are under pressure. Use the Tasks API to observe global load.
Practical starting point
Large tenants: run distributed=true with a conservative taskPartsCount (well below your tenant’s ceiling)Increase only if throughput and task stability appear healthy.
Runbook
Option A — Console (Tenant Management)
Go to Console → Tenant Management → Jobs → Rebuild Match Table. Select entity type(s), choose distributed mode and parts, set
skipMatchEventsif desired, and start.
Option B — API (recommended for automation)
Endpoint
POST {ApplicationURL}/api/{tenant_id}/rebuildmatchtableCommon query parameters
entityType=configuration/entityTypes/{EntityType}distributed=true|falsetaskPartsCount={N}(only whendistributed=true)maintenanceOptions=skipMatchEvents(optional)
All parameters and behaviors are defined in the task reference.
Example (quiet maintenance on one entity type)
POST {ApplicationURL}/api/{tenant_id}/rebuildmatchtable\
?entityType=configuration/entityTypes/Organization\
&distributed=true\
&taskPartsCount=8\
&maintenanceOptions=skipMatchEvents
(Replace {tenant_id}, {ApplicationURL}, and Organization as needed.)
Filtering scope (advanced)
When you only need to rebuild a subset, use tokenization/rebuild tasks with query filters; understand that filtered entities can still match outside the filter.
Monitoring & control
Track status via Tasks API:
SCHEDULED,PROCESSING,PAUSING,PAUSED,CANCELING, etc. Low objects/sec for long periods or frequent flips back toSCHEDULEDsuggest environment contention.-
Get active tasks to see competing work:
GET {ApplicationURL}/api/{tenant_id}/tasks(Returns tasks across tenants with active statuses.)
-
Get specific rebuing match tasks
GET {ApplicationURL}/api/{tenant_id}/tasks/<task Id> Pause/Stop is supported on rebuild and related tasks; use it if you observe instability or need to de-risk peak hours.
Post-run steps
If analysis indicated token collision issues, confirm that MATCH tables look healthy after the rebuild. (Collision cleanup requires rebuild.)
Optionally run any post-rebuild validation provided in the task documentation (where applicable) to double-check completeness.
Common pitfalls & how to avoid them
Starting many large rebuilds across tenants at once → Stagger start times and keep parts conservative; scale up only if nodes are free and throughput is steady.
Inefficient rules make rebuilds run for days → Always run Analyze Match Strategy and fix token-heavy or fuzzy rules before kicking off.
Unexpected downstream churn → Use
maintenanceOptions=skipMatchEventsfor quiet maintenance windows.
FAQ
Q: Can I rebuild multiple entity types at once?
Yes, but for large volumes it’s safer (and easier to monitor) to run per entity type—e.g., Organization first, then Location—especially when using distributed mode.
Q: How do I choose taskPartsCount?
Start conservatively relative to your tenant’s capacity; monitor throughput and task stability, then adjust. Parallelism only helps if API nodes are available.
Q: What about Rebuild Match Table v2 and follow-up processing?
Review the v2 page for version-specific behavior and the ProcessRebuiltMatchTask if you plan to process v2 results and emit events accordingly.
Related documentation
Rebuild Match Table Task (parameters, distributed mode, skip events, pause/stop).
Distributed mode (how sub-tasks run in parallel across nodes).
Analyze Match Strategy (Console) & Match Rule Analyzer (API).
Inspections for Tokens & Over-collisioned Tokens (what to fix before rebuild).
Remove Overcollisioned Tokens (requires a subsequent rebuild).
-
Tasks API (statuses, monitoring, active tasks).
-
https://docs.reltio.com/en/developer-resources/load-and-export-apis/load-and-export-apis-at-a-glance/export-service-apis/export-tasks-management-api/status-of-an-export-task
Comments
Please sign in to leave a comment.