Analyze Match Strategy completes with 0 processed objects and UI shows “There is not enough data to provide analysis results”

Summary

The Analyze match strategy job analyzes match behavior using a sample (“sample match pairs from your tenant”), not a full scan of all entities. If the job cannot construct a usable sample, it may complete successfully but still report:

  • Number of processed objects = 0

  • Throughput = 0

  • UI message: not enough data/sample data set of 0 entities

Symptoms

  • Job runs for a long time and completes, but:

    • Processed object count remains 0

    • Throughput remains 0

  • Match Analysis page shows “There is not enough data to provide analysis results” (sample = 0)

 

What we may observed in this tenant 

From your tasks/history export:

Task executed (twice):

  • AnalyzeMatchRulesProfilingTask

  • Entity type: configuration/entityTypes/Location

  • maxObjectsPerType: 100000

  • samplingAlgorithm: MATCHES_AWARE

  • Result:

    • totalObjectsProcessed: 0

    • objectsProcessed["configuration/entityTypes/Location"]: 0

    • throughput: 0.0
       

Root cause task

Your run used samplingAlgorithm: MATCHES_AWARE. Reltio documents that MATCHES_AWARE uses existing match information in the tenant to build a subset of entities. If that sampler can’t find eligible entities/pairs based on existing match artifacts, the sample can be 0, even if millions of entities exist.

 

Why did the UI job not let you fix it?

The Console job is documented as a “create an Analyze match strategy job” workflow, but it does not expose a control to select the sampling algorithm (e.g., SEQUENCE vs MATCHES_AWARE) in the documented UI steps. It just states that it analyzes using sample match pairs


So, in the practical task that uses MATCHES_AWARE (as your history shows), you can’t switch it in the UI.

Recommendation

Option A: Run Match Rule Analyzer v2 (Dynamic) with SEQUENCE sampling

This avoids dependency on “existing match information” and samples by iterating through stored entities. The API explicitly supports SEQUENCE.

Submit profiling

POST {platformUrl}/tools/{tenantId}/analyzeMatchRules/v2 
  • Use your request with:

    • samplingAlgorithm: { "name": "SEQUENCE" }

    • set entityTypes to ["configuration/entityTypes/Location"]

Example of payload

{
  "entityTypes": [
    "Location"
  ],
  "profiling": {
    "enabled": true,
    "maxObjectsPerType": 20000,
    "timeout": 36000,
    "samplingAlgorithm": {
      "name": "SEQUENCE"
    },
    "analysisTypes": [
      {
        "analysisType": "matchToken",
        "enabled": true,
        "perMatchGroup": true,
        "splitByMatchGroupType": true
      },
      {
        "analysisType": "matchDocumentMatches",
        "enabled": true,
        "perMatchGroup": true,
        "splitByMatchGroupType": false
      }
    ],
    "scopes": [
      "ALL",
      "INTERNAL",
      "EXTERNAL",
      "NONE"
    ],
    "useSkippedRules": true
  }
}

Important documented constraint: if you enable matchDocumentMatches, Reltio documents a restriction that you can’t set maxObjectsPerType > 20000 when that analysis is used. If you want100000, disable matchDocumentMatches.

Retrieve results

GET {platformUrl}/tools/{tenantId}/analyzeMatchRules/v2/profiling/{profilingId} 

Option B: Validate prerequisites for sampling

These checks determine why MATCHES_AWARE found 0:

  • Confirm match rules exist/permissions are present (Analyze Match Strategy button can be disabled if no match rules or permissions are present).

  • Confirm the tenant has match artifacts/match info usable by MATCHES_AWARE (this is the “unknown” that would explain 0 samples).


How to read and interpret the output

GET {platformUrl}/tools/{tenantId}/analyzeMatchRules/v2/profiling/{profilingId}

Everything below is based on what Reltio documents for Match Rule Analyzer v2 (Dynamic) and related matching concepts.

Validate execution status and “did it actually process"

In the GET response, start with the top-level execution metadata that tells you whether the run was usable:

  • status: should be something like completed/failed/in-progress (exact enum depends on the implementation, but you’ll see a status field in the response examples/description).

  • duration / timing fields: sanity check long vs short run.

  • totalObjectsProcessed and objectsProcessed per entity type: this is the single most important “is the output meaningful” gate.

    • If totalObjectsProcessed = 0, you will get “not enough data” style results (same symptom you saw in the UI-based task).

Common interpretation

  • maxObjectsPerType is a cap, not a guarantee. The run can process fewer (including 0) depending on how the sampler and eligibility behave.

  • If you used MATCHES_AWARE, it uses existing match information to build the subset, so it can collapse to 0 even when the entity type is large.

Did the analyzer find any rules and produce stats?

The response contains per-entity-type results (because you can submit multiple types). For each entity type section:

Confirm match rule scopes included

You’ll see which scopes were evaluated (or at least the request echo under profiling.scopes). Default scope set is ALL, NONE, INTERNAL, EXTERNAL.

Practical read-out:

  • If you have match rules scoped to INTERNAL only, but you accidentally ran scopes excluding INTERNAL, you’d get “no rule activity” even if entities exist.

  • In your case you were including all four scopes in the payload, which is the broadest scope setting.

Check whether skipped/bypassed rules were included

  • useSkippedRules: Reltio notes skipped match rules are included by default; this field lets you control that behavior.

Interpret each enabled analysis type

The “meat” of the output is inside the analysisTypes results. Here’s how to interpret the ones you’ve been focusing on:

matchToken

What it tells you: token phrase generation behavior (a direct driver of match performance). Reltio emphasizes that the number of match token phrases is important for matching performance.

What to look for in the output:

  • Whether results are split perMatchGroup and/or splitByMatchGroupType (depends on your request).

  • Summary stats (commonly min/max/avg) and histograms (distribution of generated tokens). The documemtation explains the histogram concept and parameters like number of bins, start, and bin size.

How to interpret:

  • Too many tokens → performance risk / over-candidate generation.

  • Too few tokens → low recall risk (missed candidates).
    Reltio provides dedicated “inspections for tokens” to flag these patterns.

matchDocumentMatches

What it tells you: correlation/overlap behavior between match groups based on match document outcomes (useful to detect redundant rules and rule correlation). Reltio provides inspections that specifically use matchDocumentMatches outputs to identify redundant match groups.

Important operational constraint:
Reltio documents that matchDocumentMatches is expensive, and notes a constraint that can limit how big you can set maxObjectsPerType when using it (it’s called out in the v2 Dynamic documentation).

How to interpret:

  • Look for high correlation between rules: two rules producing near-identical outcomes may be candidates for consolidation/removal.

matchDocumentsPerMatchGroup

What it tells you: distribution/frequency of match documents by match group.

How to interpret:

  • If a match group does not appear in any match document, Reltio’s “useless match group” inspection suggests it may not be used.

Interpret “inspections” 

If your request enabled inspections, treat these as actionable findings layered on top of the raw statistics:

  • Useless match group: match group does not appear in match documents (based on matchDocumentsPerMatchGroup).

  • Redundant match group: high or full correlation between rules (based on matchDocumentMatches).

  • Token inspections: high number of tokens, too many tokens by a subset, too few tokens, etc.

If the GET output still shows “0 processed,” how to use the response to debug

If totalObjectsProcessed = 0, use the echoed config in the response to triage:

  1. Sampling algorithm

    • MATCHES_AWARE depends on existing match information; try SEQUENCE (iterates stored entities) or SEARCH. Supported algorithms are listed in the doc.

  2. Scopes

    • Ensure the match rules you care about are within the evaluated scopes (ALL, NONE, INTERNAL, EXTERNAL).

  3. Analysis types

    • If you enabled heavy analyses (like matchDocumentMatches), consider starting with just matchToken to prove sampling is working, then add other analyses. (Reltio notes matchDocumentMatches is expensive; the v2 page highlights performance considerations/limits.)

  4. Go deeper on match artifacts if needed

    • If you suspect the problem is “no match artifacts exist,” use Reltio’s match document/token APIs to inspect match tokens/documents for a specific entity (the documentation explain match tokens/documents and how to retrieve match documents for an entity).

A simple “review checklist” you can paste into an investigation ticket

  • status indicates completion, not failure.

  • profiling.samplingAlgorithm.name is what we intended (SEQUENCE if avoiding MATCHES_AWARE dependency).

  • totalObjectsProcessed > 0 and objectsProcessed[entityType] > 0.

  • scopes include the match rule scopes we need (INTERNAL, EXTERNAL, ALL, NONE).

  • matchToken histogram/stats look reasonable (no explosion, no near-zero).

  • matchDocumentMatches correlation flags redundant rules (if enabled).

  • Any inspections findings are addressed (useless/redundant groups, token issues).

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.