How can the customer confirm the Snowflake data warehouse contents with data passed from the Snowflake connector - Direct Connect?

Question

How can the customer confirm the contents of the Snowflake data warehouse using data passed from the Snowflake connector (Direct Connect)?

 

Answer

How Reltio events flow with the Snowflake (Direct Connect) pipeline:

  1. Reltio emits change events (entities, relationships, interactions, matches, merges, workflows, activities, links). 

  2. Direct Connect streams data straight into your Snowflake internal stage (no external cloud storage). It uses a managed connector and Reltio auto-provisions Snowflake objects (landing tables, streams, views, tasks). 

  3. Data lands in “landing tables.” These hold the raw arrivals for each object type. 

  4. Continuous merges/upserts move data from landing → final object tables in your Snowflake schema (entities/relationships/etc), keeping them synchronized with Reltio. 

Here is a checklist to make sure syncToDataPipeline is working correctly for Snowflake (Direct Connect).

Confirm the pipeline is set up and enabled.

  • Validation of the Snowflake connector confirms successful connectivity and permissions between Reltio and Snowflake. Once you have configured the connector in the Console and provisioned all required objects inside Snowflake, a success or failure message is displayed indicating the validation outcome. This process ensures that the pipeline connection details, including account, warehouse, database, schema, organization, and role, are correctly configured. After validation is successful, you can proceed with data synchronization.
  • After completing configuration in the Reltio Console for Snowflake (Direct Connect), the system validates the connection settings and displays the result.
  • You can validate the adapter configuration via API by using a POST request:

    POST /api/tenants/<tenantId>/adapters/<adapterName>/validate

    This returns a 200 OK if the validation is successful.

Check tenant health & queues.

  • Please make sure your tenant is healthy and not backlogged before you sync: you can use the Status API to check the event queue sizes. 

GET https://<env>.reltio.com/reltio/status/tenant/<tenant_id>?details=all

Run the syncToDataPipeline task.

This endpoint “synchronizes data … from the platform to the enabled connector.” 

POST https://<ApplicationURL>/reltio/api/<tenant_id>/syncToDataPipeline?distributed=true&options=parallelExecution" 

Monitor the task until it finishes.

  • Track the task via the Tasks API (get active tasks / get task by ID). 

# List active tasks for this tenant
curl -H "Authorization: Bearer <token>" \
  "https://<ApplicationURL>/<tenant_id>/tasks"

# Get a specific task
curl -H "Authorization: Bearer <token>" \
  "https://<ApplicationURL>/<tenant_id>/tasks/<taskId>"

Validate events reached Snowflake

  • Run Tenant Queue Status API on the Data Pipeline Hub; when the event count is 0, sync has drained. (Counts can fluctuate if your tenant is busy.) 

GET https://<env>-data-pipeline-hub.reltio.com/status/tenant/<tenant_id>/details

Cross-check counts and timing

  • For an authoritative “how many events moved” in a time window (≤1 day within last 30 days), use Event Monitoring API

  • Validate in Snowflake: Direct Connect loads landing tables, then merges into final object tables; a completion row is written to the landing table at the end of synchronization—use this, along with row counts, to reconcile. Also review eventState (e.g., DATAPIPELINE_PROCESSED) where available. 

If something looks off (quick triage)


Verification on the Snowflake

Set these once at the top of your worksheet and reuse:

-- replace with what you configured in Reltio Console
USE DATABASE <YOUR_DB>;
USE SCHEMA <YOUR_SCHEMA>;

-- All tables in the schema (quick scan)
SHOW TABLES IN SCHEMA <YOUR_DB>.<YOUR_SCHEMA>;

-- SHOW-based quick list
SHOW STREAMS IN SCHEMA <YOUR_DB>.<YOUR_SCHEMA>;

-- Quick list and state
SHOW TASKS IN SCHEMA <YOUR_DB>.<YOUR_SCHEMA>;

Confirm landing is receiving events and final tables are filling: the docs provide dataset definitions (column names like uri, timestamps, etc.) to help you filter and reconcile.

Record the window you care about (e.g., from when you triggered syncToDataPipeline until “complete”).

Validate the number of rows on the staging table.

WITH params AS (
  SELECT 
    TO_TIMESTAMP_NTZ('<SYNC_START_UTC>') AS start_ts
)
SELECT *
FROM   <DB>.<SCHEMA>.<LANDING_TABLE> lt, params
WHERE  lt.<TIMESTAMP_COLUMN> >= start_ts
ORDER BY lt.<TIMESTAMP_COLUMN> DESC
LIMIT 50;  

 

Count rows in your entities' landing tables by time window.

Use the timestamp column your pipeline exposes for landing rows (see your “Landing table datasets” doc for the exact field name in your environment), and optionally keep only successfully processed events via eventState. Replace all <…> placeholders.

WITH params AS (
  SELECT 
    TO_TIMESTAMP_NTZ('2025-11-15 00:00:00') AS start_ts,
    TO_TIMESTAMP_NTZ('2025-11-16 00:00:00') AS end_ts
)

SELECT 'ENTITIES' AS object_type, COUNT(*) AS event_count
FROM   <DB>.<SCHEMA>.<ENTITIES_LANDING_TABLE>, params
WHERE  <EVENT_TIME_COLUMN> BETWEEN start_ts AND end_ts
 
  • The exact column names are listed in the Landing table datasets for Snowflake. You can use those names as they appear in your environment. 

  • Reltio guidance also references filtering on eventState values such as DATAPIPELINE_PROCESSED when reconciling event counts. 

If your landing table stores raw JSON (e.g., a "dataTable" with a VARIANT column).

Extract the event’s timestamp from the JSON and count by window. Use the key name present in your event payload (see Reltio Events/examples pages to confirm the timestamp field used in your tenant’s payload). 

-- Example where the JSON has a top-level "eventTime" field.
SELECT COUNT(*) AS event_count
FROM   <DB>.<SCHEMA>."dataTable"
WHERE  TO_TIMESTAMP_NTZ(data:"eventTime"::string)
       BETWEEN TO_TIMESTAMP_NTZ('2025-11-15 00:00:00')
           AND TO_TIMESTAMP_NTZ('2025-11-16 00:00:00');

If your payload uses a different key (e.g., "timestamp"), swap data:"eventTime" accordingly.

 

Reltio provides an Event Monitoring API that returns “how many events were transferred” for a specified interval (any 1-day window within the last 30 days). 

GET https://<env>-data-pipeline-hub.reltio.com/api/tenants/<tenant_id>/monitoring/_eventMonitoring

If something is off

  • URIs missing in final tables: verify your streams and processing tasks that move data from landing to staging/final are running/healthy. 

  • Escalate to Support if needed.

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.