Question
How can the customer confirm the contents of the Snowflake data warehouse using data passed from the Snowflake connector (Direct Connect)?
Answer
How Reltio events flow with the Snowflake (Direct Connect) pipeline:
Reltio emits change events (entities, relationships, interactions, matches, merges, workflows, activities, links).
Direct Connect streams data straight into your Snowflake internal stage (no external cloud storage). It uses a managed connector and Reltio auto-provisions Snowflake objects (landing tables, streams, views, tasks).
Data lands in “landing tables.” These hold the raw arrivals for each object type.
Continuous merges/upserts move data from landing → final object tables in your Snowflake schema (entities/relationships/etc), keeping them synchronized with Reltio.
Here is a checklist to make sure syncToDataPipeline is working correctly for Snowflake (Direct Connect).
Confirm the pipeline is set up and enabled.
- Validation of the Snowflake connector confirms successful connectivity and permissions between Reltio and Snowflake. Once you have configured the connector in the Console and provisioned all required objects inside Snowflake, a success or failure message is displayed indicating the validation outcome. This process ensures that the pipeline connection details, including account, warehouse, database, schema, organization, and role, are correctly configured. After validation is successful, you can proceed with data synchronization.
- After completing configuration in the Reltio Console for Snowflake (Direct Connect), the system validates the connection settings and displays the result.
-
You can validate the adapter configuration via API by using a POST request:
POST /api/tenants/<tenantId>/adapters/<adapterName>/validateThis returns a 200 OK if the validation is successful.
Check tenant health & queues.
Please make sure your tenant is healthy and not backlogged before you sync: you can use the Status API to check the event queue sizes.
GET https://<env>.reltio.com/reltio/status/tenant/<tenant_id>?details=all
Run the syncToDataPipeline task.
This endpoint “synchronizes data … from the platform to the enabled connector.”
POST https://<ApplicationURL>/reltio/api/<tenant_id>/syncToDataPipeline?distributed=true&options=parallelExecution"
Monitor the task until it finishes.
Track the task via the Tasks API (get active tasks / get task by ID).
# List active tasks for this tenant
curl -H "Authorization: Bearer <token>" \
"https://<ApplicationURL>/<tenant_id>/tasks"
# Get a specific task
curl -H "Authorization: Bearer <token>" \
"https://<ApplicationURL>/<tenant_id>/tasks/<taskId>"Validate events reached Snowflake
Run Tenant Queue Status API on the Data Pipeline Hub; when the event count is 0, sync has drained. (Counts can fluctuate if your tenant is busy.)
GET https://<env>-data-pipeline-hub.reltio.com/status/tenant/<tenant_id>/detailsCross-check counts and timing
For an authoritative “how many events moved” in a time window (≤1 day within last 30 days), use Event Monitoring API.
Validate in Snowflake: Direct Connect loads landing tables, then merges into final object tables; a completion row is written to the landing table at the end of synchronization—use this, along with row counts, to reconcile. Also review
eventState(e.g.,DATAPIPELINE_PROCESSED) where available.
If something looks off (quick triage)
Queues not zero after sync: recheck setup and monitor again; ongoing activity can keep counts > 0. Reference https://docs.reltio.com/en/developer-resources/data-integration-apis/data-integration-apis-at-a-glance/reltio-data-pipeline-for-snowflake-2.0-apis/tenant-queue-status-api.
Per-entity troubleshooting: Use the Entity Monitoring API to see event types/states/timestamps for a specific object.
Verification on the Snowflake
Set these once at the top of your worksheet and reuse:
-- replace with what you configured in Reltio Console
USE DATABASE <YOUR_DB>;
USE SCHEMA <YOUR_SCHEMA>;
-- All tables in the schema (quick scan)
SHOW TABLES IN SCHEMA <YOUR_DB>.<YOUR_SCHEMA>;
-- SHOW-based quick list
SHOW STREAMS IN SCHEMA <YOUR_DB>.<YOUR_SCHEMA>;
-- Quick list and state
SHOW TASKS IN SCHEMA <YOUR_DB>.<YOUR_SCHEMA>;
Confirm landing is receiving events and final tables are filling: the docs provide dataset definitions (column names like uri, timestamps, etc.) to help you filter and reconcile.
Record the window you care about (e.g., from when you triggered syncToDataPipeline until “complete”).
Validate the number of rows on the staging table.
WITH params AS (
SELECT
TO_TIMESTAMP_NTZ('<SYNC_START_UTC>') AS start_ts
)
SELECT *
FROM <DB>.<SCHEMA>.<LANDING_TABLE> lt, params
WHERE lt.<TIMESTAMP_COLUMN> >= start_ts
ORDER BY lt.<TIMESTAMP_COLUMN> DESC
LIMIT 50;
You can use the Landing table datasets doc to confirm your timestamp column name. The completion row, “with details of the sync,” has its timestamp in the Timestamp field. Reference https://docs.reltio.com/en/applications/data-integrations/data-pipelines-at-a-glance/reltio-data-pipeline-for-snowflake-at-a-glance/snowflake-pipeline-datasets/datasets-for-the-snowflake-data-schema/landing-table-datasets-for-snowflake
Count rows in your entities' landing tables by time window.
Use the timestamp column your pipeline exposes for landing rows (see your “Landing table datasets” doc for the exact field name in your environment), and optionally keep only successfully processed events via
eventState. Replace all<…>placeholders.WITH params AS ( SELECT TO_TIMESTAMP_NTZ('2025-11-15 00:00:00') AS start_ts, TO_TIMESTAMP_NTZ('2025-11-16 00:00:00') AS end_ts ) SELECT 'ENTITIES' AS object_type, COUNT(*) AS event_count FROM <DB>.<SCHEMA>.<ENTITIES_LANDING_TABLE>, params WHERE <EVENT_TIME_COLUMN> BETWEEN start_ts AND end_ts
The exact column names are listed in the Landing table datasets for Snowflake. You can use those names as they appear in your environment.
Reltio guidance also references filtering on
eventStatevalues such asDATAPIPELINE_PROCESSEDwhen reconciling event counts.
If your landing table stores raw JSON (e.g., a "dataTable" with a VARIANT column).
Extract the event’s timestamp from the JSON and count by window. Use the key name present in your event payload (see Reltio Events/examples pages to confirm the timestamp field used in your tenant’s payload).
-- Example where the JSON has a top-level "eventTime" field.
SELECT COUNT(*) AS event_count
FROM <DB>.<SCHEMA>."dataTable"
WHERE TO_TIMESTAMP_NTZ(data:"eventTime"::string)
BETWEEN TO_TIMESTAMP_NTZ('2025-11-15 00:00:00')
AND TO_TIMESTAMP_NTZ('2025-11-16 00:00:00');
If your payload uses a different key (e.g., "timestamp"), swap data:"eventTime" accordingly.
Reltio provides an Event Monitoring API that returns “how many events were transferred” for a specified interval (any 1-day window within the last 30 days).
GET https://<env>-data-pipeline-hub.reltio.com/api/tenants/<tenant_id>/monitoring/_eventMonitoring
If something is off
URIs missing in final tables: verify your streams and processing tasks that move data from landing to staging/final are running/healthy.
Escalate to Support if needed.
Comments
Please sign in to leave a comment.