Snowflake Connector: attributeFormat Options Explained

Overview

The Snowflake Connector / Snowflake Adapter supports three different attributeFormat options that control how Reltio attributes are represented in Snowflake.
Each option is designed for different analytics, reporting, and data lineage needs.

Choosing the right format is a trade-off between metadata richness, schema simplicity, and query complexity.

The supported options are:

  • STANDARD

  • FLATTEN

  • FLATTEN_SINGLE_VALUE

Example (standard Snowflake adapter with external staging bucket):

{
  "dataPipelineConfig": {
    "enabled": true,
    "activityLogEnabled": true,
    "dataFilteringEnabled": true,
    "messagingName": "<envName>-datapipeline-events_<tenantId>",
    "adapters": [
      {
        "type": "snowflake",
        "name": "snowflake",
        "enabled": true,
        "cloudProvider": "AWS",
        "awsConfig": {
          "awsCredentials": {
            "authMethod": "ROLE",
            "awsRole": "arn:aws:iam::<account>:role/<role>",
            "useRoleChaining": true
          }
        },
        "stagingBucket": "<bucket-name>",
        "ovOnly": false,
        "useLegacyUpdatedTime": false,
        "attributeFormat": "STANDARD",           // <── set it here
        "dataFilteringEnabled": true
      }
    ]
  }
}

You update this via the tenant config PUT on the Reltio Platform:

  • GET current physical tenant config

  • Modify the dataPipelineConfig.adapters[?type=="snowflake"] block

  • PUT it back

For Snowflake “internal stage” / direct-connect style

Same concept, but with snowflakeConfig instead of cloudProvider + stagingBucket:

{
 "dataPipelineConfig": {
   "enabled": true,
   "activityLogEnabled": true,
   "messagingName": "<envName>-datapipeline-events_<tenantId>",
   "adapters": [
     {
       "name": "snowflake_intrl_stg",
       "type": "snowflake",
       "enabled": true,
       "serializeInitialSourcesInCrosswalks": true,
       "snowflakeConfig": {
         "account": "<account_name>",
         "warehouse": "<warehouse>",
         "database": "<database>",
         "schema": "<schema>",
         "organization": "<organization>",
         "stage": "<internal_stage>",
         "role": "<role>"
       },
       "useLegacyUpdatedTime": false,
       "ovOnly": false,
       "attributeFormat": "FLATTEN_SINGLE_VALUE",  // <── set desired format here
       "batchSize": 1000,
       "dataFilteringEnabled": true
     }
   ]
 }
}

Switching an adapter’s attributeFormat from FLATTEN to FLATTEN_SINGLE_VALUE changes the JSON shape of the ATTRIBUTES column (arrays → scalars). Treat this as a breaking change for any downstream SQL that reads adapter output. Reltio supports STANDARD, FLATTEN, and FLATTEN_SINGLE_VALUE formats in adapter configuration.

FLATTEN_SINGLE_VALUE example: {"FirstName": "Jon"}
FLATTEN example: {"FirstName": ["Jon","Doe"]}.

What changes: After you update the adapter config to "attributeFormat": "FLATTEN_SINGLE_VALUE", newly written data from the pipeline will use the new scalar-per-attribute structure.

What doesn’t: Changing attributeFormat does not retro-convert rows already written to Snowflake. To repopulate historical data in the new shape, use a full syncToDataPipeline.

Procedure

  1. Change config
    Update the Snowflake adapter config.

  2. Run a synchronization
    Trigger a sync so the pipeline (micro-batches/new files) is produced in the new format going forward.

POST https://<env>.reltio.com/reltio/api/<tenant_id>/syncToDataPipeline

Note: Existing Snowflake rows remain as previously written (FLATTEN) unless you rebuild/backfill.

What to do with existing tables?

Option A — Clean rebuild (recommended)

  • Drop/truncate the adapter-created Snowflake tables.

  • Run a full syncToDataPipeline (backfill) So all rows are rewritten in FLATTEN_SINGLE_VALUE.
    This avoids mixed formats in the same table.

Option B — Coexistence/migration logic (only if you must keep data)

  • Keep the table, but expect mixed shapes:

    • Older rows → FLATTEN (arrays)

    • Newer rows → FLATTEN_SINGLE_VALUE (scalars)

  • All downstream SQL must detect and handle both array vs. scalar forms—this is brittle and error-prone; prefer a rebuild


Details on format

STANDARD

Description

STANDARD is the most detailed and verbose format.
Each attribute is exported as a full attribute object, including value and all associated metadata.

Structure

Attributes are represented as arrays of objects, even for single-valued attributes.

Example

{
  "FirstName": [
    {
      "id": "attr_1",
      "value": "Jon",
      "ov": true,
      "uri": "attruri/123",
      "pin": false,
      "ignore": false
    }
  ]
}

What You Get

  • Attribute value(s)

  • OV indicator (ov / isOv)

  • Internal attribute IDs

  • Attribute URIs

  • Pin and ignore flags

  • Potentially sources, timestamps, and lineage metadata (depending on configuration)

When to Use

Use STANDARD when:

  • You need full data lineage and metadata

  • You want to reapply survivorship logic downstream

  • You need to debug the OV selection

  • You are performing advanced analytics on attribute history or sources

  • You are comfortable with:

    • Larger JSON payloads

    • More complex LATERAL FLATTEN SQL in Snowflake

Trade-offs

  • Largest payload size

  • Most complex SQL queries

  • Highest flexibility and transparency

 

FLATTEN

Description

FLATTEN simplifies the structure by keeping only attribute values, while still preserving multi-valued attributes as arrays.

All metadata fields are removed.

Structure

Attributes are represented as arrays of values.

Example

{
  "FirstName": ["Jon", "Johnny"],
  "LastName": ["Doe"]
}

What You Get

  • Attribute values only

  • Arrays for multi-valued attributes

What You Don’t Get

  • OV flags

  • Attribute IDs or URIs

  • Pin or ignore indicators

  • Source or lineage metadata

When to Use

Use FLATTEN when:

  • You only need raw attribute values for analytics or reporting

  • You do not need OV logic or lineage

  • You still want to support multi-valued attributes

  • You are comfortable handling arrays in Snowflake SQL

Trade-offs

  • Smaller payload than STANDARD

  • Less context and traceability

  • Moderate SQL complexity due to arrays

FLATTEN_SINGLE_VALUE

Description

FLATTEN_SINGLE_VALUE is the simplest format.
Each attribute is exported as a single scalar value, typically the OV (primary) value chosen by Reltio.

Arrays are not used.

Structure

Attributes are represented as simple key-value pairs.

Example

{
  "FirstName": "Jon",
  "LastName": "Doe"
}

What You Get

  • One value per attribute

  • Simplified schema

What You Don’t Get

  • Non-OV or alternate values

  • Any metadata (OV flags, IDs, URIs, sources, pins, ignore flags)

When to Use

Use FLATTEN_SINGLE_VALUE when:

  • You want the simplest possible schema

  • Data will be consumed by BI or visualization tools

  • You only care about the primary (OV) value

  • You want to:

    • Avoid arrays

    • Avoid LATERAL FLATTEN

    • Minimize query complexity

Trade-offs

  • Least flexible

  • No historical or alternate value visibility

  • Best performance and usability for straightforward reporting

FormatPayload SizeMetadataMulti-ValuesQuery ComplexityBest For
STANDARDLargestFullYesHighLineage, survivorship, debugging
FLATTENMediumNoneYesMediumAnalytics on values only
FLATTEN_SINGLE_VALUESmallestNoneNoLowBI reporting, dashboards

Snowflake SQL query patterns by attributeFormat

Shape Examples:

STANDARD format

 

{
  "FirstName": [
    { "id": "1", "value": "Jon", "isOv": true, "uri": "...", "pin": false, "ignore": false },
    { "id": "2", "value": "Johnny", "isOv": false, "uri": "...", "pin": false, "ignore": false }
  ],
  "Email": [
    { "value": "jon.doe@example.com", "isOv": true },
    { "value": "j.doe@work.com", "isOv": false }
  ]
}

FLATTEN format

{
  "FirstName": ["Jon", "Johnny"],
  "Email": ["jon.doe@example.com", "j.doe@work.com"]
}

FLATTEN_SINGLE_VALUE format

 

{
  "FirstName": "Jon",
  "Email": "jon.doe@example.com"
}

No arrays, no OV/metadata – just a scalar per attribute.

Query

STANDARD – without LATERAL FLATTEN

Pick OV value by indexing the first OV element (assuming OV is first in the array).

SELECT
    URI,
    TYPE,
    ATTRIBUTES:"FirstName"[0]:"value"::STRING AS FIRST_NAME_OV,
    ATTRIBUTES:"Email"[0]:"value"::STRING     AS EMAIL_OV
FROM MDM.ENTITIES
WHERE TYPE = 'Contact';

If OV might not be index 0 and you just want the first value regardless of OV:

SELECT
    URI,
    TYPE,
    (ATTRIBUTES:"FirstName"[0]:"value")::STRING AS FIRST_NAME_ANY,
    (ATTRIBUTES:"Email"[0]:"value")::STRING     AS EMAIL_ANY
FROM MDM.ENTITIES;

STANDARD – with LATERAL FLATTEN

Return all values for an attribute with their OV flag.

SELECT
    e.URI,
    e.TYPE,
    f.value:"value"::STRING  AS EMAIL_VALUE,
    f.value:"isOv"::BOOLEAN  AS EMAIL_IS_OV
FROM MDM.ENTITIES e,
     LATERAL FLATTEN(input => e.ATTRIBUTES:"Email") f
WHERE e.TYPE = 'Contact';

Multiple attributes, each flattened separately:

SELECT
    e.URI,
    fn.value:"value"::STRING  AS FIRST_NAME_VALUE,
    fn.value:"isOv"::BOOLEAN  AS FIRST_NAME_IS_OV,
    em.value:"value"::STRING  AS EMAIL_VALUE,
    em.value:"isOv"::BOOLEAN  AS EMAIL_IS_OV
FROM MDM.ENTITIES e
     ,LATERAL FLATTEN(input => e.ATTRIBUTES:"FirstName") fn
     ,LATERAL FLATTEN(input => e.ATTRIBUTES:"Email") em;

To get only the OV email using flatten:

SELECT
    e.URI,
    em.value:"value"::STRING AS EMAIL_OV
FROM MDM.ENTITIES e,
     LATERAL FLATTEN(input => e.ATTRIBUTES:"Email") em
WHERE em.value:"isOv"::BOOLEAN = TRUE;

FLATTEN – without LATERAL FLATTEN

Take the first value in each array:

SELECT
    URI,
    TYPE,
    ATTRIBUTES:"FirstName"[0]::STRING AS FIRST_NAME_PRIMARY,
    ATTRIBUTES:"Email"[0]::STRING     AS EMAIL_PRIMARY
FROM MDM.ENTITIES
WHERE TYPE = 'Contact';

If you don’t care which value and just want to display all as JSON:

SELECT
    URI,
    TYPE,
    ATTRIBUTES:"FirstName" AS FIRST_NAME_ARRAY,
    ATTRIBUTES:"Email"     AS EMAIL_ARRAY
FROM MDM.ENTITIES;

 

FLATTEN – with LATERAL FLATTEN

Return one row per value (explode multi‑valued attributes):

SELECT
   e.URI,
   e.TYPE,
   f.value::STRING AS EMAIL_VALUE
FROM MDM.ENTITIES e,
    LATERAL FLATTEN(input => e.ATTRIBUTES:"Email") f
WHERE e.TYPE = 'Contact';

Both first name and email exploded:

SELECT
   e.URI,
   fn.value::STRING AS FIRST_NAME_VALUE,
   em.value::STRING AS EMAIL_VALUE
FROM MDM.ENTITIES e
    ,LATERAL FLATTEN(input => e.ATTRIBUTES:"FirstName") fn
    ,LATERAL FLATTEN(input => e.ATTRIBUTES:"Email") em;

If you want to keep one row per entity, but concatenate multiple values.

SELECT
   e.URI,
   LISTAGG(f.value::STRING, ', ') AS ALL_EMAILS
FROM MDM.ENTITIES e,
    LATERAL FLATTEN(input => e.ATTRIBUTES:"Email") f
GROUP BY e.URI;

 

FLATTEN_SINGLE_VALUE – without LATERAL FLATTEN

Cleanest case; direct scalar extraction:

SELECT
    URI,
    TYPE,
    ATTRIBUTES:"FirstName"::STRING AS FIRST_NAME,
    ATTRIBUTES:"Email"::STRING     AS EMAIL
FROM MDM.ENTITIES
WHERE TYPE = 'Contact';

If you had an attribute that is still an array or a nested object.

-- Example: AddressLines still modeled as an array even in FLATTEN_SINGLE_VALUE
SELECT
    e.URI,
    a.value::STRING AS ADDRESS_LINE
FROM MDM.ENTITIES e,
     LATERAL FLATTEN(input => e.ATTRIBUTES:"AddressLines") a;

 

 

Quick selection guidance

  • STANDARD

    • Use when you need isOv, id, uri, etc.

    • Almost always pair with LATERAL FLATTEN if you care about non‑OV values.

  • FLATTEN

    • Values only, as arrays.

    • Use [0] when you want a “primary” value without flattening, or LATERAL FLATTEN for full multi‑value expansion.

  • FLATTEN_SINGLE_VALUE

    • One scalar per attribute, no arrays.

    • Best for simple reporting; usually no LATERAL FLATTEN needed.

 

 

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.