Switch Account key to Client Credential in the Reltio Data pipeline for Databricks

Problem :

When configuring the Reltio Data Pipeline for Databricks using client credentials, no files are received in ADLS (Azure Data Lake Storage).

 

Cause : 

This issue is primarily caused by the improper configuration of the storage account. Please refer to the following documentation: Configure Azure Cloud Storage for Databricks 

 

Solution : 

Collaborate with the Azure team to ensure proper configuration of the storage account. 

Recommendations to enhance the stability of the setup: 

  1. Standardize by Integration
    • Maintain the use of the Reltio Data Pipeline with Client Credentials as the preferred method. Reltio
    • If an alternative pipeline requires the use of an Auth Key, this is acceptable; however, ensure that secrets and roles are not shared between pipelines.
  2. Segregate Write Locations (strongly recommended)
    • Utilize separate containers to reflect Reltio’s Staging and Target pattern. Reltio
    • If sharing a container is necessary, enforce the use of unique top-level prefixes (e.g., reltio/ versus other-pipeline/) and avoid placing both systems within the same folder prefix. When using Export Service, either accept the auto-generated path or specify a distinct prefix. Reltio
  3. Harden Permissions per Method
    • Client Credentials: Assign your service principal the minimum necessary role (Storage Blob Data Contributor) scoped at the container level utilized by Reltio. Avoid granting broad account-level roles. Reltio
  4. Ensure Writes Are Idempotent and Time-Partitioned
    • Configure your jobs to partition data by date and time (a common practice with Reltio Export’s automatic pathing) to prevent overwrites when multiple pipelines execute in close temporal proximity.
Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.