DLT pipeline fails with PARSE_SYNTAX_ERROR on CREATE OR REPLACE STREAMING LIVE VIEW

Summary

When running a Reltio Databricks Delta Live Tables (DLT) pipeline, the pipeline may fail during initialization with a SQL parse error similar to:

[PARSE_SYNTAX_ERROR] Syntax error at or near 'OR'. SQLSTATE: 42601 Failing statement: CREATE OR REPLACE STREAMING LIVE VIEW activitiesSnapshots -------^^^ AS SELECT * FROM STREAM(LIVE.landingTable) WHERE objectType = "activity"

This issue is caused by Databricks deprecating the legacy LIVE schema and older DLT SQL syntax in newer DLT releases, while the customer’s DLT notebook was generated by an older Reltio Data Pipeline Hub (DPH) version that still uses the deprecated syntax.

The fix is to re‑deploy the DLT notebook with the updated syntax using the Reltio configure_pipeline action.

 

Symptoms

You may see one or more of the following:

  • The DLT pipeline fails immediately on start, before any data is processed.

  • In the DLT event log or notebook output you see:

Error class: org.apache.spark.sql.catalyst.parser.ParseException Message: 
[PARSE_SYNTAX_ERROR] Syntax error at or near 'OR'. SQLSTATE: 42601 (line 85, pos 7)
  • The highlighted SQL statement looks like:

CREATE OR REPLACE STREAMING LIVE VIEW activitiesSnapshots AS SELECT * 
FROM STREAM(LIVE.landingTable) WHERE objectType = "activity";
  • The error points at OR in CREATE OR REPLACE or otherwise flags the LIVE keyword as unrecognized in the current DLT publishing mode.

Important:
This failure occurs during SQL parsing, before any Reltio data is read. It is not a data‑quality or schema.json/L3 mismatch issue.

Root Cause

What changed in Databricks

Databricks has deprecated the legacy LIVE virtual schema and old DLT SQL patterns in newer DLT releases:

  • The LIVE virtual schema is considered legacy and is tied to the legacy publishing mode.

  • Databricks recommends migrating pipelines to the default publishing mode and has documented that legacy mode and LIVE support will be removed in future versions.

Relevant Databricks documentation:

As part of these changes:

  • Pipelines created or migrated to the new default publishing mode no longer support some legacy patterns, including combinations like
    CREATE OR REPLACE STREAMING LIVE VIEW ....

  • The LIVE keyword is treated differently or ignored in default publishing mode, and some legacy constructs are now errors instead of warnings.

When this affects Reltio Databricks pipelines

You are impacted if all of the following are true:

  1. You are running a Delta Lake / Databricks adapter pipeline from Reltio.

  2. The DLT notebook in your Databricks workspace was originally deployed by an older DPH version that still uses the legacy LIVE syntax.

  3. You have not re‑run the Reltio configure_pipeline action from a newer DPH version (so the notebook has not been refreshed).

  4. Your Databricks workspace has been upgraded to a DLT version that deprecates legacy LIVE patterns,

  5. Your DLT pipeline is running in default publishing mode (new pipelines after the Databricks change, or pipelines manually migrated).

In this situation, the Reltio‑generated notebook still contains SQL such as:

CREATE OR REPLACE STREAMING LIVE VIEW activitiesSnapshots AS SELECT * FROM
 STREAM(LIVE.landingTable) WHERE objectType = "activity";

The new DLT engine in default publishing mode no longer accepts this combination and raises PARSE_SYNTAX_ERROR on pipeline start.

What this is NOT

This error is not caused by:

  • Issues with Reltio profile/entity data.

  • A mismatch between schema.json and the L3 data model in your tenant.

  • Changes to your Reltio configuration or to the Databricks adapter in the tenant.

When schema.json / L3 drift is the cause, you will see schema validation or runtime errors later in the run, not a SQL parse failure during notebook initialization.

Impact

  • The DLT pipeline does not start, so no new Reltio events are processed into your Delta tables.

  • Downstream consumers of the Delta tables (reports, analytics, downstream pipelines) will see stale or missing data until the DLT pipeline is able to start successfully again.

 

Resolution

The supported and recommended fix is to redeploy the DLT notebook using the latest Reltio Data Pipeline Hub logic, which generates SQL compatible with the current Databricks DLT version.

  1. Identify your Delta Lake adapter name from your Data Pipeline configuration (for example, Deltalakeprod).

  2. Call the configure_pipeline action for that adapter:

POST https://<region>-data-pipeline-hub.reltio.com/api/tenants/{tenantId}/adapters/{adapterName}/actions/configure_pipeline

What this does:

  • Regenerates the DLT pipeline configuration and notebooks.

  • Updates the SQL to remove deprecated LIVE combinations and align with the current DLT publishing mode.

  • Keeps the data model and target tables the same from the customer perspective.

After the action completes successfully:

  1. Go to the DLT pipeline in the Databricks UI.

  2. Start an update/run of the pipeline.

  3. Confirm that it now initializes without PARSE_SYNTAX_ERROR and that tables/views (including activitiesSnapshots and other streaming tables) are created successfully.

2. Optional: save-schema-file (only if the schema has changed)

If you have changed your Reltio data model and need to update the Databricks schema, you may also run the save-schema-file action:

POST https://na03-prod-data-pipeline-hub.reltio.com/api/tenants/{tenantId}/adapters/{adapterName}/actions/save-schema-file

 

Note: The root cause is usually a deprecated SQL syntax, not schema drift. save-schema-file alone does not fix the SQL parser error; you need configure_pipeline to refresh the notebook.

If you must get the pipeline running immediately and cannot run configure_pipeline right away:

  1. Open the DLT notebook referenced in the error in your Databricks workspace.

  2. Locate the failing block, e.g.:

    CREATE OR REPLACE STREAMING LIVE VIEW activitiesSnapshots AS SELECT * FROM STREAM(LIVE.landingTable) WHERE objectType = "activity";
  3. Update the statement to conform to the current DLT SQL rules for your DLT version and publishing mode (for example, removing unsupported LIVE syntax per Databricks’ migration guide:
    Enable the default publishing mode in a pipeline | Databricks on AWS

  4. Re‑run the pipeline.

Caution:
Any manual changes to the Reltio‑generated notebook may be overwritten the next time configure_pipeline is run. Treat this as a temporary stopgap, not a permanent solution.

 

How to Confirm You’re Hitting This Specific Issue

Use this quick checklist:

  1. Error type

    • DLT event log shows org.apache.spark.sql.catalyst.parser.ParseException.

    • Message includes [PARSE_SYNTAX_ERROR] Syntax error at or near 'OR'. SQLSTATE: 42601.

  2. Failing SQL

    • Logs point to a CREATE OR REPLACE STREAMING LIVE VIEW ... statement.

    • The notebook includes STREAM(LIVE.landingTable) or other LIVE references.

  3. Databricks version & mode

    • Workspace is on a recent Lakeflow / DLT release that deprecates legacy LIVE syntax (see Databricks docs above).

    • The pipeline is running in default publishing mode, not legacy LIVE mode.

If all three match, you are very likely affected by this Databricks syntax deprecation and should follow the resolution steps above.

 

Prevention and Best Practices

To avoid similar issues when Databricks deprecates legacy behaviors:

  1. Refresh notebooks after Databricks upgrades

    • After any Databricks DLT upgrade, re‑run configure_pipeline for your active Delta Lake adapters so that notebooks are regenerated with the latest supported syntax.

  2. Monitor Databricks release notes and deprecations

  3. Limit manual edits to Reltio notebooks

    • Prefer fixing issues by running the appropriate Reltio actions (configure_pipeline, save-schema-file) instead of permanently editing Reltio‑generated notebooks in Databricks.

 

When to Contact Reltio Support

Open or update a Reltio ticket if:

  • You have run configure_pipeline and the pipeline still fails with PARSE_SYNTAX_ERROR, or

  • You aren’t sure which adapter/notebook is affected, or

  • You need help confirming your Databricks DLT version and publishing mode regarding this deprecation.

Include the following details:

  • Tenant ID and environment 

  • Workspace and DLT pipeline name 

  • Full error message and a snippet of the failing SQL from the DLT event log.

  • Confirmation of recent Databricks upgrades or changes to publishing mode (legacy vs default).

  • Whether you have already run configure_pipeline and/or save-schema-file.


 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.