How to re-synchronize RI and GBQ with MDM?

Question

How can I re-synchronize GBQ and Reltio?  This should be performed when a change is made to the data model or if the GBQ counts appear to be out of synch.

Answer

  • Make sure that the tenant has GBQ streaming access enabled. Please refer to this link to find the proper RI environment.
GET https://<RI_env>.reltio.com/api/v1.0/configuration/<tenantId>
  • In the response check if the GBQ analytics is enabled:
{    "status": "success",
    "configuration": {
        "authData": {
            ...
        },
        "updatesConfig": {
           ...
            "consumeGbqEvents": true,
            "hoursToGbqCompaction": 168,
            ...
        },
        "analyticsEnabled": true,
        ...
    }
}

If the above is not set, refer to https://reltio.jira.com/wiki/spaces/IRD/pages/381353985/RIQ+Configuration+API to set the values as expected.

  • Clear out the GBQ dataset
POST <RI_env>.reltio.com/api/v1.0/gbq/cleanup

Body:

{
"tenantId": "<tenantID>"
}

Expect a response with 200 HTTP code:

{ "status": "success" }
  • If a dataset has lots of data then the request might take some time. In order to not catch timeout async mode is implemented.
    Invoke:
POST <RI_env>.reltio.com/api/v1.0/gbq/cleanup?async=true

Body:

{
"tenantId": "<tenantID>"
}
  • Before running the re-synchronization of RI and GBQ, we need to check the RIQ queue status.
    • We need the ROLE_ADMIN_TENANT to execute the below API.
    • Execute the below API to find the RIQ Queue Status
    • The status must be green before a re-synchronization process is started.
GET  //<RI_env>.reltio.com/api/v1.0/tenants/{tenantId}/status/queues

Response

{
"status": "yellow",
"payload": {
"total": {
"size": 5418,
"dlqSize": 0
}
},
"description": "Queues are not empty. Please wait till they will be processed.",
"message": "Queues are not empty."
}
  • Execute S3 Synchronization job
POST https://<RI_env>.reltio.com/api/v1.0/jobs

Body:

{
"name": "synchronize",
"tenant": "<TenantID>",
"tasks": [
{
"application": "EntitiesExport",
"payload": {}
},
{
"application": "InteractionsExport",
"payload": {}
},
{
"application": "RelationsExport",
"payload": {}
},
{
"application": "MatchesExport",
"payload": {}
},
{
"application": "MergesExport",
"payload": {}
}
]
}

Response example:

{
"id": "4FvvVFXh",
"uri": "/api/v1.0/jobs/4FvvVFXh",
"status": "JOB_PENDING"
}

You can monitor the job status by using the following call:

GET https://<RI_env>.reltio.com/api/v1.0/jobs/<job_id>

 

  • Execute GcsExport Job
POST https://<RI_env>.reltio.com/api/v1.0/jobs
Body:
{
"name": "GcsExport",
"tenant": "<tenant_id>",
"tasks": [
{
"application": "GcsExport",
"payload": {
"objectClass": "ENTITIES",
"force": "true"
}
},
{
"application": "GcsExport",
"payload": {
"objectClass": "RELATIONS",
"force": "true"
}
},
{
"application": "GcsExport",
"payload": {
"objectClass": "INTERACTIONS",
"force": "true"
}
},
{
"application": "GcsExport",
"payload": {
"objectClass": "MATCHES",
"force": "true"
}
},
{
"application": "GcsExport",
"payload": {
"objectClass": "MERGES",
"force": "true"
}
}
]
}


  • Verify that re-sync was successful by one of the following methods:

1. Run an API call to check the total amount of entities:

GET https://<env>.reltio.com/reltio/api/<tenant_id>/entities/_total

Response example:

{
"total": 1443268
}
This number should be equal to the GBQ query:
  • Open GBQ query (follow the instructions - link)
  • Use the following script to count all entities:
SELECT Type,
COUNT(*)
FROM
`customer-facing.views_riq_dw_<env>_<tenant_id>.entities_merged`
WHERE
deleted = FALSE and softDeleted = false
group by Type
Example:
GBQ_query_example.jpg
  • Compare the total number of entities to the Response from API call in the 1st step.

Example:

Total_entities.jpg

If the number of entities in QBQ is equal to the number in the API call, the sync was completed successfully.

 

 2. Another method is to run the following Qubole script:

 

val url = "https://<envirnoment>-af.reltio.com"
val tenant = "<tenantId>"
val token = "<access token>"

import com.reltio.analytics.framework._
import com.reltio.analytics.data.application._
import com.reltio.analytics.data.persist._
import com.reltio.analytics.data.persist.attributes._
import com.reltio.analytics.objects.transformation._
import org.apache.spark.sql._
import com.reltio.analytics.data.delete._
val aframe = AnalyticsFramework.login(sqlContext, url, tenant, token)

val df = aframe.entities(s"configuration/entityType/<specify type here>", deltaWindow = null, activeOnly = false)
df.count

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.