Why is my POST processinstances/_generateFromQuery failing with a socket hangup?


I am running API through postman to generate a task but I am not able to send the request.  



To summarize the challenge;

  • The customer is facing issues with a backlog of potential matches for the Organization Entity
    • Currently, the number of Organizations that have >0 potential matches is 150,000
  • Using the _generateFromQuery API for Organization is causing 1.7M potential matches to be created
    • The best practice when using this query is to generate no more than 100 tasks

The business challenge that the customer would be facing is generating a large number of tasks. Even if we had the perfect search we would be looking at (roughly) 400k to 500k tasks. The volume of this would overwhelm the inbox of data stewards.

Next Steps

The first step for any of our solutions is to clear down the PMR process list using the API and filter provided.

  • Look at utilizing the _generateFromQuery in a lower environment using MUCH stricter filters, such as Country, Entity URI with highest match count, and a single MatchRule among others.
    • This would allow batch style updates for important regions
  • You should perform the following API to check for the number of profiles returned.
GET https://{{env}}.reltio.com/reltio/api/{{TenantID}}/entities?filter=(equals(type,'configuration/entityTypes/Individual') and 
(range(matches,1,10) and equals(matchRules,'configuration/entityTypes/Individual/matchGroups/Cont BFD MarkedAsDuplicate')))
  • Take a much closer look at the match rules and the match pairs being generated using Match Rule Analyzer and data profiling.
  • It is important to make sure that older closed PMR processes are terminated.
curl --location --request POST 'https://<workflow environment>-workflow.reltio.com/workflow-adapter/workflow/<tenantId>/jobs/terminateProcessInstances?background=true' \
--header 'Authorization: Bearer 370bf645-4c00-44fb-b451-6700970362c7' \
--header 'Content-Type: application/json' \
--header 'EnvironmentURL: https://<data environment>' \
--data-raw '{

Once the above task is completed, make sure that no processes are present

curl --location --request POST 'https://<workflow envirnoment>-workflow.reltio.com/workflow-adapter/workflow/<tenantId>/processInstances/_search' \
--header 'Authorization: Bearer 370bf645-4c00-44fb-b451-6700970362c7' \
--header 'EnvironmentURL: https://<data envirnoment>' \
--header 'Content-Type: application/json' \
--data-raw '{
   "processType": ""

Was this article helpful?
0 out of 0 found this helpful



Please sign in to leave a comment.