Question:
I have created this dataload job definition in the dev tenant, Job Definition ID is90c109ee-5787-4219-86fd-8b79657b0d74, and "NOTC.CSV" is the original file that been used for creating the mapping, I have uploaded several other files via API call and triggered the jobs to run. All the APIs gave me 200 successful responses, but the job were failed to pick up the new files to run.
On the UI, It says "ERROR: Job bulk load cancelled as no new files are available for processing.", and looks like the newly uploaded file were not picked up by the data load job, do you know what needs to be configured?
Here are the API calls and the responses:
- uploading file:https://361-dataloader.reltio.com/dataloader/api/<tenantId>/storage/bf4c46c5-af26-4b5d-8619-64efcc90ba5c/upload?replace=true&projectID=bf4c46c5-af26-4b5d-8619-64efcc90ba5c
- Trigger the dataload job:https://361-dataloader.reltio.com/dataloader/api/<tenantId>/project/bf4c46c5-af26-4b5d-8619-64efcc90ba5c/jobs/run
Answer
Note: The following steps are using the REST call that are defined in https://developer.reltio.com/private/swagger.htm?module=Data+Ingestion
- Create a storage
POST https://361-dataloader.reltio.com/dataloader/api/<tenantId>/storage/_account
Request body:
{ "accountName": "HCPLoad",
"accountType": "UPLOADED,<<== [UPLOADED, GCP, AWS_S3, SFTP, AZURE]
"credentials": "*****" }
- Create Mapping
curl --location --request POST 'https://361-dataloader.reltio.com/dataloader/api/<tenantId>/_mapping' \
--header 'Authorization: Bearer 5e4d2f6c-5fa0-472a-9b7b-c225f91af1a8' \
--header 'Content-Type: application/json' \
--data-raw '{
"mappingName": "bulk load mapping",
Response:
{
"createdBy": "gloria.faley@reltio.com",
"createdDate": 1665600980060,
"mappingId": 25091,
"mappingName": "bulk load mapping",
"tenantId": "<tenantId>",
- Create a source and link it to the storage and get a storage id.
POST https://361-dataloader.reltio.com/dataloader/api/<tenantId>/project/data/source
Request body:
{
"shareSourceInfo": true,
"bucketName": "reltio_customer-facing_dataloader",
"sourcePath": "abc/reltio/entities",
"fileMask": "HCP",
"storageAccount": {
"accountName": "HCPLoad",
"accountType": "UPLOADED,<<== [UPLOADED, GCP, AWS_S3, SFTP, AZURE],
"credentials": ""
}
}
Response body contains - "storageId": "<storageId>".
- Create a project; link the storageID and mappingID from above step to the project.
POST https://361-dataloader.reltio.com/dataloader/api/<tenantId>/project
Request Body:
{ "name": "TestJob",
"updateType": "FULL_UPDATE",
"storageDetails": {
"storageId": "<storage from step above>"
},
"mappingDetails": {
"mappingId": <mapping id from steps above>
},
"schedulingDetails": {},
"progressStatus": "Define",
"environment": "dev",
"loadType": "ENTITIES",
"additionalAttributes": {
"alwaysCreateDCR": false
}
}
}
- Upload a file from your system to remote storage and get a new storage ID.
curl --location --request POST 'https://361-dataloader.reltio.com/dataloader/api/<tenantId>/storage/90c109ee-5787-4219-86fd-8b79657b0d74/upload?replace=true&projectID=3a438ad8-2afa-4cc6-96fe-c6003a842e50' \
--header 'Authorization: Bearer 5e4d2f6c-5fa0-472a-9b7b-c225f91af1a8' \
--header 'Content-Type: multipart/form-data' \
--form 'file=@"/C:/Users/gloria/Dropbox/My PC/Downloads/bulk_upload.CSV"'
Response body:
{
"createdBy": "gloria.faley@reltio.com",
"createdDate": 1665598577000,
"storageId": "39cac35a-125c-4418-bac7-eb15edb2f250",
"bucketName": "reltio_customer-facing_dataloader",
"sourcePath": "90c109ee-5787-4219-86fd-8b79657b0d74/bulk_upload.CSV",
"shareSourceInfo": false,
"delimiter": ",",
"storageAccount": {
"createdBy": "gloria.faley@reltio.com",
"createdDate": 1665598577000,
"accountId": 100382,
"accountName": "bulk_upload",
"accountType": "UPLOADED"
},
"totalRecords": 38
}
- Update the project with the new storage id that contains the upload.
curl --location --request PUT 'https://361-dataloader.reltio.com/dataloader/api/<tenantId>/project/3a438ad8-2afa-4cc6-96fe-c6003a842e50' \
--header 'Authorization: Bearer 5e4d2f6c-5fa0-472a-9b7b-c225f91af1a8' \
--header 'Content-Type: application/json' \
Response Body
{
"createdBy": "penmac_dl_usr",
"updatedBy": "gloria.faley@reltio.com",
"createdDate": 1665416344000,
"id": "3a438ad8-2afa-4cc6-96fe-c6003a842e50",
"name": "TestJob",
"tenantId": "<tenantId>",
"checkForUpdates": false,
"updateType": "FULL_UPDATE",
"storageDetails": {
"createdBy": "gloria.faley@reltio.com",
"updatedBy": "gloria.faley@reltio.com",
"createdDate": 1665598577000,
"updatedDate": 1665602378196,
"storageId": "39cac35a-125c-4418-bac7-eb15edb2f250",
"bucketName": "reltio_customer-facing_dataloader",
"sourcePath": "90c109ee-5787-4219-86fd-8b79657b0d74/bulk_upload.CSV",
"shareSourceInfo": false,
"delimiter": ",",
"storageAccount": {
"createdBy": "gloria.faley@reltio.com",
"updatedBy": "gloria.faley@reltio.com",
"createdDate": 1665598577000,
"updatedDate": 1665602378196,
"accountId": 100382,
"accountName": "bulk_upload",
"accountType": "UPLOADED"
},
"totalRecords": 38
},
"mappingDetails": {
"createdBy": "gloria.faley@reltio.com",
"updatedBy": "gloria.faley@reltio.com",
"createdDate": 1665600980000,
"updatedDate": 1665602378196,
"mappingId": 25091,
"mappingName": "bulk load mapping",
"tenantId": "<tenantId>",
"mappingSummary": {
...
},
"shareMappingDetails": true,
"mappingForObjectType": "ENTITIES",
"objectTypeUri": "configuration/entityTypes/Individual"
},
"schedulingDetails": {
"createdDate": 1665416344000,
"id": "fce755da-f27b-41e5-9b1d-621b7e295b78",
"startDate": 0,
"endDate": 0
},
"progressStatus": "Define",
"environment": "dev",
"loadType": "ENTITIES",
"additionalAttributes": {
"alwaysCreateDCR": false
}
}
- Create a job and run using the projectId.
curl --location --request POST 'https://361-dataloader.reltio.com/dataloader/api/<tenantId>/project/3a438ad8-2afa-4cc6-96fe-c6003a842e50/jobs/run' \
--header 'Authorization: Bearer 5e4d2f6c-5fa0-472a-9b7b-c225f91af1a8' \
--header 'Content-Type: application/json' \
Request body:
{
"name": "TestJob",
"storageDetails": [{
"bucketName": "reltio_customer-facing_dataloader",
"sourcePath": "90c109ee-5787-4219-86fd-8b79657b0d74",
"filemask": "bulk_upload"
}
]
}
Response body
{
"createdBy": "penmac_dl_usr",
"createdDate": 1665603129006,
"id": "0939dd69-77b7-4bf6-a7c9-a5c97d6f5d14",
"name": "TestJob",
"projectDetails": {
"createdBy": "penmac_dl_usr",
"updatedBy": "gloria.faley@reltio.com",
"createdDate": 1665416344000,
"updatedDate": 1665602378000,
"id": "3a438ad8-2afa-4cc6-96fe-c6003a842e50",
"name": "TestJob",
"tenantId": "<tenantId>",
"checkForUpdates": false,
"updateType": "FULL_UPDATE",
"storageDetails": {
"createdBy": "gloria.faley@reltio.com",
"updatedBy": "gloria.faley@reltio.com",
"createdDate": 1665598577000,
"updatedDate": 1665602378000,
"storageId": "39cac35a-125c-4418-bac7-eb15edb2f250",
"bucketName": "reltio_customer-facing_dataloader",
"sourcePath": "90c109ee-5787-4219-86fd-8b79657b0d74/mdm_experian_bulk_upload (1).CSV",
"shareSourceInfo": false,
"delimiter": ",",
"storageAccount": {
"createdBy": "gloria.faley@reltio.com",
"updatedBy": "gloria.faley@reltio.com",
"createdDate": 1665598577000,
"updatedDate": 1665602378000,
"accountId": 100382,
"accountName": "bulk_upload",
"accountType": "UPLOADED"
},
"totalRecords": 38
},
- Check the job that ran as expected
curl --location --request GET 'https://361-dataloader.reltio.com/dataloader/api/<tenantId>/project/job/0939dd69-77b7-4bf6-a7c9-a5c97d6f5d14' \
--header 'Authorization: Bearer 5e4d2f6c-5fa0-472a-9b7b-c225f91af1a8' \
--header 'Content-Type: application/json'
Comments
Hi,
Thanks for detailed steps!!
We have one question. What should we pass as a value to "credentials", when accountType = AWS_S3?
Thanks in advance!!
{
"shareSourceInfo": true,
"bucketName": "reltio_customer-facing_dataloader",
"sourcePath": "abc/reltio/entities",
"fileMask": "HCP",
"storageAccount": {
"accountName": "HCPLoad",
"accountType": "AWS_S3",
"credentials": "??????"
}
}
Please sign in to leave a comment.