Question
Can you give me some examples of GBQ queries to access information using Reporting and Analytics tables and views?
Answer
Each row of the table represents a single entity of that given type. Since each entity has at least one but most likely multiple crosswalks, each table has a crosswalk
column. In Reltio each crosswalk is considered as an object. Therefore, the crosswalk object is mapped to STRUCT
in Reltio Analytics. Furthermore, as the crosswalks can be more than one for each entity or each row in our Reltio Analytics table, the column is also REPEATED
.
REPEATED STRUCT
in Reltio Analytics. The STRUCT
has the following properties of a crosswalk as sub-columns:
values
attributes
uri
updateDate
deleteDate
reltioLoadDate
source
sourceTable
refEntityOrRelationId
createDate
How to Unnest Columns
SELECT
cw AS Crosswalk
FROM `<dataset_id>.entity_Individual_source`,
UNNEST (crosswalks) AS cw
SELECT
cw AS Crosswalk,
tg AS Tag
FROM `<dataset_id>.entity_Individual_source`,
UNNEST (crosswalks) AS cw,
UNNEST (tags) as tg
STRUCT
as shown below.
SELECT
cw.Uri AS CrosswalkUri,
cw.Source AS CrosswalkSource,
tg AS Tag
FROM `<dataset_id>.entity_Individual_source`,
UNNEST (crosswalks) AS cw,
UNNEST (tags) as tg
Get the number of entities for a particular entity type
SELECT count(distinct Id)
FROM `customer-facing-hipaa.views_riq_dw_<envirnoment>_<tenantID>.entity_<entityType>`,
UNNEST(crosswalks) as cw where deleted = false
How to gather crosswalk source system name and id for a specific entity
SELECT
Id as entity_Id,
cwk.value as crosswalk_id,
SUBSTR(cwk.source, REGEXP_INSTR(cwk.source, '/', 1, 2)) as crosswalk_source,
FROM
`customer-facing.views_riq_<customer view>` as T,
UNNEST(T.crosswalks) AS cwk,
UNNEST(cwk.attributes) AS cwk_attr
--where id = '1AcTIHmT'
group by Id, crosswalk_id, crosswalk_source
order by crosswalk_source
limit 1000
Response in output
[
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1AcTIHmT",
"crosswalk_source": "ReltioCleanser"
},
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1-38-3601",
"crosswalk_source": "BRAZIL"
}
]
How to gather entity information and the crosswalk source system name and id
CREATE TEMP FUNCTION
FlattenedAttributes(json_row STRING)
RETURNS ARRAY<STRUCT<uri STRING,
isOv STRING,
value STRING>>
LANGUAGE js AS """
function collectAttributes(attributes, baseUri, arr) {
for (attrName in attributes) {
var attrArray = attributes[attrName];
for (attrIndex in attrArray) {
var attr = attrArray[attrIndex];
if (baseUri != "" && !baseUri.endsWith("/")) {
baseUri = baseUri + "/";
}
var uri = baseUri + attrName + "/" + attr["Id"];
var value = attr["value"];
if (value === Object(value)) {
collectAttributes(value, uri, arr);
}
else {
arr.push({"uri": uri, "isOv": attr["isOv"], "value": attr["value"]});
}
}
}
return arr
}
var row = JSON.parse(json_row);
return collectAttributes(row["attributes"], "", []);
""";
SELECT
id as entity_Id,
cwk.value as crosswalk_id,
SUBSTR(cwk.source, REGEXP_INSTR(cwk.source, '/', 1, 2)+1) as crosswalk_source,
flattened_attribute.value AS attribute_Value,
substr(flattened_attribute.uri, 0, REGEXP_INSTR(flattened_attribute.uri, '/')-1 ) as attribute_Name,
FROM
`customer-facing.views_<customer view>` T,
UNNEST(T.crosswalks) AS cwk,
UNNEST(cwk.attributes) AS cwk_attr,
UNNEST(FlattenedAttributes(TO_JSON_STRING(t))) AS flattened_attribute
where id = '1AcTIHmT'
group by entity_Id, crosswalk_id, crosswalk_source, attribute_Name, attribute_Value
order by crosswalk_source, crosswalk_id, attribute_name
Response
[
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1AcTIHmT",
"crosswalk_source": "ReltioCleanser",
"attribute_Value": "BR",
"attribute_Name": "TownshipDistrictName"
},
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1-38-3601",
"crosswalk_source": "BRAZIL",
"attribute_Value": "Brazil",
"attribute_Name": "Addresses"
},
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1-38-3601",
"crosswalk_source": "BRAZIL",
"attribute_Value": "CARMO DO",
"attribute_Name": "TownshipDistrictName"
},
Capture Information from HCO entity
Select
distinct
del,
concat(IFNULL(TriggerDCR,''),',',IFNULL(PublishToDownstream,'')) as TriggerDCR,
CrosswalkValue,
concat(IFNULL(Name,''),',',IFNULL(Country,''),',',IFNULL(AddressLine1,'')) as Name,
concat(IFNULL(AddressStatus,''),',',IFNULL(TypeCode,''),',',IFNULL(SubTypeCode,'')) as AddressStatus,
concat(IFNULL(HospitalType,''),',',IFNULL(OwnershipStatus,''),',',IFNULL(CompanyType,'')) as HospitalType,
concat(IFNULL(PostalCode,''),',',IFNULL(PhoneNumber,''),',',IFNULL(WebsiteURL,'')) as PostalCode,
concat(IFNULL(DoingBusinessAsName,''),',',IFNULL(VeevaId,''),',',IFNULL("null",'')) as DoingBusinessAsName,
concat(IFNULL(AlternateBusinessName2,''),',',IFNULL(AddressLine2,''),',',IFNULL(LicenseInformation,'')) as AlternateBusinessName2,
concat(IFNULL(OfficialName,''),',',IFNULL(TotalCensusBeds,''),',',IFNULL(NumEmployees,'')) as OfficialName,
concat(IFNULL(NumsOfProviders,''),',',IFNULL(ResidentCount,''),',',IFNULL(NumPatients,'')) as NumsOfProviders,
concat(IFNULL(Numberofpatientsdischargedannually,''),',',IFNULL(Longitude,''),',',IFNULL(Latitude,'')) as Numberofpatientsdischargedannually,
concat(IFNULL(NumOperationAnnually,''),',',IFNULL(NumOPCAnnually,''),',',IFNULL(InternetHospitalCertification,'')) as NumOperationAnnually,
InternetHospitalInfo as InternetHospitalInfo,
concat(IFNULL(SpecialtyandKeyDepartment,''),',',IFNULL(SpecialtyCertification,''),',',IFNULL(HospitaltypeByOwnership,'')) as SpecialtyandKeyDepartment,
UPDATEDTIME
FROM(
SELECT distinct
T.deleted as del,
aTriggerDCR.value as TriggerDCR,
aPublishToDownstream.value as PublishToDownstream,
T.Id as CrosswalkValue,
aName.value as Name,
aCountry.value as Country,
adState.value as State,
adCity.value as City,
adCongressionalDistrict.value as District,
adLine1.value as AddressLine1,
adStatus.value as AddressStatus,
aTypeCode.value as TypeCode,
aSubTypeCode.value as SubTypeCode,
aHospitalType.value as HospitalType,
aOwnershipStatus.value as OwnershipStatus,
aCompanyType.value as CompanyType,
adPostalCode.value as PostalCode,
aPhoneNumber.value as PhoneNumber,
aWebsiteURL.value as WebsiteURL,
aDoingBusinessAsName.value as DoingBusinessAsName,
aVeevaId.value as VeevaId,
-- aOrganizationID.value as OrganizationID,
aAlternateBusinessName2.value as AlternateBusinessName2,
adAddressLine2.value as AddressLine2,
aLicenseInformation.value as LicenseInformation,
aOfficialName.value as OfficialName,
aTotalCensusBeds.value as TotalCensusBeds,
aNumEmployees.value as NumEmployees,
aNumsOfProviders.value as NumsOfProviders,
aResidentCount.value as ResidentCount,
aNumPatients.value as NumPatients,
aNumberofpatientsdischargedannually.value as Numberofpatientsdischargedannually,
aLongitude.value as Longitude,
aLatitude.value as Latitude,
aNumOperationAnnually.value as NumOperationAnnually,
aNumOPCAnnually.value as NumOPCAnnually,
aInternetHospitalCertification.value as InternetHospitalCertification,
aInternetHospitalInfo.value as InternetHospitalInfo,
aSpecialtyandKeyDepartment.value as SpecialtyandKeyDepartment,
aSpecialtyCertification.value as SpecialtyCertification,
aHospitaltypeByOwnership.value as HospitaltypeByOwnership,
date(updatedTime) as UPDATEDTIME,
ROW_NUMBER() OVER (PARTITION BY T.Id, Type ORDER BY updatedTime desc) AS row_num
FROM
`customer-facing.views_riq_<customer view>` T
left join unnest(attributes.TriggerDCR) as aTriggerDCR
left join unnest(attributes.PublishToDownstream) as aPublishToDownstream
left join unnest(attributes.Name) as aName
left join unnest(attributes.Country) as aCountry,
unnest(attributes.SourceName) as aSourceName
left join unnest(attributes.Address) as aAdd
left join unnest(aAdd.value.AddressLine1) adLine1
left join unnest(aAdd.value.City) adCity
left join unnest(aAdd.value.StateProvince) adState
left join unnest(aAdd.value.CongressionalDistrict) adCongressionalDistrict
left join unnest(aAdd.value.Status) adStatus
left join unnest(attributes.TypeCode) as aTypeCode
left join unnest(attributes.SubTypeCode) as aSubTypeCode
left join unnest(attributes.HospitalType) as aHospitalType
left join unnest(attributes.OwnershipStatus) as aOwnershipStatus
left join unnest(attributes.CompanyType) as aCompanyType
left join unnest(aAdd.value.Zip) as adZip
left join unnest(adZip.value.PostalCode) adPostalCode
left join unnest(attributes.Phone) as aPhone
left join unnest(aPhone.value.Number) aPhoneNumber
left join unnest(attributes.WebsiteURL) as aWebsiteURL
left join unnest(attributes.DoingBusinessAsName) as aDoingBusinessAsName
left join unnest(attributes.VeevaId) as aVeevaId
-- left join unnest(attributes.OrganizationID) as aOrganizationID
left join unnest(attributes.AlternateBusinessName2) as aAlternateBusinessName2
left join unnest(aAdd.value.AddressLine2) as adAddressLine2
left join unnest(attributes.LicenseInformation) as aLicenseInformation
left join unnest(attributes.OfficialName) as aOfficialName
left join unnest(attributes.TotalCensusBeds) as aTotalCensusBeds
left join unnest(attributes.NumEmployees) as aNumEmployees
left join unnest(attributes.NumsOfProviders) as aNumsOfProviders
left join unnest(attributes.ResidentCount) as aResidentCount
left join unnest(attributes.NumPatients) as aNumPatients
left join unnest(attributes.Numberofpatientsdischargedannually) as aNumberofpatientsdischargedannually
left join unnest(attributes.GeoLocation) as aGeoLocation
left join unnest(aGeoLocation.value.Longitude) aLongitude
left join unnest(aGeoLocation.value.Latitude) aLatitude
left join unnest(attributes.NumOperationAnnually) as aNumOperationAnnually
left join unnest(attributes.NumOPCAnnually) as aNumOPCAnnually
left join unnest(attributes.InternetHospitalCertification) as aInternetHospitalCertification
left join unnest(attributes.InternetHospitalInfo) as aInternetHospitalInfo
left join unnest(attributes.SpecialtyandKeyDepartment) as aSpecialtyandKeyDepartment
left join unnest(attributes.SpecialtyCertification) as aSpecialtyCertification
left join unnest(attributes.HospitaltypeByOwnership) as aHospitaltypeByOwnership
order by updatedTime DESC
) where row_num=1 and del = false
Response
{ "del": false, "TriggerDCR": ",true", "CrosswalkValue": "ZUKTGin", "Name": "ACME Company", "AddressStatus": "Active, 10 Main Street,", "HospitalType": "Private", "PostalCode": ",0513 8881 3752,http://www.jshaz.com/", "DoingBusinessAsName": "Acme Comany,3475245120,null", "AlternateBusinessName2": ",,", "OfficialName": "Offically Acme,,", "NumsOfProviders": "1,,", "Numberofpatientsdischargedannually": "20000,,", "NumOperationAnnually": ",,", "InternetHospitalInfo": null, "SpecialtyandKeyDepartment": ",,", "UPDATEDTIME": "2021-12-23" },
See the name of the loser and winner
Select * From
(Select T.Id,eventTime,updatedTime,T.deleted,T.deletedTime,T.endDate,T.versionTime, aName, acW.Uri as cwURI,acW,
ROW_NUMBER() OVER (PARTITION BY T.Id, Type ORDER BY versionTime DESC, eventTime DESC) AS row_num
FROM
`customer-facing.riq_dw_<customer view>.entity_H*` as T,
unnest(T.attributes.Name) as aName,
unnest(T.crosswalks) as acW
where T.Id in ('VteC3zw')
) T
where (cwURI like '%RQkhwfx%')
order by T.updatedTime desc
Results
"Name": [
{
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": true,
"value": "BUIT, StressTest5 D",
"uri": "entities/VteC3zw/attributes/Name/41bC3ClS6"
},
{
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": false,
"value": "LUSKS , SAM
{
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": false,
"value": "BUIT , JIM ", <<=== loser name
"uri": "entities/VteC3zw/attributes/Name/3tRKV3n56"
},
{
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": false,
"value": "ROSK, ALAN",
"uri": "entities/VteC3zw/attributes/Name/19zr5D0Sy"
}
],
Field definition
Each entity table also contains the following columns:
tags
- The column holds the tags that can be attached to the entityREPEATED
column.createdTime
- Timestamp of the entity created in the source system.updatedTime
- Timestamp of the last update of the entity in the source system.startDate
- The date when the entity was created in the Reltio system.endDate
- The date when the entity was soft-deleted (if deleted) in the Reltio system.uri
- The URI of the entity.updatedBy
- The ID of the person who made the last update.type
- The type of entity. Since one table holds information about one entity type, it is expected that all the records in the given table have the same value in this column.createdBy
- The creator of the entity.eventTime
- The time at which the event was logged.versionTime
- The version of objects in the data frame.deleted
- Boolean value indicates whether the entity is deleted or not.deletedTime
- Timestamp of the entity’s deletion in the source system.id
- The ID of the entity in the Reltio system.insertedTime
- Timestamp of an entity inserted in the GBQ streaming table. For GBQ batch table,insertedTime
is equal to theupdatedTime
.
Comments
Please sign in to leave a comment.