Examples of GBQ queries using Analytics and Reporting

Question

Can you give me some examples of GBQ queries to access information using Reporting and Analytics tables and views?

Answer

Each row of the table represents a single entity of that given type. Since each entity has at least one but most likely multiple crosswalks, each table has a crosswalk column. In Reltio each crosswalk is considered as an object. Therefore, the crosswalk object is mapped to STRUCT in Reltio Analytics. Furthermore, as the crosswalks can be more than one for each entity or each row in our Reltio Analytics table, the column is also REPEATED.

The array of crosswalk objects in Reltio is mapped to REPEATED STRUCT in Reltio Analytics. The STRUCT has the following properties of a crosswalk as sub-columns:
  • values
  • attributes
  • uri
  • updateDate
  • deleteDate
  • reltioLoadDate
  • source
  • sourceTable
  • refEntityOrRelationId
  • createDate

How to Unnest Columns

Unnesting columns in Reltio Analytics is the process of flattening a list of values in a column. This is a fairly simple task but has a great influence on performance. Therefore you must perform this task with caution. The following is an example of unnesting:
SELECT
  cw AS Crosswalk
FROM `<dataset_id>.entity_Individual_source`,
UNNEST (crosswalks) AS cw
You are recommended to give an alias of your column as it helps you to avoid duplicate column errors. You can also unnest multiple columns at once as shown below.
SELECT
  cw AS Crosswalk,
  tg AS Tag
FROM `<dataset_id>.entity_Individual_source`,
UNNEST (crosswalks) AS cw,
UNNEST (tags) as tg
You can also partially select sub-columns if your nested column is a STRUCT as shown below.
SELECT
  cw.Uri AS CrosswalkUri,
  cw.Source AS CrosswalkSource,
  tg AS Tag
FROM `<dataset_id>.entity_Individual_source`,
UNNEST (crosswalks) AS cw,
UNNEST (tags) as tg

 

Get the number of entities for a particular entity type

SELECT count(distinct Id) 
FROM `customer-facing-hipaa.views_riq_dw_<envirnoment>_<tenantID>.entity_<entityType>`,
UNNEST(crosswalks) as cw where deleted = false

 

How to gather crosswalk source system name and id for a specific entity

SELECT
  Id as entity_Id,
  cwk.value as crosswalk_id,
SUBSTR(cwk.source, REGEXP_INSTR(cwk.source, '/', 1, 2)) as crosswalk_source,

FROM
`customer-facing.views_riq_<customer view>` as T,
UNNEST(T.crosswalks) AS cwk,
UNNEST(cwk.attributes) AS cwk_attr
--where id = '1AcTIHmT'
group by Id, crosswalk_id, crosswalk_source
order by crosswalk_source
limit 1000

Response in output

[
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1AcTIHmT",
"crosswalk_source": "ReltioCleanser"
},
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1-38-3601",
"crosswalk_source": "BRAZIL"
}
]

How to gather entity information and the crosswalk source system name and id

CREATE TEMP FUNCTION
FlattenedAttributes(json_row STRING)
RETURNS ARRAY<STRUCT<uri STRING,
isOv STRING,
value STRING>>
LANGUAGE js AS """
function collectAttributes(attributes, baseUri, arr) {
for (attrName in attributes) {
var attrArray = attributes[attrName];
for (attrIndex in attrArray) {
var attr = attrArray[attrIndex];
if (baseUri != "" && !baseUri.endsWith("/")) {
baseUri = baseUri + "/";
}
var uri = baseUri + attrName + "/" + attr["Id"];
var value = attr["value"];
if (value === Object(value)) {
collectAttributes(value, uri, arr);
}
else {
arr.push({"uri": uri, "isOv": attr["isOv"], "value": attr["value"]});
}
}
}
return arr
}
var row = JSON.parse(json_row);
return collectAttributes(row["attributes"], "", []);
""";

SELECT
  id as entity_Id,
  cwk.value as crosswalk_id,
SUBSTR(cwk.source, REGEXP_INSTR(cwk.source, '/', 1, 2)+1) as crosswalk_source,
  flattened_attribute.value AS attribute_Value,
substr(flattened_attribute.uri, 0, REGEXP_INSTR(flattened_attribute.uri, '/')-1 ) as attribute_Name,
FROM
`customer-facing.views_<customer view>` T,
UNNEST(T.crosswalks) AS cwk,
UNNEST(cwk.attributes) AS cwk_attr,
UNNEST(FlattenedAttributes(TO_JSON_STRING(t))) AS flattened_attribute
where id = '1AcTIHmT'
group by entity_Id, crosswalk_id, crosswalk_source, attribute_Name, attribute_Value
order by crosswalk_source, crosswalk_id, attribute_name

Response

[
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1AcTIHmT",
"crosswalk_source": "ReltioCleanser",
"attribute_Value": "BR",
"attribute_Name": "TownshipDistrictName"
},
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1-38-3601",
"crosswalk_source": "BRAZIL",
"attribute_Value": "Brazil",
"attribute_Name": "Addresses"
},
{
"entity_Id": "1AcTIHmT",
"crosswalk_id": "1-38-3601",
"crosswalk_source": "BRAZIL",
"attribute_Value": "CARMO DO",
"attribute_Name": "TownshipDistrictName"
},

Capture Information from HCO entity

Select
distinct
del,
concat(IFNULL(TriggerDCR,''),',',IFNULL(PublishToDownstream,'')) as TriggerDCR,
CrosswalkValue,
concat(IFNULL(Name,''),',',IFNULL(Country,''),',',IFNULL(AddressLine1,'')) as Name,
concat(IFNULL(AddressStatus,''),',',IFNULL(TypeCode,''),',',IFNULL(SubTypeCode,'')) as AddressStatus,
concat(IFNULL(HospitalType,''),',',IFNULL(OwnershipStatus,''),',',IFNULL(CompanyType,'')) as HospitalType,
concat(IFNULL(PostalCode,''),',',IFNULL(PhoneNumber,''),',',IFNULL(WebsiteURL,'')) as PostalCode,
concat(IFNULL(DoingBusinessAsName,''),',',IFNULL(VeevaId,''),',',IFNULL("null",'')) as DoingBusinessAsName,
concat(IFNULL(AlternateBusinessName2,''),',',IFNULL(AddressLine2,''),',',IFNULL(LicenseInformation,'')) as AlternateBusinessName2,
concat(IFNULL(OfficialName,''),',',IFNULL(TotalCensusBeds,''),',',IFNULL(NumEmployees,'')) as OfficialName,
concat(IFNULL(NumsOfProviders,''),',',IFNULL(ResidentCount,''),',',IFNULL(NumPatients,'')) as NumsOfProviders,
concat(IFNULL(Numberofpatientsdischargedannually,''),',',IFNULL(Longitude,''),',',IFNULL(Latitude,'')) as Numberofpatientsdischargedannually,
concat(IFNULL(NumOperationAnnually,''),',',IFNULL(NumOPCAnnually,''),',',IFNULL(InternetHospitalCertification,'')) as NumOperationAnnually,
InternetHospitalInfo as InternetHospitalInfo,
concat(IFNULL(SpecialtyandKeyDepartment,''),',',IFNULL(SpecialtyCertification,''),',',IFNULL(HospitaltypeByOwnership,'')) as SpecialtyandKeyDepartment,
UPDATEDTIME

FROM(
SELECT distinct
T.deleted as del,
aTriggerDCR.value as TriggerDCR,
aPublishToDownstream.value as PublishToDownstream,
T.Id as CrosswalkValue,
aName.value as Name,
aCountry.value as Country,
adState.value as State,
adCity.value as City,
adCongressionalDistrict.value as District,
adLine1.value as AddressLine1,
adStatus.value as AddressStatus,
aTypeCode.value as TypeCode,
aSubTypeCode.value as SubTypeCode,
aHospitalType.value as HospitalType,
aOwnershipStatus.value as OwnershipStatus,
aCompanyType.value as CompanyType,
adPostalCode.value as PostalCode,
aPhoneNumber.value as PhoneNumber,
aWebsiteURL.value as WebsiteURL,
aDoingBusinessAsName.value as DoingBusinessAsName,
aVeevaId.value as VeevaId,
-- aOrganizationID.value as OrganizationID,
aAlternateBusinessName2.value as AlternateBusinessName2,
adAddressLine2.value as AddressLine2,
aLicenseInformation.value as LicenseInformation,
aOfficialName.value as OfficialName,
aTotalCensusBeds.value as TotalCensusBeds,
aNumEmployees.value as NumEmployees,
aNumsOfProviders.value as NumsOfProviders,
aResidentCount.value as ResidentCount,
aNumPatients.value as NumPatients,
aNumberofpatientsdischargedannually.value as Numberofpatientsdischargedannually,
aLongitude.value as Longitude,
aLatitude.value as Latitude,
aNumOperationAnnually.value as NumOperationAnnually,
aNumOPCAnnually.value as NumOPCAnnually,
aInternetHospitalCertification.value as InternetHospitalCertification,
aInternetHospitalInfo.value as InternetHospitalInfo,
aSpecialtyandKeyDepartment.value as SpecialtyandKeyDepartment,
aSpecialtyCertification.value as SpecialtyCertification,
aHospitaltypeByOwnership.value as HospitaltypeByOwnership,
date(updatedTime) as UPDATEDTIME,

ROW_NUMBER() OVER (PARTITION BY T.Id, Type ORDER BY updatedTime desc) AS row_num
FROM
`customer-facing.views_riq_<customer view>` T
left join unnest(attributes.TriggerDCR) as aTriggerDCR
left join unnest(attributes.PublishToDownstream) as aPublishToDownstream
left join unnest(attributes.Name) as aName
left join unnest(attributes.Country) as aCountry,
unnest(attributes.SourceName) as aSourceName
left join unnest(attributes.Address) as aAdd
left join unnest(aAdd.value.AddressLine1) adLine1
left join unnest(aAdd.value.City) adCity
left join unnest(aAdd.value.StateProvince) adState
left join unnest(aAdd.value.CongressionalDistrict) adCongressionalDistrict
left join unnest(aAdd.value.Status) adStatus
left join unnest(attributes.TypeCode) as aTypeCode
left join unnest(attributes.SubTypeCode) as aSubTypeCode
left join unnest(attributes.HospitalType) as aHospitalType
left join unnest(attributes.OwnershipStatus) as aOwnershipStatus
left join unnest(attributes.CompanyType) as aCompanyType
left join unnest(aAdd.value.Zip) as adZip
left join unnest(adZip.value.PostalCode) adPostalCode
left join unnest(attributes.Phone) as aPhone
left join unnest(aPhone.value.Number) aPhoneNumber
left join unnest(attributes.WebsiteURL) as aWebsiteURL
left join unnest(attributes.DoingBusinessAsName) as aDoingBusinessAsName
left join unnest(attributes.VeevaId) as aVeevaId
-- left join unnest(attributes.OrganizationID) as aOrganizationID
left join unnest(attributes.AlternateBusinessName2) as aAlternateBusinessName2
left join unnest(aAdd.value.AddressLine2) as adAddressLine2
left join unnest(attributes.LicenseInformation) as aLicenseInformation
left join unnest(attributes.OfficialName) as aOfficialName
left join unnest(attributes.TotalCensusBeds) as aTotalCensusBeds
left join unnest(attributes.NumEmployees) as aNumEmployees
left join unnest(attributes.NumsOfProviders) as aNumsOfProviders
left join unnest(attributes.ResidentCount) as aResidentCount
left join unnest(attributes.NumPatients) as aNumPatients
left join unnest(attributes.Numberofpatientsdischargedannually) as aNumberofpatientsdischargedannually
left join unnest(attributes.GeoLocation) as aGeoLocation
left join unnest(aGeoLocation.value.Longitude) aLongitude
left join unnest(aGeoLocation.value.Latitude) aLatitude
left join unnest(attributes.NumOperationAnnually) as aNumOperationAnnually
left join unnest(attributes.NumOPCAnnually) as aNumOPCAnnually
left join unnest(attributes.InternetHospitalCertification) as aInternetHospitalCertification
left join unnest(attributes.InternetHospitalInfo) as aInternetHospitalInfo
left join unnest(attributes.SpecialtyandKeyDepartment) as aSpecialtyandKeyDepartment
left join unnest(attributes.SpecialtyCertification) as aSpecialtyCertification
left join unnest(attributes.HospitaltypeByOwnership) as aHospitaltypeByOwnership
order by updatedTime DESC
) where row_num=1 and del = false

Response

{
    "del": false,
    "TriggerDCR": ",true",
    "CrosswalkValue": "ZUKTGin",
    "Name": "ACME Company",
    "AddressStatus": "Active, 10 Main Street,",
    "HospitalType": "Private",
    "PostalCode": ",0513 8881 3752,http://www.jshaz.com/",
    "DoingBusinessAsName": "Acme Comany,3475245120,null",
    "AlternateBusinessName2": ",,",
    "OfficialName": "Offically Acme,,",
    "NumsOfProviders": "1,,",
    "Numberofpatientsdischargedannually": "20000,,",
    "NumOperationAnnually": ",,",
    "InternetHospitalInfo": null,
    "SpecialtyandKeyDepartment": ",,",
    "UPDATEDTIME": "2021-12-23"
  },

See the  name of the loser and winner

Select * From
(Select T.Id,eventTime,updatedTime,T.deleted,T.deletedTime,T.endDate,T.versionTime, aName, acW.Uri as cwURI,acW,
ROW_NUMBER() OVER (PARTITION BY T.Id, Type ORDER BY versionTime DESC, eventTime DESC) AS row_num

FROM
`customer-facing.riq_dw_<customer view>.entity_H*` as T,

unnest(T.attributes.Name) as aName,
unnest(T.crosswalks) as acW
where T.Id in ('VteC3zw')
) T
where (cwURI like '%RQkhwfx%')
order by T.updatedTime desc

Results

 "Name": [
            {
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": true,
"value": "BUIT, StressTest5 D",
"uri": "entities/VteC3zw/attributes/Name/41bC3ClS6"
            },
            {
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": false,
"value": "LUSKS , SAM
            {
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": false,
"value": "BUIT , JIM ", <<=== loser name
"uri": "entities/VteC3zw/attributes/Name/3tRKV3n56"
            },
            {
"type": "configuration/entityTypes/HCP/attributes/Name",
"ov": false,
"value": "ROSK, ALAN",
"uri": "entities/VteC3zw/attributes/Name/19zr5D0Sy"
            }
        ],

Field definition

Each entity table also contains the following columns:

  • tags - The column holds the tags that can be attached to the entity REPEATED column.
  • createdTime - Timestamp of the entity created in the source system.
  • updatedTime - Timestamp of the last update of the entity in the source system.
  • startDate - The date when the entity was created in the Reltio system.
  • endDate - The date when the entity was soft-deleted (if deleted) in the Reltio system.
  • uri - The URI of the entity.
  • updatedBy - The ID of the person who made the last update.
  • type - The type of entity. Since one table holds information about one entity type, it is expected that all the records in the given table have the same value in this column.
  • createdBy - The creator of the entity.
  • eventTime - The time at which the event was logged.
  • versionTime - The version of objects in the data frame.
  • deleted - Boolean value indicates whether the entity is deleted or not.
  • deletedTime - Timestamp of the entity’s deletion in the source system.
  • id - The ID of the entity in the Reltio system.
  • insertedTime - Timestamp of an entity inserted in the GBQ streaming table. For GBQ batch table, insertedTime is equal to the updatedTime.

 

 

 

 

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.