RESOURCES / Articles

Using Vlookup to avoid having redundant data in Datorama

January 17, 2023

Here, we will consider an example where DCM (DoubleClick Campaign Manager) data is received from 2 sources. One is a direct retrieval from DCM itself from their API (Application Programming Interface), while the other source is second-hand, via DCM email reports sent to a TotalConnect Stream from multiple advertisers and agencies. Since a publisher can deal with hundreds of advertisers, there is a risk of the same data coming from both the API and the email reports. This will lead to inflation of delivered numbers in the dashboard.

We have an automated process in place to avoid this duplication from happening. Given below are the steps that can be followed for this –

  1. Initially the DCM API is created, and this is what we consider the ‘base’ reference, as we set the parameters and get the data at the time we schedule. This API stream is specific to our needs.
  2. A report is created with a filter to this DCM API stream and we pull all the ‘Media Buy Keys’ (Placement Ids) from there.

  3. The output of this report is used to create a new lookup data stream through a Generic TotalConnect or a LiteConnect stream type.

  4. The DCM TotalConnect stream is set up next, to filter the incoming data by the ‘Media Buy Key’ ID. In this setup, please use the formula given below –
  5. var a = VLOOKUP(csv[‘Placement ID’],[MainDupe_Check_Entity_Key] ,[Main_Dupe_Check_Entity_Key],true);

    IF(ISEMPTY(a), csv[‘Placement ID’],’Discard’)

    In this formula, the Placement IDs from the email reports are checked against the IDs from the original API stream and returns a ‘Discard’ value if a match is found. This way all the duplicates are grouped under ‘Discard’. Please note that in the Vlookup formula, we can also input the DCM API name as a data stream filter to directly lookup these Placement IDs against API Media Buy Keys. However, using a base reference lookup table gives us flexibility if more than one API streams are present.

  6. In the dashboard, Pivots, or reports, add this filter = ‘Media Buy Key’ not equals ‘Discard’ to remove these discarded values from final views.

  7. The above process can be customized to satisfy other instances where existing databases need to have redundant info filtered out.