The Do’s and Don’ts of Data Enrichment
The recent Analytics Nexus event, hosted by Claravine, included an industry-leading lineup of speakers. If you weren’t able to attend, we hope to see you at our next event!
One of those terrific presenters was Adam Greco, a senior partner at Analytics Demystified and an advisor to Claravine. Adam literally wrote the book on SiteCatalyst, and he took the time to share with our audience some of the tactics he’s learned to make full use of the capabilities of data enrichment.
But first, what is data enrichment? Simply put, data enrichment is the act of merging an existing database of first-party customer data with third-party data from another authoritative source. Brands generally do this to enhance the data they already have, so they can make more informed decisions.
You might opt to use data enrichment in situations when, for one reason or another, it’s not possible to capture all the data you want via onsite JavaScript tagging. For example:
- Your analytics tool does not provide a sufficient number of dimensions
- Some of the data you want isn’t available at the time of collection
- You need flexibility to modify or update your data over time
- There is data you wish to collect, and you don’t want it visible to JavaScript debuggers
Each analytics tool uses its own terminology for what we’re describing here as data enrichment. Adobe uses the term SAINT classifications, Google calls it Dimension Widening, and tools like Snowplow Analytics simply call it Data Enrichment. Platforms will vary, but regardless of which analytics tool you use, data enrichment is the process of connecting that additional data overlay to the data you’ve collected, usually via JavaScript.
Metadata can be added in a number of ways. You can upload it in a browser, you can FTP the metadata in, or you can use an API. Or–the method we recommend–use a third party platform like Claravine, and upload your metadata automatically using preset tables. This creates efficiencies by minimizing the risk of human error and can save your team lots of time.
It’s up to you to decide when and how often your metadata gets added. More importantly, you should look at each dimension and determine, Is there any additional data that I can provide to this dimension, that will help me do more analysis, and help my users know what’s happening on my site or in my app?
One valuable feature of data enrichment is that you’re able to aggregate data in the same way you might use a pivot table in Excel. Uploading your metadata allows you to group together different types of content. For example, you could enrich your company’s blog posts by adding a piece of metadata called blog post type. After you’ve uploaded the metadata, your analytics tool will allow you to see all your blog posts grouped by type. This can provide an easy way to sum up your metrics over time and to see how assets perform by category, without having to know the blog type in advance at the time the page was loaded.
And remember: metadata can be used like any other data point in your implementation. For example, if you add a metadata attribution called source to your campaign tracking code to see where your blog views come from, you can then use source as the report view in your analytics tool. Adding 20 new pieces of metadata to your analytics implementation opens up the possibility of 20 more reports you can generate, to refine and slice your data.
Another great way to take advantage of this capability is by using metadata attributes for segmentation. That’s right, you can segment your data not just by the root thing you captured with JavaScript, but by any metadata you’ve added. For example, if you get a lot of blog traffic from your competitors, you can create a metadata attribute called competitor flag and segment those views out of your analysis. This way, when you’re doing a conversion analysis on that content, you won’t include data from a bunch of people who are unlikely to actually purchase your services.
It can be tricky to know when you should pass a data point into a particular dimension directly via onsite JavaScript, and when you should use it as a metadata attribute to enrich the data, and this probably needs to be evaluated on a case-by-case basis. A good rule of thumb is this: if there’s a data point you want to lock in forever, like a prospect’s city or age, then you’ll want to pass it directly into your analytics tool. Imagine instead that you just capture a User ID, and then upload the prospect’s city and age. If that person moves, it will skew your data by making it look like all that person’s activity took place in the city they moved to. The same thing happens with age. If you want your demographic information to be accurate, you probably don’t want it constantly overridden by newer data.
Conversely, there may be times when you want the ability to override some data sets. Suppose you launch a campaign code with a tag identifying it as paid search, but it wasn’t actually paid search. If you upload that tag as metadata, you can easily alter it after the fact with the correct code.
Do you have some good data enrichment tips you’d like to share? Let us know! And be sure to bookmark the Claravine blog and the Analytics Demystified blog for more analytics content.