Dirty Data is Affecting Your Customer Personalisation Goals - Here's Everything You Need to Know
An average company uses over 35 applications to collect customer data, but most of this data is unstructured and rife with errors. If you're aiming to create personalised experiences for your customers, you must first sort this dirty data and attempt to get a single, consolidated version of the truth. Here's how.
Working on a customer personalisation plan? Don't forget to add a data cleaning plan to it. Here's why.
You've probably heard or read this a gazillion times already - data is the new fuel, new gold, new lifeline, etc. The irony is, despite knowing how important data is to the success of any digital transformation, analytics, or personalisation plan, most businesses do not have data it can trust.
Some Statistics on Data Quality
In fact, studies and surveys consistently report the high percentage of CEOs who do not have confidence in their organisation's data quality.
- Just 20% of organizations publish data provenance and data lineage. ~ OReilly.
- Only 3% of companies' data meets quality standards. ~ HBR
I'm not being an alarmist here, but in my experience working with enterprise-level clients, this is a ground reality - one that the C-level executives wish would go away.
I didn't add the statistics for how many percentages of consumers check off a business or service provider because of bad experiences caused by bad data (accidental emails, inaccurate customer records, missed key information...to name a few).
So if you're next goal is to personalise experiences for your customers, time to take data quality into consideration.
Understanding Dirty Data & How it Affects Data Quality
When there's a discussion on data quality, companies often aim to solve the big problems while ignoring the seemingly small and simple ones.
They try hard to meet GDPR and state-compliance rules, they implement data security solutions, they have all the most powerful, top-notch data storage systems, expensive data lakes etc. It's imperative to realise that unless the small problems are not fixed, these big investments won't take you far.
Let's start with dirty data.
Data in its raw form is inherently dirty data. You know, misspelled characters, typos, abbreviations all over, incomplete addresses and phone numbers, inaccurate information are some examples of dirty data. Since this is a constant issue, organizations simply hire a data analyst or specialist to spend 80% of their time fixing this. Although there are commercial solutions out there that help with data matching, data cleaning, data profiling, and a host of other things, companies think it's not much of a deal that they have to buy *yet* another solution.
And this is where things get bad.
It takes a best-in-class software to profile, match, and clean a million rows of data in max 45 minutes. It takes data analyst months to do the same. This is why in most companies bad data is a consistent problem.
Data analysts are constantly trying to fix quality issues that keep coming in and eventually, the analyst gets frustrated because they're simply not doing the job they are supposed to be doing - that is analyzing and deriving insights from data!
But dirty data is not just about fixing spelling mistakes. On a deeper level, dirty data has to do with data that is:
- Obsolete and hasn't been updated for eons.
- Spread across multiple, disparate systems in different formats and structures.
- Siloed away, lost in some database hardly ever seeing the light of day.
- Duplicated, messy, unstructured, incomplete, inaccurate and well, practically useless.
Of all these problems, duplicated data remains the most challenging. Most companies do not have data governance in place, which means their data is heavily duplicated. It's worse with web forms and call center data. Web forms are filled by users themselves, which means it will always be rife with errors and duplication. Any time a user updates their information with a new phone number, a new email address, or a new physical address, a duplicated entry is created. Every time a data entry operator makes a mistake with unique identifiers, a duplicate is created.
Data today is not as simple as ages ago. You're not limited to just basic contact information. You've got household data, social media data, metadata (device logins, etc), app data, and a whole lot more. To understand your customers, you have to make sense of these data sources, while also ensuring they meet quality standards.
It is challenging, but, it's not impossible.
Here are some ways you can deal with this.
Create a Data Quality Management Plan
You know what the problem is. Start working towards the solution. Get C-level buy-in. Demonstrate the impact bad data is having on your personalisation plans. Show the costs in numbers. Do your research. Determine whether you want to hire an in-house team or whether you can use a top-line data quality vendor to do the cleaning and matching work.
In my experience at both ends of the spectrum, being a marketing manager handling data and now being a data consultant for a data solutions provider, I can tell you that when companies have a knee-jerk reaction to bad data, they make costly mistakes.
Data cleaning today is not as simple as applying Excel or Oracle filters. No. You need to profile data at a deeper level to understand errors. You need to know what kind of issues are plaguing your data. Then, you need to match millions of rows of data to weed out duplicates and create clean records. Sure, you can hire a team to do this, but if you're going to end up spending millions of dollars only to fail your personalisation goals by a mile, then you got to think if it's worth it.
Once you understand this crucial difference and know what steps you need to take to fix your data quality, draw out that chart, make those graphs, fill those numbers, and present them to the team.
Do Not Leave Data Quality Up to IT Department
You don't have to be in IT to resolve data problems. You can be a business user who's frustrated at having to fix hundreds of rows of data before sending a marketing campaign. You can be a marketing manager who is tasked to understand the customer journey but who doesn't have access to the right data.
You can be literally anyone in the workplace and can have a problem with data quality. Your job is to formulate the plan, meet other business users, understand challenges, and present a solution to executives. Involve your CEO, CIO, CMO, and other senior leadership to support you in getting the data you need to achieve your goals.
Consolidating Data to Get a Single Source of Truth
If you're working for a large enterprise, chances are your customer data is siloed away, lost in some Amazonian jungle. A renowned bank attempted to create this personalised experience and they had to spend 6 months in just collecting, sorting, and cleaning data! With dozens of services, spread across dozens of vendor and third-party resources, the bank had a tough time consolidating its data. There are dozens of institutes and companies going through the same struggle.
The process is simple:
Gather data -> Clean data -> Match data -> Get a single source of truth.
The execution is tough.
A single source of truth is a consolidated version of your customer's profile, their interaction, and their relationship with your company. To deliver a personal experience, you must have a unified view. This view should answer questions like:
- What is your customers' preference?
- What do they want or expect from your service/product?
- What is important to them?
- What makes them choose you over your competitors?
- What activities, benefits, or services make your customers happy?
The answer to all these questions lies in your data. All you need is to have a master record that leads you from Point A to Point Z, showing the whole journey.
To Conclude, Get Your Data Right Before Creating a Personalised Experience
In my years of working all kinds of businesses including Fortune 500 companies, I see the same mistake repeated over and over again. Companies only realise the problem of data quality when it results in a failed transformation project, a flawed report, or analytics that don't give the insights expected.
If you truly want to give your users an excellent, personalised experience, start with your data.