Share it with your network!
Help your friends to new knowledge
Pär Österlund - 12 Nov 2018
In one, we have John's name and phone number. In another we have John's name and address. In a third we have John's name and his old address.
If someone doesn't know John and looks at these different pieces of information, he or she has no chance of knowing if they all pertain to the same John Wilson or not. The only solution is to find a fourth or fifth system that contains information that ties together all the otherwise disconnected Johns.
Inability to piece together information can be expensive. We'll consider the business implications in a moment, but let's start with the legal implications.
A legal jungle
GDPR mandates that companies must provide data subjects, such as customers, with transparency regarding their personal data. This means that John can call and ask to receive access to all the information a company has collected about him.
Going back to our example of three different, disconnected representations of John, we can see why granting such a request might be tricky. If the relationships between the different representations of John remain ambiguous, it is impossible to know which pieces of information to give John when he comes calling.
Leave something out? You're breaking the law. Give out someone else's data? You're breaking the law.
The only way to be compliant is to piece the data together to be sure which of the Johns are the same.
Solutions for tying together data across the enterprise
The solution to this data fragmentation problem is to create a single customer view. This means connecting all the different versions so that you can see what John has bought, when he has been in contact with customer care, what type of marketing he has responded to and any other data that might be stored somewhere in an enterprise information system.
Having a single customer view is valuable because it helps you understand customer behavior and profitability. The big question is: how do we achieve this data nirvana?
The typical solution for connecting data across systems, regions and functions is to start a master data management (MDM) project. This can work, but the traditional consultant-driven MDM project has three problems:
Why is this? Remember John and the different data stored about him in different enterprise systems? Well, what the consultants do is essentially to create an advanced set of rules that try to connect the different pieces of data. You can just imagine how much time this requires and how inaccurate the result will be.
If this is a bad approach, why does third-party data make it better? Excellent question, thank you very much. I'm happy to explain.
Matching pieces of data
If we take a look into the database hosted at Bisnode, we can see it contains a lot of data about John. It has his name, his address, his five previous addresses, his phone number, the name of his spouse, the names of his kids and so on.
Thus, we can use the external database as a tool to connect the different pieces of information about John together. One database had his name and address. No problem, we get a good match on that. Another one contained his name and his old address. Again, no problem, perfect match available. The third one had his name and his phone number. Did you guess already? Yes, perfect match.
This means that the whole logic of matching together pieces of data can be outsourced to someone else. And since the quality of matching is dependent on the amount and quality of data available for the process, then of course the larger database of referential data wins.
Cheaper, faster and more accurate
The key here is literally a key – an ID. When using an external matching engine, we get a unique and persistent ID that's assigned to John. This same ID will be used for any instance of John, in any system.
Once we have the IDs in all systems, it's trivial to match them together. This, in turn, means that the master data project run by the consultants will be cheaper, faster and give more accurate results.
From a technical point of view, we can just call the same matching API with John's information and get the ID populated into a CRM, billing system, campaign management tool and marketing automation system. This way we can significantly cut down on the amount of infrastructure needed to create a master database.
From a GDPR perspective this is great. We can now give John all his data without risking either leaving something out or giving him someone else's data. If we want, we can poll the same database for updates in John's data – thus also complying with the GDPR demand that we keep our databases up-to-date.
In order to understand our customer data, we need to be able to connect data from tens or hundreds of systems across the enterprise. This is challenging.
To be able to piece all the data together, we need an ID that is both unique and persistent. This ID is easiest to generate through an external matching service based on a large referential database, because it can work as an authoritative data source of both up-to-date and historical data.
By storing the ID in all the different systems that contain consumer data we can significantly decrease the technical complexity and thus cost of data consolidation.
Data consolidation has both business and compliance benefits. We get an overview of the customer which in turn increases our understanding of the business, increases ROI on marketing investments and enables us to provide better service. It also makes it easier, faster and cheaper to achieve GDPR compliance.