With the constant evolution and growth of technology it has meant the volume of data has grown exponentially. How big data is executed is still an on-going battle, there are technology solutions that can help solve this challenge for business but without good governance over data quality, data becomes meaningless and unactionable. Another way of looking at its garbage in, garbage out.
I honestly believe that data should be a brand’s competitive advantage, the challenge with data quality so much of it is an afterthought. If we go back 10 years, data was primary used for dashboards. With unlimited data available, capabilities of machine learning and artificial intelligence it requires good data and large volumes of data to provide the best outputs.
Gartner survey stated that data quality is financially hurting business to the value of $15m every year. To put it into context in the UK brands are spending 26% of their budgets on marctech. There is an in-balance of how much brands are willing to spend on technology. But not put the time investment to ensure high level of data quality to help answer business questions. 60% to 80% time is spent on cleaning data than building models and developing insights which is going to drive the most value to business.
How a business data quality is managed is a good indicator where they are in the lifecycle of data maturity
Data quality is not only a marketing data issue it impacts every data source that a business has access to from call centre, CRM and content created etc. They all need to adhere to a consistent form of data quality. With the dynamic changes in how data works it requires a dynamic approach to managing data quality.
When starting the process to get to better data quality there are 6 key steps:
Step 1 – Understanding business objectives and requirements to ensure data quality is adhered to so that the business questions can be answered.
Step 2: Integration of data sources – Getting access to all data sources that are available. Understand the structure of the data and how they link to different data sources available.
Step 3: Health of data sources – From the data available, what % of data is good, what % is missing, what % is bad. This will help understand the complexity of the task involved and what kind of rules will be needed to put into place to ensure a high standard of data quality.
I would recommend having a regular health data check-up and plot it on a chart to measure the progress
Step 4: Developing a schema – Once you have understood the different data sources available create a scheme how the data should be processed.
Step 5: Clean and Merge data sources – On a sample set of data implement the changes from the schema created in step 4 to all data sources available which will provide a much richer data story than previously.
Once the changes are consistently implemented it will allow business to make decisions on good, trustworthy data.
Step 6: Create a process using Python to ensure that the health of the data can be regularly monitored which should be correlated to the impact business decisions can be made.
What is important to note is that the process is fundamental and without it will breakdown.
Data quality should be the cornerstone of the data strategy executing the business strategy.