Data Quality Happyland: How to Get There

Originally published 20 April 2009

Many organizations are suffering from poor data quality. Some know about their issues, but for many the extent of costs they are incurring due to data nonquality remains largely hidden.1 Since nobody is happy with poor quality, the ubiquitous question is, “How do I accomplish sustainable improvement in data quality?”

The reason this is such a challenging question lies in the fact that every company needs different measures, a unique approach. One size does not fit all. But although there are so many differences, there is nonetheless a structure, a pattern that can be discerned which predictably leads to improvement. In this article, I’ll expose a simple and familiar framework to guide your progress. Identifying where you are in this structure helps you determine where your highest leverage points are for change, as well as how you can make your improvement effort last.

A Familiar Framework: Awareness and Competence 

For this article, I will use a familiar framework to describe the stages that companies go through when they embark on data quality improvement efforts. It consists of a classic 2x2 matrix crossing awareness and competence.

alt
Figure 1: Organizing for Data Quality

In Figure 1, there is a natural progression moving from the bottom left quadrant via the top left corner to the top right and then on to the bottom right. Along the way different efforts are required (see arrows 1, 2 and 3).

Inform

Many companies have data quality issues that they are largely unaware of. At least, most people within the organization are unfamiliar with their data quality issues, and the (downstream) costs they are incurring as a result of this. The way to get “out” of this place is by information (arrow 1). Tell as many people as possible about the issues you are faced with and preferably also what the costs are associated with the current status quo.2

Certainly senior management should be informed about your data quality problems. Translating the consequences of data nonquality into the one dimension that every manager in every industry understands so well, namely dollars, is a great way to draw their attention. Such an effort will require hard work and making assumptions to finalize the cost calculations. Then include the calculation model and assumptions being made. Maybe you want to present this with a certain bandwidth to acknowledge the uncertainty in your calculations. But make sure “a” number gets calculated. Financials have a tendency to “stick,” to be remembered quite well.

Educate

The second transition you will want to go through is moving from awareness/lack of competence to awareness/competence (arrow 2 in Figure 1). The way you get there is through education. With this transition your goal is to train people in practices that will prevent poor data from entering your systems.

At the front end, this could be by training data entry staff in best practices, such as checking and rechecking manual entries. Maybe you want to reconfirm the importance and cost associated with poor quality (which was brought to their attention in phase one), particularly as a standard practice for new trainees. But it could also mean simply designing better user interfaces that enforce and facilitate higher quality data entry.

On the back end, it means showing how the data warehouse staff can track and establish data quality, preventing poor quality data from entering your data warehouse. For instance, this might be putting deduplication technology in place that leads to fewer duplicate records from the ETL process or reporting the number of errors occurring in your audit dimensions.3

Rethinking Accountabilities

The third stage in this voyage is meant to solidify new working practices. This phase should ensure that the new practices become ingrained in the organization and producing high quality data becomes the norm.4 accomplish this by restructuring accountabilities (arrow 3 in Figure 1).

Issues you will be facing in this phase are things like organizational alignment and performance targets. If you were rewarding data entry staff for speed, you will now need to add performance objectives that also reward staff for making fewer errors. Otherwise, you put them in a lose/lose quandary when they are overloaded with work, and the only way to deliver quality would be by missing their productivity targets.

Misalignment between departments occurs when the people suffering from lack of quality data can’t influence resource allocation where (upstream) data quality should be produced. For instance, marketing often incurs the costs of sloppy data handling by data entry staff. The problem holder is the person “suffering” from poor data quality, in this case marketing. The problem owner controls the resources needed to resolve the problem, in this case the manager of data entry. Organizational alignment is the result of bringing problem holder and problem owner as close together as possible.  

The Journey Continues

After you have gone full circle, and new and improved levels of data quality have become the norm. You will have controls in place, and awareness about the importance of quality operations will continue to grow. What happens then, invariably, is that new and heretofore ignored areas will become the object of scrutiny. For any other process where the possibility exists to drive out nonquality, you will run through this loop again.

As the process of improving data quality becomes increasingly familiar, you may attempt to do several things in parallel. Be aware that each phase builds on the previous ones, so it is practically impossible to “skip” any steps.

Some companies may know they have costly data quality issues, and some may not. However, everybody prefers good quality data. Hard to argue with that. The question for many is how to reach their data quality happyland. Every company is different. To provide guidance on this journey, the framework I have provided helps identify where you are and commensurate steps to take. The order of these steps may not be set in stone, but the underlying dependencies help determine what to do and how to assess progress.

Organizations typically move through awareness creation, raising attention for the data quality problems at hand. Then the next step is developing skills and competencies. This can be training staff, ranging from front-line data entry to back-end data warehouse ETL specialists. But this phase also includes improving user interfaces to enable better data entry or supporting data warehouse staff specialist technology (data cleansing tools, etc.). Finally, to make change sustainable, the root causes for data quality problems need to be considered. These typically lie in poorly aligned objectives.

The entire transformation path can be summarized as inform-educate-transform, where each phase builds upon the previous one. After data nonquality has been driven out of one process, the organization will learn to signal similar opportunities in other processes, and the quest for ensuring data quality and making it the default continues.

References:


1. Jack Olson,  Data Quality – The Accuracy Dimension, 2003.
2. Larry English, Improving Data Warehouse and Business Information Quality, 1999. 
3. Ralph Kimball & Joe Caserta, The Data Warehouse ETL Toolkit, 2004.
4. Philip Crosby, Quality is Free, 1980. 

 

 

 

 

  • Tom BreurTom Breur
    Tom Breur, Principal with XLNT Consulting, has a background in database management and market research. For the past 10 years, he has specialized in how companies can make better use of their data. He is an accomplished teacher at universities, MBA programs and for the Certified Business Intelligence Professional (CBIP) program. He is a regular keynoter at international conferences.  Currently,he is a member of the editorial board of the Journal of Targeting, the Journal of Financial Services Management and Banking Review. He acts as an advisor for The Council of Financial Competition and the Business Banking Board and was cited among others in Harvard Management Update about state-of-the-art data analytics. His company, XLNT Consulting, helps companies align their IT resources with corporate strategy, or in plain English, he helps companies make more money with their data. For more information you can email him at tombreur@xlntconsulting.com or call +31646346875.

     

Recent articles by Tom Breur


Related TechTarget Editorial Content


 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!