Most organizations have at least some level of discomfort with their data quality. It may be considered a “minor issue,” it may be considered a “serious risk,” and sometimes problems seem so intractable and dangerous that nobody even wants to think about them. Of course, sometimes management is completely oblivious to costs (and risks) associated with poor quality data.
Whenever costs of poor quality data are acknowledged, some drive emerges for betterment. But where to begin? And how? Like the Chinese say: Even a 1000 mile journey begins with a single step. But which one?
You need to balance short- and long-term data quality objectives. A short-term gain might be alleviating knowledge workers’ frustration with blatantly erroneous data. This may cause time-consuming lookups, or failed communications with customers. A long-term objective might be improved interface design to facilitate error proofing data entry.
And besides the need to balance short- and long-term improvement initiatives, there is a question about scope. Are you looking to cope with the current backlog of information needs, or should you (also) be working on tomorrow’s data infrastructure (Olson, 2003)?
Short-Term Versus Long-Term Improvement
Most data quality teams are all too familiar with omnipresent opportunities for data quality repair. Even if you apply the Pareto principle of attacking the most serious and expensive problems first, there always seem to be more areas in need of fixes. Sometimes, picking “low hanging fruit” may well be a cop-out to avoid dealing with structural process improvement.
It is very easy, if not downright seductive, to get bogged down in one cleanup after the other because results of these improvements are visible right away. It works, so why argue with success??
But there is also a problem with the “quick fix.” What English (2009) refers to as “inspect and correct”, is really the knowledge workers’ equivalent of scrap and rework in manufacturing. Someone upstream has produced dirty data that you’re now obliged to fix. Apparently there was either no time, insufficient skills, or no motivation to “get it right” the first time around. Consequently, you’re redoing the same work. Clearly this way of working only adds cost, and no value to your primary business process.
Unfortunately this is a zero sum game: all time and effort spent on the “quick fix” cannot be spent working on structural process improvement. That’s why you should always be weary of cleaning up at a downstream point in your data production process. Correcting data errors only, only makes sense once you have taken preventive measures to avoid new errors from entering your systems.
However, when you only take long-term measures, it might take a (very) long time for the erroneous data to go stale to the point where they no longer disrupt your (primary) business processes. And all (historic) reporting from these faulty data will continue to be off, and lead to poor quality decisions. When this poses unacceptable problems, a wholesale cleanup (for instance, data entry by manual reconfirmation) might be the only way to go.
Micro Versus Macro Scope of Betterment
There always seem to be abundant data improvement and integration projects waiting. As soon as business intelligence
(BI) starts generating value for the business, this stock of outstanding requests won’t cease. Somehow, answering questions invariably leads to more and new questions being asked. It’s like a law of nature.
What’s the problem, you may wonder? Just get to work and manage the backlog of outstanding requests as best you can. But there is
a problem with an exclusive focus on (current) business requests. Data play a crucial role to enable knowledge workers. Data are also crucial for future strategic purposes and for the business to stay competitive. It’s this dual role that you need to be aware of. It’s this same dual role that threatens a balanced view on priorities. Staying
competitive means working toward tomorrow’s data infrastructure as well as supporting today’s needs. BI
professionals need to shed light on how the current
infrastructure should evolve in order to meet tomorrow’s
needs. And what kind of investments are required to evolve and “future proof” your BI environment? This refers to skill sets in your BI team, as well as capital expenditures.
When your databases grow in size, and their scope expands, they require refactoring to ensure you stay nimble. This preventive maintenance, if you may, is essential to avoid your data warehouse becoming “new” legacy. It’s exceedingly difficult to explain and justify these investments in light of future costs of change. But you need to do it anyway. Informing and educating stakeholders is an ongoing effort to ensure business alignment.
Creating both micro as well as macro improvement of your BI efforts is a matter of strategic alignment. And again, this is a zero sum game: You can only invest your resources once. And again, it may be tempting to go all out after outstanding business requests. Doing anything less requires managing expectations with tremendous care.
Typically BI departments don’t get involved in strategy making. They often enable by providing numbers, though. Yet ensuring strategic alignment is everyone’s business. And who can envision the roadmap to tomorrow’s infrastructure better than we? The future may demand integration of completely new data streams, and maybe additional external data.
Also, there seems to be a universal push to lower data latency. But not everybody can dream up the kind of architecture that will support near real-time data warehousing. This often places additional demands on your data models, network bandwidth, and hardware. Striking an economic balance between those three is hard. And putting it into a robust and agile framework is even harder. For many – if not most, companies – substantially driving down data latency means replacing their data warehouse architecture altogether.
In many companies, BI is positioned as an enabler that acts as custodian for data resources in the organization. There is nothing wrong with that role. What can go wrong, however, is that BI expertise is insufficiently called for when making tactical or strategic tradeoffs.
When the BI team spends increasing amounts of time cleaning up data (“inspect and correct”), they have a responsibility to inform the business about inefficiencies that occur when you’re correcting downstream data without putting commensurate preventive measures in place upstream. We may be working at, but we are also part of this same data ecosystem. So we need to act on our responsibility when imbalances occur.
Many BI teams are frantically attempting to manage their backlog of projects. And usually this is a sign of success. When your BI environment gets inundated with change requests, this doesn’t necessarily mean you’ve done a poor job specifying requirements. More often, this is a symptom of success: answering important questions leads to more and new questions being raised.
However, by the time your pursuit of the backlog of user requests gets in the way of future proofing your data warehouse, there could be considerable risk associated with being “successful.” While working to support current business cases for BI, there is a market “out there” with potential competitors who aren’t burdened by the current status quo. Preparing for new and innovative competitors is likely to place additional demands on BI.
The whole purpose of exposing this conundrum is to bring such tradeoffs to the conscious level. And when long-term investments are deferred, then at least that decision is made in full sight of both internal as well as external considerations. You can always come back to those questions in light of current and potential future competitors. Because the best companies never cease to raise the bar to stay ahead of the game.References:
Jack Olson (2003), Data Quality – the Accuracy Dimension. ISBN# 1558608915
Larry English (2009), Information Quality Applied: Best Practices for Improving Business Information, Processes and Systems. ISBN# 047013477X
SOURCE: Data Quality: Balancing Today’s Needs With Tomorrow’s Demands
Recent articles by Tom Breur