Blog: Tom Breur Subscribe to this blog's RSS feed!

Tom Breur

Welcome to my blog!

About the author >

Tom Breur, Principal with XLNT Consulting, has a background in database management and market research. For the past 10 years, he has specialized in how companies can make better use of their data. He is an accomplished teacher at universities, MBA programs and for the Certified Business Intelligence Professional (CBIP) program. He is a regular keynoter at international conferences.  Currently,he is a member of the editorial board of the Journal of Targeting, the Journal of Financial Services Management and Banking Review. He acts as an advisor for The Council of Financial Competition and the Business Banking Board and was cited among others in Harvard Management Update about state-of-the-art data analytics. His company, XLNT Consulting, helps companies align their IT resources with corporate strategy, or in plain English, he helps companies make more money with their data. For more information you can email him at or call +31646346875.


March 2009 Archives

The topic of investigating data quality as a formal, separate discipline is about 10 years old, now. Classic books like Redman's (1997) "Data Quality for the Information Age" and English' (1999) "Improving Data Warehouse and Business Information Quality" have opened up discussions in many companies and settings whether data quality merits special and separate attention. Since then, about a dozen or so books have been written specifically about data quality. One of the main problems, though, seems to be that few people agree what data quality really is. Depending on who you ask, you are likely to get a wide variety of answers and definitions.

When you ask an ETL programmer what data quality is, he will point to the number of conflicts in your audit dimensions, when he is merging disparate data sources. The front-end BI tool users will refer to the number of fields available and their richness to qualify and describe some unit of interest, say a customer or order or shipment. Other analysts will refer to the predictive power that attributes hold for producing models with great lift. And still others will talk about sparsely populated fields with too many n/a or missing values.

What appears to be missing at this stage of maturity is an overarching framework, a set of adjectives, to specify what aspect of data quality we are talking about, and where in the BI cycle it pertains. Until we resolve this confusion, most conversations about data quality are likely to remain, well, low quality...

Posted March 8, 2009 10:12 AM
Permalink | No Comments |