Blog: Jill Dyché Subscribe to this blog's RSS feed!

Jill Dyché

There you are! What took you so long? This is my blog and it's about YOU.

Yes, you. Or at least it's about your company. Or people you work with in your company. Or people at other companies that are a lot like you. Or people at other companies that you'd rather not resemble at all. Or it's about your competitors and what they're doing, and whether you're doing it better. You get the idea. There's a swarm of swamis, shrinks, and gurus out there already, but I'm just a consultant who works with lots of clients, and the dirty little secret - shhh! - is my clients share a lot of the same challenges around data management, data governance, and data integration. Many of their stories are universal, and that's where you come in.

I'm hoping you'll pour a cup of tea (if this were another Web site, it would be a tumbler of single-malt, but never mind), open the blog, read a little bit and go, "Jeez, that sounds just like me." Or not. Either way, welcome on in. It really is all about you.

About the author >

Jill is a partner co-founder of Baseline Consulting, a technology and management consulting firm specializing in data integration and business analytics. Jill is the author of three acclaimed business books, the latest of which is Customer Data Integration: Reaching a Single Version of the Truth, co-authored with Evan Levy. Her blog, Inside the Biz, focuses on the business value of IT.

Editor's Note: More articles and resources are available in Jill's BeyeNETWORK Expert Channel. Be sure to visit today!


By Carol Newcomb, Senior Consultant

Minding Your Metadata

The second part of my summertime primer addresses ‘Minding your Metadata’.   I can just hear the collective groans and yawns now.   Sorry, but metadata collection is one of those necessary evils that may not be fun in the doing, but having it available as a resource to understand your data and use it appropriately is invaluable.   And you just might find some interesting surprises along the way!


Carol_image3

Metadata: What Is It & Why Do I Need It?

As you start your Root Cause Analysis (see last week’s primer), you first need to examine existing data definitions (or lack thereof).   Metadata is the foundation of good data management and forms the basis for Data Governance.     Pardon me for stating the obvious, but metadata is fundamental to investigating and resolving data issues and it is the first place to start when investigating data quality issues.

Metadata is ”data about data”.   Plain and simple.   It includes descriptive information about electronic data used in common daily business practice.   Metadata includes items usually found in a data dictionary: field name, field length, retention rules, and security access, as well as additional descriptive information that may include data origin (source or system), creation/entry date, method of creation (key-entry or the result of a calculation), purpose of the data (its intended use), how frequently it gets updated or refreshed, and current location in a database (table, view, schema).   If a data element is the result of calculation logic or groupings (such as age categories), those business rules used to generate the resulting data values should be collected as part of the metadata.

A good example of metadata that you may use every day would be ‘document properties’ in a Word document.   This feature captures data on the original document creation date, most recent access and update times, document creator, count of characters, words and pages.   If the document should be private, this will be indicated in its properties.   You may also tag the document by indicating key words in order to make it easier to find by you or others.

A few of the benefits of Metadata Management include:


  • Clarify rules for data entry

  • Reduce ambiguity around appropriate use of data elements

  • Eliminate problems associated with not having data definitions, business rules or transformation logic available

  • Validate legitimate values at the data element level

  • Provide evidence to regulators that security and confidentiality are protected

  • Centralize the storage and accessibility of metadata for end-users

  • Reduce the amount of effort required to research data results.


A Metadata Management Repository is a central location or system to collect and store metadata that may exist in disparate parts of the organization (data dictionaries, systems, spreadsheets, or people’s brains). The metadata repository will store detailed definitions centrally on a network where other users can find it.

There are three general sources of metadata that should be included in this repository:

Business Metadata – Business metadata attributes facilitate identification, understanding, and appropriate use of existing data elements.   These include clear business names and descriptions, relevant business rules, descriptions of the data sources, security and privacy rules, etc.  
Technical Metadata – Describes the technical attributes of data such as physical location (host server, database server, schema, etc.), data types, any transformations applied and domain of valid values, relationships to other data elements, precision, and lineage.   Technical metadata is used by business users and by IT staff to design efficient databases, queries, and applications, and to reduce duplication of data.  
Operational Metadata – Describes the attributes of routine operations on data and related statistics.   These include job schedules and descriptions, data movement and transformation processes, data read, update and performance statistics, volume statistics, backup and archival information.   Operational metadata is used by operations staff, and DBA’s to tune the system and ensure its continued efficient operations.   It is also used by business users to track such events as ”last use” of a field, and ”last load” of a data element.
Exciting stuff, huh?   Well, the whole point of metadata is to have the information about data available to a multitude of users when they need it, to keep it current, and to avoid confusion around usage.   So if you appreciate having a clean bathroom, and knowing where you keep your antiperspirant, you will also appreciate having good metadata!   The time for spring cleaning is well overdue.

CarolNewcomb_thumb Carol
Newcomb is a Senior Consultant with Baseline Consulting. She
specializes in developing BI and data governance programs to drive
competitive advantage and fact-based decision making. Carol has
consulted for a variety of health care organizations, including Rush
Health Associates, Kaiser Permanente, OSF Healthcare, the Blue Cross
Blue Shield Association and more. While working at the Joint Commission
and Northwestern Memorial Hospital, she designed and conducted
scientific research projects and contributed to statistical analyses.



Posted June 17, 2010 6:00 AM
Permalink | No Comments |