Several years ago a book was published – DW 2.0: The Architecture for the Next Generation of Data Warehousing. The book included several predictions for the future. One prediction was for the emergence of a large, inexpensive storage platform. Today we have Hadoop. Another prediction was for the recognition that textual data should become a part of data warehousing, and today we have textual ETL. Yet another prediction was for the emergence of enterprise-wide data, such as metadata for the enterprise. That final aspect of DW 2.0 has not yet emerged.
That raises an interesting question: What is the problem with enterprise-wide data, especially enterprise-wide metadata? There are several systemic challenges to the emergence of enterprise-wide metadata. These challenges include:
- It is not in any given vendor’s best interest to make their metadata available to anyone else. It doesn’t matter which vendor you pick. Pick any one. Why doesn’t Oracle wish to make it easy for IBM and Microsoft to easily access their metadata? Why doesn’t IBM open their doors to Oracle and Microsoft. Or why doesn’t Microsoft open its doors to IBM and Oracle? It simply is not in the long-term best interest of any of those companies to open up their metadata to their competitors. Yet when you look across the enterprise of most corporations, you find – in one form or the other – technology from all of these vendors. Thus, what today’s corporations need is a look at ALL of the metadata across these vendors, but the vendors – for good reason – are not forthcoming with what is needed. The first obstacle to enterprise metadata then is a simple business barrier that is very real and very predictable.
- Enterprise-wide metadata looks at not just current metadata, but also at older legacy metadata. Additionally, some metadata is stored in technologies from companies that don’t even exist anymore. It is a random walk to find and understand the patchwork of enterprise-wide metadata. The world would be a simple place if all enterprise technology was from IBM, Microsoft and Oracle. However, the reality is that the world of enterprise metadata is made up of MANY other technologies, some of which are viable and active and some which are not.
- Lack of uniformity is another challenge. Metadata across the enterprise is about as uniform as automobiles. There are old cars and new cars. There are sports cars and there are SUVs. There are jeeps and bandwagons. There are trucks, and there are motorcycles. There are red cars and yellow cars. There are convertibles, and there are coupes. There are family cars, and there are sidecars. In short, there is an amazing panoply of vehicles in the world. There really is not much uniformity at all when it comes to the vehicles we use for transportation. Similarly, there is not much uniformity for enterprise-wide metadata. Every vendor has its own interpretation of what metadata is appropriate, and every vendor is different from every other vendor. There simply is nothing like a universal, agreed-upon standard for metadata. When you look across the enterprise, the rainbow of the different types of metadata becomes apparent.
- Metadata is text. Regardless of how metadata is defined, it always comes out as text. And – prior to textual ETL – the world has had a hard time digesting text. Text is non-repetitive. Text needs context in order to be understood. Text comes in different languages. Text can be informal and often comes in shorthand. In short, anything that comes in the form of text is trouble. Text requires definition in order to be useful. Consequently, it’s no surprise that enterprise-wide metadata is a mystery. It is wrapped up in the form of text.
There probably are many other reasons why enterprise-wide metadata is troublesome (as if these were not enough!).
Having stated that, there are some very interesting developments on the horizon that hold promise. Finally, the hard nut of enterprise-wide metadata can be cracked. When enterprise-wide metadata is cracked, the vision of DW 2.0 will, at last, become a real possibility.
SOURCE: Data Warehousing and Metadata: The Enterprise Perspective
Recent articles by Bill Inmon