Blog: Ronald Damhof Subscribe to this blog's RSS feed!

Ronald Damhof

I have been a BI/DW practitioner for more than 15 years. In the last few years, I have become increasingly annoyed - even frustrated - by the lack of (scientific) rigor in the field of data warehousing and business intelligence. It is not uncommon for the knowledge worker to be disillusioned by the promise of business intelligence and data warehousing because vendors and consulting organizations create their "own" frameworks, definitions, super-duper tools etc.

What the field needs is more connectedness (grounding and objectivity) to the scientific community. The scientific community needs to realize the importance of increasing their level of relevance to the practice of technology.

For the next few years, I have decided to attempt to build a solid bridge between science and technology practitioners. As a dissertation student at the University of Groningen in the Netherlands, I hope to discover ways to accomplish this. With this blog I hope to share some of the things I learn in my search and begin discussions on this topic within the international community.

Your feedback is important to me. Please let me know what you think. My email address is Ronald.damhof@prudenza.nl.

About the author >

Ronald Damhof is an information management practitioner with more than 15 years of international experience in the field.

His areas of focus include:

  1. Data management, including data quality, data governance and data warehousing;
  2. Enterprise architectural principles;
  3. Exploiting data to its maximum potential for decision support.
Ronald is an Information Quality Certified Professional (International Association for Information and Data Quality one of the first 20 to pass this prestigious exam), Certified Data Vault Grandmaster (only person in the world to have this level of certification), and a Certified Scrum Master. He is a strong advocate of agile and lean principles and practices (e.g., Scrum). You can reach him at +31 6 269 671 84, through his website at http://www.prudenza.nl/ or via email at ronald.damhof@prudenza.nl.

It is only by means of good and respectfull discussion that knowledge
and insight will evolve. This post should be regarded as such.

This post is a second reaction to the first article in a series of three
which were written by a highly respectfull thoughtleader in the field
and publisher on the B-Eye-Network; Rick van der Lans. The papers are
titled 'The Flaws of the Classic Data Warehouse Architecture'.

This blog post is a reaction to the first part. It deals with the flaws of the classic data warehouse architecture (CDWA).

Rick signals five flaws which will lead in article two and three to a new architecture. This post is addressing the second flaw.

- My reaction to flaw #1 can be read here
- My reaction to flaw #2 can be read here

Flaw 3 according to Rick
Rick signals the need to do analytics on external data and on datastores that are unstructured. I quote Rick: 'Most vendors and analysts propose to handle these two forms of data as follows: if you want to analyze external or unstructured data, copy it into the CDW to make it availablefor analytics and reporting'. Ricks is wondering why? Unstructured data can be 'handled' on the source and external data can be done by mashup tools.

My reaction to flaw 3
Where are these vendors and analysts that propose to copy unstructured data into the CDW? I do not know them....really I don't. And if they exist - I agree with Rick; don't do it. Especially for the unstructured data I think other architectural choices are more optimal at the moment. But where is the flaw in the CDWA architecture? The CDWA was not meant for unstructured data and still is not. I still do not see the flaw...

But for the external data, I really believe in the years to come that there is still a solid business case of getting this data into your data warehouse. Sure enough - especially for more
situational BI - mashups offer very fast time to market for new informational products. Although I believe the securtity issue is not to be underestimated as well as the need to perform analytics on the combinations of internal and - multiple sources of - external data.  

Mashups also need solid architectures.......

So:
- I challenge the notion that the vendors and analysts in data warehousing massively propose to put unstructured data in the DWH. The CDWA was not meant for that purpose. Not a flaw.
- There are solid business cases for getting your external data into your DWH. The CDWA is still a valid approach. Not a flaw.
- Mashups - as is new (BI) technology in general - surely offer new features and promising functionality.


Posted June 22, 2009 5:03 AM
Permalink | No Comments |

Leave a comment