Blog: David Loshin http://beyenetwork.be/blogs/loshin/ Welcome to my BeyeNETWORK Blog. This is going to be the place for us to exchange thoughts, ideas and opinions on all aspects of the information quality and data integration world. I intend this to be a forum for discussing changes in the industry, as well as how external forces influence the way we treat our information asset. The value of the blog will be greatly enhanced by your participation! I intend to introduce controversial topics here, and I fully expect that reader input will "spice it up." Here we will share ideas, vendor and client updates, problems, questions and, most importantly, your reactions. So keep coming back each week to see what is new on our Blog! Copyright 2011 Thu, 20 Jan 2011 11:24:38 -0700 http://www.movabletype.org/?v=4.261 http://blogs.law.harvard.edu/tech/rss Meet Me for Breakfast, Data Quality, and MDM - 3 Upcoming Events I have been invited by data quality and MDM tool company Ataccama to be the invited guest speaker at a series of breakfast seminar events in early March at the following locations:

March 1 Bridgewater NJ

March 2 Chicago, IL

March 3 Charlotte, NC

The topic is "Strategic Business Value from your Enterprise Data," and I will be discussing aspects of business value drivers for Data Quality and MDM. I believe that attendees will also get a copy of my book "Master Data Management."

I participated in a few similar events at the end of 2010 and found that some of the attendees posed ssome extremenly interesting challenges, and I hope to share some new insights at these upcoming events!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2011/01/meet_me_for_bre.php http://www.beyenetwork.be/blogs/loshin/archives/2011/01/meet_me_for_bre.php Thu, 20 Jan 2011 11:24:38 -0700
Webinar: Fundamental Techniques To Maximize the Value of Your Enterprise Data I will be presenting at a webinar hosted by Talend on December 2 at 2:00PM EDT, 11:00AM PDT on Fundamtental Techniques to Maximize the Value of Your Enterprise Data. In this presentation I will discuss the convergence of the value of three interconnected techniques: master data managemetn, data integration, and data quality. As data repurposing grows, so do the challenges in centralizing semantics, and we wil look at some common challenges. Join me on Dec 2!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/11/webinar_fundame.php http://www.beyenetwork.be/blogs/loshin/archives/2010/11/webinar_fundame.php Mon, 29 Nov 2010 12:56:03 -0700
The Practitioner's Guide to Data Quality Improvement Just published! My new book on data quality improvement, called The Practitioner's Guide to Data Quality Improvement was released a few weeks ago and is now available. The book provides practical information about the business impacts of poor data quality and provides pragmatic suggestions on building your data quality roadmap, assessing data quality, and adapting data quality tools and technology to improve profitability, reduce organizational risk, increase productivity, and enhance overall trust in enterprise data.

I have an accompanying web site for the book at www.dataqualitybook.com. At that site I am posting my ongoing thoughts about data quality (and other topics!) and you can download a free sample chapter on data quality maturity!

Please visit the site, check out the chapter, and let me know your thoughts by email: loshin@knowledge-integrity.com.

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/11/the_practitione.php http://www.beyenetwork.be/blogs/loshin/archives/2010/11/the_practitione.php Wed, 10 Nov 2010 13:45:15 -0700
Geographic Data Services If you have read my articles and blog entries over the years, you may know that I have a real fondness for geographic-based data analysis. I have loved maps since I was a kid (when I was in elementary school I used to visit all the local gas stations when they used to hand out road maps free). Today, with the ubiquity of handheld GPS systems, location-bsaed services are rapidly becoming a critical component to any enterprise information management program.

 

I just finished a paper on location-based services and am doing a webinar on it this Thursday. Register at http://bit.ly/cO82dv. I am looking forward to seeing you at the webinar!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/10/geographic_data.php http://www.beyenetwork.be/blogs/loshin/archives/2010/10/geographic_data.php Tue, 26 Oct 2010 03:37:43 -0700
Event: Breakfast Seminar in Boston 10/28 I will be the guest speaker at an executive breakfast seminar in Boston on October 28th to discuss the critical link between Data Quality, MDM and Data Governance. If you are interested in attending, please register through http://bit.ly/cM5q7k - looking forward to seeing you there!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/10/event_breakfast.php http://www.beyenetwork.be/blogs/loshin/archives/2010/10/event_breakfast.php Fri, 15 Oct 2010 08:56:21 -0700
Data Integration, 80%, and Webinar This Thursday They say that data integration accounts for 80% of the effort of a data warehousing (or a variety of other enterprise application's) effort. But who are "they"? I know that the figure is often presented as the typical resource and time investment for data integration activities, but have not tracked down a source for it. I seem to recall seeing it in some data warehousing book, but do not remember which one.

 

Nonetheless, there is no reason for data integration to consume that amount of effort if the right steps are taken ahead of time to reduce the comfusion and complexity of ambiguous semantics and structure. I will discuss these issues at a webinar this Thursday, August 12 - hope you can make it!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/08/data_integratio.php http://www.beyenetwork.be/blogs/loshin/archives/2010/08/data_integratio.php Tue, 10 Aug 2010 06:05:38 -0700
Data Governance: "In Action" or "Inaction"? In the past week I have been sent email from two different organizations offering me information about data governance, and both cases seem to indicate apparent minimal dog food self-ingestion.

 

The first example is actually the last few of a string of emails that I have received over a nine-month period, each of which is addressed to "Jack." In the past month I have gotten six emails about a webinar on data governance. I responded to the sender three times. The first time I asked whether data quality was part of their talk on data governance, perhaps a tongue-in-cheek way of hoping that they'd notice that my email name ('David Loshin") and their salutation name did not match. No response from them. When I got the next one addressed to Jack, I emailed back saying that may name was not Jack. No response. The last email I got from them was responded to with a simpler question: Does anyone actually respond to emails sent to that email address? Apparently not. They don't know Jack ;-).

 

The second example is perhaps funnier. The salutation on the email I received regarding a new white paper including material on "the inability to use information for strategic business advantage" and recognizing data as an asset to "improve customer experiences" was "{FIRST_NAME}," which is perhaps a little more correct (I do indeed have a first name even if I don't typically use it) although equally indicative of an absence of oversight on the process of producing the information end-product (i.e., the emails).

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/07/data_governance_1.php http://www.beyenetwork.be/blogs/loshin/archives/2010/07/data_governance_1.php Wed, 28 Jul 2010 11:00:39 -0700
Thoughts on Parsing and Standardization, and Upcoming Webinar Last week I had an interesting discussion regarding technical aspects of data cleansing, particularly in the context of acquired data. The challenge posed was that the organization needed to collect data sets from numerous sources with no ability to introduce any types of data controls or dat avalidations. In other words, the data they got was what it was, and if they wanted to use it, they'd have to clean it up themselves.

So the discussion led to talk about tools for cleansing, and I mentioned that most products today provide some means of parsing and standardization as  aprelude to entity resolution, matching, and consolidation. In fact, I will be continuing this discussion at a web seminar next week on Parsing and Standardization, and I hope you can attend!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/07/thoughts_on_par.php http://www.beyenetwork.be/blogs/loshin/archives/2010/07/thoughts_on_par.php Thu, 22 Jul 2010 13:51:40 -0700
Semantic Consistency, Master Data Models, and Upcoming Webinar! In the past week, we have had a number of conversations with folks struggling with specific aspects of data integration for master data management. The main issue is that secondary users of what will eventually be master data do not always necessarily bound to abide by the primary users' data definitions. For example, the concept of "customer" means something different to the sales department than it does to those in customer support.

The upshot is that as data element definitions are reinterpreted, the results of sums, counts, and other aggregations start to be skewed. Ultimately, resulting reports are inconsistent, leading to a need for reconcilations, then loss of trust in the master data asset.

One way to address this is a concerted effort to normalize semantics prior to executing the data consolidation. This may shake out semantic inconsistencies and reduce the need for reconciliations.

More importantly, it implies the need for best practices in developing master data models. To that end, I will be presenting a talk on Accelerating MDM Initiatives with Master Data Modeling at a webinar sponsored by Embarcadero on July 28th. Lots of folks have already signed up, and I hope that it will provide an open forum for discussing some critical issues regarding master data modeling.

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/07/semantic_consis.php http://www.beyenetwork.be/blogs/loshin/archives/2010/07/semantic_consis.php Wed, 21 Jul 2010 16:52:29 -0700
Data Quality and Data Virtualization As more organizations are starting in earnest to consider deploying master data systems, they are also beginning to see where some fundamental issues may block the creation of a full-scale master data repository. This had me thinking about potential ways around some of these issues (especially where governmental regulations prevent moving data across borders!), and one aspect I started considering is the use of a data federation or virtualization model that can abstract the pereption of a unified view without necessarily copying the data.

 

At the same time, I was approached by Composite Software to do some research on the fesibility of incorporating data quality management techniques within a data virtualization framework. That activity only has continued to pique my interest in integrating virtualization and MDM, and it also allowed me to explore some ideas I explored in my book on data quality management. Meanwhile, the result of that task is an interesting white paper, which you can access via Composite's web site - search down to the "analyst reports" section of the page. Let me know your thoughts!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/06/data_quality_an.php http://www.beyenetwork.be/blogs/loshin/archives/2010/06/data_quality_an.php Tue, 15 Jun 2010 10:16:39 -0700
Executive Briefings on MDM and Data Governance In my most recent article on b-eye-network, I discussed the questions raised as a result of serious consideration of instituting MDM, and how that directly depends on sound data management practices associated with data quality and more importantly, data governance. If this interests you, I will be participating in a set of executive lunch meetings with Initiate Systems to discuss these ideas in greater detail, one in Chicago on May 25 and one in San Jose on May 26. Here are the links for more information:

Chicago, May 25 at 10:00 AM - 1:00PM CDT

San Jose, May 26 at noon - 3:00PM PDT

If you plan to go, let me know!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/05/executive_brief.php http://www.beyenetwork.be/blogs/loshin/archives/2010/05/executive_brief.php Tue, 04 May 2010 10:06:00 -0700
Puget Sound DAMA Day April 20 - Introduction to MDM I am excited to be the featured speaker at the DAMA Puget Sound DAMA Day event. I will be teaching a full day course: Introduction to Master Data Management. The course is adapted from some of the courses I have taught at the Enterprise Data World conference, and incorporates material from my recent book on MDM.

The event takes place at the the Best Western Executive Inn - 200 Taylor Ave N, Seattle, and registration includes full day seminar, all seminar materials, continental breakfast and buffet lunch.

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/04/puget_sound_dam.php http://www.beyenetwork.be/blogs/loshin/archives/2010/04/puget_sound_dam.php Mon, 12 Apr 2010 18:41:24 -0700
Book: Key Performance Indicators About a year ago I came across a very good book by David Parmenter called Key Performance Indicators that provided a nice breakdown of the concepts and processes associated with articulating performance measures in relation to business objectives. One nice feature was a taxonomy of measures with a great organization.

Well, I recently got my hands on the recently revised version of the book, and am definitely looking forward to reading through it. If you get a chance to read it, please share your thoughts!

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/04/book_key_perfor.php http://www.beyenetwork.be/blogs/loshin/archives/2010/04/book_key_perfor.php Thu, 08 Apr 2010 13:52:26 -0700
How Do You Know What Data is Master Data? On Monday, march 15 I conducted a full-day master data management tutorial at the Enterprise Data World conference. As a forum for discussing pragmatic MDM best practices, one section of the day was set aside for a panel discussion among representatvis from four vendor products:

  • Dan Soceanu from DataFlux
  • Marty Moseley from Initiate Systems - An IBM Company
  • Ravi Shankar from Informatica (formerly Siperian)
  • Jim Walker from Talend

I posed the question to all four: What defines data as "master data"? The first round of answers focused on the standard answer: data concepts that "are important" to the business and are shared by two or more applications. My reaction to this response was that it was not a practical guide, and then rephrased the question: What can the people in the audience do when they got back from the conference to start identifying data entities as master data?

Again, I did not get the answers I was looking for - all four suggested that the task was not one that could be done at your desk, that it required knowledge of the business, that subject matter experts had to be consulted.

All true, but again, not executable, so I reframde the question again: knowing that there was bound to be variation, replication, duplication, redundancy, differences in semantics, what is a process for reviewing the data to decide which data element of which data entities belongs in a unified master view.

At that point the answer became a little clearer: you can't tell unless you understand what each data element is, how it is used, what its definition was, how many application sused, in what type of usage scenarios. In addition, you needed oversight of the process for analyzing the data and capturing the results, sharing, and having all that information validated by subject matter experts.

As moderator, I responded by summarizing: "in order to determine what data is master data, you need to analyze the data, document all the information about the data, and have policies for overseeing that process. That sounds like data profiling, metadata management, and data governance." (nods all around)

But is has to be more than that; there has to be a more operationalized method that results in a clear determination of which data elements of which data entities are to be mastered.

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/03/how_do_you_know.php http://www.beyenetwork.be/blogs/loshin/archives/2010/03/how_do_you_know.php Wed, 17 Mar 2010 13:01:07 -0700
And Now the Other MDM Shoe Has Dropped IBM to acquire Initiate Systems.

]]>
http://www.beyenetwork.be/blogs/loshin/archives/2010/02/and_now_the_oth.php http://www.beyenetwork.be/blogs/loshin/archives/2010/02/and_now_the_oth.php Wed, 03 Feb 2010 07:25:22 -0700