SPOTLIGHT: The Impact of Big Data Q&A with Ivan Chong of Informatica

Originally published 23 August 2011

BeyeNETWORK Spotlights focus on news, events and products in the business intelligence ecosystem that are poised to have a significant impact on the industry as a whole; on the enterprises that rely on business intelligence, analytics, performance management, data warehousing and/or data governance products to understand and act on the vital information that can be gleaned from their data; or on the providers of these mission-critical products.

Presented as a Q&A-style article, these interviews with leading voices in the industry including software vendors, end users and independent consultants are conducted by the BeyeNETWORK and present the behind-the-scene view that you won’t read in press releases.

This BeyeNETWORK spotlight features Ron Powell's interview with Ivan Chong who is the general manager for Informatica’s data quality business unit. He has also headed up Informatica's product marketing as well and worked in the corporate strategy group doing business development. Ron and Ivan discuss how big data has changed the way organizations integrate and manage data.

Big data is probably one of the hottest areas for the BeyeNETWORK,, and Informatica 9.1 is all about data integration for big data initiatives, and you have a rather unique and interesting definition for big data as a confluence of three trends. It would be great to hear about the three trends and why those trends make up your definition of big data.

Ivan Chong: We hear a lot from the industry in terms of the different definitions they have for big data and the different drivers. To gain a complete picture, our point of view is that there are these three large initiatives that are now converging. The first of these trends involves traditional transaction data where the growth and data volume have outpaced the ability of IT to effectively manage and process that data for analytics. You'll observe that a number of vendors have invested in data warehouse or analytic appliances that source traditional systems and then they load enterprise transaction data. This is what we refer to as big transaction data.

Now, another trend we see is what we refer to as big interaction data and that has come about through the rapid adoption of social data and device data. And it has forced the industry to reconsider how IT can invest in infrastructure to tap into these additional sources of data. Obviously, as many of people know, the data volumes in this area are quite staggering and far more demanding than what was previously required with traditional enterprise transaction data.

That leads us to our third component of our viewpoint of big data, which is that it is precisely these volume requirements and the diversity of data that is now being considered in the enterprise that is driving consideration for alternatives outside of the traditional database processing platforms. Specifically, many large organizations are evaluating Hadoop as an alternative to these traditional relational systems, and this is what we refer to as big data processing. Thus our viewpoint and our point of view on big data is that it is the confluence of big transaction data, big interaction data, and big data processing.

Well, I think that's a very all-encompassing definition for big data, and I couldn’t agree with you more. Let’s talk about the latest release of your flagship product. In the press release, you mentioned that Informatica 9.1 delivers trustworthy transaction data and authoritative master data. I know that enterprises everywhere are fully aware of the importance of being able to trust their data, and that has always been a challenge to accomplish. But in the past, it seemed organizations viewed master data management [MDM] as something nice to have but not necessarily something they wanted to spend time doing. Do you agree and how are organizations tackling that problem now?

Ivan Chong: Well, we speak to many companies that understand that data needs to be mastered and managed outside of the original source application systems. In the past, I think you were alluding to one school of thought that you could master the data through transactional systems so having a separate system for mastering and managing the data is a “nice to have.” But that's very difficult to accomplish now that there is a lot of adoption of cloud applications like and social data that may originate from lots of different places in the enterprise, and third-party data as you have globalized partner networks. All of that is leading IT to the conclusion that your origination of data may exist in lots of different places outside of your enterprise transaction systems that you manage internally. And so you have to own your data as you define your business, even though it may be outsourced, even though it may be resident in cloud computing infrastructure. You still have to manage and own your data, and that makes MDM a “must have.”

I totally agree. When I think of master data, the next area I always think of is data governance. It's an area that I feel is finally getting a lot of attention within most enterprises. Part of that reason is probably that organizations now have so much data that they need a way to know what they have, to know what applications and services use that data, and to be confident that they're in conformance with regulations and service level agreements. Is that what you're also seeing with your customers?

Ivan Chong: Yes, it is. We see many customers that are driven to implement data governance in response to regulatory pressures. However, these customers are also looking to leverage data governance so that they can use their data for business benefit, not just to comply with regulations. Business benefit is a key objective for many of the successful data governance initiatives that we observed among our customers. They're coordinating efforts across business units and departments, first so that they can understand the impact of data defects on their business and therefore manage their business at the right investments and data quality. But then, what they want to do is they want to reuse the data quality rules that they developed for regulatory compliance across other initiatives that are applied purely to driving improvements in the business processes that they run. On top of that, we see that they're looking at the cost and liabilities of managing their data, and data governance councils allow them to coordinate that assessment, and then they look to apply technologies like information lifecycle management to minimize the expense of having to maintain all their data.

Aren't there some pretty complicated challenges to provision all the different data types? It seems like as every year goes by we're seeing more and more data types and data coming from different areas. Can you talk about the challenges in that area?

Ivan Chong: Yes, the challenges are tremendous. Businesses have to handle change, and they need to ensure that they’re agile enough to respond at the pace of their business, not at the pace of how they can rework all the infrastructure in their internal data systems. So a lot of customers come to Informatica because they want help on implementing a true data services-based architecture. By having an adaptive data services architecture, they can insulate the consumers of their data from all the churn that exists at the source system layer. You refer to data coming from lots of different areas. Not only is it coming from more diverse source systems, but it also brings with it a lot of upgrades and migrations and retirement issues that they have to deal with. They don’t want all that change at the root layer to propagate through to the data consumers. They want to maintain service level agreements there as they rework their infrastructure and improve and invest in their data.

Not too many years ago, enterprises had a lot of structured data sources and many companies tried to manage them internally on their own with homegrown systems. But given all the different data types and requirements, it seems to me that any midsized to large organization today has to engage a vendor like Informatica to manage all of these different types of data because it's almost impossible to do without a commercial product. Do you have that same feeling?

Ivan Chong: Yes. We've had to grow our business to be more than just the reporting data that people use in a data warehouse. And in the early days, people would say well if I get an ETL system from my database vendor or from my application vendor, what makes Informatica unique? And as we explained our value proposition, we adopted that as our mission – to line up with exactly how the customer views the data. We're not going to have an agenda to augment a database system or tie a customer to a specific application system. We're going to look at purely how they view data as an asset. And if that means we've got to adopt newer sources of data, then we'll invest in that. If that means we've got to refine the data to make it more valuable to the customer by filling quality gaps or enriching it, then we'll invest in that. And if that means looking at data that's considered outside of the enterprise such as cloud-based data or perhaps third-party business-to-business data exchange, we'll adopt that as well. Because we have a unique perspective, we're in a position to align ourselves with how customers are looking to leverage data as an asset.

Well, obviously, there were some solid business reasons on the part of your customers that precipitated the development of some of the new features in 9.1. Could you talk about some of those business imperatives and the new features that resulted?

Ivan Chong: Sure. In general, businesses are increasing their investments in data analytics. Specific business imperative that drives this type of investment is the goal of attracting and retaining customers. Now, data about customers comes from far more than just traditional enterprise systems these days, and this is what we were discussing earlier. Our recent 9.1 release showcases several new capabilities that enable organizations to achieve a customer-centric approach. One example that I thought I would highlight in response to your question is that our MDM product allows customers more flexibility to leverage data by providing a multi-styled data architecture, registry-based or hub-based. As it turns out, both approaches offer distinct capabilities toward gaining a complete view of customers and then using that knowledge to provide superior customer service.

Well obviously, when you look at businesses today, the whole global nature of what's happening and how competitive it has become, what do you see for the future? Obviously, the data volumes are exploding. I think I just read something from Forrester that said data volumes in the next 40 years are going to grow by 50 times. It's just mindboggling to even hear something like that. What is your prediction for the future? What other sources of unstructured and semi-structured data do you see as potential data source that will require integration to the enterprise?

Ivan Chong: Earlier we discussed a new wave of interaction data, and I think in general people recognize that social data and mobile device data is being used by our customers today. Right now, we believe that in the future these sources of data will become a vital component of enterprise data. There's so much rich interaction that's captured through these data sources, and a key for many businesses is how effectively they’ll be able to integrate that data with traditional enterprise transactions. If you look at how people are dealing with these sources of data today, a lot of it appears to be more experimental. They analyze the data in isolation so you'll see sentiment analysis that's performed on social data. But that's not really leveraging social data to its fullest extent. Our customers as well as a number of key industry luminaries feel that the true value is when you combine all of that interaction data with transactional information. Then you get the complete view of what it is that you can best do to service your customer and partner with them to meet their needs.

Without having a product like Informatica or something to help manage all of that information, it seems very daunting for any organization of any size. Would you agree?

Ivan Chong: Yes. We believe very strongly that data itself has value outside of the applications that generate it. I think in the past, people and organizations would look at data as purely the residue of an investment they would make in an enterprise software application system. But if you take that approach, the challenges that you mentioned with volume, with diversity, with consolidation and reconciliation of the data is truly daunting. We see this as something feasible if you take an approach where data itself has its platform that you invest in. The platform then allows you to deal with the fact that there's increasing numbers of sources, that there's increasing volumes, that there's increasing diversity, and you can focus on specifying the specific rules for the data that you need to apply to harvest data as an asset for your business. What are the ways in which you are looking to mine the data for actionable information? What are the ways you are looking to quickly leverage insight from the data and apply it toward your ongoing operational processes? Those are things that a data platform would provide. And Informatica is certainly championing this notion that data itself needs a platform.

It's been a pleasure speaking with you today, and I’m looking forward to seeing the response to 9.1.

Ivan Chong: Thanks so much Ron.

SOURCE: SPOTLIGHT: The Impact of Big Data Q&A with Ivan Chong of Informatica

  • Ron PowellRon Powell
    Ron is an independent analyst, consultant and editorial expert with extensive knowledge and experience in business intelligence, big data, analytics and data warehousing. Currently president of Powell Interactive Media, which specializes in consulting and podcast services, he is also Executive Producer of The World Transformed Fast Forward series. In 2004, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). He maintains an expert channel and blog on the BeyeNETWORK and may be contacted by email at 

    More articles and Ron's blog can be found in his BeyeNETWORK expert channel.

Recent articles by Ron Powell



Want to post a comment? Login or become a member today!

Be the first to comment!