Tuesday, May 02, 2006

CDI compared to other master data

There is a good article on CDI by Jill Dyche, a co-founder of Baseline Consulting and someone who has clearly seen a lot of real-world CDI projects. She does a good job of explaining how CDI projects have traditionally been quite transaction-oriented, with hubs serving up customer data via middleware to other applications. CDI hubs are at one end of the MDM spectrum, firmly at the "operational" level. At the other end are "analytic" MDM applications, which enable companies to take a cross-enterprise view of key information like assets, people, products, channels etc. Getting to understand the differences between the multiple, conflicting definitions embedded in the source systems is a major job in itself, and will usually result in a master data repository. This in turn can be a feed into a corporate warehouse. A few pioneering companies have taken the final logical step and hooked up their master data repositories, via middleware like Tibco or IBM Websphere, to their operational systems, so that the master data repository becomes the true master source, driving changes as required back down into the operational systems like ERP and CRM.

CDI hubs have started at the other end, linking up to systems providing customer data, often in real-time. Customer data represents a high-value area of MDM, as in the case of consumers the customer data is often quite simple, but is in high volume, and requires fairly simple processing to match a customer record in one system to one in another (e.g. matching "A. Hayler" v "Andy Hayler"). However, this is only part of the answer, as even in the case of "customer" things can get more complex. Suppose you are a company like Shell and you want to treat Unilever as a key global account. Finding out all the information about Unilever is not just a simple keyword matching exercise, since Unilever trades under many different subsidiary names and brands around the world e.g. its main Indian subsidiary is not called Unilever but Hindustan Lever; it also owns a company called Algida, and I defy even the cleverest fuzzy logic algorithm to associate "Algida" with "Unilever" (such examples are why you should always be sceptical about vendors selling matching algorithms) It can be seen that, for more complex situations like this, human intervention is required in order to correctly add up all the element of Unilever's business.

This issue can become considerably more complex with things like "asset" or "product", which can have a whole hierarchy of sub-types. This is why CDI hub technology tends to be used specifically for consumer information. Other types of MDM technology are required to manage more complex data and the workflows that surround the updating this e.g. no automated system is going to just create a new brand; this requires numerous approvals and has various knock-on effects to other master data.

I would argue that, at least at present, you are likely to require one kind of technology to handle general purpose MDM data, whether customer or asset or whatever, from an analytical viewpoint, and potentially a separate technology to handle real-time updates, perhaps real-time. Of course it would be nice if a single product did everything, but at present nobody can truly claim this. What does seem a missed opportunity is the way that vendors have made their technology so very specific to particular types of master data e.g. PIM and CDI. While operational and analytic needs are inherently different, there is no reason at all not to take a generic approach to all types of master data. Customers can hardly be expected to buy a separate hub for every type of master data.

2 Comments:

Anonymous Srikanth said...

Andy,
On your views on MDM,
(1) Is MDM a matter of concern for large distributed enterprises only? Possible to give a more quantifiable indicator of who would need it - say those above revenues of $ 100mn and operating in atleast 3 countries.
(2) You ask whether there is any Fuzzy logic to associate 'Algida' with Unilever. My response is, why not simply use standard directories such as D&B where the hierarchy is established? Why go for expensive fixes?
Would love to hear your views.

5:16 AM  
Blogger Andy Hayler said...

Good questions. MDM is definitely not only a problem for large companies. Even small companies can encounter problems with inconsistent master data. A small software company might have Great Plains accounting software, salesforce.com for leads, a support call tracker, an intranet for marketing material and a version control package. Do you think they will definitely have consistent definitions of terms such as "customer"? Do you think there could be any data duplication? Data issues can occur in even small companies, almost anywhere where multiple software packages and systems are used. Of course the bigger the company, the greater are the likely problems.

2. Of course in the case I gave you could look up Dun&Bradstreet, so perhaps it was not a good example. However there are many examples where different codes are used for the same product, or (perhaps worse) the same codes used for different products, and in many cases these will not be easily spotted without human intervention. My point is that smart data quality tools can certainly be helpful in spotting some data quality problems, but that human beings will need to intervene in a large number of cases.

6:33 AM  

Post a Comment

Links to this post:

Create a Link

<< Home