MDM Data Quality Processes

MDM can help establish and enhance your data quality processes
In my previous posts, we’ve discussed the confusion between MDM and data quality and examined MDM as a technique to prioritize data quality issues.
At a high level, MDM approaches data quality by defining two key continuous processes:
- MDM Benchmark Development: Creation and maintenance of the data quality Benchmark Master Entity View, e.g. a benchmark for customer data and their relationships
- MDM Benchmark Proliferation: Proliferation of the benchmark data to other systems.
In most modern MDM solutions data hub technologies focus on the MDM benchmark development process and its challenges. This process creates and maintains the MDM benchmark record in a data hub. The process focuses not only on data integrity, completeness, standardization, validation and stewardship but also on the record accuracy.
Data accuracy oftentimes requires verification with the individuals that know what the accurate values for certain attributes are.
Most publications and discussions focus on the first process and data hub components that may include probabilistic and deterministic matching, record merge capabilities, ETL and real-time processing components, data profiling and data quality tools, etc. Data hub vendors have developed methodologies, techniques and implementation options for the processes, metrics and their reporting in the data hub.
When a master entity record is created or changed in one of the operational systems or an existing record changes, the data hub receives an update with a sub-second delay. The data hub processes the change applying an update to the benchmark record based on attribute survivorship rules defined by the business and data governance.
Consequently, at any point in time, with a sub-second delay, the data hub maintains the benchmark record for the master entity to the best of the enterprise's knowledge. This makes the hub the benchmark for all practical purposes.
The benchmark proliferation process is equally important for data quality. It is no doubt a great accomplishment for an enterprise to define, create and maintain the golden (benchmark) record in the data hub. That said, from the enterprise data quality perspective a critical challenge remains.
Enterprise stakeholders are looking to establish and maintain data quality in numerous systems across the enterprise. They care about the quality of data in the data hub mostly because it helps them solve their data quality problems in other data sources, applications and systems.
The data hub is just a technique helping to establish and maintain data quality across the enterprise rather than the ultimate goal of an MDM initiative.
Analytical systems and operational systems interact with the data hub differently. We’ll make this distinction in the next post.
Leave a Response







Entries(RSS)