On Data Volumes, Performance Considerations & Reference Data

As you build your MDM roadmap, you must consider how to manage growing data volumes and reference data
Over the past few weeks, we have been surveying the many domains of MDM that you encounter while building the right MDM roadmap for your organization. Today, we’ll discuss how to manage growing data volumes while maintaining performance. We’ll also discuss some important considerations for reference data.
A master data hub can manage millions - even billions - of records. The cost of hardware depends on these volumes and a variety of performance requirements, such as average transaction rates for reads and writes and peak transaction rates.
Additionally, you must consider the number of physical servers, disk space and other hardware components you’ll need to support all the required architectural tiers and environments of the solution. This, in turn, is dependent on the data volumes, transactional rates, and availability requirements. Together, these factors can increase the cost of the hardware.
Even more important, the MDM data hub license costs typically depend on the number of records that will be stored in or processed by the hub. The number of records in the data hub grows over time. Hence, your roadmap should take this growth over time to determine how the hardware configuration and price will also change over time.
On Reference Data
A typical list of master data domains includes: Party, Product, Location, Supplier and other entities with at least many thousands of records. Over time, the volume of data grows dramatically.
Master data records often represent a continuum that continuously changes. For instance, new customers can be on-boarded every day by an insurance company or a bank. People come to a hospital and become new patients at which point new patient records are created in the patient registration system.
For the most part, reference data are represented by relatively static codes that are created and managed by the enterprise. These codes are important characteristics of master entities.
From the application perspective, these codes are displayed in drop down boxes. For instance, accounts managed by the enterprise are characterized by account type. Products are tagged by class, category, type and brand. Customers can be categorized by marketing segments, categories, classifications, gender, country of citizenship etc.
Reference lists are defined, managed and have their codes reconciled (for consistent use) by data governance and the business. Reference data reconciliation is an important problem that must be addressed as part of an MDM initiative.
Reference code reconciliation can be approached as a workstream of an MDM program or managed as a separate project. Reference data may or may not be managed by the same software as master data. If reference data is a part of the MDM program, it is an important domain of the MDM roadmap impacting the MDM initiative’s cost.
This is part of a series, Building an MDM Roadmap. For other posts and a complete index, view the Table of Contents.
Leave a Response







Entries(RSS)