More Data Quality Metrics: Standardization, Availability, Adoption and Reference Data

Determining how to weigh and measure each of these data quality metrics is important to building a successful data governance program

Determining how to weigh and measure each of these data quality metrics is important to building a successful data governance program

In the past couple weeks, we’ve examined the importance of establishing data quality metrics for your MDM program and dug into the details of Uniqueness, Completeness, Latency and Consistency. Now, let’s review the remaining categories.

Standardization and Validation

A number of attributes can be considered to define standardization and validation rules. These attributes include address, phone, email, SSN, driver's license number, passport number, etc. Standardization refers to the proper format of an attribute, while validation focuses on verifying that the given value points to an existing address, phone, etc.

In this category, you can track the percent of non-standard or invalid attribute values.

Availability

Availability can be expressed in terms of the number of business users of the MDM hub and the number of applications from which the data in the MDM hub can be accessed. Another aspect of availability can be measured in terms of the hub’s downtime.

User Adoption

The number of search API or web service calls from business applications to the data hub and the number of users that initiated these calls can be used to measure user adoption of the data hub.

Reference Data, Code Semantics Reconciliation and Relationships

Reference data, referential integrity and code semantics reconciliation are important from two perspectives:

  • To infer relationships between master entities. For example, if a relationship between individual entity and organization entity defines employment relationships, the quality of employment attributes can be critical to determine the relationship.
  • Distinct value lists, codes and their cross-system translations are critical to support the integrity of transactions and correctness of analytical queries

The number of violations in these two categories is an important metric that can be defined by data governance.

Other Considerations

The metrics I’ve discussed are a good starting point for a data governance organization that is looking to establish data quality policies for master data and better define their policies and controls. Both of these tasks are critical to begin data governance operations as an LOB.

At the same time, you should also consider building a data governance dashboard. A dashboard is a critical tool that can slice and dice your data quality metrics (by system, time, department, etc). The dashboard should also be able to perform a "what if" analysis to estimate the impact of planned data quality improvement efforts.

For example, the dashboard should be able to estimate how an improvement in data completeness on a certain attribute will impact your metrics. This will help the data governance organization streamline data quality improvement activities, optimize resources and maximize the ROI.

For more tips on the role of data quality metrics in MDM and data governance, read my recent series, How MDM Helps Data Quality and Governance, or two posts from my colleague Marty Moseley, Measuring What Matters and Measuring What Makes the Business Tick.


Tagged as: , , ,

1 Responses »

Trackbacks

  1. Data Quality From The Ground Up

Leave a Response