Learning how to track data quality metrics will help save money in your business and figuring out how to measure data quality does not need to be complicated. The best approach combines common data quality dimensions and data quality KPIs to calculate business results. From that point, you build trust that your business data will perform well in the right context at the right time.
Read further to learn more and to see how measuring data quality metrics works through four examples.
How Do You Define Data Quality?
The precise definition of data quality refers to the overall state of qualitative or quantitative pieces of information or data. Data quality is the condition of the information available, based on tangible metrics. Generally, data is considered to be high-quality data if it’s able to deliver the insights needed in order make better business decisions.
Furthermore, high data quality means that you can be assured of the same results as previously demonstrated, given the same situation and inputs. Data quality tools such as WinPure are vital in achieving the level of data quality necessary for a profitable business that wishes to see continued growth.
What Are the 6 Dimensions of Data Quality?
Organizations agree that data quality falls into six core dimensions or characteristics:
- Data Completeness: Data needed to describe entities in data sets exists in the database system.
- Data Accuracy: Data in the database systems matches agreed on facts about it in the world.
- Timeliness: The span between data inputs and expected outputs meets business needs.
- Uniqueness: When datasets in one location move to another site, without any additional transformation, those datasets remain accurate and consistent.
- Consistency: When datasets move from one location to another, their contents remain the same.
- Validity: The final element of data quality, given a prediction of how a data set should behave, that data repeatedly demonstrates that behavior.
Relate Reading: To learn more about data management, then head to our Master Data Management (MDM) Guide where we cover everything there is to know about the Master Data Management process, including MDM architecture and framework.
What Are Data Quality Metrics?
Data quality metrics are measurements used to assess the quality of your business data. Combine any of the fundamental six data quality measures, and you will be able to assess the quality of your business data. However, data quality metrics will only be as relevant as a common understanding of what good data quality looks like and its purpose within the business.
What are examples of quality metrics?
Using the healthcare industry as an example, take an end of month patient registrations report. Run-on the 15th, it contains fewer patients than when the rerun is done on the 30th. While the business is saying that the information has a data quality issue due to completeness, IT is arguing that the report is adequate. Who is right?
DATA QUALITY KPIS
Here enters data quality KPIs or key performance indicators, necessary in understanding data quality goals. Data quality KPIs connect business objectives or KPIs to different data quality dimensions, such as accuracy.
In the example above, the business defines patient registrations data quality KPI as a complete patient list from day 1 to day 30. IT defines data quality as a comprehensive patient list from the day printed (patients registered on the 20th would not show up on a report run on the 15th).
As you can see, data quality measures require the same understanding of data quality KPIs according to data quality dimensions. Furthermore, in considering many data quality dimensions and business requirements, you will need several different kinds of data quality measurements to assess whether you have the data quality you need.
Data quality measures describe the entire set of values calculated from how well data quality KPI’s meet data quality dimensions. The closer the data quality scores to the desired results, where and when they are needed, then the more organizations can be confident in their data.
How Do You Measure Data Quality Metrics?
So, how do you measure data quality?
The first step to measure data quality requires objectively calculating functions (counts, averages, sums, and totals) of data quality KPIs that sufficiently cover data quality dimensions.
You do these measurements not only in the database but, most importantly, to the business need.
For example, you may want to ensure a complete list of monthly patient registrations to see if more of your at-risk subjects check into your clinics for health screening tests. You perform a technical review by checking for patients, in the database, without a check-in date. You also check patients with positive test results that have missing registration information, not recorded.
You also assess data quality metrics by the business need. You count how many patients you recommended a health screening test at your clinics did not follow through (They registered at a different clinic or ignored the directive). Of that population, survey the non-registered subjects for feedback and your database to see about missing data.
You measure data quality metrics within the database and the business contexts.
What Are the Four Examples of Data Quality Metrics?
To get a better sense about applying data quality metrics to your data quality KPIs, we have used these examples below to illustrate how to measure data quality metrics.
Keep in mind these examples cover some but not all data quality measurements. These cases will guide you in setting up and using data quality metrics relevant to your business goals and the characteristics you wish to measure.
Dimensions: Accuracy, Timeliness
Data Quality KPIs: Fast and accurate identification of COVID-19 from those with concerning coughs
Data Quality Metrics: Number of people with a problematic cough that did not register in a hospital or clinic and had inaccurate data. The time gap cellphone users with a bad cough that took to enroll for a COVID-19 test.
How To Calculate: First combine all the data in the data systems and search for any required data cells not populated. Check the number of patient registrations that have positive COVID-19 test results at a particular clinic. Check this patient registration list in the tracker with a clinic. Note any discrepancies. Find the time gaps from a person notified about their cough to register at a clinic over one month.
Quality Metrics Example Case Study: WinPure Data Matching In Ground-Breaking COVID-19 Tracker Project
Dimensions: Validity, Uniqueness
Data Quality KPI: A single view of unique, active Centura Health donors with no duplicated identifying data.
Data Quality Metrics: The count of duplicated identities should be 0. The number of identified Centura Health donors. Count of individuals falsely identified as a donor. Feedback from customers that they have received duplicate mail.
How To Calculate: Count the number of duplicate identities that show up in the view. It should be at 0. Total the number of records with cleansed data in an hour after automation vs. manual processes. Find the percentage of help desk tickets requiring resolution for duplicate mailings.
Quality Metrics Example Case Study: Fast & Accurate Data Matching Saves Precious Time
Dimensions: Consistency, Uniqueness
Data Quality KPI: Over 162,000 clean, correct, and standardized address data flow to the Customer Relationship Management (CRM) database in a standard, accurate format.
Data Quality Metric: Percentage of address data changed from one system to the CRM. The number of duplicate records in the CRM.
How To Calculate: Count the number of addresses where the critical data changed from one system to another. Count duplicate mailings sent, as reported by customers and the system, and compare them to duplicate records in the CRM.
Quality Metrics Example Case Study: Address Verification Provided Significant Cost Savings
Dimensions: Completeness, Accuracy
Data Quality KPI: Ticket sales data in Birmingham Hippodrome matches have the same data retrieved by other agencies that sell tickets.
Data Quality Metrics: The number of ticket records with complete data retrieved by other agents matches the same number in Birmingham Hippodrome’s database. The same ticket data contents from a data profiled in Birmingham Hippodrome’s database matched that found in the other agencies.
How To Calculate: Check the Birmingham Hippodrome’s database for any empty values required by both the internal IT and the ticket agencies. Of the duplicate mail sent to customers, check the percentage of duplicate records in the Birmingham Hippodrome’s database.
Final Words On Data Quality Metrics
Research has found agreed-upon data metrics according to data quality dimensions: completeness, accuracy, timeliness, uniqueness, consistency, and validity. But these metrics need to pertain to your business context to earn business trust in its data and its correct operations over the right context at the right time. Connect your data quality metrics to data quality KPIs and the six data quality dimensions to better understand your data quality.Find Out More About WinPure Data Quality Tool