Cover 09

Customers, coworkers, and computer systems all need critical data completeness to communicate well with one
another.

Businesses rely on data completeness, a data quality characteristic, to run well. Managers need correct names,
addresses, and emails to build and maintain customer relationships. Also, IT professionals need enough data completeness
and accuracy to make customer data available across many systems
without errors.

Unfortunately, all too often, business projects frequently lack required data completeness. For example, while growing a
college’s corporate relations department, colleagues provide contact information with missing addresses, telephones, and
names. This patchwork of incomplete data not only hinders sending invitations to contacts successfully but also causes
headaches when upgrading to a new interdepartmental database system.

So, before beginning your next marketing, sales, or IT initiative, you want to understand how data completeness will
affect it. Read further to assess what makes data complete, see how incomplete data happens, and ensure the level of
data completeness and accuracy you need to succeed.

We’ll explore and explain data completeness in detail, including:

  • What is data completeness?
  • How does data become incomplete?
  • How to identify missing data?
  • How do you know if data is missing randomly?

What is data completeness?

Data completeness measures the availability of data on-hand, it is a data quality characteristic demonstrating data
comprehensiveness. The data completeness example above, for Robert Johnson, shows 18 cells with content and two
with missing data. Upon comparing the number of inputs against the total number of cells, you get over 80% data
completeness.

Sufficient data completeness can be below 100% because it covers the critical data needed. For example, say
Robert Johnson has a summer address in New York and a winter address in Florida. His record has two entries for
the summer and winter address.

But not all your customers have two homes. Most people live in the same place year-round. In these cases, it
would make sense that the person has only one address, leaving the other blank. Adequate data completeness would
mean having at least one address per contact.

Just because an entry has content does not guarantee the data completeness and accuracy of that data. Whether
Bob Johnson has an address of “6 East Bridge Drive” or “In Situ” would not matter. In either case, the data will
be checked off as complete.

How does data become incomplete?

Data can become incomplete just by doing everyday business, for example:

  • Customers choosing not to provide information: For example, a client may only
    prefer to be contacted by email, leaving their phone and street address data blank.
  • Customers not knowing how to enter their data
    correctly
    : A customer may miss entering a
    company name because they never saw the entry.
  • Coworkers experience training issues: Employees may not understand data input
    and management requirements or make mistakes, missing names, addresses, emails, or create data
    entry errors.
  • Coworkers judge the data as non-essential: A professor may just enter students’
    names and social security numbers for grading, without emails and addresses.
  • Computer systems setup: Databases set up may not meet business requirements.
    For example, a marketer may create a contact data spreadsheet. However, the name column may not
    be made mandatory and goes missing for some customers.
  • Computer systems integration: The data architecture between one system and
    another does not have the same data completeness criteria, causing errors during a data merge.
    For example, an Excel spreadsheet may allow for missing street addresses. Integrating data with
    missing addresses into the Enterprise Data Management system fails. The Enterprise Data
    Management system requires street addresses exist.

How to identify missing data?

By their nature, customers, coworkers, and computer systems cause incomplete data. Realizing and
identifying this missing data and early-on saves a lot of headaches. Follow these steps:

  • Identify critical data that must be complete: You need to know, according to
    your business requirements, what data must be available and the level of importance. This work
    may seem unnecessary, but exceptions do exist.  For example, many customer relationship
    management systems require the customer’s last name. But a few people, like the magicians Penn
    and Teller, do not have last names. Should your customers and stakeholders only identify by one
    name, having a mandatory first and last name field may not make sense.
  • Get feedback from data governance: Good data governance gets different business
    divisions together to talk about critical data needs and clarify what data to fill. For example,
    various college departments discussed a required status field assigned to a student. The
    college’s employees recognized that the same person could be a student, alum, donor, or a
    professor at different times and need specific services based on that status
  • Profile data: Once you have identified critical data to complete and compile
    results from your data governance conversations, you apply your findings through data profiling.
    Data profiling provides a systematic data quality assessment across existing data sets that
    uncovers missing data entries.  Using software makes data profiling easier, sorting through
    thousands of data entries and using patterns to locate incomplete data. WinPure’s
    award-winning Clean & Match
    software
     using the Data Profiling function returns the percentage of missing data values
    per field and highlights that incomplete data, as shown below.
Data Completeness Example Missing
Data Profiling function from
WinPure Clean & Match
  • Prioritize and resolve critical data elements as needed: If you find your
    threshold for data completeness and missing data entries to be acceptable, then you can proceed.
    Should unavailable necessary data remain a problem, prioritize and research information to
    backfill the info.  Use automation tools to help you locate and complete the data. Address
    verification software can help if you find full addresses from only a portion of the data.
    Reverse lookup directories, like USPhoneBook.com, find first and last names from a
    phone number.
  • Check your work through data profiling: Repeat the profile data step to review
    your data completeness meets the data quality you need. If you find the data usable, start your
    business projects. If not, repeat the previous step.
  • Periodically revise your data completeness criteria: Expect business needs will
    change. Perhaps, in the future, people will use emails, not phone numbers, to call each other on
    the computer. A situation like this would change what critical data completeness would be
    required.

How do you know if data is missing randomly?

Implicit and explicit business requirements will cue you in to whether incomplete data should be
expected or considered random. For example, say you send marketing mailings domestically, only in
Canada. There would be no need to investigate why country name entries remain empty.

However, say you plan on sending a pledge letter to prospects across North America (The US., Mexico,
Canada, the Virgin Islands, Costa Rica, Belize, Greenland, etc.). Then you would want to have complete
country information in your customer relationship management system.

Should some of the country data be completed and others left empty, you will have randomly missing data.
At this point, the data may or may not meet adequate data completeness, but you will want to identify
the missing data and follow the steps in the “How to Identify Missing Data” section.

If you have an email, a phone number, or part of an address, you may have enough data completeness with
the missing country values. You could still contact people. However, you would need to assess whether
that does meet business needs.

To conclude

Data completeness and accuracy significantly impact data
quality
, determining whether customers, coworkers, and computer systems can communicate
well to do business. Over time expect to see missing data in your system as part of doing
business.

Early on, you want to identify and define critical data that needs to be available. Then, you
want to profile your data to see about the acceptability of your data completeness.

Prioritize and resolve critical data completeness issues. Data quality software and automation
tools that present and fix missing data well make dealing with incomplete data simpler. Be sure
to periodically review your data completeness criteria and adjust them according to business
changes.

Written by Farah Kim

Farah Kim is a human-centric product marketer and specializes in simplifying complex information into actionable insights for the WinPure audience. She holds a BS degree in Computer Science, followed by two post-grad degrees specializing in Linguistics and Media Communications. She works with the WinPure team to create awareness on a no-code solution for solving complex tasks like data matching, entity resolution and Master Data Management.

Share this Post

Schedule a Demo

Explore WinPure’s award-winning data quality suite packed with capabilities like:

  • Data Profiling
  • Data Cleansing & Standardization
  • Data/Fuzzy Matching
  • Data Deduplication
  • AI Entity Resolution
  • Address Verification

…. and much more!

Index