entity resolution with winpure

Artificial intelligence has gained a notorious reputation since the launch of ChatGPT – but truth be told, there is more to AI than mere rewriting sentences. For example – to resolve complex entities within data sets. Now that’s buzzworthy!

In this piece, I’ll walk you through the innovative collaboration we have with Senzing, the company behind the world’s most powerful entity resolution engine. I’ll also give you a brief overview of the AI principles used in building the entity resolution module and key capabilities of the software.

Let’s roll.

But first, what is entity resolution & why is it challenging? 

Entity resolution also known as record matching is the process of consolidating all the data you have on a real-world entity into a consolidated view. For example, in the image below, Mary Jane Smith has multiple digital manifestations of her data. When businesses want to to implement targeted marketing campaigns or create personalized offers, they need to “resolve” these variations to get a complete picture of Mary Jane Smith, which include her most up-to-date data such as contact information, demographic information, and other user data that gives the company a unified overview of who she is.

entityresolution
 How data on real world entities are stored in organizations with varying, incorrect, and often times incomplete information.

In some cases entity resolution is also considered as a de-duplicating records, where redundant information such as multiple phone numbers or physical addresses are identified and treated.

Entity resolution is a complex process that can be summarized into three key challenges:

  • Computationally expensive & complex matching: Determining if records represent the same entity is computationally expensive and requires specialized algorithms that are computationally expensive and complex. For example, customer record matching can range from 10M to 100M comparisons per hour, depending on data cleaning steps. This is far slower than simple string or numeric comparisons. Additionally, matching non-cultural names or entities with different spellings or nicknames is a herculean task that requires specialist talent in data-matching algorithms.

  • Data cleansing & data quality: ER requires advanced data preparation techniques. You would need to clean, standardize, and fix data inconsistencies before you can run it through a match algorithm to detect duplicates. The data cleansing process itself is time-consuming and problematic especially since modern databases seldom have structured data. It’s mostly semi-structured data streaming in from third-party sources – or – from poorly designed web forms that are notorious for collecting bad data.

  • No easy way to streamline the ER process: One of the biggest setbacks for most organizations is the inability to streamline an ER process. Companies with limited resources cannot afford to hire specially trained resources, while large organizations (LOs) struggle with team alignment. Resources work in isolation and key processes like data cleansing, data profiling etc is broken down into parts and different users work on different goals. This disjointed effort causes internal conflicts, disruption of business processes, and in severe cases, can also lead to data loss or data privacy challenges.

Hopefully, I didn’t make it sound too bad 😂

But, the good news, a better, easier, faster, and more efficient solution exists.

And no this isn’t a matter of trusting AI blindly. For the first time ever, in the history of data management, you now have one single solution that combines the entire DQM framework and an entity resolution module, all without requiring a single line of code.

Data quality & entity resolution capabilities rolled into one ….

Senzing offers the world’s first purpose-built AI entity resolution machine learning engine that is used in fraud prevention systems, government identity resolution systems, and banking and insurance systems to meet KYC goals. It is undoubtedly a powerful real-time entity resolution engine that is unmatched in its accuracy and efficiency.

Senzing’s entity resolution API is now integrated into WinPure, which offers a no-code interface for data cleaning, matching, consolidation, and setting golden records.

Powered with SENZING INSIDE™’s AI engine, WinPure now has a module for entity resolution packed into the DQM cycle. So now after cleaning and standardizing your data, you can use the AI-powered ER module to look for hidden relationships between entities.

resolving (1)
Unifying disparate data with entity resolution

More importantly, the AI also uses a common sense and principles-based approach to look for potential duplicates, a feature that could save you hours of manually tweaking match algorithms to look for potential duplicates.

What is a common sense & principles-based approach to entity resolution?

jeffjonas

Don’t throw things at other people’s stuff is pretty much a common-sense example drawing on a fundamental principle that can be attributed as a special form of generalized knowledge.

The use of principles is a key reason the WinPure software does not need training, tuning, or experts to deploy into new domains or to add new data sets, new languages, etc.

Okay, now to the interesting part:

How does it actually work?

WinPure’s Entity AI uses smart techniques to accurately match and identify data about people. It understands that some pieces of information are unique to one person, like a Social Security Number (SSN), while other details, like a date of birth (DOB), can be shared by many people.

The AI looks at three main things:

👉🏼 how often a piece of data is shared (frequency),

👉🏼 whether one person can have more than one instance of that data (exclusivity), and

👉🏼 how much that data stays the same over time (stability).

For example, an SSN is unique and doesn’t change, while a home address might change frequently. This approach helps ensure the AI correctly identifies and matches data.

Furthermore, the AI can handle messy real-world data that doesn’t always behave as expected.

For example, it can spot issues like multiple people sharing the same SSN or one person having multiple DOBs. The AI flags these anomalies for further review and adjusts its understanding accordingly. The software uses about 30 built-in principles to determine if entities are the same, possibly the same, or related.

It considers factors like how often data is shared, whether an attribute can be duplicated, and how stable the data is over time. These principles help the AI effectively match data right out of the box for various types of entities, such as people and companies.

You can learn more about the module’s cultural name awareness & relationship detection awareness as well as dozens of other functionalities in this technical AI-powered entity resolution guide

Sounds pretty cool eh? Well, the founders of WinPure and Senzing partnered up to share their mutual goal of democratizing data quality management and entity resolution, enabling businesses large and small to finally solve their data challenges – without spending a fortune. Read the partnership press release here. 

What is the current state of entity resolution processes in organizations?

Over the past few years, we’ve worked with dozens of organizations that share a similar struggle with ER. Highlighted below are some common themes:

❌ Attempting to perform ER using spreadsheets: IT and business teams in organizations are still using spreadsheet programs to manually comb through thousands of rows of data, employing a multitude of formulas to clean and match records. However, with increasing complexities in data structures, it has become a struggle to improve data quality with spreadsheets.

❌ Attempting to build data matching algorithms in-house: Not many businesses can afford to invest in highly-trained specialists to build in-house algorithms. Even if they do, it is a time-consuming process that takes months of effort before any significant DQM or ER project can be initiated – and – completed.

❌ Outsourcing to third-party organizations or consultants: SME businesses outsource entity resolution tasks because they don’t have the capabilities or resources to tackle ER challenges. While this is the easiest way, it is also one that causes significant delays and costs can increase tenfold. Third-party organizations do not have ‘context’ awareness of the data and therefore can implement suggestions that can cause internal conflicts when the data is used in downstream applications

Because of outdated practices, entity resolution has become but a distant dream, too hard to bring to reality.

But with the right solution and the right team to help you navigate these complexities, we are confident you can resolve your entity challenges in just weeks, using little to no additional resources; both in terms of human talent and infrastructure.

What are the benefits of using WinPure’s entity resolution?

Think of WinPure as an assistant you can offload the most strenuous data matching task to so you can improve efficiency – and – work on strategic data-driven initiatives.

In this era, manually performing ER can become a significant bottleneck for organizations. The sheer volume and complexity of data often render traditional methods impractical and time-consuming.

Leveraging advanced solutions like WinPure allows organizations to move beyond the limitations of entity resolution, freeing up valuable resources to focus on higher-level tasks that drive business value.

WinPure entity resolution
No-code interface for entity resolution. Simply map, label, and let the AI handle the rest.

But other than business value, you get a host of data quality features bundled into a single software.

  • Easy-to-use interface
  • Global Name Management
  • Purpose-built Module for Entity Resolution
  • Minimal data preparation
  • AI with Built-In Privacy by Design (PbD)
  • Nonobvious relationship awareness
  • Granular control with advanced profiling, cleansing and fuzzy match capabilities

You get full control of your data quality process. Whether you want to set specific fuzzy match thresholds for granular resolution or you want AI to do the job, it’s all possible on the WinPure platform.

five points
Tangible benefits of achieving entity resolution using WinPure.

Schedule a Hands-on Trial Experience

Want to see how WinPure works on your sample data? Schedule a hands-on trial experience and be prepared to be blown away by the ease, uniqueness, and unmatched capabilities of the solution.

Written by Farah Kim

Farah Kim is a human-centric product marketer and specializes in simplifying complex information into actionable insights for the WinPure audience. She holds a BS degree in Computer Science, followed by two post-grad degrees specializing in Linguistics and Media Communications. She works with the WinPure team to create awareness on a no-code solution for solving complex tasks like data matching, entity resolution and Master Data Management.

Share this Post

Request a 7 Day Trial

Explore WinPure’s award-winning AI Data Quality platform packed with capabilities like:

  • Data Profiling
  • Data Cleansing & Standardization
  • Data/Fuzzy Matching
  • Data Deduplication
  • Entity Resolution
  • Address Verification

…. and much more!

"*" indicates required fields

Hidden
This field is for validation purposes and should be left unchanged.

Index