Hello and welcome to our first-ever Q&A session. We’ve organized this event in response to the feedback we’ve received from you, our WinPure clients. Your input is important to us, and we want to focus on solving the real-life issues you’ve been facing.
This isn’t just another event; it’s an opportunity for us to come together as a community. We’ll be using WinPure software as a tool to find practical solutions to your challenges. We’re looking forward to making these sessions a regular thing, so we can continue learning from each other.
Here is a summary of the main topics that were covered during the session.
Let’s dive in.
Introduction to the Session
We had three customers waiting in the wings, each with a question that came up during that day’s training session. We delved into these real-world issues and explored how WinPure software could offer effective solutions. In essence, the customers were looking to resolve issues with their master records and create golden records.
All three of our customers are professionals with experience in the data management and data governance phase. We had:
➡️Francesca, an implementation manager, wanted to use WinPure to match and consolidate her data to create master records identifying accounts with the highest invoiced amounts.
➡️Daniella has experience working with various systems in the machinery industry and wanted to use WinPure to clean up her master data.
➡️Mark works in the SAP arena, and more specifically in the data migration field primarily for almost 25 years, and wanted to use WinPure for a large de-duplication and recursion process.
The three customers were currently using WinPure to resolve some of their pressing challenges, which they discussed during the live show.
The First Challenge – Identifying Master Record by Highest Total Spend
Francesca is immersed in a project that involves navigating through thousands of vendor accounts across multiple systems. Her challenge lies in identifying the master record based on the highest invoiced amount for a particular account. She poses the question: How can she efficiently pinpoint the master vendor account based on the highest invoice amount?
To tackle Francesca’s challenge, the first step involved matching based on VAT registration numbers. An exact match was used for this criterion, as VAT numbers are unique identifiers that don’t lend themselves well to fuzzy matching.
Watch how Kathryn uses WinPure to identify master records by highest spend.
Next, the rule is defined to consider only the amount invoiced from January 2020 onwards amounts invoiced from January 2020 onwards to identify the highest spender. The software will select the record with the maximum invoice amount as the master record.
The process of selecting the master record is a nuanced one that involves identifying the record with the most comprehensive information. This could mean focusing on the record with the highest total spend or the one with the most populated data fields. What makes this approach particularly effective is its adaptability; users have the flexibility to configure the criteria for selecting the master record to suit their unique scenarios. This ensures that the chosen master record is the most relevant and useful for each specific case.
The Second Challenge – Excluding Specific Matches in Multi-System Data
Daniella was dealing with multiple systems connected via an interface, which created one large list composed of three different sub-lists. Each dataset within these lists was linked by a unique number. Daniella wanted to exclude records with matching Geschäfts partner numbers from being identified as matches by WinPure’s software.
Isolating Exact Matches for Targeted Data Matching
To address this, they decided to use an exact match for the Geschäfts partner numbers, effectively isolating them. These exact matches were then moved into a separate file. This allowed them to proceed with fuzzy matching based on company names and addresses, without the Geschäfts partner numbers interfering. The result was a set of match results that met Daniella’s specific criteria, successfully overcoming the challenge.
Watch how Kathryn does this:
In situations where records have matching numbers but differ in other attributes, users have the option to apply specific matching criteria to refine the results. One common approach is to use exact number matching for numerical data. By thoughtfully choosing these criteria, users can intentionally exclude certain attributes from the matching process. This level of customization leads to more accurate and targeted match results, aligning closely with the user’s specific needs and objectives.
The THIRD Challenge – Achieving Golden Records
In the project Mark is currently working on, he is dealing with more than 50 source systems, each containing customer and vendor information. This data is often duplicated across systems, leading to a situation where the same address could exist as both a customer and a supplier, potentially resulting in up to 100 records for the same address details. Our aim is to merge this fragmented data into a unified golden record. His team plans to create a “golden record” that consolidates this data and references it back to the original source systems. Importantly, this process will be carried out in phases or releases. For instance, the first phase involves merging data from the initial three systems. The challenge then becomes how to match new data from subsequent releases to this golden record, ensuring ongoing data integrity and management.
Watch how Kathryn creates the golden record here.
Kathryn began by assuming a unique account number for each record, which could vary—it might be an existing account number, a unique company registration number, or a simple numerical sequence. She duplicated this unique column and renamed it as the “Golden Record ID” for clarity. The matching criteria included the VAT number and company name, with a 95% match for the latter to account for minor variations. Additional matching criteria included the postcode, city, and CIN number.
The first step was to ensure that the original IDs were preserved while generating a unique Golden Record ID. Kathryn selected the “most populated” record as the master record in this case. She then updated and overwrote the newly created Golden Record ID with the ID from the master record. This ensured that each entry retained its original ID while also being assigned the correct Golden Record ID.
Stage 2: Merging Records While Preserving IDs
Once the Golden Record IDs were established, the next step was to merge the records. During this merging process, all original IDs were preserved in the new, consolidated record. This resulted in a single record with a unique Golden Record ID, but without losing any of the original IDs from the various source systems.
Preparing for Future Releases
For subsequent data releases, Mark would need to either load the Golden Records into the tool or combine them with the new data set. Kathryn suggested that customers often create unique IDs within the sheet to serve as the Golden Record ID for future matching. Mark agreed, noting that he would input the new system-generated ID into the sheet and create a rule based on that new ID being the primary selection parameter.
This comprehensive approach not only solved Mark’s immediate challenge but also provided a scalable solution for future data management needs.
Work with winpure to create, manage, and dedupe golden records
Working with WinPure offers a streamlined approach to creating, managing, and deduplicating Golden Records. The platform’s robust matching algorithms and customizable criteria make it easy to consolidate data from multiple source systems into a single, authoritative record.
Whether you’re working with customer data or vendor data, CRM data, or product data, WinPure can help you with data cleaning, data match, and data deduplication.
Get in touch with us using the form below!