What’s Needed for Entity Resolution Success?

There are so many considerations for launching a successful entity resolution solution.
Identifying persons of interest from existing and new data sources is essential for national security and law enforcement efforts. However, the current situation is less than optimal.
Entity resolution technology that can find and compare data within and across organizations and disparate data sources can provide huge benefits to the Intelligence Community, integrated law enforcement and immigration and border control agents. However, technical challenges can impede efforts to deploy these systems.
When selecting an entity resolution solution, an organization should review a list of critical feature and implementation requirements to ensure that they are selecting the most accurate, flexible and cost effective system to meet their needs.
There are so many different considerations that a comprehensive list can be daunting. However, here are some of the most important:
High Degree of Accuracy
Entities that do not want to be found often employ deception techniques, which makes entity resolution a difficult task. The solution needs to be able to make sense of bad data and accurately match and link records based on non-regular attributes while minimizing false positives and false negatives.
In other words, an accurate, configurable matching algorithm that can achieve significant results from sparse, low quality or fraudulent data is a necessity.
Advanced Relationship Analysis
Identifying obvious and non-obvious relationships across disparate data sets requires the ability to correlate and link related entities and compare relevant attributes that are associated with a person, place or thing against one another.
The system must be able to do advanced comparisons that include fuzzy matching and examine the degree of similarities and differences across attributes before declaring that two or more entities are related.
Scale
The software should scale to resolve across billions of entities, not just millions, in batch and real-time. Batch computes state should support the migration of mass volumes of data (on the order of billions of records).
Performance
Real-time compute state requirements capable of sub-second response times against volumes in the billions and that uphold established service level agreements should be supported.
Transparency
The solution should be configurable, tunable, and explainable. It should not be a black box. Analysts need to feel comfortable with the results being returned and want to see how the solution came to its deductions and conclusions.
Next week, I’ll cover several additional considerations. For the meantime, are there any that you would suggest? Add your thoughts in the comments.
Leave a Response







Entries(RSS)