Entity Resolution Taken One Step Further: “Spot The Difference”

In a recent blog post, Jeffrey Huth described a simplified analogy for explaining entity resolution management to laymen. He proposed that rather than sticking with the more abstract business terms we use in the field, we should explain entity management as if we were talking about Google searches.

Although I wholeheartedly agree with the example he provided, it still does require people to understand the principles of Google, something which my techno-less mother still might comprehend but my 7-year old daughter certainly would not. So I decided to simplify the analogy even more and make it accessible even to first graders.

The “Spot the Difference” game is a common theme found in Sunday comics and activity books. In it, a person has to decide which image differs in the other and uses several visual cues to make the determination.  An example, courtesy of My First Pony (Kappa Books, 2005), is provided below. Before reading on, see if you can Spot the Difference.

Entity resolution can help you spot the difference

(For those of you still reading, the answer is 2.  If you look at the butterfly and the top of the party hat, you’ll see it).

Of course, this example is one of the best representations behind the fundamentals of entity management and we can explain it to a 7-year-old as follows:

“Consider that each pony either matches or doesn’t match with one another. Those that do match can be grouped into what we call an entity. Those that differ can be put into another entity. We know that we have three ponies (“members”) that can be put together into an entity, and one pony goes to another entity. If we have even more pictures, we can have still more members within each entity. Entity management is about comparing members and combining them either into similar entities or different ones.”

What our brains attempts to do when comparing and singling out the images is not much different than what the Initiate algorithm attempts with patient records.

In fact,  we could stretch the analogy even further to discuss which attributes require the highest weightings, which features are most important in determining thresholds between same and different , and even bring in the concept of merging different attributes (something which Huth does nicely in his example).

Of course, I will leave the delivery of these topics for the reader to expound upon as it could become very broad in scope and quite taxing on the 7-year-olds still trapped in the discussion.


Tagged as: ,

1 Responses »

  1. Great way of explaining what Entity Resolution is about.
    Thanks.

Leave a Response