Entity Resolution to Build a Better “Watch List”

Building a new watch list must incorporate data but also probability

Building a new watch list must incorporate data but also probability

Yesterday, we discussed reasonable security questions and the common rationale as to why Umar Abdulmutallab was not flagged on a watch list – a lack of information about his background and intentions.

Today, let’s examine the two big things that will hopefully come out of the “watch list review” requested by the administration:

Data. We should not be afraid to create more data sources and integrate more information. The fear is we run the risk of missing the useful information in a sea of worthless data. Entity resolution technology can make sense of all that information and resolve identities and relationships between them. Entity resolution is a type of MDM that deals explicitly with the challenges faced by the intelligence community and law enforcement.

Entity resolution can determine Abdulmutallab is on a UK watch list and has possible terrorist ties. An entity resolution solution with sufficient scalability, accuracy and performance can process billions of records. Software such as Initiate Master Data Service has successfully proven this with law enforcement and intelligence customers globally.

The whole notion of “a list” should be revisited. Abdulmutallab was not on the No-fly List or the Selectee List. He was only on the Terrorist Identities Datamart Environment (TIDE) as a person of interest with possible terrorist ties. As a result, the answer to the binary question of whether Abdulmutallab should be allowed to fly was yes.

But when we are already dealing with confidence levels in data (he might have been a terrorist) we need our watch list process to likewise deal with confidence levels.

A list implies static, deterministic decision making (e.g. Is he on the no-fly list? No). Rather we should introduce the notion of probability and grey area.

Start with the initial question: Is he a potentially dangerous passenger? Possibly, because he appears on the TIDE list). Next, consider the other factors. Does the likelihood of his being dangerous increase if he bought a one-way ticket and checked no bags? Yes. Does it increase again if he’s been denied entry to an allied country? Yes.

Using all of the relevant information about Abdulmutallab and applying the concept of probability to the question of whether he should have been allowed on a plane leads to much better decision making. But, it requires that all of the relevant information be shared and available to the decision maker.

Yet, information sharing itself is fraught with challenges if not done correctly.

Commentators have recently been calling vociferously for better information sharing. Information sharing is clearly a necessary part of the solution to prevent another terrorist attack, but it is not as easy as it sounds.

Information sharing can be potentially damaging and involves risks. Let’s assume, as some people propose, that the government starts sharing all of their lists across multiple agencies and branches of government. We can’t envision all of the complications but it is fairly clear that some “list sharing” can result in serious adverse consequences.

As information sharing is spread to many individuals, the chances of information leaks grow. A simplistic uncontrolled “list sharing” can help terrorists to understand what names are on the “lists” and what names are not. This can unintentionally aid terrorist operations, and possibly reveal the techniques and individuals responsible for making the terrorist names available to the intelligence agencies

Politics aside, it is fairly clear that “list sharing” across agencies is a complex master data governance problem applied to multiple “lists” or in a more “physical” terms multiple data hubs.

Master data governance is a control discipline with the primary focus on cross-functional master data quality, consistency, and sharing. This includes technologies, processes, and design options that enable controlled master data sharing that minimizes the total risks that include both the risks of sharing and the risks of not sharing.

This is an opportunity to use entity resolution in conjunction with advanced Master Data Governance to connect the dots across intelligence sources while minimizing the risks of information sharing.

To be able to resolve a person like Abdulmutallab you must be able to do many things quickly:

  • Integrate many sources with massive amounts of data into information and relationships
  • Minimize the chances to compromise the lists
  • Be willing to assign confidence scores
  • Define data governance procedures in the grey area.
  • Assimilate new data that can change how you perceive a situation and move out of the grey area into black and white.

Abdulmutallab is no longer a ‘possible’ terrorist. He now permanently has the label of ‘known terrorist’. The problem facing our government is finding persons like Abdulmutallab and recognizing them as dangerous before they have an opportunity to do harm.

My colleagues Jonathan McDonald and Scott Schumacher have written extensively on entity resolution. View their index.


Tagged as: , , ,

1 Responses »

  1. Congratulations on some informative posts describing the application of entity resolution. For further reading on the topic, IdentityResolutionDaily.com provides a source of regular posts about identity resolution and entity analytics. A recent article addressed this very subject:

    "Actionable Identity Intelligence from Identity Resolution"
    http://identityresolutiondaily.com/682/actionable-identity-intelligence-from-identity-resolution/

    Bob Barker
    Editor
    IdentityResolutionDaily.com

Leave a Response