700 likes | 861 Vues
Homeland Security What Can Mathematics Do?. Fred Roberts Professor of Mathematics, Rutgers University Chair, RU Homeland Security Research Initiative Director, DIMACS Center.
E N D
Homeland Security What Can Mathematics Do? Fred Roberts Professor of Mathematics, Rutgers University Chair, RU Homeland Security Research Initiative Director, DIMACS Center
Mathematical methods have become important tools in preparing plans for defense against terrorist attacks, especially when combined with powerful, modern computer methods for analysis and simulation.
. After Pearl Harbor: Mathematics and mathematicians played a vitally important role in the US World War II effort.
Enigma machine Critical War-Effort Contributions Included: Code breaking. Creation of the mathematics-based field of Operations Research: logistics optimal scheduling inventory strategic planning
But: Terrorism is Different.Can Mathematics Really Help? 5 + 2 = ? 1, 2, 3, …
I’ll Illustrate with Mathematics Projects I’m Involved in.There are Many Others • Bioterrorism Sensor Location • Monitoring Message Streams • Identification of Authors • Detecting a Bioterrorist Attack through “Syndromic Surveillance”
OUTLINE • Bioterrorism Sensor Location • Monitoring Message Streams • Identification of Authors • Detecting a Bioterrorist Attack through “Syndromic Surveillance”
Early warning is critical in defense against terrorism • This is a crucial factor underlying the government’s plans to place networks of sensors/detectors to warn of a bioterrorist attack The BASIS System – Salt Lake City
Locating Sensors is not Easy • Sensors are expensive • How do we select them and where do we place them to maximize “coverage,” expedite an alarm, and keep the cost down? • Approaches that improve upon existing, ad hoc location methods could save countless lives in the case of an attack and also moneyin capital and operational costs.
Two Fundamental Problems • Sensor Location Problem • Choose an appropriate mix of sensors • decide where to locate them for best protection and early warning
Two Fundamental Problems • Pattern Interpretation Problem: When sensors set off an alarm, help public health decision makers decide • Has an attack taken place? • What additional monitoring is needed? • What was its extent and location? • What is an appropriate response?
The Sensor Location Problem • Approach is to develop new algorithmic methods. • Developing new algorithms involves fundamental mathematical analysis. • Analyzing how efficient algorithms are involves fundamental mathematical methods. • Implementing the algorithms on a computer is often a separate problem – which needs to go hand in hand with the basic mathematics of algorithm development.
Greedy Algorithms • Find the most important location first and locate a sensor there. • Find second-most important location. • Etc. • Builds on earlier mathematical work at Institute for Defense Analyses (Grotte, Platt) • “Steepest ascent approach.’’ • No guarantee of “optimal” or best solution. • In practice, gets pretty close to optimal solution.
Algorithmic Approaches II : Variants of Classic Facility Location Theory Methods
Location Theory • Old problem in Operations research: Where to locate facilities (fire houses, garbage dumps, etc.) to best serve “users” • Often deal with a network with nodes, edges, and distances along edges • Users u1, u2, …, un arelocated at nodes • One approach: locate the facility at node x chosen so that sum of distances to users is minimized. • Minimize:
1 a f 1 1 e b 1 1 c d 1 Location Theory: A Network 1’s represent distances along edges Nodes are places for users or facilities
1 a f 1 1 e b 1 1 c d 1 u1 u2 u3 x=a: d(x,ui)=1+1+2=4 x=b: d(x,ui)=2+0+1=3 x=c: d(x,ui)=3+1+0=4 x=d: d(x,ui)=2+2+1=5 x=e: d(x,ui)=1+3+2=6 x=f: d(x,ui)=0+2+3=5 x=b is optimal
Variants of Classic Facility Location Theory Methods: Complications • We don’t have a network with nodes and edges; we have points in a city • Sensors can only be at certain locations (size, weight, power source, hiding place) • We need to place more than one sensor • Instead of “users,” we have places where potential attacks take place. • Potential attacks take place with certain probabilities. • Wind, buildings, mountains, etc. add complications.
Variants of Classic Facility Location Theory Methods: Complications • These more complex problems are hard! • The best-known algorithms for solving these “higher-dimensional” variants of the classic location problem are due to Rafail Ostrovsky -- a partner on our project. • The mathematics-based approximation methods due to Ostrovsky and his colleagues are promising.
Algorithmic Approaches IIII : Variants of Air Pollution Monitoring Models
Variants of Air Pollution Monitoring Models • Long history of using mathematical models to locate air pollution monitors. • Use fluid dynamics • Use plume models. • Large computer simulations needed. • Long used in nuclear weapons defense.
Variants of Air Pollution Monitoring Models • Mathematical challenge: Modify air pollution monitor placement modeling tools for complex biological agents. • E.g.: Complications arise when applying the models to cities: Buildings make it hard!
The Pattern Interpretation Problem (PIP) • It will be up to the Decision Maker to decide how to respond to an alarm from the sensor network.
Approaching the PIP: Minimizing False Alarms One approach: Redundancy. • Could require two or more sensors to make a detection before an alarm is considered confirmed • Could require same sensor to register two alarms: Portal Shield requires two positives for the same agent during a specific time period.
Approaching the PIP: Minimizing False Alarms • Could place two or more sensors at or near the same location. Require two proximate sensors to give off an alarm before we consider it confirmed. Redundancy has drawbacks: cost, delay in confirming an alarm. We need mathematical methods to analyze the tradeoff between lowered false alarm rate and extra cost/delay
Approaching the PIP: Using Decision Rules • Existing sensors come with a sensitivity level specified and sound an alarm when the number of particles collected is sufficiently high – above threshold.
Approaching the PIP: Using Decision Rules • Let f(x) = number of particles collected at sensor x in the past 24 hours. Sound an alarm if f(x) > T. • Alternative decision rule: alarm if two sensors reach 90% of threshold, three reach 75% of threshold, etc. Alarm if: f(x) > T for some x, or if f(x1) > .9T and f(x2) > .9T for some x1,x2, or if f(x1) > .75T and f(x2) > .75T and f(x3) > .75T for some x1,x2,x3.
Approaching the PIP: Using Decision Rules • Prior work along these lines in missile detection (Cherikh and Kantor)
Bioterrorism Sensor Location: Partner Agencies/Institutions • Defense Threat Reduction Agency • MITRE Corporation • Los Alamos National Laboratory • Institute for Defense Analysis • New York City Dept. of Health
OUTLINE • Bioterrorism Sensor Location • Monitoring Message Streams • Identification of Authors • Detecting a Bioterrorist Attack through “Syndromic Surveillance”
Monitoring Message Streams: Algorithmic Methods for Automatic Processing of Messages
Objective: Monitor huge communication streams, in particular, streams of textualized communication, to automatically detect pattern changes and "significant" events Motivation: monitoring email traffic, news, communiques, faxes
Technical Approaches: • Given stream of text in any language. • Decide whether "events" are present in the flow of messages. • Event: new topic or topic with unusual level of activity. • Suppose events have been classified into classes or groups: group 1, group 2, … • A new message comes in. Does it fit into group 1? Into group 2? Or does it (and related messages) define a new group of interest?
One Approach: “Bag of Words” • List all the words of interest that may arise in the messages being studied: w1, w2,…,wn • Bag of words vector b has k as the ith entry if word wi appears k times in the message. • Sometimes, use “bag of bits”: Vector of 0’s and 1’s; count 1 if word wi appears in the message, 0 otherwise.
“Bag of Words” Example Words: w1 = bomb, w2 = attack, w3 = strike w4 = train, w5 = plane, w6 = subway w7 = New York, w8 = Los Angeles, w9 = Madrid, w10 = Tokyo, w11 = London w12 = January, w13 = March
“Bag of Words” Message 1: Strike Madrid trains on March 1. Strike Tokyo subway on March 2. Strike New York trains on March 11. Bag of words b1 = (0,0,3,2,0,1,1,0,1,1,0,0,3) w1 = bomb, w2 = attack, w3 = strike w4 = train, w5 = plane, w6 = subway w7 = New York, w8 = Los Angeles, w9 = Madrid, w10 = Tokyo, w11 = London w12 = January, w13 = March
The Approach: “Bag of Words” • Key idea: how close are two such vectors? • Suppose known messages have been classified into different groups: group 1, group 2, … • A message comes in. Which group should we put it in? Or is it “new”? • You look at the bag of words vector associated with the incoming message and see if “fits” closely to typical vectors associated with a given group.
The Approach: “Bag of Words” • Your performance can improve over time. • You “learn” how to classify better. • Typically you do this “automatically” and try to develop mathematical methods that will allow a machine to “learn” from past data.
“Bag of Words” Message 2: Bomb Madrid trains on March 1. Attack Tokyo subway on March 2. Strike New York trains on March 11. Bag of words b2 = (1,1,1,2,0,1,1,0,1,1,0,0,3) w1 = bomb, w2 = attack, w3 = strike w4 = train, w5 = plane, w6 = subway w7 = New York, w8 = Los Angeles, w9 = Madrid, w10 = Tokyo, w11 = London w12 = January, w13 = March
“Bag of Words” Note that b1 and b2 are “close” b1 = (0,0,3,2,0,1,1,0,1,1,0,0,3) b2 = (1,1,1,2,0,1,1,0,1,1,0,0,3) Close could be measured using distance d(b1,b2) = number of places where b1,b2 differ (“Hamming distance” between vectors). Here: d(b1,b2) = 3 The messages are “similar” – could belong to the same group or class of messages.
“Bag of Words” Message 3: Go on strike against Madrid trains on March 1. Go on strike against Tokyo subway on March 2. Go on strike against New York trains on March 11. Bag of words b3 = same as b1. BUT: message 3 is quite different from message 1. Shows complexity of problem. Maybe missing some key words like “go” or maybe we should use pairs of words like “on strike” (“bigrams”)
One Approach: k-Nearest Neighbor (kNN) Classifiers • How kNN Classifiers Work: • Find k most similar “training” messages (neighbors) • Assign a message to those groups that are most common among neighbors (using weighting by distance) • kNN classifiers had beenconsidered inefficient since finding neighbors is slow
Speeding up kNN • Can finding neighbors be made fast enough to make kNN practical? • Mathematics can help. • Store text and classes “sparsely” • Use “inverted file” heuristics that group input by word, not by “document” and compute similarities using only the few words occurring in the document • Result: New methodsare10 to 100 times faster with only a 2-10% loss in “effectiveness” (according to some standard measures) • Software delivered to sponsors.
Streaming Data • We often have just one shot at the data as it comes “streaming by” because there is so much of it. This calls for powerful new algorithms.