140 likes | 264 Vues
Content Analytics Solutions September, 2010. Content Analytics – An Increasingly Important Solution Component. Social Network and Event Timeline Analysis are just two examples of this – there are many more. Event Timeline Analysis Plotting specific events against a timeline.
E N D
Content Analytics – An Increasingly Important Solution Component Social Network and Event Timeline Analysis are just two examples of this – there are many more. Event Timeline Analysis Plotting specific events against a timeline. Social Network Analysis Showing relationships between people, organisations, phone numbers, etc.
SOURCE EXTRACT CORRELATED FUSED Structured URN 12345678 Born 1970 URN 12345678 Born 1970 URN 12345678 Name J S Luke Born 22/01/1970 Age 34 File System Luke Born 22/01/1970 URN 12345678 Born 22/01/1970 Web Articles J S Luke Age mid-30s URN 12345678 Age mid-30s An Open Information Centric Architecture
An Open Information Centric Architecture SOURCE EXTRACT STORE CORRELATION & FUSION TOOLS Structured Visualisation • i2 GIS • ESRI • Tenet File System Search & Discovery Engines • OmniFind • IBM Content Analytics Web Articles ….. Data Fusion & Mining • SPSS • EAS
PC 143 (Hunter) 15 June 2006 23:47 Suspect identified himself as JohnSetsuko. Matched description given by night club doorman (IC1, Male, Ag 22-24 yrs, blue Everton shirt). Stopped whilst driving WhiteFord Mondeo, W563 WDL. Address given as 22 East Dene Ridge, Copdock, Ipswich. Searched at scene and found in possession of 1oz Cannabis Resin and lockable pocket knife. Arresting_Officer PC 143 Arrest_Date_Time 15/06/2006 : 23:47 Suspect_Forename John Suspect_Surname Setsuko Suspect_VRN W563WDL Suspect_Vehicle_Colour White Suspect_Vehicle_Make Ford Mondeo Suspect_Addr_Street 22 East Dene Ridge Suspect_Addr_Town Ipswich Evidence_1_Description 1 oz Cannabis Resin Evidence_2_Description Lockable pocket knife An IBM Content Analytics Solution InLaw Enforcement Now, this police department can: • Check for errors & inconsistencies with existing databases • Provide management with actionable information • Have improved search capabilities • Perform identity resolution and relationship mining
Open Information Architecture SOURCE EXTRACT STORE CORRELATION & FUSION TOOLS Structured Visualisation • i2 GIS • ESRI • Tenet File System Search & Discovery Engines • OmniFind • IBM Content Analytics Web Articles ….. Data Fusion & Mining • SPSS • EAS
The Goal: The Problem: The Solution: IBM Visual Search For A Government Agency Reducing analysts time in locating relevant information. Keyword search technologies do not allow the definition of complex searches. For example, “find every person mentioned in a document describing drug smuggling associated with another person mentioned in a document describing organised crime.” Deployment of a graphical search interface enabling the definition of complex patterns.
Find groups of 3 people who are linked together and are associated with the same organization
The Goal: The Problem: The Solution: IBM Success at a Government Agency The automated solution saved each analyst over 6 hours per day, improving the quality and consistency of analysis Identify the re-occurrence of phone numbers within historical documents. Using keyword search technologies had historically resulted in large numbers of false hits for credit card, visa and other reference numbers. The tedious nature of the task also resulted in oversights and errors. • Deployment of an automated software solution to analyze documents and identify recurring phone numbers • Semantic rules were used to ensure a high degree of accuracy • All extracted phone numbers were compared against other documents with the results visualized through a carefully designed User Interface.
What Are The Inhibitors? • What’s the business case? • How good is the text analytics? • How do we know how good the text analytics is? • How do we respond to changes in the content and of course the business environment? • Are we creating, rather than solving a problem, when we invest in text analytics?
Real-time Analysis Index Driven Annotation Engine Interactive Rule Development & Manual Annotation Node 1 Node 1 Node 2 Enterprise Services Geospatial Analysis … Network Analysis Node n … Semantic Search New Architectural Models For Text Analytics • Large scale development / training / test corpus • Near real-time feedback on impact • Analytics as opposed to speculation (mining instead of prospecting)