230 likes | 286 Vues
ESRC Society Today The Background Theory. Methods for Sharing Online Resources December 2004 Cormac.Connolly@ESRC.ac.uk. Methods for Sharing Online Resources… . How does a child recognise what a dog looks like? What exactly is content? Waving a magic wand… Putting theory into practice.
E N D
ESRC Society Today The Background Theory Methods for Sharing Online ResourcesDecember 2004Cormac.Connolly@ESRC.ac.uk
Methods for Sharing Online Resources… • How does a child recognise what a dog looks like? • What exactly is content? • Waving a magic wand… • Putting theory into practice. • Serendipity Effect • ESRC World
Oracle DB2 MS SQL 20% Structured 80% Unstructured Unstructured vs Structured
A Unique Combination of Technologies • Automatic • Data Agnostic • Language Independent • Fast • Scalable • Accurate • Dynamic & Realtime • Includes Voice & Video • Fully XML compatible • + and includes Legacy Methods = Autonomy granted single source supplier status by US Government for its unique technology. Also used by UK Government
Process Automation Aggregation Automatic Categorization Hyperlinking Profiling Personalization Collaboration Delivery Retrieval Routing Alerting Notes Database News Feeds Information Theory and Bayesian Inference Integration Through Understanding Internet Email Manual Processes Files Document Management Aggregate content, tag & categorize XML Hypertext links to similar content Personalization from forms/questionnaires geodemographic profiling Audio/ Media Searching for information Answering customer inquiries via a help desk Reformatting for multi- channel delivery, e.g. PDF to XML E-mailing information to relevant recipients Understanding + Automation Removes Manual Processes
Taxonomies- Fully hierarchical and relational, dynamic, • individualized, trainable & editable by example or legacy methods Automatic Categorization • Requires very few documents for initial training • Fully dynamic views of categorized content • Manual supervision if available if required • Clustering- Ability to take information, or people and cluster them automatically into related groups Retrieval Hyperlinking • Profiling- User profiles can be automatically matched and connected for collaboration creating centers of expertise Personalization • Personalization- Fully granular automatic and individual personalization configurable by users or administrator Alerting • Alerting- Accurate scalable proactive alerting. Avoiding problems of Keyword systems. Implicit or explicit • alert subject setting Profiling • Hyperlinking- Fully Automatic hyperlink generation across data types Clustering • Retrieval-Natural language, concept matching, full legacy Boolean, metadata and XML, distributed and federated, refine by example, combinational, cross-lingual, user feedback, results weighting, parametric Example Functions…
IDOL • Putting information into context • Conceptual Indexes • Conceptual Profiles • Contextual Categories • Collaboration & Expertise Networks • Legacy Systems • Legacy Indexes • Legacy Topics • Legacy Profiling • Legacy Collaboration Systems Legacy Compatibility Module - LCM Additional benefits of being able to integrate with a whole host of Document Management Systems and Legacy Retrieval and Collaboration Systems in order to leverage the existing user-document relationships that reside within the knowledge base. Legacy Compatibility Module
Supported Repositories… • Oracle 9i • Oracle Database • Lotus Notes • Lotus Quickplace • Documentum • ATG Dynamo • Intershop • Exchange Server • FileNet • iManage Server • Microsoft SQL • Sybase • DB2 • ODBC Databases • Microsoft SharePoint • OpenText LiveLink • PCDocs • Siebel 2000 • Vignette
Meta Data… FULL META DATA SUPPORT • All document Meta-Data supported • e.g: Price, Colour, Image, Size, Author, Summary, Type, Security, Meta-tags • Strings, Numbers, Dates, Bits supported • Conceptual search + mixed Conceptual / Meta search • Full Meta-data Boolean search • Meta-data weighting • Biasing / Filtering by Meta-Data • Advanced Compound Sorting • Boolean Meta-Data Categorization • Powerful per document free field structure
XML Support… FULL XML SUPPORT • Just another file format: • Read XML natively • Products can output XML • Advanced XML field mappings / operations • ALL Autonomy operations available on XML
Retrieval Methods… 1. Legacy Methods dog AND pet AND labrador 2. Bayesian Inference 3. Information Theory
Statistics Generation from The Corpus The DRE, using Bayesian Inference and Shannon’s Information Theory, builds “Bags” of statistics from a corpus of documents DRE = Dynamic Reasoning Engine
Natural Language Response… “Tell me about the golden labrador...”
“Autonomy shines when finding interesting or unanticipated matches between texts or digital assets … important and needed to drive collaboration… or knowledge sharing activities …” Forrester …"Autonomy can understand and analyse huge amounts of information….. (it can) categorise the ideas that they contain and build a sophisticated idea of what it is looking at without human help…..Autonomy has developed a program that reads, analyses and acts upon text, a breakthrough in artificial intelligence"Sunday Times Serendipity effect…
ESRC World… Potential Content Resources • SOSIG • UK Data Archive • Selected MIMAS targets • IBSS • ESRC Investment Websites • 3rd Party Materials • Commissioned Content “Search” belittles Autonomy’s capability as an enabling technology for personalization, knowledge management, and collaboration - an automated “Intelligent Data Operating Layer” for unstructured content…AMR July 2003