330 likes | 440 Vues
Individualized Knowledge Access. David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick. Information Access. A key task in Oxygen: help people manage and retrieve information Three overlapping projects: Haystack: information storage and retrieval application clients
E N D
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick
Information Access • A key task in Oxygen: help people manage and retrieve information • Three overlapping projects: • Haystack: • information storage and retrieval • application clients • Semantic Web: next-generation metadata • Volt: collaborative access
Presentation Overview • Motivation • Information access behavior and goals • System Design & Architecture • Data Model • Interacting data and UI components • Working applications • Base haystack • Frontpage • Volt
Problem Scenario • I try solving problems using my data: • Information gathered personally • High quality, easy for me to understand • Not limited to publicly available content • My organization: • Personal annotations and meta-data • Choose own subject arrangement • Optimize for my kind of searching • Adapts to my needs
Then Turn to a Friend • Leverage • They organize information for their own use • Let them find things for me too • Shared vocabulary • They know me and what I want • Personal expertise • They know things not in any library • Trust • Their recommendations are good
Last to Library/web • Answer usually there • But hard to find • Wish: rearrange to suit my needs • Wish: help from my friends in looking
Lessons • Individualized access • Best tools adapt to individual ways of organizing and seeking data • Individualized knowledge • People know more than they publish • That knowledge is useful to them and others • Collaborative use • Right incentives lead to sharing and joint use
Haystack • Individualized access • My data collection, organization • Search tools tuned for me • Collaborate to leverage individual knowledge • Access unpublished information in others’ haystacks • Self interest public benefit • Lens to personalize access to the world library • Rearrange presentation to suit my personal needs
Example • Info on probabilistic models in data mining • My haystack doesn’t know, but “probability” is in lots of email I got from Tommi Jaakola • Tommi told his haystack that “Bayesian” refers to “probability models” • Tommi has read several papers on Bayesian methods in data mining • Some are by Daphne Koller • I read/liked other work by Koller • My Haystack queries “Daphne Koller Bayes” on Yahoo • Tommi’s haystack can rank the results for me…
Gathering Data • Haystack archives anything • Web pages browsed, email sent and received, address book, documents written • And any properties, relationships • Text of object (for text search) • Author, title, color, citations, quotations, annotations, quality, last usage • Users freely add types, relationships
Doc Haystack D. Karger Outstanding Semantic Web • Arbitrary objects, connected by named links • No fixed schema • User extensible • Sharable by any application • A new “file system”? HTML type title quality author says
Gathering Data • Active user input • Interfaces let user add data, note relationships • Mining data from prior data • Plug-in services opportunistically extract data • Passive observation of user • Plug-ins to other interfaces record user actions • Other Users
Spider Machine Learning Services Web Viewer Volt Viewer/ Editor Web Observer Proxy Mail Observer Proxy Data Extraction Services Triple Store Deduction Clients Data Sources
Sample Applications • Because everything uses the Semantic Web constructions, a variety of application clients can share information • Web Browser---data viewer • FrontPage---personalized information filter • Volt---collaboration tool
Haystack via Web • Web server interface • Basic operations: • Insert objects • View objects • Queries
Haystack via Web • Viewer shows one node and associated arrows • Service notices we’ve archived a directory; so archives the objects it contains (and so on…)
Haystack via Web • Services detect document type, extract relevant metadata • Output can specialize by type of object
Mediation • Haystack can be a lens for viewing data from the rest of the world • Stored content shows what user knows/likes • Selectively spider “good” sites • Filter results coming back • Compare to objects user has liked in the past • Can learn over time • Example - personalized news service
News Service • Scavenges articles from your favorite news sources • Html parsing/extracting services • Over time, learns types of articles that interest you • Prioritizes those for display • Content provider no longer controls viewing experience • No more ads
Collaborative Access • Want to leverage others’ work in organizing information • No need to “publish” expertise • Exposed automatically---without effort • Self interest helps others
Volt • Volt is about collaboration between people • The Haystack architecture allows easy collaboration among individuals • semantic web references to Haystack objects • Individuals share parts of their Haystack • Group spaces and shared notebooks
Collaborators • Those I interact with • Frequent mail contact • Frequent visits to their home page • Those with shared content • And who have same opinions about content • Collaborative filtering techniques • Referrals • Expertise search engine
Volt Expertise Beacons • Group spaces and shared notebooks • Create individual and group profiles • Profiles can be used to find other people • Allows targeted search • “Who else is working on this project?” • User controls visibility/privacy
Summary • Next generation information access • Semantic Web • provides a language and capabilities for meta-data • Haystack • teases out individual knowledge, • stores it in a coherent fashion, and • allows a variety of application clients to leverage individual meta-data • Volt • turns individual knowledge into a community resource
More Info http://haystack.lcs.mit.edu/ http://www.w3c.org/2001/sw karger@mit.edu las@ai.mit.edu ackerman@lcs.mit.edu swick@w3.org