1 / 10

Electronic Environments

Cost Trends: Hardware cost < Software cost < Information cost < People time Virtuality (transcend space) Timeliness (minimize time) Interactivity Multimedia Trends: Resource Sharing, Collaboration, Dynamic Representation, The WWW Critical Need for Text and Multimedia Management Systems !.

damien
Télécharger la présentation

Electronic Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cost Trends: Hardware cost < Software cost < Information cost < People time Virtuality (transcend space) Timeliness (minimize time) Interactivity Multimedia Trends: Resource Sharing, Collaboration, Dynamic Representation, The WWW Critical Need for Text and Multimedia Management Systems ! Electronic Environments

  2. Information seeking is a human-centered process Analytical <------------------> Browse continuum of strategies and tactics Close coupling of queries, results, and usage Interactive, iterative process Information retrieval has focused on documents (not concepts or answers) Information Seeking Perspective

  3. 1. Text retrieval is more complex than data retrieval from DBMS. 2. Distinguish searching for word matches from concept matches. 3. Distinguish subject from keyword search: Subject:-->Search on a controlled vocabulary (e.g., LC subject headings). The results point to documents. Keyword-->Search all words in particular fields/text fragments. The results point to documents. 4. Distinguish exact match from partial match retrieval Electronic Text Retrieval

  4. 1. Surrogate Search: Search a set of predefined words that point to related documents. Requires indexing via some controlled vocabulary. pros: natural transition from paper systems; computationally cheap cons: limited access; human indexing required 2. Full-Test Search: Search every word in every document. pros: broaden access; possible to automate indexing cons: computationally expensive; word rather than concept 3. Knowledge-Based Search: Search a set of concepts that are related to concepts in documents. pros: improved retrieval cons: computationally expensive; theoretical at present Approaches to Text Retrieval

  5. Full-Text Search: Search every word (or variant)in the document except stop words. Methods: Text Scanning Indexes (inverted files) Vectors Signatures Full-Text Search

  6. Words point to word number, offset, surrogate, or document: aardvark *Doc3, Doc 7, Doc45, Doc 67..... abacus Doc2, Doc16, Doc33, Doc 45, Doc 67, ..... . . . . zygote Doc 7, Doc 33, Doc 67, Doc 123, .... Find all Documents and then apply logical operators to combine Query either matches or does not match * actually Doc3,Para5,Word45 Inverted File

  7. Each document (or surrogate) is represented by a vector defined by every word in the collection. Doc 1 0 0 1 1 0 0 ..... 0 Doc 2 0 0 0 0 1 1 ..... 0 . Doc 7 1 0 0 1 0 0 ..... 1 (has aardvark and zygote) . Doc 33 0 1 0 0 0 0 ..... 1 (has abacus and zygote) . Doc 67 1 1 0 0 0 0 ..... 1 (has aardvark, abacus and zygote) . Doc N Queries are expressed as vectors and matched to document vectors. Degrees of matching are possible. Vectors

  8. Paragraphs, passages SGML codes Related problems: text summarization/auto abstracting auto categorization Document Alternatives

  9. Linguistic surrogates Images color, texture, luminosity, shape Video same as stills but add motion Sound speaker attributes, pitch, duration Multimedia

  10. 1. More full text databases (e.g., The Web!) 2. More statistical engines for ranking results (e.g., PLS, Inquiry, RetrievalWare, Topic) 3. Evolution in traditional markets (e.g., Dialog's Target, West's WIN, Mead's Freestyle) 4. WWW engines and services (Yahoo, Alta Vista, etc.) 5. Relevance feedback added 6. Multimedia developments Retrieval Trends

More Related