1 / 37

information retrieval

information retrieval. mon feb 08 2016 data… & information organization. SPSS Workshop in Odum…. Monday, February 29 2:00 – 3:30 pm Davis Library, Room 219 (same lab room) introduction to SPSS and teach how to work with data saved in SPSS format no registration required.

alexanderc
Télécharger la présentation

information retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. information retrieval mon feb 08 2016 data… & information organization

  2. SPSS Workshop in Odum… • Monday, February 29 • 2:00 – 3:30 pm • Davis Library, Room 219 (same lab room) • introduction to SPSS and teach how to work with data saved in SPSS format • no registration required Anyone need an “SPSS Cheat Sheet”?

  3. framework for today’s lecture…

  4. info organization activity • in a small group, examine the cards that identify various “documents” in a collection • on the table organize the document surrogates into some sort of schema – grouping by category (like items with like) • choose your own organization scheme and hierarchy • if desired, write on the blank cards to create new or uber categories • be ready to share your organization method with the class

  5. STRUCTUREDvs unstructured data easy to envision structured data in terms of “tables” Employee Manager Salary Smith Jones 68000 Chang Smith 65000 Ivy Smith 50000 Typically allows numerical range and exact match (for text) queries, e.g., Salary < 60000 AND Manager = Smith.

  6. tables in a MS Access relational database – defines each entity in a social networking site

  7. Data entry form in a MS Access relational database – create each record

  8. structured vsUNSTRUCTURED data • typically refers to free text • email is a good example of unstructured data. it's indexed by date, time, sender, recipient, and subject, but the body of an email remains unstructured • other examples of unstructured data include books, documents, medical records, and social media posts

  9. journal article is an example of unstructured data

  10. Document collection (corpus) Query Representation function Representation function Matching function Index CATEGORIES SUBJECT HEADINGS Results

  11. KWIC Key word in context

  12. metadata

  13. What is Metadata? • Classic definition: data about data • Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. (NISO) • 3 primary “types”: • Descriptive • Structural • Administrative (rights management, preservation)

  14. digital forensics

  15. How do we organize a collection of “documents” so that users can find what they need?

  16. from Glushko reading… • what three types/forms of categorization does Glushko discuss in the Categorization in the Wild piece? • give a real-world example of a categorization system and briefly describe the purpose behind it (i.e. what problem is it trying to address?)

  17. from Glushko reading… • Cultural categorization • Embodied in culture and language • Acquired implicitly through development via parent-child interactions, language, and experience • Formal education can build on this, but non-formal cultural system can often dominate • Traditional perspective for thinking and research about categorization

  18. From Glushko reading… • Individual categorization • A system developed by an individual for organizing a personal domain to aid memory, retrieval, or usage • Can serve social goals to convey information, develop a community, manage reputation • Have exploded with the advent of social computing, especially in applications based on “tagging” • An individual’s system of tags in web applications is sometimes called a “folksonomy”

  19. From Glushko reading… • Institutional categorization • Systems created to serve institutional goals and facilitate sharing of information and increase interoperability • Helps to streamline interactions and transactions so that consistency, fairness and higher yields can result.

  20. Let’s look at a database of magazine & journal articles…to see how information is organized – with particular attention to value-added SUBJECT TERMS/HEADINGS (categorization) …Academic Search Premier >> UNC Libraries Homepage: http://www.lib.unc.edu/ >> E-Research by Discipline >> Frequently Used >> Academic Search Premier [off-campus log in with onyen/password] Handout Activity #2

  21. info organization & search • We organize to enable retrieval • The more effort put into organizing information, the more effectively it can be retrieved • The more effort we put into retrieving information, the less it needs to be organized first • We need to think in terms of investment, allocation of costs and benefits between the organizer and retriever • The allocation differs according to the relationship between them; who does the work and who gets the benefit?

  22. final notes… • Homework #2: Database report • sign up for a database – or talk with me about suggestion • next Wednesday – 5-min reports in class • Wednesday: “Information Retrieval” intro with Dr. Jaime Arguello (required reading prep) • Wednesday: Data to Story Project – speed date/pitch

More Related