1 / 45

Lecture 02: Information

Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/. Lecture 02: Information. IS 202: Information Organization and Retrieval. Lecture Outline. What Is Information?

Télécharger la présentation

Lecture 02: Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/ Lecture 02: Information IS 202: Information Organization and Retrieval

  2. Lecture Outline • What Is Information? • History of Information Search and Organization • Discussion Questions • Action Items for Next Time

  3. Lecture Outline • What Is Information? • History of Information Search and Organization • Discussion Questions • Action Items for Next Time

  4. What is Information? • There is no “correct” definition • Can involve philosophy, psychology, signal processing, physics • Cookie Monster’s definition: • “news or facts about something”

  5. What is Information? • Oxford English Dictionary • Information • Informing, telling; thing told, knowledge, items of knowledge, news • Knowledge • Knowing familiarity gained by experience; person’s range of information; a theoretical or practical understanding of; the sum of what is known

  6. Assignment 1 - Discussion • What is information, according to your background or area of expertise?

  7. Relating data to a context (“situational interpretation”) Anything that is important to anyone (“significance”) World data information knowledge Requires community of interpretation All information is dependent on context Capable of being recorded and stored and transmitted (also in physical form – e.g., fossils) Information must be recorded Information is a record of something that can be reused Information is a commodity What Is Information?

  8. What Is Information? • Negentropy • Potential energy to become knowledge • Potential for it to be built upon • Does information have to be related to “true” data? • Can information be downgraded to data if it is forgotten?

  9. Types of Information • Differentiation by form • Differentiation by content • Differentiation by quality • Differentiation by associated information

  10. Information Properties • Information can be communicated electronically • Broadcasting • Networking • Information can be easily duplicated and shared • Problems of ownership • Problems of control Adapted from ‘Silicon Dreams’ by Robert W. Lucky

  11. Intuitive Notion (Losee 97) • Information must • Be something, although the exact nature (substance, energy, or abstract concept) is not clear • Be “new”: repetition of previously received messages is not informative • Be “true”: false or counterfactual information is “mis-information” • Be “about” something • This human-centered approach emphasizes meaning and use of message

  12. Information from the Human Perspective • Levels in cognitive processing • Perception • Observation/attention • Reasoning, assimilating, forming inferences • Knowledge • “Justified true belief” • Belief • An idea held based on some support; an internally accepted statement, result of inductive processes combining observed facts with a reasoning process

  13. Information from the Human Perspective • Does information require a human mind? • Communication and information transfer among ants • A tree falls in the forest … is there information there? • Existence of quarks

  14. Meaning vs. Form • Form of information as the information itself • Meaning of a signal vs. the signal itself • What aspects of a document are information? • Representation (Norman 93) • Why do we write things down? • Socrates thought writing would obliterate serious thought • Sounds and gestures fade away • Artifacts help us to reason • Anything not present in the representation can be ignored • Things left out of the representation are often what we don’t know how to represent

  15. Information • Consider Borges’ infinite Library of Babel… • It has all possible data combinations of letters • Does it therefore contain all possible information? • What about all possible knowledge? • What about wisdom? • Is the Internet a prototype Library of Babel?

  16. Claude Shannon, 1940’s, studying communication Ways to measure information Communication: producing the same message at its destination as that seen at its source Problem: a “noisy channel” can distort the message Between transmitter and receiver, the message must be encoded Semantic aspects are irrelevant Information Theory Noise Message Source Trans- mitter Receiver Desti- nation Channel

  17. Information Theory Message Message Source Encoding Decoding Destination Channel Noise Message Message Source Encoding (Writing/ Indexing) Storage Decoding (Retrieval/ Reading) Destination • Better called “Technical Communication Theory” • Communication may be over time and space

  18. Human Communication Theory? Message Message Source Encoding Decoding Destination Channel Noise

  19. Communication Theory • Encompasses a vast array of disciplines • Mass communications, literary and media theory, rhetoric, sociology, psychology, linguistics, law, cognitive science, information science, engineering, etc. • Questions • What and how we communicate • Why we communicate • What happens when communication “works” and when it doesn’t • How to improve communication

  20. Why Study Communication Theory? • Our understanding of what, how, and why we communicate informs our • Theory of information and practice of information production • Analysis, design, and evaluation of information systems and applications • How we work together in teams • How we read texts and talk with one another in this course • Law and public policy

  21. Etymology of “Communication” • Communication - c.1384, from O.Fr. communicacion, from L. communicationem (nom. communicatio), from communicare "to impart, share," lit. "to make common," from communis (see common). • Common - 13c., from O.Fr. comun, from L. communis "shared by all or many," from L. com- "together" + munia "public duties," those related to munia "office." Alternate etymology is that Fr. got it from P.Gmc. *gamainiz (cf. O.E. gemæne), from PIE *kom-moini "shared by all," from base *moi-, *mei- "change, exchange." • Remuneration - c.1400, from L. remunerationem, from remunerari "to reward," from re- "back" + munerari "to give," from munus (gen. muneris) "gift, office, duty." Remunerative is from 1677.

  22. What and How Do We Communicate? • What “gifts” do we give each other? • What do we do with these gifts? • How does this gift exchange bring us together (or not)?

  23. The Conduit Metaphor • Language functions like a conduit, transferring thoughts bodily from one person to another • In writing and speaking, people insert their thoughts or feelings in the words • Words accomplish the transfer by containing the thoughts or feelings and conveying them to others • In listening or reading, people extract the thoughts and feelings once again from the words

  24. Conduit Metaphor: Minor Frameworks • Thoughts and feelings are ejected by speaking or writing into an external “idea space” • Thoughts and feelings are reified in this external space, so they exist independent of any need for living beings to think or feel them • These reified thoughts and feelings may, or may not, find their way back into the heads of living humans

  25. Toolmakers’ Paradigm

  26. Semantic Pathology • Semantic Pathology • “Whenever two or more incompatible senses capable of figuring meaningfully in the same context develop around the same name” • Example • “This text is confusing.” • Text(1) = The layout/font of the text is confusing. • Text(2) = The argument of the text is confusing. • Question: Where is Text(2)?

  27. Lecture Outline • What Is Information? • History of Information Search and Organization • Discussion Questions • Action Items for Next Time

  28. Origins: Physical Representations • Very early history of content representation • Sumerian tokens and “envelopes” • Alexandria - pinakes • Indices

  29. Origins: Mental Representations • Rhetorical mnemonic theory and practice (“memoria”) • Memory palaces • An organization and retrieval technology for concepts that combines physical and virtual places (“loci”) • Examples • Simonides of Ceos • Cicero’s “testes”

  30. Origins: Bibliographic Representations • Biblical indexes and concordances • Hugo de St. Caro – 1247 A.D. : 500 monks – KWOC • Book indexes (Nuremburg Chronicle) • Library catalogs • Journal indexes • “Information explosion” following WWII • Bush and Memex • Cranfield studies of indexing languages and information retrieval • Development of bibliographic databases • Index Medicus – production and Medlars searching

  31. How Much Information Today? • See report by Hal Varian and Peter Lyman http://www.sims.berkeley.edu/research/projects/how-much-info/ • Total annual information production including print, film, magnetic media, etc. • Upper Bound 2,120,539 Terabytes (1012 bytes) • Lower Bound 635,480 Terabytes • I.e., between 1 and 2 Exabytes per year (1018 bytes) • How do we organize THIS?

  32. Lecture Outline • What Is Information? • History of Information Search and Organization • Discussion Questions • Action Items for Next Time

  33. Discussion Questions (Borges) • Yuri Takhteyev on Borges • How does Borges' view of information compares to Shannon's (information as reducing uncertainty)? • Why does Borges arrange the books randomly? What difference would it make in the story? (This question is also raised by Dennett in the “Library of Mendel,” so we may want to leave it till that discussion) • What leads the Librarians to postulate the existence of the Man of the Book? Does that logic make sense?

  34. Discussion Questions (Borges) • Yuri Takhteyev on Borges • What is the significance of the sentence: “I cannot combine some characters - htcmrlchtdj - which the divine library has not foreseen and which in one of its secret tongues do not contain a terrible meaning?” • What is the significance of the Librarian's conclusion that the “Library is unlimited and cyclical?”

  35. Discussion Questions (Dennett) • Joshua Solomin on Dennett • It is mentioned that books over 500 pages in length can be represented in the Library by having them span multiple Library volumes; and that by doing this, some Library volumes will be reused. But Dennett (from Quine) reduces this case to the case where the entire Library can be represented by a 1 and a 0, simply reused in different combinations. I would argue that this reductive case is no longer useful, because you then have to store the formulae for reproducing each book from your 1 and 0, which would be just as bad as storing the volumes themselves. So, does this strategy of reducing the content of a volume and re-using volumes help with the volume of information at all? If so, at what point between the 500-page volume and the 1-character volume will the strategy break down? Or would it be argued that it doesn't break down, but rather the strategy is still useful when condensed to a 1 or 0?

  36. Discussion Questions (Dennett) • Joshua Solomin on Dennett • Dennett mentions “even finding one readable volume in this huge storehouse is unlikely in the extreme.” If no parse-able information can be gleaned from a given volume (or piece of data), is it still useful? Can it be said that some piece of data is absolutely useless, or is it more that we simply haven't yet developed an encoding system that corresponds to it (that would allow us to decode meaning from it)? Or perhaps some third option? What could be a possible strategy for declaring some volumes “useless,” in order to reduce the scope of the Library to something easier to deal with?

  37. Discussion Questions (Dennett) • Joshua Solomin on Dennett • It is observed that while Borges did not order his Library, attempting to do so would have its own problems associated with it. Dennett's solution is a kind of alphabetizing, organized in multiple dimensions. Is there some better way to perform this sorting? Assuming that we didn't want to have 1,000,000 dimensions to our file cabinet (the number of characters per volume), could we perform some kind of intelligent grouping of volumes? What kind of metadata could be developed from this sorted Library to facilitate searching -- e.g., a section devoted to books about whales, with subsections on books involving sea captains as well as books involving wooden boys who become human? Would this save us anything over Dennett's alphabetizing?

  38. Discussion Questions (Reddy) • Katherine Ahern and Brooke Maury on Reddy • Is there any model of communication other than the conduit metaphor and the toolmaker's paradigm? Do these two visions leave any aspects of communication out? • If information is not actually stored in the 'signal', then is the only value in this transmitted matter how one interprets it?

  39. Discussion Questions (Reddy) • Katherine Ahern and Brooke Maury on Reddy • What is the value of information (ideas, data, facts, etc.) without someone to receive, decode and interpret that information? • Reddy seems to put the responsibility on the user or consumer of information in terms of correct interpretation. However, are there tools that can be 'packaged' with the information, that can assist in this unpacking? • How does one develop a common context from which we can establish the rules or semantics of information exchange?

  40. Discussion Questions (Reddy) • Katherine Ahern and Brooke Maury on Reddy • Reddy suggests that the increase in signals (i.e., libraries, recordings, and mass communication) have resulted in less culture, because the skill of reconstructing or “extracting” ideas is neglected. What are the implications for information organization and retrieval? Is it our job to somehow facilitate this reconstruction? Does Reddy's analysis even allow the possibility of facilitating extraction of ideas? If so, how does one encode information in such a way as to minimize the confusion and lack of clarity around its meaning during transmission and upon reception?

  41. Discussion Questions (Reddy) • Katherine Ahern and Brooke Maury on Reddy • Is Reddy's analogy of the evil magician representing language appropriate? Are subscribers to the conduit metaphor doomed to think others hostile or insane? Perhaps the 'evil magician' is our own laziness or failure to do the work of communication.

  42. Lecture Outline • What Is Information? • History of Information Search and Organization • Discussion Questions • Action Items for Next Time

  43. Homework (!) • Read Introduction and Chapters 1 – 2 of George Lakoff’s Women, Fire, and Dangerous Things • Create your SIMS home page

  44. Next Time • Human Categorization

  45. Sign Up for Office Hours • Prof. Marc Davis • Thursdays 2:00 pm – 4:00 pm • 314 South Hall

More Related