1 / 51

Information Science 2005

Information Science 2005 . Tefko Saracevic, PhD School of Communication, Information and Library Studies Rutgers University New Brunswick, New Jersey USA http://www.scils.rutgers.edu/~tefko. Information science: a short definition.

phuoc
Télécharger la présentation

Information Science 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Science 2005 Tefko Saracevic, PhD School of Communication, Information and Library Studies Rutgers University New Brunswick, New Jersey USA http://www.scils.rutgers.edu/~tefko © Tefko Saracevic

  2. Information science: a short definition “the science dealing with the efficient collection, storage, and retrieval of information” Webster © Tefko Saracevic

  3. Organization of presentation • Big picture – problems, solutions, social place • Structure – main areas in research & practice • Technology – information retrieval – largest part • Information – representation; bibliometrics • People – users, use, seeking, context • Digital libraries – whose are they anyhow? • Paradigm shift – distancing of areas • Conclusions– big questions for the future © Tefko Saracevic

  4. Scope • Evolution and state of the field in the last decade of the old and first decade of the new century © Tefko Saracevic

  5. The big pictureProblems addressed • Bit of history: Vannevar Bush (1945): • Defined problem as “... the massive task of making more accessible of a bewildering store of knowledge.” • Problem still with us & growing © Tefko Saracevic

  6. … solution • Bush suggested a machine: “Memex ... association of ideas ... duplicate mental processes artificially.” • Technological fix to problem • Still with us: technological determinant © Tefko Saracevic

  7. At the base of information science:Problem Trying to control content in • Information explosion • exponential growth of information artifacts, if not of information itself PLUS today • Communication explosion • exponential growth of means and ways by which information is communicated, transmitted, accesses, used © Tefko Saracevic

  8. applying technology to solving problems of effective use of information BUT: from aHUMAN & SOCIAL and not only TECHNOLOGICAL perspective technological solution, BUT … © Tefko Saracevic

  9. People Information Technology or a symbolic model © Tefko Saracevic

  10. Problems & solutions: SOCIAL CONTEXT • Professional practice AND scientific inquiry related to: Effective communication of knowledge records - ‘literature’ - among humans in the context of social, organizational, & individual need for and use of information • Taking advantage of modern information technology © Tefko Saracevic

  11. or as White & McCaine put it: “modeling the world of publications with a practical goal of being able to deliver their content to inquirers [users] on demand.” © Tefko Saracevic

  12. Elaboration • Knowledge records = texts, sounds, images, multimedia, web ... ‘literature’ in given domains • content-bearing structures – central to information science • Communication = human-computer-literature interface • study of information science is the interface between people & literatures • Information need, seeking, and use = reason d'être • Effectiveness = relevance, utility © Tefko Saracevic

  13. General characteristics • Interdisciplinarity - relations with a number of fields, some more or less predominant • Technological imperative - driving force, as in many modern fields • Information society - social context and role in evolution - shared with many fields © Tefko Saracevic

  14. StructureComposition of the field • As many fields, information science has different areas of concentration & specialization • They change, evolve over time • grow closer, grow apart • ignore each other, less or more © Tefko Saracevic

  15. most importantly different areas… • receive more or less in funding & emphasis • producing great imbalances in work & progress • attracting different audiences & fields • this includes • vastly different levels of support for research and • huge commercial investments & applications © Tefko Saracevic

  16. Information or People or How to view structure? by decomposing areas & efforts in research & practice emphasizing Technology © Tefko Saracevic

  17. Part 3. Technology • Identified with information retrieval (IR) • by far biggest effort and investment • international & global • commercial interest large & growing © Tefko Saracevic

  18. Information Retrieval – definition & objective “ IR: ... intellectual aspects of description of information, ... search, ... & systems, machines...” Calvin Mooers, 1951 • How to provide users with relevant information effectively? For that objective: 1. How to organize information intellectually? 2. How to specify the search & interaction intellectually? 3. What techniques & systems to use effectively? © Tefko Saracevic

  19. Streams in IR Res. & Dev. 1.Information science: • Services, users, use; • Human-computer interaction; • Cognitive aspects 2. Computer science: • Algorithms, techniques • Systems aspects 3. Information industry: • Products, services, Web • Market aspects • Problem: • relative isolation – discussed later © Tefko Saracevic

  20. Contemporary IR research • Now mostly done within computer science • e.g Special Interest Group on IR, Association for Computing Machinery (SIGIR,ACM) • Spread globally • e.g. major IR research communities emerged in China, Korea, Singapore • Branched outside of information science - “everybody does information retrieval” • data mining, machine learning, natural language processing, artificial intelligence, computer graphics … © Tefko Saracevic

  21. Text REtrieval Conference (TREC) • Started in 1992, now probably ending • “support research within the IR community by providing the infrastructure necessary for large-scale evaluation” • Methods • provides large test beds, queries, relevance judgments, comparative analyses • essentially using Cranfield 1960’s methodology • organized around tracks • various topics – changing over years © Tefko Saracevic

  22. TREC impact • International – big impact on creating research communities • Annual conferences • report. exchange results, foster cooperation • Results • mostly in reports, available at http://trec.nist.gov/ • overviews provided as well • but, only a fraction published in journals or books © Tefko Saracevic

  23. Genomics with 4 sub tracks HARD (High Accuracy Retrieval from Documents) Novelty (new, nonredundant information) Question answering Robust (improving poorly performing topics) Terabyte (very large collections) Web track Previous tracks: ad-hoc (1992-1999) routing (92–97) interactive (94-02) filtering (95-02) cross language (97-02) speech (97-00) Spanish (94-96) video (00-01) Chinese (96-97) query (98-00) and a few more run for two years only TREC tracks 2004103 groups from 21 countries © Tefko Saracevic

  24. Broadening of IR – ever changing, ever new areas added • Cross language IR (CLIR) • Natural language processing (NLP IR) • Music IR (MIR) • Image, video, multimedia retrieval • Spoken language retrieval • IR for bioinformatics and genomics • Summarization; text extraction • Question answering • Many human-computer interactions • XML IR • Web IR; Web search engines • DB and IR integration – structured and unstructured data © Tefko Saracevic

  25. Commercial IR • Search engines based on IR • But added many elaborations & significant innovations • dealing with HUGE numbers of pages fast • countering spamming & page rank games – adversarial IR • never ending combat of algorithms • Spread & impact worldwide • about 2000 engines in over 160 countries • English was dominant, but not any more © Tefko Saracevic

  26. Commercial IR: brave new world • Large investments & economic sector • hope for big profits, as yet questionable • Leading to proprietary, secret IR • also aggressive hiring of best talent • new commercial research centers in different countries (e.g. MS in China) • Academic research funding is changing • brain drain from academe © Tefko Saracevic

  27. IR successfully effected: • Emergence & growth of the INFORMATION INDUSTRY • Evolution of IS as a PROFESSION & SCIENCE • Many APPLICATIONS in many fields • including on the Web – search engines • Improvements in HUMAN - COMPUTER INTERACTION • Evolution of INTEDISCIPLINARITY IR has a long, proud history © Tefko Saracevic

  28. Part 4. Information • Several areas of investigation; • as basic phenomenon – not much progress • measures as Shannon's not successful • concentrated on manifestations and effects • information representation • large area connected with IR, librarianship • metadata • bibliometrics • structures of literature Covered in separate lectures © Tefko Saracevic

  29. Part 5. People • Professional services • in organization – moving toward knowledge management, competitive intelligence • in industry – vendors, aggregators, Internet, • Research • user & use studies • interaction studies • broadening to information seeking studies, social context, collaboration • relevance studies • social informatics © Tefko Saracevic

  30. User & use studies • Oldest area • covers many topics, methods, orientations • many studies related to IR • e.g. searching, multitasking, browsing, navigation • Branching into Web use studies • quantitative & qualitative studies • emergence of webmetrics © Tefko Saracevic

  31. Interaction • Traditional IR model concentrates on matching not user side & interaction • Several interaction models suggested • Ingwersen’s cognitive, Belkin’s episode, Saracevic’s stratified model • hard to get experiments & confirmation • Considered key to providing • basis for better design • understanding of use of systems • Web interactions a major new area © Tefko Saracevic

  32. Information seeking • Concentrates on broader context not only IR or interaction, people as they move in life & work • Based on concept of social construction of information • Most active area, particularly in Europe, with annual conferences © Tefko Saracevic

  33. Information seeking Sampling of theories, models • Why people seek information: • Taylor’s stages of information need • Dervin’s Sense-Making – gap, bridge • Belkin’s Anomalous State of Knowledge • Chatman’s life in the round – inf. poverty • How people seek information: • Wilson’s General Model of inf. seeking • Bates’ berrypicking – acts in searching • Kuhlthau’s information search process • Chang’s browsing model • Benoit’s communicative action - Habermas © Tefko Saracevic

  34. Part 7. Paradigm split in technology - people • Split from early 80’s to date into two orientations • System-centered • algorithms, TREC • continue traditional IR model • Human-(user)-centered • cognitive, situational, user studies • interaction models, some started in TREC • These became almost separate universes – one based in computer science, the other in information science & libraianship © Tefko Saracevic

  35. Critiques, cultures • Number of critiques (e.g. Dervin & Nilan) about isolated systems approach • calls for user-centered approaches, designs & evaluation • But user-centered studies did not deliver very useful design pointers, guides • Very different cultures: • computer science has own, more science & technology oriented • information science more humanities oriented • C.P. Snow’s two cultures © Tefko Saracevic

  36. Human vs. system • Human (user) side: • often highly critical, even one-sided • mantra of implications for design • but does not deliver concretely • System side: • mostly ignores user side & studies • ‘tell us what to do & we will’ • Issue NOT H or S approach • even less H vs. S • but how can H AND S work together • major challenge for the future © Tefko Saracevic

  37. Reconciliation? • Several efforts to provide human-centered design • but more discussion than real application • Integration of information seeking and information retrieval in context (Ingwersen & Järvelin) • Research & development toward • using search context, improving user search experiences & search quality • machine learning, incorporating semantics © Tefko Saracevic

  38. Funding • Most funding goes toward systems side & computer science • most (very large %) support for system work • In the digital age support is for digital • True globally © Tefko Saracevic

  39. Digital librariesLARGE & growing area • “Hot” area in R&D • a number of large grants & projects in the US, European Union, & other countries up to now; • will it continue? It is not growing • but “DIGITAL” big & “libraries“ small • “Hot” area in practice • building digital collections, hybrid libraries, • many projects throughout the world • growing at a high rate © Tefko Saracevic

  40. Technical problems • Substantial - larger & more complex than anticipated: • representing, storing & retrieving of library objects • particularly if originally designed to be printed & then digitized • operationally managing large collections - issues of scale • dealing with diverse & distributed collections • interoperability • assuring preservation & persistence • incorporating rights management © Tefko Saracevic

  41. Digital Library Initiatives in the US (DLI) • Research consortia under National Science Foundation • DLI 1: 1994-98, 3 agencies, $24M, six large projects • DLI 2: 1999-2006, 8 agencies, $60+M, 77 large & small projects in various categories • ‘digital library’ not defined to cover many topics & stretch ideas • not constrained by practice © Tefko Saracevic

  42. European Union • DELOS Network of Excelence on Digital Libraries • many projects throughout European Union • heavily technological • many meetings, workshops • resembles DLIs in the US • well funded, long range © Tefko Saracevic

  43. Research issues • understanding objects in DL • representing in many formats • non-textual materials • metadata, cataloging, indexing • conversion, digitization • organizing large collections • federated searching over distributed (various) collections • managing collections, scaling • preservation, archiving • interoperability, standardization • accessing, using, © Tefko Saracevic

  44. DL projects in practice • Heavily oriented toward a variety of institutions – primarily libraries • but also museums, professional societies, specific domains, etc etc • Main orientation: institutional missions, contexts, finances • sustainability, preservation in real world • managing growth, rights, access © Tefko Saracevic

  45. Agendas • Most DL research agenda is set from top down • from funding agencies to projects • imprint of the computer science community's interest & vision • Most DL practice agendas are set from bottom up • from institutions, incl. many libraries • imprint of institutional missions, interests & vision • providing access to specialized materials and collections from an institution (s) that are otherwise not accessible • covering in an integral way a domain with a range of sources © Tefko Saracevic

  46. Connection? • DL research & DL practice presently are conducted • mostly independent of each other, • minimally informing each other, • & having slight, or no connection • Parallel universes with little connections & interaction © Tefko Saracevic

  47. ConclusionsIS contributions • IS effected handling of inf. in society • Developed an organized body of knowledge & professional competencies • Applied interdisciplinarity • IR reached a mature stage • IR penetrated many fields & human activities • Stressed HUMAN in human-computer interaction © Tefko Saracevic

  48. Challenges • Adjust to the growing & changing social & organizational role of inf. & related inf. infrastructure • Play a positive role in globalization of information • Respond to technological imperative in human terms • Respond to changes from inf. to communication explosion - bringing own experiences to resolutions, particularly to the INTERNET • Join competition with quality • Join DIGITAL with LIBRARIES © Tefko Saracevic

  49. Juncture • IS is at a critical juncture in its evolution • Many fields, groups ... moving into information • big competition • entrance of powerful players • fight for stakes • To be a major player IS needs to progress in its: • research & development • professional competencies • educational efforts • interdisciplinary relations • Reexamination necessary © Tefko Saracevic

  50. © Tefko Saracevic

More Related