1 / 46

Classification Systems

Classification Systems. Spring 2006, 3 April Bharat Mehra IS 520 (Organization and Representation of Information) School of Information Sciences University of Tennessee. Objectives: to understand different subject access methods to compare these methods Part I. Controlled Vocabulary

raven
Télécharger la présentation

Classification Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Classification Systems Spring 2006, 3 April Bharat Mehra IS 520 (Organization and Representation of Information) School of Information Sciences University of Tennessee

  2. Objectives: to understand different subject access methods to compare these methods Part I. Controlled Vocabulary In UTK OPAC, select subject index to browse April First and Holidays. Look at the LC Authority Records for the two concepts to understand the structure of the controlled vocabulary: authorized heading, lead-in terms (Use For), narrower terms, broader terms, and the corresponding LCC number (similar to the Relative Index in DDC). Look at the use of the heading Holidays in pre-coordinated subject cataloging in UTK OPAC: What types of subdivisions are being used? Find examples for topical subdivision, geographical subdivision, chronological subdivision, and form subdivision. Browse the list forward (Next Page button) and backward (Previous Page button) to see how various holidays (New Year’s Day and Thanksgiving Day) are dispersed in the alphabetical listing. Are the headings in near proximity always related concepts? Part II: Classifications Take a tour of DDC at http://www.oclc.org/dewey/resources/tour/default.htm Read the comparison of DCC and LCC, both enumerative classifications, at http://staff.oclc.org/~vizine/Intercat/vizine-goetz.htm Read “Was Ranganathan a Yahoo!?” about the colon classification, a facet classification at http://scout.wisc.edu/Projects/PastProjects/toolkit/enduser/archive/1998/euc-9803.html Assignment 4: Subject Access

  3. Report The report should read like a well-organized essay. No need to answer the specific questions above; just use the results you obtained as examples to illustrate or back up your arguments. You must use some examples from the above activities to make your points and to show that you gained some understanding while performing the above tasks. Your essay should have sections that include the following parts: Summarize your understanding of the roles of controlled vocabularies in providing subject access to intellectual works Summarize your understanding of the roles of classifications in organizing information objects in physical libraries Compare classification systems with alphabetical subject headings or thesauri (controlled vocabulary) in providing subject access (pros and cons) Discuss the new roles of controlled vocabularies and classifications in organizing electronic resources on the Web Assignment 4: Subject Access

  4. Subject Analysis and Classification • Subject analysis: Is part of creating metadata that deals with the conceptual analysis of an information object to determine what it is about, and • Translating “aboutness” of an info object to create controlled vocabulary terms for subject headings and classification notations

  5. Knowledge Classification • A logical system for the organization of knowledge • The division of knowledge into classes usually is based on disciplines • Classes are arranged into a hierarchical and coherent framework

  6. Knowledge Classification: Multistage Process • Identifying property of interest • Distinguishing objects that possess that property or those which lack it • Grouping objects that have the property into one class • Identifying relationships between classes • Finding distinctions within classes to arrive at subclasses Classical theory: From general to specific Problems???

  7. Fuzzy Set Theory (Lotfi Zadeh) • Some categories are well defined, others are not • Continuum of property rather than discrete marks • If categories defined by properties members share, then no member should be “better” than the others (prototypes) • Categories should be independent of humans doing the categorization • Ad hoc categories: on the spur of the moment

  8. Natural sciences Philosophy Literature Chemistry Math Astronomy Physics …… Plants Geometry Algebra Classificatory Structure (Tree) This may be arranged using indentation as seen often in printed schedules

  9. Philosophy Natural sciences Literature **************** Natural sciences - Math - Astronomy - Physics - Chemistry …… - Plants Literature 1 Natural sciences - Math -- Algebra -- Geometry - Astronomy …… 201 Classification (Print Format)

  10. Natural sciences 100 800 Philosophy 500 Literature Chemistry Math Astronomy Physics …… Plants 540 510 520 530 580 …… Geometry 516 Algebra 512 Linearization using notations The linear order of these concepts using numeric notation 100 ... 500 510 512 ... 516 ... 520 ... 530 ... 540 ... 580 ... 800

  11. Algebra 512 Astronomy 520 Chemistry 540 Geometry 516 Literature 800 Math 510 Math 510 Natural sciences 500 Philosophy 100 Physics 530 Plants 580 Classification vs. Alphabetical Order 100 (Philosophy) ... 500 (Natural sciences) 510 (Math) 512 (Algebra) ... 516 (Geometry) ... 520 (Astronomy) ... 530 (Physics) ... 540 (Chemistry) ... 580 (Plants) ... 800 (Literature)

  12. Algorithm for Browsing and Searching • Traverse the hierarchical (tree) structure is a top-down process • At each level of hierarchy the searcher must select one node to expand to the next level • Think about how you find information using a Web directory: what is the path?

  13. Library Classification • A way that helps organize information objects by grouping subjects in the manner which is most useful to the users • The most-used systems are • LCC: Library of Congress Classification • DDC: Dewey Decimal Classification Hierarchical Enumerative: attempt to assign designation for every subject concept needed in the system LCC more enumerative than DDC UDC: now faceted

  14. Classification Schemes • Verbal description (topic by topic) of things/concepts that can be represented • Arrangement of verbal descriptions in classed or logical order • Notational system alongside each verbal description (schedules) • Cross-references for navigation within the schedules • Alphabetical index of terms used in schedule (and synonyms) • Instructions for use • Organization that maintains classification scheme

  15. Ranganathan’s Colon Classification: A Faceted Approach • Parts of the whole: faces of a diamond • Notations for subparts strung together • 5 fundamental categories of a subject • Personality (focal or most specific subject) • Material • Energy (activity, operation, process) • Space (place) • Time e.g., design of wooden furniture in eighteenth century America Faceted indicators: not convenient for shelves Convenient in the age of the Internet. Why?

  16. Library Classification: Functions • Arrange items in a logical manner on the shelves • Locate known work through call number: shared mark on item and catalog • Collocate “like” items: chosen property is subject • Provide systematic display of bibliographic entries in printed catalogs, indexes, etc. • Help in direct retrieval

  17. Basics • Successive stages of classes and subclasses with a chosen property as the basis of each stage • Hierarchical tree structure: Genus and species • Facets, arrays, chain, citation order

  18. Classification Concepts • Broad vs. Close Classification • Classification of Knowledge vs. Classification of a Particular Collections (Literary warrant) • Integrity vs. Keeping Pace with Knowledge • Fixed vs. Relative Location • Closed vs. Open Stacks • Location Device (call number) vs. Collocation Device (classification notation)

  19. Library of Congress Classification A -- GENERAL WORKS B -- PHILOSOPHY. PSYCHOLOGY. RELIGION C -- AUXILIARY SCIENCES OF HISTORY D -- HISTORY (GENERAL) AND HISTORY OF EUROPE E -- HISTORY: AMERICA F -- HISTORY: AMERICA G -- GEOGRAPHY. ANTHROPOLOGY. RECREATION H -- SOCIAL SCIENCES J -- POLITICAL SCIENCE K -- LAW L -- EDUCATION M -- MUSIC AND BOOKS ON MUSIC N -- FINE ARTS P -- LANGUAGE AND LITERATURE Q -- SCIENCE R -- MEDICINE S -- AGRICULTURE T -- TECHNOLOGY U -- MILITARY SCIENCE V -- NAVAL SCIENCE Z -- BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL)

  20. Library of Congress Classification • Subclass B Philosophy (General) • Subclass BC Logic • Subclass BD Speculative philosophy • Subclass BF Psychology • Subclass BH Aesthetics • Subclass BJ Ethics • Subclass BL Religions. Mythology. Rationalism • Subclass BM Judaism • Subclass BP Islam. Bahaism. Theosophy, etc. • Subclass BQ Buddhism • Subclass BR Christianity • Subclass BS The Bible • Subclass BT Doctrinal Theology • Subclass BV Practical Theology • Subclass BX Christian Denominations

  21. Library of Congress Classification Subclass B • B1-5802 Philosophy (General) • B69-99 General works • B108-5802 By period (Including individual philosophers and schools of philosophy) • B108-708 Ancient • B720-765 • Medieval B770-785 • Renaissance B790-5802 • Modern B808-849 Special topics and schools of philosophy • B850-5739 By region or country • B5800-5802 By religion Subclass BC • BC1-199 Logic • BC11-39 History • BC25-39 By period • BC60-99 General works

  22. LCC—Some Features Notations • lack of built-in hierarchy • alphanumeric--linearization Advantages • comprehensive • flexible • inclusive • adaptive / hospitable Cons • difficult to search hierarchically

  23. Dewey Decimal Classification • From the divine to the mundane (except 000) • Choosing decimals for its categories, allows purely numerical and infinitely hierarchical • Faceted classification: combines elements from different parts of the structure to construct a number representing the subject content • Except for general works and fiction, works are classified principally by subject, with extensions for subject relationships, place, time or type of material, producing classification numbers of not less than three digits but otherwise of indeterminate length with a decimal point before the fourth digit, where present • Classmarks are to be read as numbers, in the order: 050, 220, 330.973, 331 etc.

  24. Dewey Decimal Classification Main classes=>divisions=>sections The system is made up of ten categories: • 000 Computers, information and general reference • 100 Philosophy and psychology • 200 Religion • 300 Social sciences • 400 Language • 500 Science and mathematics • 600 Technology • 700 Arts and recreation • 800 Literature • 900 History and geography 330 for economy + 94 for Europe = 330.94 European economy; 973 for United States + 005 form division for periodicals = 973.005, periodicals concerning the United States generally

  25. Dewey Decimal Classification • 000 Generalities 001 Knowledge 002 The book 003 Systems 004 Data processing Computer science 005 Computer programming, programs • 006 Special computer methods 007 Not assigned or no longer used 010 Bibliography 011 Bibliographies 012 Bibliographies of individuals 200 Religion 201 Philosophy of Christianity 202 Miscellany of Christianity 203 Dictionaries of Christianity 204 Special topics 205 Serial publications of Christianity 206 Organizations of Christianity 207 Education, research in Christianity 208 Kinds of persons in Christianity 209 History & geography of Christianity 210 Natural theology 211 Concepts of God 212 Existence, attributes of God • 100 Philosophy & psychology 101 Theory of philosophy 102 Miscellany of philosophy 103 Dictionaries of philosophy 104 Not assigned or no longer used 105 Serial publications of philosophy 106 Organizations of philosophy 107 Education, research in philosophy

  26. Comparison of DDC & LCC • Knowledge • Arabic numerals • Universal • Uneven classes • Logical placement of subjects • Developer (“generalist”) • Mnemonic • Literary warrant • Alphanumeric • American • Hospitable • Logical hierarchies often lost • Developer (“Specialists”) • Confusing notation

  27. How Call Numbers Work

  28. Every book is given a unique call number to serve as an address for locating the book on the shelf Call Numbers LLC Call number has two parts— (Library of Congress Classification or Dewey Decimal Classification) and the Cutter number or book number

  29. Every book is given a unique call number to serve as an address for locating the book on the shelf Call Numbers DDC Call number has two parts--Dewey Decimal Classification and the Cutter number or book number CUTTER NUMBER for a book usually consists of the first letter of the author's last name and a series of numbers (from a table designed to help maintain an alphabetical arrangement of names). Conley, Ellen C767 Conley, Robert C768 Cook, Robin C77Cook, Thomas C773 How do we keep the call number unique if the library has several works by the same author? 813.54 Cook, Robin C77aAcceptable Risk C77fFever   C77faFatal Cure work mark or work letter

  30. Call Numbers DDC 813.54 Farthest shore L52f Ursula Le Guin 813.54 Four ways to forgiveness L52fo Ursula Le Guin 813.54 Planet of Exile L52p   Ursula Le Guin 813.54 Approaches to the Fiction of Ursula Le Guin L52Z James Bittner B54 813.54 is the Dewey number for American Literature after 1945, L52Z is the Cutter number for Ursula Le Guin, Z is for a work of criticsm, B54 is for James Bittner, the author of Approaches.....   The capital Z the last letter in the alphabet, insures that all criticisms are shelved after the author's work

  31. Assign Call Numbers • Select appropriate class number from the schedule • Add auxiliary number from tables or based on rules to extend the class number • Add cutter number as book mark (use cutter tables)

  32. Call Numbers using LCC • QE534.2.B64 Call numbers can begin with one, two, or three letters • The first letter of a call number represents one of the 21 major divisions of the LCC System. In the example, the subject "Q" is Science. • The second letter "E" represents a subdivision of the sciences, Geology. All books in the QE's are primarily about Geology. • Books in categories E, United States History, and F, Local U.S. History and American History, do not have a second letter (exception: in Canada, FC is used for Canadian history). • Books about Law, K's, can have three letters, such as KFH, Law of Hawaii. Some areas of history (D) also have three-letter call numbers.

  33. Call Numbers using LCC Numbers after letters. • The first set of numbers in a call number help to define a book's subject. "534.2" teaches us more about the book's subject. The range QE 500-625 are books about "Dynamic and Structural Geology" • Books with call numbers QE534.2 are specifically "Earthquakes, Seismology - General Works - 1970 to Present" • One of the most frequently used number in call numbers is "1" which is often used for general periodicals in a given subject area. • For example, Q1.S3 is the call number for the journal Science. • Journals are also given call numbers based on the specific subject. • For example, QE531.E32 is the call number for the journal Earthquake Spectra as QE531 is the call number for periodicals about "Earthquakes, Seismology"

  34. Call Number using LCC • QE534.2.B64, the B64 is taken from the two-number table and represents the author's last name, Bruce A. Bolt. • The book is Earthquakes. • Some books have two Cutters, the first one is usually a further breakdown of the subject matter. • QA 76.76 H94 M88 is a book located in the Mathematics section of the Q's. • QA 76 is about Computer Science • The ".76" indicates Special Topics in Automation • "H94" tells us that this is a book about HTML • "M88" represents the last name of the first author “Musciano” • The book is HTML: The Definitive Guide

  35. Call Number using LCC • Class mark: Letters Numbers Decimal ... • Cutter numbers: Letters plus numbers • --single cutter as a book mark • --double cutters • a first Cutter number as class extension by topic geographic etc.; • a second Cutter number as book number

  36. Call Number in MARC • 050 00 $a Q184 $b I87 • 050 00 $a QA76.9.C64 $b C36

  37. Application -- One • How to organize periodicals on the shelves? • Method 1. Alphabetical by title • Method 2. Classification • Pros and cons of each method?

  38. Application – Two • How to organize monographs in a series? • Method 1. ASIST conference proceedings as a monograph series (see record 1) • Method 2. as a serial: journals or magazines (see record 2)

  39. Definitions • Serials: publication issues in successive parts that is intended to continue indefinitely • Monograph series contain individual objects that are complete bibliographic units (not intended to be continued indefinitely) • Pros and cons of each practice?

  40. Well organized subject headings -- beyond listing • Medical Subject Headings MeSH)http://www.nlm.nih.gov/mesh/2005/MeSHtree.A.html

  41. Purpose of Classification • Provides meaningful subject access via retrieval tool • Provides collocation of objects of a like nature (Cutter) • Provides a logical location for similar objects • Saves user time

  42. Purpose of Classification Because books are classified by subject, you can often find several helpful books on the same shelf, or nearby

  43. Other subject access tools • Facet -- synthetical classification was developed to overcome the limitations of enumerative hierarchical classifications to allow combination of classes • Taxonomy -- organization or subject oriented: classification of things, or the principles underlying the classification • Ontology -- building shareable knowledge structures (among people, computer, …): "What are the fundamental categories of being?"

  44. Semantic Web • is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. The first steps in weaving the Semantic Web into the structure of the existing Web are already under way. In the near future, these developments will usher in significant new functionality as machines become much better able to process and "understand" the data that they merely display at present. • ---Tim Berners-Lee, etc. Scientific America, May 17, 2001

More Related