330 likes | 563 Vues
Languages for aboutness. Language A systematic arrangement of concepts What makes a language systematic? What makes an indexing language systematic?. Languages for aboutness. Indexing languages: Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV)
E N D
Languages for aboutness • Language • A systematic arrangement of concepts • What makes a language systematic? • What makes an indexing language systematic?
Languages for aboutness • Indexing languages: • Terminological tools • Thesauri (CV – controlled vocabulary) • Subject headings lists (CV) • Authority files for named entities (people, places, structures, organizations) • Classification / Classificatory systems • Keyword lists • Natural language systems (broad interpretation)
Translating Aboutness • Subject Access to Information through: • Evaluate, assess… • Translate to where in the language system • Assign the descriptor (term, class notation, code)
Subject Analysis • What something is about? • What the content of an object is “about”? • Different methods (Wilson, 1968) • Counting (objective method) (White House) • Purposive method (Machiavelli) • Method appealing to unity (Male, partisan pol.) • What stands out (War, inflation) • Challenges • Non-text (Aesthetics/taste, iconography/symbolism )
Aboutness: How to do it! • Read the document [Intellectual reading] • look for key features • many indexers mark up the items • rarely have time to read the whole document • Determine aboutness [Conceptual analysis] • Translate aboutness into the vocabulary or scheme you are using • In general: Subject headings: 1-3 headings • Descriptors, 5-8 descriptors • Classification: 1 notation.
Features of indexing languages: • Involve rules and require maintenance • Can be generated via automatic, human, or auto-human processes • Different processes generally display different strengths and weaknesses.
Features of indexing languages: • With the exception of a few general domain tools, they are generally domain specific. • NAL Thesaurus • Asian Vegetables Thesaurus • Florida Environments Online http://www.w3.org/2001/sw/Europe/reports/thes/thes_links.html • Concepts (or concept representations) are arranged in a discernable order
Language schema designs • Classified (grouping) Hierarchies and facets • NAL Thesaurus http://agclass.nal.usda.gov/agt/agt.htm • AAT http://www.getty.edu/research/conducting_research/vocabularies/aat/ • Alphabetical (ordering) • Asian Vegetables: http://www.nre.vic.gov.au/trade/asiaveg/thes-00.htm • Florida Environments Online Thesaurus http://susdl.fcla.edu/lfnh/thesauri/feol2/
Controlled Vocabulary • A list or a database of subject terms in which each concept has a preferred terms or phrase that will be used to represent it in the retrieval tool; the terms not used have references (syndetic structure), and often scope notes. • [jg adapted this definition from Taylor, Organizing Information. (2001).]
Thesaurus (structured thesaurus) • Lexical semantic relationships • Composed of indexing terms/descriptors • Descriptors - representations of concepts • Concepts - Units of meaning (Svenonius) Algorithmic/similarity thesauri (created via machine processing)
Thesaurus • Preferred terms • Non-preferred terms • Semantic relations between terms • How to apply terms (guidelines, rules) • Scope notes • Adding terms (How to produce terms that are not listed explicitly in the thesaurus)
Preferred Terms, issues…(see handout) • Control form of the term • Spelling, grammatical form • Theatre / Theater • MLA / Modern language association • Choose preferred term between synonyms • Dress or Clothing?
Common thesaural identifiers • SN Scope Note • Instruction, e.g. don’t invert phrases • USE Use (another term in preference to this one) • UF Used For • BT Broader Term • NT Narrower Term • RT Related Term
Semantic Relationships • Hierarchy • Equivalence • Association
Hierarchies of Meaning ‘Beer Glass’ ‘White wine glass’ ‘Glass’ ‘Wine Glass’ ‘Red wine glass’ From: Controlled Vocabularies/ Paul Miller Interoperability Focus UKOLN
Hierarchy • Level of generality – both preferred terms • BT (broader term) • Robins BT Birds • NT (narrower term) • Birds NT Robins • Inheritance, very specific rules
Equivalence • When two or more terms represent the same concept • One is the preferred term (descriptor), where all the information is collected • The other is the non-preferred and helps the user to find the appropriate term
Equivalence • Non-preferred term USE Preferred term • Nuclear Power USE Nuclear Energy • Periodicals USE Serials • Preferred term UF (used for) Non-preferred term • Nuclear Energy UF Nuclear Power • Serials UF Periodicals
Association • One preferred term is related to another preferred term • Non-hierarchical • “See also” function • In any large thesaurus, a significant umber of terms will mean similar things or cover related areas, without necessarily being synonyms or fitting into a defined hierarchy
Association • Related Terms (RT) can be used to show these links within the thesaurus • Bed RT Bedding • Paint Brushes RT Painting • Vandalism RT Hostility • Programming RT Software
Thesauri Guides • National Information Standards Organization. (2005). Guidelines for the construction, format, and management of monolingual thesauri. ANSI/NISO Z39.19-2005. Bethesda, MD: NISO Press. http://www.niso.org/standards/resources/Z39-19-2005.pdf?CFID=5559601&CFTOKEN=31747314 • Aitchison, Jean & Gilchirist, Alan. Thesaurus Construction: A Practical Guide. 3rd ed. London: Aslib, 1997. • Willpower Information Management Consultants http://www.willpower.demon.co.uk/thesprin.htm
Thesauri Directory • Indexing Resources on the WWW • http://www.slais.ubc.ca/resources/indexing/database1.htm • Controlled vocabularies • http://sky.fit.qut.edu.au/~middletm//cont_voc.html
Thesaurify • Apples • Fruit • Apple pie • Bosh pears • Oranges • Vegetables • Bartlett pears • Pears • Fruit stand
Principles…Specificity • Most specific words or phrase expressing the subject • A book about ‘cats’ • Under Cats • And not under Domestic Animals • Or Mammals • Or Zoology
Exhaustivity • Two degrees: • Summarization • Library Cataloging • Dublin Core • The wholeJournal of the ACM • Depth indexing • An individual book’s index • More intricate metadata schemas • Individual articles from the Journal of the ACM • Yahoo Weather index http://weather.yahoo.com/
Coextensivity • Coextensivity - assign as many terms as needed to bring out the main theme, and according to guidelines sub-themes. (p. 29, Lancaster) • “nothing more, nothing less” • a single descriptor, or a single term is rarely coextensive • Electronic purchasing/buying for air travel • “Electronic commerce/shopping” and “Air travel” • Risk factor or safety with SUVs • “Car safety” and “SUVs”
Warrants • End-user warrant • language of the user • Literary Warrant • System… reflect the language of the literature
Coordinate Indexing • Precoordinate indexing • terms are chosen and coordinate at the time of indexing or cataloging - Subject Headings • “Wireless home computer network” • (Syntagmatic relationships) • Postcoordinate indexing • indexing terms entered discretely and combined by the searcher at the time of searching - Thesauri • Keyword Searching using Boolean Operators • “Wireless network” & “home computer” • (Paradigmatic relationships)
Subject Heading Lists • Summarization (use of as many terms as required to summarize the content) • Precoordinate • E.g., “Drug abuse treatment in Britain” • Drug abuse – Treatment – Great Britain (LCSH) • Thesaurus: “Drug Abuse” and “Therapy” or “Treatment” and “Great Britain” or “Drug Abuse Therapy” and “GB) [facets] • Significance order (from the most important heading to the least important heading)
Subject Heading Lists • Library of Congress Subject Headings (LCSH) • Sears List of Subject Headings (Sears0 • Medical Subject Headings (MeSH)
Thesauri Created according to standards Z39.19 (ANSI/NISO) Single termconcepts/postcoordination “Wireless network” & “home computer” “Terrorism” “Attacks” & “United States” More popular in the online environment Lend to recall Lend to multilingual environment Subject Heading Lists Rules and guidelines “Thesaurification” multi-wordconcepts/pre-coordination “Wireless home computer network” $y Terrorism attacks $z United States STRINGS Lend to precision