1 / 39

Comparison of Keyword Searching Using FAST vs. Using LCSH

Comparison of Keyword Searching Using FAST vs. Using LCSH. Presentation for the ALCTS CCS Program: FAST: A New System of Subject Access for Cataloging and Metadata New Orleans, Saturday, June 24, 2006 by Arlene G. Taylor. The Database.

louise
Télécharger la présentation

Comparison of Keyword Searching Using FAST vs. Using LCSH

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparison of Keyword Searching Using FAST vs. Using LCSH Presentation for the ALCTS CCS Program: FAST: A New System of Subject Access for Cataloging and Metadata New Orleans, Saturday, June 24, 2006 by Arlene G. Taylor

  2. The Database • OCLC (i.e., Ed O’Neill and team) created a test database of bibliographic records • Records were a subset of Worldcat records • Each record had both a set of LCSH headings and a set of FAST headings • The FAST headings were “translated” from the LCSH headings • Two indexes were created by OCLC’s research team – one to search FAST headings and one to search LCSH headings © 2006 Arlene G. Taylor

  3. The Project • Participants were students at the University of Pittsburgh in a Subject Analysis class • Two parts • Search both LCSH and FAST indexes for Newspapers in home state • Search four topics of interest in both LCSH and FAST indexes • Students were asked to explain differences found in the two indexes © 2006 Arlene G. Taylor

  4. Newspaper searches • Searches for “newspapers” and any state that has an authorized AACR2 abbreviation are almost always different in the two indexes • A search retrieves both records for newspapers themselves and records for works about newspapers • The state is abbreviated on some records, using abbreviations in AACR2, but the searcher almost always spells out the state name (some states, e.g., Ohio and Iowa, have no abbreviation) © 2006 Arlene G. Taylor

  5. Newspaper searches (cont.) • A record about newspapers may have an LCSH subject heading: • 650  0 American newspapers   $z Pennsylvania   $z Bucks County    • In FAST this is translated to: • 650  7 American newspapers   $2 fast • 651  7 Pennsylvania   $z Bucks County   $2 fast • A keyword search for “Newspapers Pennsylvania” will retrieve the record in both LCSH and FAST indexes. © 2006 Arlene G. Taylor

  6. Newspaper searches (cont.) • A record for a newspaper itself may have the LCSH heading: • 651  0 Clearfield (Clearfield County, Pa.)   $v Newspapers. • In FAST this is translated to: • 651  7 Pennsylvania   $z Clearfield (Clearfield County)   $2 fast • 655  7 Newspapers   $2 fast • A keyword search for “Newspapers Pennsylvania” will retrieve the record only in the FAST index. © 2006 Arlene G. Taylor

  7. Part II of the project • While most students understood on some level the different results they got in Part I, few of them understood their different results in Part II. • Therefore, the result of Part II was to generate 76 topics that I then searched again to determine results and the reasons for differences. © 2006 Arlene G. Taylor

  8. Basic statistics • Number searches – 76 • Number records found using FAST index – 2371 • Number records found using LCSH index – 2340 • Number records same using either index – 2200 • Number records not found using LCSH index – 171 • Number records not found using FAST index - 140 © 2006 Arlene G. Taylor

  9. Reasons for variation in searching results • Invalid LCSH (or not established) not translated to FAST • $x and or $v in 600 and 610 fields not indexed in the LCSH index • Word indexed in FAST index because it was in a 650 field with 2nd indicator 7 and a $2 at the end, but the $2 contained a code for a vocabulary other than FAST • Some names (personal or corporate) not translated to FAST • Differences between LCSH and FAST © 2006 Arlene G. Taylor

  10. Invalid LCSH (or not established) not translated to FAST • At the time of creation of the FAST file we were working with, the rule was to convert LCSH (6xx, 2nd indicator 0) to FAST, but then only those headings that matched a FAST authority record were kept as FAST headings in the record. • 117 records found using the LCSH index were not found using the FAST index due to this “rule” • An example showing a result of searching for “information literacy” follows: © 2006 Arlene G. Taylor

  11. Search for “information literacy”: 6500Business $x Research. 6500Business $x Research $x Computer network resources. 6500Information retrieval $x Study and teaching. 6500Electronic information resource literacy$x Study and teaching. 6507Business $x Research $2 fast 6507Business $x Research $x Computer network resources $2 fast 6507Information retrieval $x Study and teaching $2 fast

  12. Invalid LCSH (or not established) not translated to FAST (cont.) • “Electronic information resource literacy” is in the FAST authority file, but not “Study and teaching.” • Currently the heading would have the subdivision removed and a match would be made to the heading without the subdivision. • A keyword search for “information literacy” in the future would find this record through the FAST index as well as the LCSH index. © 2006 Arlene G. Taylor

  13. $x and or $v in 600 and 610 fields not indexed for the LCSH index • At the time of creation of the FAST and LCSH indexes we were working with, only subfields a,b,c,d (and q in 600) in fields 600 and 610 (with 2nd indicator 0) were indexed for the LCSH index. • 72 records found using the FAST index were not found using the LCSH index due to this “rule” • An example showing a result of searching for “archives catalogs” follows: © 2006 Arlene G. Taylor

  14. Search for “archives catalogs”: 610 20Baptist Missionary Society $xArchives$vCatalogs. 6500Baptists $x Missions $z West Indies. 6500Baptists $x Missions $z Africa. 6500Baptists $x Missions $z Asia. 610 27Baptist Missionary Society. $2 fast 6507Archives$2 fast 6507Baptists $x Missions $2 fast 6517Africa $2 fast 6517Asia $2 fast 6517West Indies $2 fast 6557Catalogs$2 fast

  15. $x and or $v in 600 and 610 fields not indexed for the LCSH file (cont.) • Currently these subfields would be included in the LCSH index. • A keyword search for “archives catalogs” in the future would find this record through the LCSH index as well as the FAST index. © 2006 Arlene G. Taylor

  16. Word indexed in FAST index because it was in a 650 field with 2nd indicator 7 and a $2 at the end • Not all 2nd indicator 7, $2 designated terms are FAST terms – some are from gsafd, nasa, ram, lctgm, etc. • 40 records found using the FAST index were not found using the LCSH index due to this oversight • An example showing a result of searching for “dog training” follows: © 2006 Arlene G. Taylor

  17. Search for “dog training”: 6500Dog trainers $z Arkansas $z Blanchard Springs. 6507Animal training$z Arkansas $z Blanchard Springs $y 1950-1960. $2 lctgm 6507Dogs $z Arkansas $z Blanchard Springs $y 1950-1960. $2 lctgm 6507Photojournalism $z Arkansas $z Little Rock $y 1950-1960. $2 lctgm 6507Dog trainers $2 fast 6517Arkansas $z Little Rock $2 fast

  18. Word indexed in FAST index because it was in a 650 field with 2nd indicator 7 and a $2 at the end (cont.) • Currently the indexing program would be refined so as not to include fields with 2nd indicator 7 and $2 unless “fast” is in $2. • A keyword search for “dog training” in the future would not find this record through either the LCSH index or the FAST index. © 2006 Arlene G. Taylor

  19. Some names (personal or corporate) not translated to FAST • The program that translated LC 6xx headings to FAST compared names to the “FAST authority file” and validated only those that were matched in the file. • 20 records found using the LCSH index were not found using the FAST index due to this “rule” • An example showing a result of searching for “technical services” follows: © 2006 Arlene G. Taylor

  20. Search for “technical services”: 610 20Kansas Real Estate Commission$xAuditing. 610 10Kansas.$bState Board of Technical Professions $xAuditing. 610 10Kansas. $bBoard of Emergency Medical Services$xAuditing. 610 27Kansas Real Estate Commission $2 fast 610 17Kansas. $b State Board of Technical Professions $2 fast 6507Auditing $2 fast

  21. Some names (personal or corporate) not translated to FAST (cont.) • The corporate name containing “technical” is in the FAST authority file, but not the name containing “services.” • A keyword search for “technical services” in the future would find this record through the FAST index as well as the LCSH index. © 2006 Arlene G. Taylor

  22. Differences between LCSH and FAST • “Politics and government” as a subdivision in LCSH is changed to “Political science” in FAST • “Appropriations and expenditures” as a subdivision in LCSH is changed to “Expenditures, Public” in FAST • “Exhibitions” as a subdivision in LCSH is changed to “Exhibition catalogs” in FAST • “Columbia River Watershed” and “Pacific Coast (U.S.)” were translated to FAST with “United States” as a geographic heading • “Arabic is a language element in LCSH and is also coded in the 008 field. This is considered redundant in FAST • “Library” as a subdivision in LCSH is changed to “Libraries” in FAST • “Study and teaching (Higher)” as a subdivision in LCSH is changed to “Higher education” in FAST © 2006 Arlene G. Taylor

  23. “Politics and government” as a subdivision in LCSH is changed to “Political science” in FAST • This change affects any keyword search using any one of the words: politics, government, political, or science • 1 record found using the LCSH index was not found using the FAST index, and 27 records found using the FAST index were not found using the LCSH index due to this “rule” • Examples showing a result of searching for “government documents” and a result of searching for “religion and science” follow: © 2006 Arlene G. Taylor

  24. Search for “government documents”: 6510Egypt $x Politics and government$y 30 B.C.-640 A.D. $v Sources. 6500Legal documents$z Egypt $x History $v Sources. 648730 B.C. - 640 A.D. $2 fast 6507Legal documents$2 fast 6507Political science $2 fast 6517Egypt $2 fast 6557History $2 fast 6557Sources $2 fast

  25. Search for “religion and science”: 6500Islam and politics $z Algeria. 6500Religion and politics $z Algeria. 6510Algeria $x Politics and government. 6507Islam and politics $2 fast 6507Political science$2 fast 6507Religion and politics $2 fast 6517Algeria $2 fast

  26. “Appropriations and expenditures” as a subdivision in LCSH is changed to “Expenditures, Public” in FAST • This change affects any keyword search using the word “appropriations” or the word “public” • 23 records found using the FAST index were not found using the LCSH index due to this “rule” • An example is the search for “public service”: © 2006 Arlene G. Taylor

  27. Search for “public service”: 610 10United States. $b Dept. of the Air Force $x Appropriations and expenditures. 610 10United States. $b Defense Finance and Accounting Service. $b Denver Center $x Auditing. 610 17United States. $b Defense Finance and Accounting Service. $b Denver Center $2 fast 610 17United States. $b Dept. of the Air Force. $2 fast 6507Auditing $2 fast 6507Expenditures, Public$2 fast

  28. “Exhibitions” as a subdivision in LCSH is changed to “Catalogs $v Exhibition catalogs” in FAST • This change affects any keyword search using the words: exhibition, exhibitions, or catalogs • 4 records found using the FAST index were not found using the LCSH index due to this “rule” • An example is the search for “archives catalogs”: © 2006 Arlene G. Taylor

  29. Search for “archives catalogs”: 610 10United States. $b National Archives and Records Administration $x Photograph collections $v Exhibitions. 6500Photography $z United States $x History $y 20th century $v Exhibitions. 610 17United States. $b National Archives and Records Administration $2 fast 64871900 - 1999 $2 fast 6507Photograph collections $2 fast 6507Photography $2 fast 6517United States $2 fast 6557Catalogs$v Exhibition catalogs$2 fast 6557History $2 fast

  30. “Columbia River Watershed” and “Pacific Coast (U.S.)” were translated to FAST with “United States” as a geographic heading • This change affects any searches qualified by “United States” spelled out • 2 records found using the FAST index were not found using the LCSH index due to this “rule” • An example is the search for “endangered species United States”: © 2006 Arlene G. Taylor

  31. Search for “endangered species United States”: 6500Endangered species$z Columbia River Watershed. 6500Logging $x Environmental aspects $z Columbia River Watershed. 610 20Plum Creek Timber Company. 610 27Plum Creek Timber Company $2 fast 6507Endangered species$2 fast 6507Logging $x Environmental aspects $2 fast 6517United States$z Columbia River Watershed $2 fast

  32. “Arabic is a language element in LCSH and is also coded in the 008 field – redundant in FAST • This change affects any searches using the word “Arabic.” • 2 records found using the LCSH index were not found using the FAST index due to this “rule” • An example is the search for “arabic books”: © 2006 Arlene G. Taylor

  33. Search for “arabic books”: 008 990614s1960 ru 000 0 ara d 500 In Russian and Arabic. 6500Russian language $v Conversation and phrase books$xArabic. 6507Russian language $2 fast 6557Conversation and phrase books$2 fast

  34. “Library” as a subdivision in LCSH is changed to “Libraries” in FAST • This change affects any searches using the word “library” or the word “libraries” • 2 records found using the FAST index were not found using the LCSH index due to this “rule” • An example is the search for “medical libraries”: © 2006 Arlene G. Taylor

  35. Search for “medical libraries”: 6500Medicine $v Bibliography $v Catalogs. 610 20Moody Medical Library $v Catalogs. 600 10Blocker, T. G. $q (Truman Graves) $x Library $v Catalogs. 600 17Blocker, T. G. $q (Truman Graves) $2 fast 610 27Moody Medical Library. $2 fast 6507Libraries$2 fast 6507Medicine $2 fast 6557Bibliography $v Catalogs $2 fast 6557Catalogs $2 fast

  36. “Study and teaching (Higher)” as a subdivision in LCSH is changed to “Higher education” in FAST • This change affects any searches using the words: study, teaching, education • 1 record found using the FAST index was not found using the LCSH index due to this “rule” • An example is the search for “education policy”: © 2006 Arlene G. Taylor

  37. Search for “education policy”: 6500Arctic regions $x Research $x Government policy$z Canada. 6500Research $z Arctic regions. 6510Arctic regions $x Study and teaching (Higher) $z Canada. 6507Education, Higher $2 fast 6507Research $2 fast 6507Research $x Government policy$2 fast 6517Arctic regions $2 fast 6517Canada $2 fast

  38. Conclusions • A total of 62 records were affected by real differences between LCSH and FAST – about 3% • The real differences affected 9 of the 76 searches – about 12% – (but only 62 of the records in those 9 searches were affected – 472 records in the 9 searches were the same in both indexes) © 2006 Arlene G. Taylor

  39. Thank you! Arlene G. Taylor ataylor@mail.sis.pitt.edu © 2006 Arlene G. Taylor

More Related