1 / 66

Analysis of Social Tagging and Book Cataloging: A Case Study

HKLA 50th Anniversary Conference Hong Kong, 5 November 2008. Analysis of Social Tagging and Book Cataloging: A Case Study. Yi-Chen Chen 陳怡蓁 Dept. of Library & Information Science National Taiwan University. Outline. Introduction Background + Related Work Research Questions

lars-barton
Télécharger la présentation

Analysis of Social Tagging and Book Cataloging: A Case Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HKLA 50th Anniversary Conference Hong Kong, 5 November 2008 Analysis of Social Tagging and Book Cataloging: A Case Study Yi-Chen Chen 陳怡蓁 Dept. of Library & Information Science National Taiwan University

  2. Outline • Introduction • Background + Related Work • Research Questions • Data and Methodology • Results • Conclusions and Future Directions

  3. Background • the concept of social tagging has grown in popularity on the web-based services • it is quite different from controlled vocabularies-based indexing or authority-based cataloging(Mathes, 2004; Guy & Tonkin, 2006) • the emergence of social tagging has begun to challenge traditional ways of information organization

  4. Related Works… • the difference between social tagging and traditional cataloging/indexing has been noticed (Tennis, 2006), but very few of the studies were conducted to verify it • little research has been performed to examine how social tagging is applied to resources of books • since the resources (books) are already catalogued with library subject headings

  5. In this study…

  6. Objective & purposes • discover the properties and functions of social tags attached to books • compare user-created tags with authoritative subject headings

  7. Case study… • A case study on LibraryThing’s tagging system • “an online service to help people catalog their books easily” • allows its members to addtags for their personal book collections

  8. Research Questions • How can tags be organized or classified into different function types? • What kinds of function tags are most often used? • How are tagging terms similar to or different from library subject headings?

  9. Two parts of our studies • Part 1  investigate functions of tags and derives a classification based on the types of functions  explore what kinds of tag classes are more popular among users • Part 2  compare social tags with LCSH assigned to the same works of books and examine their overlaps and variations

  10. Part 1: Analysis of classifying tags and usagefrequency

  11. Data collection • LibrayThing (http://www.librarything.com) is chosen as our platform for data collection and analysis • The data we used for this study was gathered from LibraryThing in July 2008.

  12. Data collection • sample of books including both fiction and non-fiction works • randomly selected from the “Most often tagged fiction” and “Most often tagged non-fiction” booklist in LibraryThing • two criteria: (1) English books; (2) the corresponding catalog records should include LCSH.

  13. Data collection • total number of works = 50 • 25 fiction+25non-fiction

  14. Part 1: Methodology • extract the tagging data (tag cloud and tag frequency) from these 50 works • only use the main page tag cloud for analysis • a tag cloud that appear on the main page for each work includes only the top frequency tags

  15.  the tag frequency data was gathered on the tag cloud of each work, indicating how many times the tag was used for a particular work

  16. In total, there are 2,249 tags associated with the selected works, and 45 tags per work on average

  17. Part 1: Results(Ⅰ) • For these 2,249 tags, we analyze their function types and classified them. • Classification framework for tags was created

  18. 1. Bibliographic description  describe physical attributes of the work and can give factual information about the book • Genre/Form (e.g. “science fiction” ) • Author • Country of origin • Edition (e.g. “first edition”) • Variant formats (e.g. “film”) • Audience (e.g. “kids”)

  19. 2. Subject-related  tags that intended to reflect what a work is about and deal with thecontent of the resource • abstract and concrete concepts, things or objects, subject areas, character names, settings or place, timeframe of the story, themes of the document, topics and the like

  20. 3. Personal reference  act as reminders to oneself based on his/her personal context • Ownership (e.g. “borrowed”) • Reading Progress (e.g. “unread”, “tbr”) • Time (e.g. “2007”) • Task (e.g. “@work”, “textbook”) • Location (e.g. “bookshelf”)

  21. 4. Opinion • users can express their feelings and opinion about the resourcewith tags subjectively • reveal the reader’s value judgments and emotional reaction to a particular book (e.g. “favorite”, “interesting”)

  22. 5. Awards/Top list • a specific award or prize name (e.g. “Pulitzer prize”, “Nobel prize”) • the top book list (e.g. “1001books”)

  23. 6. Community • apply such tags to the books that they wish to share or discuss with others • convey the community meaning of the resource (e.g. “book club”)

  24. Part 1: Results • Distribution of number of tags • Distribution of tag frequency

  25. Part 1: Results • Distribution of number of tags • Distribution of tag frequency

  26. Number of tags (all works)

  27. Bibliographic description Genre/Form

  28. Personal reference Reading Progress

  29. Number of tags (fiction)

  30. Number of tags (non-fiction)

  31. Part 1: Results Distribution of number of tags Distribution of tag frequency

  32. Part 1: Results • Distributionof number of tags • Distribution of tag frequency • investigate if certain tag classes are used more frequently than others

  33. Tag frequency (all works)

  34. Tag frequency (fiction)

  35. Tag frequency (non-fiction)

  36. Part 1: Results & Findings • users are more likely to distinguish fiction works by their genre or form, and distinguish non-fiction works by theirsubject of books

  37. Bibliographic description Number of tags vs. Tag frequency ? vs. Subject-related Although Subject-related has the largest number of tags among all the works, itstag usage frequency is not as high as that of Bibliographic description.

  38. Part 1: Results & Findings • number of tags vs. tag usage frequency ? • the subject matter could be divergent and expressed in a variety of words, so its tag usage frequency is lower • the descriptions of bibliographic data often have common usage, especially of genre/form, thus resulting in clear convergence on the tagging terms

  39. Part 2: Comparison of social tags and LCSH terms

  40. Part 2: Methodology • Dataset • the Bibliographic description tags andSubject-related tags (from part 1) • the subject headings data was extracted from the LCSH terms assigned to each selected work in Library of Congress Online Catalogs (http://catalog.loc.gov/webvoy.htm)

  41. Part 2: Methodology • subject heading string may comprise the main headings and dash-subdivisions with complicated combinations (e.g. Japan --History --20th century --Fiction) • we separated the combination of subject headings into several concept terms and excluded the duplicate terms. (e.g. Japan. History. 20th century. Fiction.)

  42. 1,759 tags (Bibliographic description and Subject-related tags) 35.2 tags per work 313LCSH terms 6.3LCSH terms per work Part 2: Preliminary results

  43. Rules of comparison (1) tags and LCSH terms associated with the same work are compared in a term-by-term manner. (2) the overlap is identified with an exact or almost exact match in spelling, including plural/singular forms and case variations. (3) abbreviations or acronyms are considered the same as the full form of terms. (4) preposition, punctuation mark and symbol are ignored.

  44. Overlaps between tags and LCSH Overlappedtags tags not covered in SH (Non-overlap)

  45. 10.8% (overlap) All works

  46. 10.2% (overlap) Fiction…

More Related