1 / 29

Semantify del.icio : automatically turn your tags into senses

Semantify del.icio.us : automatically turn your tags into senses. Maurizio Tesconi, Francesco Ronzano, Andrea Marchetti, Salvatore Minutoli Institute for Informatics and Telematics National Research Council (C.N.R.) Pisa, Italy. Social Data on the Web @ ISWC2008 October 27, 2008. Overview.

kalare
Télécharger la présentation

Semantify del.icio : automatically turn your tags into senses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantify del.icio.us:automatically turn your tagsinto senses Maurizio Tesconi, Francesco Ronzano, Andrea Marchetti, Salvatore Minutoli Institute for Informatics and Telematics National Research Council (C.N.R.) Pisa, Italy Social Data on the Web @ ISWC2008 October 27, 2008

  2. Overview Semantify del.icio.us: automatically turn your tags into senses • Collaborative tagging: a huge amount of (messy) social data • Sense-based tagging: from tags to senses • Automatically semantifying user tags: the Tag Disambiguation Algorithm • an exaple of application to delicious tags • Delicious TDA evaluation • Advantages of sense-based tagging • sense-based grouping of tags • organizing user resources exploiting external classifications • linking tagging systems to Linked Data datasets • Future works

  3. Collaborative tagging:a huge amount of (messy) social data Semantify del.icio.us: automatically turn your tags into senses Collaborative tagging activity generates a considerble amount of valuable social data by describing Web resources through tags Keywords Web resources • Delicious has: • more than 5 million users • more than 150 million of bookmarked URLs • about 140.000 new posts each month Tag-based classifications of resources are often weakened by the poor structure and organization of collected data

  4. Collaborative tagging:a huge amount of (messy) social data Semantify del.icio.us: automatically turn your tags into senses polysemy ajax distinct lexical forms different levels of precision European Greek heroes synonymy hero fighter typos or alternate spellings colour color

  5. Sense-based tagging:from tags to senses Semantify del.icio.us: automatically turn your tags into senses • The globally agreed causes of lack of consistency of tag based classifications are: • the coplete freedom users have in choosing tags • the lack of any support of semantic information Web resources Tags Collaborative tagging systems Senses Use senses instead of tags to characterize Web resources

  6. Semantify del.icio.us: automatically turn your tags into senses Sense-based tagging:from tags to senses a perceived second generation of web-based communities and hosted services a reusable design for a software system (or subsystem) a collection of subprograms used to develop software a scripting language most often used for client-side web development web2.0 library javascript js framework http://script.aculo.us/ http://jquery.com/ http://extjs.com/

  7. Semantify del.icio.us: automatically turn your tags into senses Sense-based tagging:from tags to senses a perceived second generation of web-based communities and hosted services a reusable design for a software system (or subsystem) a collection of subprograms used to develop software a scripting language most often used for client-side web development web2.0 library javascript js framework • We need a shared semantic reference ad hoc built and structured, providing: • general-domain coverage • extensive agreement upon it • constantly updated and enriched • rich of useful relations to organize and retrieve information

  8. Sense-based tagging:from tags to senses Semantify del.icio.us: automatically turn your tags into senses Tagpedia (http://www.tagpedia.org/) • A semantic organization and classification of tags, grouping them • by meaning • Structured to support semantic sense-based tagging • Based on the model of term-concept networks • Populated mining Wikipedia car a motor vehicle with four wheels; usually propelled by an internal combustion engine machine • Tagpedia is made up of: • 1.927.378 senses • 4.237.740 tags http://en.wikipedia.org/wiki/Car auto cars

  9. Automatically semantifying user tags:Tag Disambiguation Algorithm Semantify del.icio.us: automatically turn your tags into senses ‘In order to ease the adoption of sense-based tagging, is essential not to explode the amount of user interactions needed to deal with it’, so... Tag Disambiguation Algorithm (TDA) An algorithm that point out the right meaning of each one of the tags of a user of a tagging system, exploiting Tagpedia and Wikipedia • We assume that: • the more the meaning of a tag t described by a text Wis similar to the one intended by the user Uthe higher is, in the text W, the number of occurrences of tags related to t • the meaningintended by the user U for the tag tdoesn’t change while tagging

  10. Semantify del.icio.us: automatically turn your tags into senses Automatically semantifying user tags:Tag Disambiguation Algorithm Starting from the user tagging profile, the TDA caluclates the sense-rank for each meaning of a tag The higher the sense-rank of a meaning of a tag of a user is, the better that meaning defines the sense intended by the user for that tag Tag Disambiguation Algorithm Sense-rank calculation for each meaning of each tag Tagging user profile Tagging system Tagpedia Wikipedia

  11. Semantify del.icio.us: automatically turn your tags into senses Automatically semantifying user tags:Tag Disambiguation Algorithm User tags software library framework javascript http://script.aculo.us/ http://jquery.com/ Popu- lar tags javascript library programming web javascript ajax 14879 5980 5163 6191 19435 18360 javascript framework ajax software web javascript programming

  12. Semantify del.icio.us: automatically turn your tags into senses Automatically semantifying user tags:Tag Disambiguation Algorithm Sense rank A collection of information sources, resources and services, organized for use, and maintained by a public body. A library is a collection of information, sources, resources and services, organized for use, and maintained by a public body, an institution, or a private individual. In the more traditional sense, it means a collection of books. This collection and services are used by people who choose not to — or cannot afford to.................... 0,4171 A collection of subprograms used to develop software. In computer science, a library is a collection of subprograms used to develop software. Libraries contain helper code and data, which provide services to independent programs. This allows code and data to be shared and changed in a modular fashion. Some executables are both standalone programs and libraries.................. 1 library A collection of cells, macros or functional units that perform common operations and are used to build more complex logic blocks. In electronic design, library often refers to a collection of cells, macros or functional units that perform common operations and are used to build more complex logic blocks.A standard cell library is a collection of low level logic functions such as AND, OR, INVERT, flip-flops, latches and buffers. These cells are .................................. 0,0018 A collection of molecules in a stable form that represents some aspect of an organism. In molecular biology, a library is a collection of molecules in a stable form that represents some aspect of an organism. Two common types of libraries are cDNA libraries (formed from Complementary DNA) and genomic libraries. The nucleotide sequences of interest are preserved as inserts to a plasmid......... 0 Wikipedia Tagpedia

  13. Semantify del.icio.us: automatically turn your tags into senses Automatically semantifying user tags:Tag Disambiguation Algorithm Delicious TDA evaluation • 9 users, from occasional to very active ones • 3926 posts (bookmarked URLs) • 3520 tags (on average 3,38 tags/post) Applying the TDA we have obtained that: • 89,74% of the tags (3159) have beeen associated to a meaning • 2884 disting meanings have been pointed out • among 2589 polysemous tags, the TDA has chosen the correct meaning of the 89,15% (2308) of them (human review) • 91,52% (2891) of the disambiguated tags (3159) have been associated to the correct meaning (human review)

  14. Advantages of sense-based tagging Semantify del.icio.us: automatically turn your tags into senses Sense-based grouping of tags On average, the 9%of the tags chosen by a user to describe a resource has a meaning already referred by other tags Carpooling is the shared use of a car Tendency to provoke laughter and provide amusement rideshare funny humor humour carpool carpooling ridesharing A term that encompasses individual motion pictures movie film films

  15. Semantify del.icio.us: automatically turn your tags into senses Advantages of sense-based tagging Organizing user resuorces exploiting external classifications Since each meaning identified by the TDA is described by a page of Wikipedia we have examined the adequacy of three hierarchical classifications built on the top of Wikipedia to organize and structure tagged resources a perceived second generation of web-based communities and hosted services a reusable design for a software system (or subsystem) a collection of subprograms used to develop software a scripting language most often used for client-side web development wiki/Web_2.0 wiki/Car wiki/Javascript wiki/Framework library javascript js framework web2.0

  16. Semantify del.icio.us: automatically turn your tags into senses Advantages of sense-based tagging Organizing user resuorces exploiting external classifications • Wikipedia Categories: • 312 thousand categories, connected bysubsumption relations • almost all Wikipedia pagesare placed in at least a category • YAGO Classes: • 225 thousand classes, connected bysubsumption relations • 1,412 million (74%) of Wikipedia pages are mapped to at leas to a class • Wordnet Synsets: • 124 synsets, connected by hyponymy relations • about 450 thousand of Wikipedia pages are mapped to at least to a synset

  17. Semantify del.icio.us: automatically turn your tags into senses Advantages of sense-based tagging Organizing user resuorces exploiting external classifications Senses Coverage Tagged Web Resources Coverage

  18. Semantify del.icio.us: automatically turn your tags into senses Advantages of sense-based tagging Organizing user resuorces exploiting external classifications Reduction of the total number of YAGO classes needed to classify user resources through ancestros hierarchy: 12 VI level of ancestors 19 VI level of ancestors 29 V level of ancestors IV level of ancestors 42 III level of ancestors 69 II level of ancestors 102 I level of ancestors 132 162 Direct YAGO Classes

  19. Semantify del.icio.us: automatically turn your tags into senses Advantages of sense-based tagging Linking tagging systems / delicious to Linked Data datasets http://script.aculo.us/ A collection of subprograms used to develop software. library http://dbpedia.org/resource/Library_(computing) http://en.wikipedia.org/wiki/Library_(computing) • We automatically tightly link: • the socially produced data coming from tagging systems • the Semantic Web’s DBPedia / Linke Data datasets delicious

  20. Semantify del.icio.us: automatically turn your tags into senses Future works • Improvement of the Tag Disambiguation Algorithm • Investigation of new ways to organize and represent user sense-based tagged resources on the basis of external classification schemas (UMBEL) • Better tuning the representation and access to sense-based tagging user profiles as RDF triples • Development of a real service that allows users to semantify their tagging profile, experimenting all the advantages of this new way of tagging

  21. Thanks for your attention! Any question? Semantify del.icio.us:automatically turn your tagsinto senses Maurizio Tesconi, Francesco Ronzano, Andrea Marchetti, Salvatore Minutoli Institute for Informatics and Telematics National Research Council (C.N.R.) - Pisa, Italy Social Data on the Web @ ISWC2008 - October 27, 2008 This work is funded by the 7th Framework Project KYOTO

  22. Tagpedia: a semantic reference to describe and search for Web resources ADDITIONAL SLIDES From: Tagpedia: a semantic reference to describe and search for Web resources Francesco Ronzano, Maurisio Tesconi, Andrea Marchetti, Salvatore Minutoli Institute for Informatics and Telematics National Research Council (C.N.R.) - Pisa, Italy

  23. Building Tagpedia Tagpedia: a semantic reference to describe and search for Web resources Tagpedia has been mainly populated mining Wikipedia, to create an initially rich collection of syntag sets We based our mining on Wikipedia internal pages organization Three kinds of pages: Article pages Redirect pages Disambiguation pages Ajax or Aias (ancient Greek) was a mythological Greek hero, the son of Telamon and Periboea and king of Salamis…

  24. Building Tagpedia Tagpedia: a semantic reference to describe and search for Web resources Tagpedia has been mainly populated mining Wikipedia, to create an initially rich collection of syntag sets We based our mining on Wikipedia internal pages organization Three kinds of pages: Article pages Redirect pages Disambiguation pages Ajax or Aias (ancient Greek) was a mythological Greek hero, the son of Telamon and Periboea and king of Salamis…

  25. Building Tagpedia Tagpedia: a semantic reference to describe and search for Web resources Tagpedia has been mainly populated mining Wikipedia, to create an initially rich collection of syntag sets We based our mining on Wikipedia internal pages organization Three kinds of pages: Article pages Redirect pages Disambiguation pages Ajax or Aias (ancient Greek) was a mythological Greek hero, the son of Telamon and Periboea and king of Salamis…

  26. Building Tagpedia: some statistics Tagpedia: a semantic reference to describe and search for Web resources Tagpedia is made up of: Tags have been generated by: - 1.927.378 syntag sets - 4.237.740 tags

  27. Building Tagpedia: some statistics Tagpedia: a semantic reference to describe and search for Web resources Tagpedia is made up of: Distribution of syntag sets, considering their tags cardinality: - 1.927.378 syntag sets - 4.237.740 tags

  28. Tagpedia: semantic characterization and search patterns Tagpedia: a semantic reference to describe and search for Web resources syntag sets tags http://www.myresource1.com http://www.myresource4.com http://www.myresource2.com http://www.myresource3.com

  29. Tagpedia: semantic characterization and search patterns Tagpedia: a semantic reference to describe and search for Web resources syntag sets tags http://www.myresource1.com http://www.myresource1.com

More Related