230 likes | 235 Vues
Semantic Infrastructure for KM 2.0 A new approach to folksonomies and other knowledge representations. Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com. 2.0 Themes .
E N D
Semantic Infrastructure for KM 2.0 A new approach to folksonomies and other knowledge representations Tom ReamyChief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com
2.0 Themes • “Tags are great because you throw caution to the wind, forget about whittling down everything into a distinct set of categories and instead let folks loose categorizing their own stuff on their own terms." - Matt Haughey - MetaFilter • “It’s MySpace meets YouTube meets Wikipedia meets Google – on steroids.” • “It’s ignorance meets egotism meets bad taste meets mob rule – on steroids.” – The Cult of the Amateur – Andrew Keen • “Things fall apart; the center cannot hold;Mere anarchy is loosed upon the world,…The best lack all conviction, while the worstAre full of passionate intensity.” - The Second Coming – W.B. Yeats
Agenda • Introduction • Essentials of Folksonomies • Advantages, Disadvantages, and Dangers of Folksonomies • Improving the Quality of Folksonomies • Facets and Flickr • Del.icio.is – Topics, Popularity and Findability • Semantic Infrastructure Solution • Elements of Semantic Infrastructure / KM 2.0 • Evolving Folksonomies • Ontologies and Natural categories • Conclusion
KAPS Group • Knowledge Architecture Professional Services (KAPS) • Consulting, strategy recommendations • Knowledge architecture audits • Partners – Convera, Inxight, FAST, and others • Taxonomies: Enterprise, Marketing, Insurance, etc. • Taxonomy customization, ontology development • Intellectual infrastructure for organizations • Knowledge organization, technology, people and processes • Search, content management, portals, collaboration, knowledge management, e-learning, etc.
Essentials of Folksonomies? • Wikipedia: A folksonomy is an Internet-based information retrieval methodology consisting of collaboratively generated, open-ended labels that categorize content such as Web pages, online photographs, and Web links. • A folksonomy is most notably contrasted from a taxonomy – done by users, not professionals, • Example sites – Del.icio.us and Flickr (not really – no feedback) • It is just metadata that users add • Key – social mechanism for seeing other tags
Advantages of Folksonomies • Simple (no complex structure to learn) • No need to learn difficult formal classification system • Lower cost of categorization • Distributes cost of tagging over large population • Open ended – can respond quickly to changes • Relevance – User’s own terms • Support serendipitous form of browsing • Easy to tag any object – photo, document, bookmark • Better than no tags at all • Getting people excited about metadata!
Disadvantages of Folksonomies - Quality • They don’t work very well for finding • No structure, no conceptual relationships • Flats lists do not a onomy make • Issues of scale – popular tags already showing a million hits • Limited applicability – only useful for non-technical or non-specialist domains • Either personal tags (other’s can’t find) or popularity tags – lose interesting terms (Power law distribution) • Most people can’t tag very well – learned skill • Errors – misspellings, single words or bad compounds, single use or idiosyncratic use
Dangers of Folksonomies • Unwisdom of Crowds • “We find that whole communities suddenly fix their minds upon one object, and go mad in its pursuit; that millions of people become simultaneously impressed with one delusion, and run after it, till their attention is caught by some new folly more captivating than the first.” • From witch hunts to tulipomania to stock market crash • Extraordinary Popular Delusions and the Madness of Crowds • Tyranny of the majority • Popularity drowns quality • Narrowing of choices, lost content
Better Folksonomies: • Will social networking make tags better? • Not so far – example of Del.icio.us – same tags • Quality and Popularity are very different things • Most people don’t tag, don’t re-tag • Study – folksonomies follow NISO guidelines – nouns, etc – but do they actually work – see analysis • Most tags deal with computers and are created by people that love to do this stuff – not regular users and infrequent users – Beware true believers!
Basic Facets – over 90% of content Place – Amsterdam to Beach – 40% Events, Date, People, Things / Animals, Color Subject Matter – less than 1% Works on lower level scales: Artparade, tourofbritain, stgilesfair, hideoutblockparty (last weeks) Faceted navigation – extremely powerful, easy to use to find How to develop automatic facets? Design facet system – one time cost, some monitoring Entity Extraction, suggested placement Flickr Facets
Del.icio.us Tags • Design blog software music tools reference art video programming webdesign web2.0 mac howto linux tutorial web free news photography shopping blogs css imported education travel javascript food games • Development inspiration politics flash apple tips java google osx business windows iphone science productivity books toread helath funny internet wordpress ajax ruby research humor fun technology search opensource • Photoshop media recipes cool work article marketing security mobile jobs rails lifehacks tutorials resources php social download diy ubuntu freeware portfolio photo movies writing graphics youtube audio online
Del.icio.us - Topics, not Facets • High level topics - photography, news, education • Get related terms by popularity, not conceptual • Photography • Synonyms - photo, photos • Related – art, design, images, camera • Related Facet – howto, tutorial, photoshop • Popularity is not quality • Dominance of computer terms • Tyranny of the majority – design (1 MIL), interior design – 3,909 • Top 25 – same set, slight order shift – social inertia • New terms - important – iphone, ipod, .net, ebooks,facebook • Dropped terms – adult, babes, britney, naked, sex, sexy
Del.icio.us - Folksonomy Findability • Too many hits (where have we heard that before?) • Design – 1 Mil, software – 931,259, sex – 129,468 • No plurals, stemming (singular preferred) • Folksonomy – 14,073, folksonomies – 3,843, both – 1,891 • Blog-1.7M, blogs – 516,340, Weblog- 155,917, weblogs – 36,434, blogging – 157,922, bloging – 697 • Taxonomy – 9.683, taxonomies – 1,574 • Personal tags – cool, fun, funny, etc • Good for social research, not finding documents or sites • How good for personal use? Funny is time dependent
Del.icio.us - Improving the Quality • Bundle tags – if used? • Types of relationships – ubuntu – tutorial, howto, reference, tips, install • Ontology Clusters – grow with people and software • Taxonomy Clusters – software – Linux - ubuntu • Add broad general taxonomy of most popular tags • Tags as natural categories – build up and down • Start – evolve a simple 2 level taxonomy • People assign tags to a category, build numbers • Evolve quality of tags and emerging structure of tags • Preferred term = popular (Blog/blogs – Books/book) • Add mechanisms – rank tags, taggers, categories
Enterprise Environment – KM 2.0 • From internet to intranet – we’ve done this before • Remember early Intranets built on Internet model? • Smaller content repositories, more coherent • More precise targets – specific documents (the official version) not web sites • More formal – from documents to publishing procedures • More control of publishing – corporate policy • More options for tagging – part of CM system, policy, dedicated editor team, reward system
Semantic Infrastructure Solution • What won’t work: • Recommendations about count-non-count nouns or singular – plural • Link to online dictionary or Wikipedia – extra work, whole focus is on ease of tagging – any help has to be immediate and integrated - or done by a central group • New Relationship of Center and Crowd • Not top down or bottom up • Interpenetration of opposites • Integrated Solution: Content Structures, People, Technology, Policies and Procedures
Semantic Infrastructure: People • KM 2.0 (or 3.0?) • KM always concerned with social aspects of knowledge • New relationship of center and users – more sophisticated support, more freedom, more suggestions, more user input • - New roles – for users (taggers, part of variety of communities – both distributed and central) • New roles for central – create feedback system, tweak the evolution of the system, Develop initial candidates • Communities of Practice – apply to tagging, ranking • Community Maps – formal and informal • Map tags to communities – more useful suggestions • Use tags to uncover communities (see tech SNA)
Semantic Infrastructure - Technology • Enterprise Content Management • Place to add metadata – of all kinds, not just keywords • Policy support – important, part of job performance • Add tag clouds to input page • More sophisticated displays • Tag clouds mapped to community map • Tag clusters, taxonomy location • Semantic Software – Inxight, Teragram etc. • Suggest terms based on text, on tag clouds • Social Networking – add semantics • SNA – apply to people and tags • KM – platforms, COP’s – social tags
Semantic Infrastructure: Putting it all togetherComplexity Theory and Folksonomies: Feedback • Ranking Methods • Explicit – people rank directly • Categories, tags, taggers • Good tags, best bets for terms or categories? • Implicit – software evaluation, reverse relevance • Ranking Roles • Taggers – everyone (rewards, make it easy and fun) • Meta-taggers – everyone (but levels of meta-taggers) • Editors – tagging system, integration with taxonomy, resolve disputes, Wikipedia model
Content Structures – Best of Both Worlds • Start and end with a formal taxonomy / Ontology • Findability vastly superior • Communication with others – share tags • Take advantage of conceptual relationships • Tagging experience – folksonomies plus • Users can type any word – system looks it up – plurals, synonyms, preferred terms, spelling variations • Software suggestions – based on content of bookmark, document and on popular user tags – natural level not top down • New terms flagged and routed to central team • Facets – for both things and documents (faceted taxonomy) • Software suggests facet values, user override • Cognitively simpler task than own value, complex hierarchy
Conclusions: Semantic Infrastructure for KM 2.0 • Folksonomies can help – but they need help to evolve better quality • Fundamental contradiction of ease of tagging and findability will limit usefulness of Internet folksonomies • Enterprise (Intranets, KM) is where the benefits will happen • Semantic Infrastructure solution (people, policy, technology, semantics) and feedback is best approach • Evolve folksonomies, taxonomies, ontologies – not central, top-down design • Intelligent Design + Darwin = new job – Taxonomy Gods
Questions? Tom Reamytomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com