1 / 21

Chris Biemann, Karsten Böhm, Gerhard Heyer, Ronny Melz University of Leipzig

Automatically Building Concept Structures and Displaying Concept Trails for the Use in Brainstorming Sessions and Content Management Systems. Chris Biemann, Karsten Böhm, Gerhard Heyer, Ronny Melz University of Leipzig I2CS 2004 – Guadalajara - Mexico 06-23-2004. Support for Creativity.

tamera
Télécharger la présentation

Chris Biemann, Karsten Böhm, Gerhard Heyer, Ronny Melz University of Leipzig

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatically Building Concept Structures and Displaying Concept Trails for the Use in Brainstorming Sessions and Content Management Systems Chris Biemann, Karsten Böhm, Gerhard Heyer, Ronny MelzUniversity of Leipzig I2CS 2004 – Guadalajara - Mexico 06-23-2004

  2. Support for Creativity Acquisition of Knowledge Gathering information from structured and unstructered texts, databases, document collections, web etc. Processing KnowledgeGeneration of semantic maps and associations in cooperative teamwork meetings Using Knowledge Visualization of terms and relations. Filters define views on semantically relevant contents and structures.

  3. Goal 1: Computer-aided Associating Software realizes • Protocol function by displaying identified keywords • Adding associations from database • Displaying keywords reflecting semantical similarity Desired effects: • Users can remember the session later easily • During the session, associations remind users of terms they might have forgotten otherwise • The weight and the relatedness of differrent topics in a session becomes visible

  4. Goal 2: Semantic Map and Red Thread Software realises: • Calculation and visualisation of large document collections by using important terms (keywords) • Positioning of terms reflects semantic closeness • Small documents can be drawn into the semantic map: red thread functionality Desired effects • A fixed map gives rise to orientation in the contents of the document collection • Important terms can be overseen quickly • Red thread functionality can be used for „fast reading“

  5. Data Sources • Projekt Deutscher Wortschatz: Word list and co-occurrences- for associations- as a reference corpus for the semantic map • Manual Annotation: typed (coloured) edges and nodes- Semantic primitives- Semantic relations

  6. Calculating Associations: Statistical Co-occurrences • Co-occurrence: occurrence of two or more words within a well-defined unit of information (sentence, nearest neighbors) • Significant Co-occurrences reflect relations between words • Significance measure (log-likelihood) • This measure defines the association degree between all words. High degrees result in edges in the semantic map

  7. Example for Co-occurrences Significant Co-occurrences of Guadalajara: Camarena (194), Mexico (104), Mexican (58), kidnapped (43), Zavala (40), ranch (40), Avelar (37), abducted (35), Alvarez (33), drug (33), Camarena's (32), pilot (32), Caro (30), Enrique (29), agent (27), Enforcement (25), Quintero (25), gynecologist (23), tortured (23), Jalisco (22), DEA (21), Drug (21), miles (21), torture (21), Alfredo (20), Machain (18), Feb (16), bodies (16), southeast (16), Monterrey (15), Rafael (15), found (15), Radelat (14), Paso (13), consulate (13), Administration (12), Salazar (12), body (12), killed (12), outside (12), Vasquez (11), Verdugo (11), bullet-riddled (11), murder (11), El (10), Humberto (10), Lopez (10), lord (10), Felix (9), Gallardo (9), Hernandez (9), Mexico's (9), arrested (9), cartel (9), Alberto (8), City (8), March (8), Zuno (8), city (8), homicide (8), indictment (8), kidnapping (8), Caro-Quintero (7), February (7), Tijuana (7), Zuno-Arce (7), buried (7), marijuana (7), racketeering (7), slayings (7), 31-year-old (6), April (6), Consulate (6), Culiacan (6), Javier (6), Machain's (6), agents (6), office (6) Significant left Neighbours of Guadalajara: outside (12), near (5) Significant right Neighbours of Guadalajara: gynecologist (27), office (8), street (8), home (6), Haggadah (5), drug (5)

  8. Calculating Semantic Maps Requires: document collection • Calculate co-occurrences and keywords by differential frequency analysis: important words are much more frequent in the document collection than in a large reference corpus • take the highest ranked words from the differential frequency analysis as nodes • Take highly significant co-occurrences to existing nodes as further nodes • Remove stopwords (functional words, determiners...) • Insert edges between nodes that have a high association degree by co-occurrence significance

  9. Positioning in Semantic Maps force-directed: nodes and edges are thrown on a plane and then driven to equilibrium by minimizing the energy

  10. Domain Adjustment PARTIAL OVERLAP PARTIAL OVERLAP Session knowledge / Project knowledgeEnrichment of database by incorporating task-specific knowledge and know-how. Community knowledge / Domain knowledgeGeneration of a semantic map by processing domain-relevant documents and incorporating existing ontologies. Wortschatz- Database (Very Large Corpus)

  11. Visualization Extension of Touchgraph (www.touchgraph.com): • Force-directed model for positioning • Label filling colours for runtime-type (keyword, associated, red thread) • Label edge colours for semantic primitives • Edge colours for semantic relations • Nodes can be displayed as lables or dots Is-A Relation white: keyword by user co-hyponymy Relation grey: association from DB primitive: Noun primitive: organisation

  12. Zooms • Conceptual zoom: lexicalize nodes or display them as dots • Granularity: reduce number of visible nodes • Optical zoom: size of window compared to total size of the map

  13. Adding nodes in Association Mode • User keywords are added to the graph. They fade if they get not connected for a certain time. • Grey words are added if they are associated to at least two user keywords Lasst uns über Mexiko sprechen. Die Mexikaner tragen Sombreros, das sind Hüte für den Sonnenschutz. So einen Hut hätte ich auch gern! Das ist ein Land in Mittelamerika.

  14. ... it knows lots of countries

  15. Semantic Map Example

  16. Red Thread Functionality Afghanistan Georgia Iraq • Given: semantic map, additional input • Terms from the additional input that are found in the semantic map are coloured in red and connected in sequence of their occurrence- red connection: the edge already existed in the semantic map- yellow connection: the edge is new • Long-range yellow edges visualize topic shifts

  17. SemanticTalk GUI topic survey window zoom rulers local context window

  18. Embedding in the system • Implementation as java servlet with tomcat webserver • Mysql-Database for Graphs and associations • Linguatec VoicePro 10 – Interface for speech recognition • Several (language recognition)-clients can be connected via LAN

  19. Interfaces • Import/Export- various formats for text files- XML/RDF/RDB for maps- PNG for maps The results obtained with SemanticTalk can be saved, loaded and exported to other tools for further processing • Retrieval:- words (nodes) with links to occurrences in the document collection- associations (edges) with links to occurrences in the document collection- explicit links, e.g. pictures for words

  20. Further Processing of Net Topology Structures Product model Exchange format (e.g. rdf) Varianten Process model Semantic Map Ressource model Transformation in application models

  21. Questions? THANK YOU!

More Related