1 / 21

Using TagHelper to analyze e-discussions

Using TagHelper to analyze e-discussions. Oliver Scheuer German Research Center for Artificial Intelligence. Goals. General goal Automatically classifying contributions of e-discussions to help a moderator moderate Example: Which contributions show critical reasoning and which ones not

jstandish
Télécharger la présentation

Using TagHelper to analyze e-discussions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using TagHelper to analyze e-discussions • Oliver Scheuer • German Research Center for Artificial Intelligence

  2. Goals • General goal • Automatically classifying contributions of e-discussions to help a moderator moderate • Example: Which contributions show critical reasoning and which ones not • Summer school goal • Improve prior results (Kappa .68) by using new TagHelper options (user-defined attributes)

  3. Workspace Title Content Data source • Classroom discussions with e-discussion tool Digalo Shapes (discussion moves) Links (references to other contributions) Discussion opener (defining topic) Participants

  4. Coding of data • Data is already coded by pedagogical experts • “Code book“:

  5. Coded data (critical reasoning?) Coding of data Code distribution • yes: 367 = 36% • no : 654 = 64%

  6. Computing a classifier with TagHelper • Prerequisite • Describing discussion contributions in a formal way (attribute-value pairs) • Non-text attributes • Shape (claim, argument, question, …) • Context in discussion map (in-going links, out-going links, …)

  7. Computing a classifier with TagHelper • Text attributes (extracted by TagHelper) • “Standard“ (unigrams, bigrams, POS bigrams, line length, …) • User-defined indicators (“designed attributes“) • Defining key word lists for identifying CR-related properties of a contribution • Claim: “I believe“, “I feel“, “I mean“, “I think“, … • Opinion: “agree”, “disagree”, “against”, “in favor of”, “good point”, … • Reasoning: “because”, ”caus”, “therefore”, “lead to”, “if … then”, … Rationale: CR = (Claim|Opinion) + Reasoning

  8. Results • Evaluation mode • Train on 4/5 of available data • Test on remaining 1/5 • “tuned SMO“: • attribute selection • more powerful kernel • “designed atts“ • key word lists manually defined with TagHelper → Designed attributes did not lead to improvements → Could not beat prior results

  9. Discussion • Hard to improve when starting from a fairly high level • Hard for a computer to identify CR contributions when key words are missing (e.g. because) • “The computer creates interest and the interest increases concentration” • CR sometimes even hard for human coders to identify(“human factor” as additional error source) • “Hello Mr. Chen. Your claim is partly correct and partly incorrect. you wrote that you don't support the experiments but on the other hand - that without those experiments you can't study new things! so I think you are only 50% right.” (CR ?)

  10. Discussion • No characteristic terms for “no CR“ contributions → Makes it difficult to define attributes which “guide“ the algorithm to correctly classify “no CR“ • CR: 94% (5 misclassified instances) • no CR: 73% (41 misclassified instances) • Problem of granularity • Contributions can contain large text strings with several sentences • Makes a precise classification more difficult • More time needed for further fine-tuning of attribute space

  11. Lessons learned • TagHelper is a useful tool for analyzing verbal data • “Designed“ attributes are potentially helpful (even if the here conducted experiments haven‘t lead to improvement) • to guide the machine learning algorithm in the right direction • by making information accessible to the learning algorithm which is not represented in the training data • Fine-tuning of machine learning algorithms and attribute space is necessary for optimal results but also time-consuming

More Related