1 / 25

Development of a Naïve Bayesian Classifier for “Big Five” Item Domains

Development of a Naïve Bayesian Classifier for “Big Five” Item Domains. Alan D. Mead Cassia K. Carter Illinois Institute of Technology. Agenda. The problem: items and domain Bayesian classification Research Questions Method Results Future Directions. Items and Domains.

skyla
Télécharger la présentation

Development of a Naïve Bayesian Classifier for “Big Five” Item Domains

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development of a Naïve Bayesian Classifier for “Big Five” Item Domains Alan D. Mead Cassia K. Carter Illinois Institute of Technology

  2. Agenda • The problem: items and domain • Bayesian classification • Research Questions • Method • Results • Future Directions

  3. Items and Domains • In most tests, items are assigned to domains • To meet content specifications • To provide feedback by domain • Items are usually assigned by the item writers and double-checked during item review • For many tests, manual classification is reliable and easy

  4. Personality Domains • This research stemmed from two projects aimed at automating the development of personality items • Project 1: Personality items generated from templates • Project 2: LSA was used to assemble items from a large pool according to semantic similarity between items and a construct definition • Manual classification is not perfectly reliable • It would be good to have a methodological way to classify items into domains

  5. Big Five Dimensions • The Big Five is a common, high-level taxonomy for personality constructs: • Conscientiousness (“I like order” C+) • Agreeableness (“I insult people” A-) • Neuroticism (“I often feel blue” N+) • Openness (“I do not have a good imagination” O-) • Extraversion (“I am the life of the party” E+) • “I am a warm, nurturing person” E+ or A+? • “I am very traditional” O- or C+?

  6. Choice of Methodology • In this classification problem, there will be no response data • If we had response data, we could use EFA/CFA • Predictors are the presence of specific words • These data are probably of a nominal level of measurement • Sample size is the number of items to classify • We might easily have many more predictors than rows of data (even ignoring interaction terms)

  7. Choice of Methods • When X/predictors and Y/domains are metric, many techniques exist (LDF/Regression, LCA, factor analytic approaches, etc.) • When X is metric but Y is categorical, logistic regression is suitable • What to use when X and Y are not metric? • Naïve Bayesian Classifiers are one solution

  8. Bayesian Classification • Predict nominal classes (domain) from nominal predictors (presence of specific words) • Handle problems with many predictors • Have a history of successful application • Are computationally simple • Have been shown to be robust to technical issues like high degrees of multidimensionality and noise

  9. Bayesian Classification (cont.) • Compute P(domain|item) for each domain • Classify as domain of maximum probability: Predicted domain = argmax P(domain|item) = argmax P(item|domain)P(domain)/P(item) = argmax P(item|domain)P(domain) = argmax P(w1,w2,…,wn|domain)P(domain) ≈ argmaxP(w1|d.)P(w2|d.) …P(wn|d.)P(d.) • “Naïve” refers to this assumption of independence of the predictors

  10. Example of NB Classification • I am the life of the party (E+) • Classified as extraversion

  11. Research Questions • RQ1: How well does this method work? • Can it be improved? • RQ2: Does adding additional items help improve classification accuracy? • RQ3: Does type of item added in matter? • RQ4: How to handle unknown words?

  12. Method • Compiled a database of five forms of various Big Five personality tests; N=655 • Leave one out cross-validation (LOOCV) was used: • Hold out item 1; Train classifier on remaining items; Classify item 1 • Hold out item 2; Train classifier on remaining items; Classify item 2 • Repeat for items 3, 4, …, N • Compare predicted domain to actual domain

  13. Pre-processing & Processing • Force all terms to lowercase • Discard any punctuation • Discard common words (I, am, a, the, etc.) • Use Porter stemming to produce rough lemmas (annoyed, annoy, annoys, annoying -> “anno”) • Ignore unknown words (i.e., discard them)

  14. RQ1: Classification Results • 70.5% accuracy (see diagonal) • Too few Extraversion

  15. RQ2: Adding In Items • Added in items written as a part of three grad-level classes • All Big Five items, classified by students who wrote them • Blind manual classification • Final item set included items where agreement occurred for original classification and two independent raters • New N=1116

  16. Additional Items Results • Accuracy 75.3% • Increase only about 3% above set of 655 items • Now Openness lowest, Extraversion still low

  17. RQ3: Type of Item • Does type of item added into database matter? • Template Group 1: Template items where frequency words varied (“I {always/sometimes/never/rarely/often} enjoy spending time with other people”) • N = 940 • Template Group 2: Manually generated templates based on IPIP items (“I have difficulty {dreaming up| conceiving of| brainstorming| devising| inventing| making up| planning| scheming| visualizing} things.”) • N = 194 • Template Group 3: I am a BLANK person (“I am an energetic person”) • N = 1,239 • Student Item Set: Another group of student-written Big Five items only reviewed by one rater

  18. Type of Item Results

  19. Type of Item Results • Adjective-based items had lowest accuracy • Items come down to a single word, often unique • Template items with high redundancy were best on their own • However, accuracy for this group dropped when added to overall set • Template items with less redundancy improved overall accuracy somewhat • Adding more items doesn’t help dramatically • But adding in items with more information does help

  20. RQ4: Unknown Terms • Unknown terms are a real problem • “I am filled with doubts about things” was seen as “things” because “doubts” and “filled” were used only in this item • Many items hinge upon a single word (e.g., “workaholic”) • Solution: Replace unknown term with sense 1 from wiktionary.org; e.g.: • http://en.wiktionary.org/wiki/advance

  21. Unknown Term Example • “I sometimes feel bashful.” • “bashful” is not known • Lookup up bashful: “inclined to avoid notice” • “I sometimes feel inclined to avoid notice.” • Simplistic approach: • Ignored grammatical implications • In this case, it wasn’t possible to match senses, so sometimes the wrong definition was used. • Did not check that definition used known terms

  22. Results: Unknown Terms • 84% unchanged • Originally 48% correct; After defining unknown terms, 58% correct • 4 items (13%) improved; 1 item (3%) became a miss

  23. Unknown Terms • Small improvements using this method • Would work better if the correct sense could be chosen • Often sense 1 was not the correct part of speech • Some words did not have correct senses on Wiktionary • Could try using synonyms

  24. Future Directions • Find more personality items • Explore better ontologies (e.g., WordNet) • Analyze words more carefully • Part-of-speech (POS) tagging • Try using word-sense disambiguation • Search definitions for “personality-ish” definitions • Use Laplace smoothing and POS tag to handle unknown terms algorithmically

  25. Thank you!Contact: mead@iit.edu

More Related