1 / 113

Affective patterns using words and emoticons in Twitter

Affective patterns using words and emoticons in Twitter. Tyler Schnoebelen NWAV 40 Georgetown University Oct 30, 2011. Hello, Readers. If you’re reading this presentation on the web—Hi!

morley
Télécharger la présentation

Affective patterns using words and emoticons in Twitter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Affective patterns using words and emoticons in Twitter Tyler Schnoebelen NWAV 40 Georgetown University Oct 30, 2011

  2. Hello, Readers • If you’re reading this presentation on the web—Hi! • I’ve put in notes for most of the slides, which should help build out what they are about and give some of the narrative. • I’ve reduced the quality of the images to try to get the presentation smaller, but the file is still kind of big (sorry). • Feel free to tweet this presentation: • http://bit.ly/tyleremo (this presentation) • http://bit.ly/tyleremotion (a link to my main web page about emotions and language)

  3. Get out your phones #NWAV40 @TSchnoebelen

  4. What’s ahead • Situated cues and broader patterns • In this presentation: • What can we say about the meaning of various emoticons? • What are their usage patterns? • And which words do they co-occur with? • Words describe not just the emoticons, but users, stance objects, and types of audiences that they are most/least consistent with

  5. Emoticon Dialectology • (^_^) • smiling • (^_~) • winking • (>_<) • angry • (-_-) • not amused • d-_-b • listening to music

  6. Among English-Speakers? The biggest emoticons, worldwide and in America, are faces on their sides: =) :P ;-) (: XD • :) • :D • :( • ;) • :-) Some have equal eyes Some face right There are tongues, winks, and wrinkly eyes, too Some have noses

  7. Smiley stuff :) :-) (: :D :-D XD =) =D

  8. Winky stuff ;) ;-) (; ;P

  9. Tongue-y stuff :P :-P ;P =P

  10. Frown-y stuff =( :( :-( ): D:

  11. Slant-y stuff :/ :-/

  12. O-mouths :O

  13. Coverage--WorldWide • 39 million tokens for 1,479 emoticons • http://www.infochimps.com/datasets/twitter-census-smileys

  14. Corpus for this Presentation • 3,209,102 American English tweets with one of these emoticons :) and :-) and (: :D and :-D ;) and ;-) and (; :P and :-P ;P XD :O :( and :-( and ): :/ and :-/ D: =) and =D and =( and =P • 32,252,909 word tokens, 13,586 unique words

  15. Fill in the blank • @KevinHarvickAwww, leave the cute little ground hogs alone. That is so sad… [_EMOTICON_]

  16. Closest to “so sad” • @KevinHarvickAwww, leave the cute little ground hogs alone. That is so sad…[_EMOTICON_]

  17. That “awww” • @KevinHarvickAwww, leave the cute little ground hogs alone. That is so sad…[_EMOTICON_]

  18. Leave X alone • @KevinHarvickAwww, leave the cute little ground hogs alone. That is so sad…[_EMOTICON_] • @KevinHarvickAwww, leave the cute little ground hogs alone. That is so sad…[_EMOTICON_]

  19. Qual and quant • Our intuitions are qualitative and nuanced. • But do these intuitions actually hold? • Are they built on quantifiable generalizations? • We can and should make reference to how the various linguistic resources we are using as cues get used in other situations.

  20. Probability • There are 16,348 tokens of sad that appear with our 25 emoticons • :) occurs with 12,531,809 words • There are 32,252,909 words that appear with any emoticon • If :) was really just a random tag with no meaning, then we’d expect there to be: • (16,348/32,252,909)*(12,531,809/32,252,909)*(32,252,909)= • 6,351.986 tokens of sad alongside :) • Observed tokens of :) and sad together—only 1,972 • 31% of what we’d expect • Highly significant by Fisher’s exact test (~5.91e-05) • Throughout this presentation, I’ll report Observed/Expected values that are significant at minimally p<0.05

  21. :-/

  22. Scope and affect • Notice that “cute” and “little” are positively valenced. • But since they occur within the scope of “leave_alone”, they presumably become LESS likely to appear with smiles • @KevinHarvickAwww, leave the cute little ground hogs alone. That is so sad…[_EMOTICON_]

  23. We also can think about… • The author’s gender (female) • The main recipient’s gender (male) • The author’s social network make-up • As defined not by followers/following but by mutual-@’ing across time • (in this case, mixed)

  24. A quick note about gender • Nearly all emoticons are used by a higher percentage of women than men • The one exception is :-P • Once we distinguish tweeters based on the gender composition of their network • Instead of using “followers/following”, we use “who has consistently and mutually @’ed each other”? • We see that gender makeup doesn’t change how women use most emoticons • Men are much more sensitive • In the domain of “unhappy emoticons”, let’s compare gender-biased networks with mixed gender networks • :(men with male networks avoid this, men with female networks use it a lot • :-( men with female networks use it a lot • :-/ women with female networks avoid this • :/ men with male networks avoid, but women with male networks use a lot; women with female networks avoid Joint work with David Bamman and Jacob Eisenstein

  25. Fill in the blank • @iShell_Beelieve I LOOVE YOU MOORE!!! yessss please skype me [_EMOTICON_]hahaim excited lol

  26. Clustering so far • So far we’ve looked at 14 words • They’ve distinguished happy vs. sad in the first case • And noses from no-noses in the second case • What happens when we look at our 25 emoticons across 13,586 words?

  27. Clustering overview • We can pick out 2 or 3 dimensions, but we have 25 dimensions • Lots of ways to cluster • Hierarchical clustering • Factor analysis • K-means • Model-based • Basically, they all look at distances between points • Close pairs should go in the same cluster • Distant pairs should go in different clusters

  28. Hierarchical clustering overview • Agglomerative hierarchical clustering: • Start with each point as an individual and start fusing like points together. • Then take the “fused points” and fuse them with more, building up, ultimately to one giant cluster that shows a hierarchy beneath it. • Once a fusion is made, it’s done. • You can’t appear in more than one group.

  29. Results of hierarchical clustering—all words Noses cluster separately :O seems “playful” Positive and negative cluster separately Noses cluster separately Why are D: and XD together?

  30. factor analysis • We use factor analysis to discover “latent” variables • Imagine a test that had 30 geometry questions and 20 literature questions. You give it to a few hundred kids. • Going in, we know that there are kids who will do better in one section than the other • If we did a factor analysis on their data, we’d expect to discover a latent “math” variable and a latent “reading” variable. • Look for variables (emoticons) that are correlated • Combine them into factors • Each emoticon is then more-or-less associated with each factor

  31. 1. Negative vs Smile 3. Right-facing+ (: (; ;D 2. “Extra-expressives” D: XD :O vs. :) 4. Noses :-D ;-) :-)

  32. Top 4 factors • Factor 1: :( :-( :/ :-/ =( vs. :) • disappointed, expired, allergies, migraine, grrrrr • DOES NOT GO WITH…mwah, terrific, kk, thankyou, notorious • Factor 2: D: XD :O vs. :) • facepalm, jizz, shitting, mexicans, omfg, (a lot of Spanish) • DOES NOT GO WITH…imy, yayyy, thankss, ughhhhh, sry • Factor 3: (:(; ;D • ithink, swagg, yur, idgaf, kickback, wassup, cutee • DOES NOT GO WITH…jajaja, iya, wicked, wkwk, odd, brainstorm • Factor 4: :-D ;-) :-) • twitterville, hubby’s, hee, pmsl, 4get, w00t • DOES NOT GO WITH…hahaa, ooc, heyy, fever, cus, nooo

  33. What patterns emerge? • Happy and sad are different • Thank goodness this is recovered • Noses and no-noses are different • Consistent in the hierarchical clustering and factor analysis • There may also be a “right-facing” dialect • The factor analysis shows this most clearly • There may also be an “equal-eyes” dialect • The hierarchical cluster analysis shows this most clearly • Why are D: (worry) and XD (laughing face) clustering • Across all analyses here • And sometimes with :O • Aren’t tongues and winks different? • They don’t seem to be here

  34. But surely :) and :-) mean the same thing??? • Well, sort of. • Different types of people use them • Which is to say “people who use :) use a different vocabulary than people who use :-)” • Each emoticon’s meaning is how it is used AND who it is used by • One way at getting to their emotional meaning is to stop looking at collocations with all words and start looking at collocations words we know to be emotional (angry, happy, sadness, frighten) • I gather 13 different lists of “emotion terms”—10,592 unique terms • I restrict myself to the 432 words that are on 3 or more lists

  35. XD now in a better spot :O still playful (EqEyes may be a bit set apart) With or without noses, similar affective meaning Tongues not quite the same as winks Although not for neg

  36. 1. “Elaborate sad” vs. “elaborate happy” 3. Tongues 4. D:, :-/, :O vs. =) 2. :( and :/ vs. :)

  37. Just emotion terms: Top Factors • Factor 1 ): =( =/ :-( vs. :D and :-) • upset, depressed, poor, sleepy, sore, hurt • DOES NOT GO WITH…nervous, worried, awful, confused, terrible • Factor 2 :( and :/ vs. :) • heartbroken, devastated, tragic, nauseous • DOES NOT GO WITH…excited, gratitude, dumb, bliss, confused, hate • Factor 3 =P :-P ;P :P • silly, lame, lazy, blah, blame, fantastic, crazy • DOES NOT GO WITH…sad, great, hope, welcome, beautiful, love, happy • Factor 4 D: :-/ :O vs. =) • nervous, ugly, attack, worried, awkward, awful, scared, ire • DOES NOT GO WITH…upset, lonely, disappointed, blah, nag, depressed, mad, blame

  38. Redux • Happy and sad are still different • Phew • Noses and non-noses are AFFECTIVELY similar • Right-facing and equal-eye emoticons also seem to be affectively similar to their left-facing/normal-eye counterparts • D: now patterns with negative stuff and away from XD • :O is playful in the hierarchical cluster analysis, more grave in the factor analysis • Tongues and winks are distinct from each other affectively

  39. So What’s the difference between… • We could ask a lot of questions, but I’ll restrict myself to… • Between noses and no-noses • Winks and tongues

  40. No noses and noses

  41. Full Disclosure (Fake nose)

  42. ~14% of people vary THEIR nose use 58,367 users with 10+ emoticon uses

  43. Functional Reduction? • Length of tweet • Frequency of emoticon use

  44. Save a character? • There’s a 140-character limit on Twitter • Do noses get tossed out to make room? • Actually, no • 30,000 random :) tweets and 30,000 random :-) tweets • People who use noses are writing MORE not less • Another way to say this is that people who leave off the noses are shortening other things, too Average number of characters • Sig p=2.96e-21 (by t-test)

  45. frequent use = shorter • People who use emoticons a lot don’t use noses as much • Sig p=1.623e-28 (by t-test)

More Related