MaxEnt is useful here too

MaxEnt is useful here too • We last saw MaxEnt in the NLTK default tagger • What is it doing?

Logistic regression • Very common machine learning technique • Assign a positive/negative value to every feature • Add up the values for features that are present • The logit function tells you the probability • Learn the best values

Example: movie reviews • funny = +1 • disappointed = -2 • seagal = -3

High-level overview of MaxEnt • Now you have something more complicated than a yes/no question • You’re learning probability distributions instead of probabilities • The best probability distributions are the ones that are maximally uninformative about things you don’t know • Do things you’ve never observed happen 0% of the time? No, that’s assuming information you don’t have.

MaxEnt is useful here too