Named EntIty Recognition in Query

Jiafeng Guo, Gu Xu, Xueqi Cheng, Hang Li NamedEntItyRecognitioninQuery SIGIR 2009 Presentation by Gonçalo Simões Course: Recuperação de Informação

Outline • Basic Concepts • Named Entity Recognition in Query • Conclusions

Outline • Basic Concepts • Information Extraction • Named Entity Recognition • Named Entity Recognition in Query • Conclusions

Information Extraction • Information Extraction (IE) proposes techniques to extract relevant information from non-structured or semi-structured texts • Extracted information is transformed so that it can be represented in a fixed format

Named Entity Recognition • Named Entity Recognition (NER) is an IE task that seeks to locate and classify text segments into predefined classes (e.g., Person, Location, Time expression)

Named Entity Recognition CENTER FOR INNOVATION IN LEARNING (CIL) EDUCATION SEMINAR SERIES Joe Mertz & Brian Mckenzie Center for Innovation in Learning, CMU ANNOUNCEMENT: We are proud to announce that this Friday, February 17, we will have two sessions in our Education Seminar. At 12:30pm, at the Student Center Room 207, Joe Mertz will present "Using a Cognitive Architecture to Design Instructions“. His session ends at 1pm. After a small lunch break, at 14:00, we meet again at Student Center Room 208, where Brian McKenzie will start his presentation. He will present “Information Extraction: how to automatically learn new models”. This session ends arround 15h. We hope to see you in these sessions Please direct questions to Pamela Yocca at 268-7675.

Named Entity Recognition Classes/entities: Person Location Temporal Expression CENTER FOR INNOVATION IN LEARNING (CIL) EDUCATION SEMINAR SERIES Joe Mertz & Brian Mckenzie Center for Innovation in Learning, CMU ANNOUNCEMENT: We are proud to announce that this Friday, February 17, we will have two sessions in our Education Seminar. At 12:30pm, at the Student Center Room 207, Joe Mertz will present "Using a Cognitive Architecture to Design Instructions“. His session ends at 1pm. After a small lunch break, at 14:00, we meet again at Student Center Room 208, where Brian McKenzie will start his presentation. He will present “Information Extraction: how to automatically learn new models”. This session ends arround15h. We hope to see you in these sessions Please direct questions to Pamela Yocca at 268-7675.

NER in IR • NER has been used for some IR tasks • Example: NER + Coreference resolution When Mozart first arrived in Vienna, he’d get up at 6am, settle into composing at his desk by 7, working until 9 or 10 after which he’d make the round of his pupils, taking a break for lunch at 1pm. If there’s no concert, he might get back to work by 5 or 6pm, working until 9pm. He might go out and socialize for a few hours and then come back to work another hour or two before going to bed around 1am. Amadeus preferred getting seven hours of sleep but often made do with five or six ...

NER in IR • NER has been used for some IR tasks • Example: NER + Coreference resolution • Instead of using a bag of words explore the fact that the highlighted entities correspond to the same real world entity When Mozart first arrived in Vienna, he’d get up at 6am, settle into composing at his desk by 7, working until 9 or 10 after which he’d make the round of his pupils, taking a break for lunch at 1pm. If there’s no concert, he might get back to work by 5 or 6pm, working until 9pm. He might go out and socialize for a few hours and then come back to work another hour or two before going to bed around 1am. Amadeus preferred getting seven hours of sleep but often made do with five or six ...

Outline • Basic Concepts • NamedEntityRecognitioninQuery • Introduction • NERQ Problem • Notation • ProbabilisticApproach • ProbabilityEstimation • WS-LDA Algorithm • TrainingProcess • Experimental Results • Conclusions

Introduction • 71% of the queries in search engines contain named entities • These named entities may be useful to process the query

Introduction • Motivating Examples • Consider the query “harry potter walkthrough” • The context of the query strongly indicates that the named entity “harry potter” is a “Game” • Consider the query “harry potter cast” • The context of the query strongly indicates that the named entity “harry potter” is a “Movie”

Introduction • Identifying named entities can be very useful. Consider the following examples related to the query “harry potter walkthrough”: • Ranking: Documents about videogames should be pushed up in the rankings (Altavista search) • Suggestion: Relevant suggestions can be generated like “harry potter cheats” or “lord of the rings walkthrough”

NERQ Problem • Named Entity Recognition in Query (NERQ) is a task that tries to detect the named entities within a query and categorize it into predefined classes • The work that was previously performed in this area was focused on query log mining and not on query processing

NERQ Problem • NER vs NERQ • The techniques used in NER are adapted for Natural Language texts • They do not have good results for queries because: • queries only have 2-3 words on average • queries are not well formed (e.g., all letters all typically lower case)

Notation • A single-named-entity query q can be represented as a triple (e,t,c) • e denotes a named entity • t denotes the context • A context is expressed as α#β where αandβ denotes the the left and right context respectively and # denotes a placehoder for the named entity • c denotes the class of e • Example • “harry potter walkthrough” is associated to the triple (“harry potter”, “# walkthrough”, “Game”)

Probabilistic Approach • The goal of NERQ is to detect the named entity e in query q, and assign the most likely class c, to e • Goal: Find (e,t,c)* such that: (e,t,c)* = argmax(e,t,c) P(q,e,t,c)

Probabilistic Approach • The goal of NERQ is to detect the named entity e in query q, and assign the most likely class c, to e • Goal: Find (e,t,c)* such that: (e,t,c)* = argmax(e,t,c) P(q | e,t,c) P(e,t,c)

Probabilistic Approach • The goal of NERQ is to detect the named entity e in query q, and assign the most likely class c, to e • Goal: Find (e,t,c)* such that: (e,t,c)* = argmax(e,t,c) ϵ G(q) P(e,t,c)

Probabilistic Approach • For each triple (e,t,c) ϵ G(q), we only need to compute P(e,t,c) P(e,t,c) = P(t,c | e) P(e)

Probabilistic Approach • For each triple (e,t,c) ϵ G(q), we only need to compute P(e,t,c) P(e,t,c) = P(t | c,e) P(c | e) P(e)

Probabilistic Approach • For each triple (e,t,c) ϵ G(q), we only need to compute P(e,t,c) P(e,t,c) = P(t | c) P(c | e) P(e) • How to estimate these probabilities?

Probability Estimation • P(t | c), P(c | e) and P(e) can be estimated through training • The input for the training process is: • Set of seed named entities with the respective classes • Query log

Probability Estimation • Consider the existence of a training data set with N triples from labeled queries T = {(ei,ti,ci) | i=1,…,N} • With this training data set, the learning problem can be formalized as:

Probability Estimation • Buildingthetraining corpus for fullquerieswouldbedifficultandtime-consumingwheneachnamedentitycanbelong to several classes • A solutionis to collecttraining data as: T = {(ei,ti) | i=1,…,N} andthelistofpossible classes for eachnamedentityintraining • Withthistraining data set, thelearningproblemcanbeformalized as:

Probability Estimation • P(t | c) and P(c | e) canbepredictedusing a TopicModel • Thereis a relationshipbetweenTopicModeland NERQ notions • Withoutlossofgenerality, theauthorsdecided to use a variationof LDA called WS-LDA

WS-LDA Algorithm • Unsupervised learning methods for topic model would not work in NERQ • WS-LDA introduces weak supervision for training by using a set of named entity seeds • It is assumed that a named entity has high probabilities on labeled classes and very low probabilities on unlabeled classes

WS-LDA Algorithm • Objective function for eachnamedentity O(e|y,Θ) = log P(w | Θ) +λC(y, Θ) • y,binary vector thatassignsanentity to therespective classes • Θ = {α,β}, parametersoftheDirichletdistributionandtheMultinomialdistributionusedintheprocess • λ, coeficientgivenbytheuserthatindicatestheweightofthesupervisionconstraints • C(y, Θ), constraintfunction

Training Process • Thetrainingprocessisdividedintotwosteps: • Findqueriesofthequerylogcontatiningthenamedentityseeds • Generatethecontextsassociated to thenamedentityseedsinthequeries • Generatethequerytrainingdata (ei,ti) to trainthe WS-LDA topicmodel • Use thetopicmodel to learn P(t|c) • Scanthequerylogwiththepreviouslygeneratedcontexts to extractnewnamedentities • Use thetopicmodel to learn P(c|e) for eachnewentity • Estimate P(e) withthefrequencyofeinthequerylog

Outline • Basic Concepts • Named Entity Recognition in Query • Experimental Results • Data Set • NERQ by WS-LDA • WS-LDA vs Baselines • Supervision in WS-LDA • Conclusions

Data Set • 6 billion queries • Four semantic classes: “Movie”, “Game”, “Book” and “Music” • 180 seed named entity from Amazon, Gamespot and Lyrics annotated by four Human Beings • 120 named entities for training • 60 named entities for testing

Data Set • After training a WS-LDA model with the 120 seed named entities: • 432.304 contexts • About 1.5 million named entities

NERQ by WS-LDA • NERQ conducted on queries from a separate query log with about 12 million queries • 140.000 recognition results • Evaluation with 400 randomly sampled queries

NERQ by WS-LDA • Three types of errors: • Inacurate estimation of P(e) • Uncommon contexts that were not learned • Queries containing named entities out of the predefined classes

WS-LDA vs baselines • Comparison between WS-LDA and two other approaches: • A deterministic approach that learns the contexts of a class by aggregating all the contexts of named entities of the class • Latent Dirichlet Allocation

WS-LDA vs baselines • Modeling Contexts of classes

WS-LDA vs baselines • Class prediction

WS-LDA vs baselines • Convergence speed

Supervision in WS-LDA • How can λ affect the performace of WS-LDA?

Outline • Basic Concepts • Named Entity Recognition in Query • Experimental Results • Conclusions

Conclusions • NERQ ispotentiallyusefulinmanysearchapplications • Thispaperis a firstapporach to NERQ andproposed a probabilisticapproach to performthistask • WS-LDA ispresented as na alternative to LDA • Experimental resultsindicatethattheproposedapproachcanaccuratelyperform NERQ

Named EntIty Recognition in Query

Named EntIty Recognition in Query

Presentation Transcript

Named Entity Recognition

Exploiting Domain Structure for Named Entity Recognition

Named Entity Recognition

Cross-Domain Bootstrapping for Named Entity Recognition

CS544: Named Entity Recognition and Classification

Named Entity Recognition in Tweets: TwitterNLP

Biomedical Named Entity Recognition

Named Entity Recognition

Chinese Named Entity Recognition using Lexicalized HMMs

Named-Entity Recognition with Character-Level Models

Named Entity Recognition gate.ac.uk/ nlp.shef.ac.uk/ Hamish Cunningham

Named Entity Recognition

Named Entity Recognition in an Intranet Query Log

NAMED ENTITY RECOGNITION

Named Entity Recognition (NER) with NLTK

Named Entity Recognition

CS544: Named Entity Recognition and Classification

How Does Named Entity Recognition Work?