80 likes | 322 Vues
Information Extraction From Automobile Advertisements. Nipun Bhatia Rakshit Kumar Shashank Senapaty. Problem Definition. Craigslist - Rudimentary keyword search. Not a natural way to search for cars. Difficult to efficiently find ads with particular attributes.
E N D
Information Extraction From Automobile Advertisements Nipun Bhatia Rakshit Kumar Shashank Senapaty
Problem Definition • Craigslist - Rudimentary keyword search. • Not a natural way to search for cars. • Difficult to efficiently find ads with particular attributes. • Want structured search over attributes. • Attributes : Make, Model, Price, Year, Mileage, Transmission, PostedBy, Location, Contact
Dataset & Issues • 350 postings from the cars & trucks section in Craigslist. • Manually annotated with the attributes.
Feature Selection • Features: • Title : isPresentLexicon, hasDollar_hasDigit, hasParanthesis, hasDigit, hasApostrophe_hasDigit, PrevLabel, Word • Body :isPresentTrLexicon, isPresentOwLexicon, hasDigit_ hasDash, hasDigit_hasDot, hasDigit_ hasParanthesis, Word_Representation, Neighbor
Results Body Classifier Title Classifier