1 / 32

Automatic Acquisition of Fuzzy Footprints

Automatic Acquisition of Fuzzy Footprints. Steven Schockaert, Martine De Cock, Etienne E. Kerre. Introduction Constructing fuzzy footprints Experimental results. WWW. Geographical Question Answering. Give a list of Italian Restaurants in the neighborhood of Agia Napa.

umika
Télécharger la présentation

Automatic Acquisition of Fuzzy Footprints

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Acquisition of Fuzzy Footprints Steven Schockaert, Martine De Cock, Etienne E. Kerre

  2. Introduction • Constructing fuzzy footprints • Experimental results Workshop on SEmantic Based Geographic Information Systems

  3. WWW Geographical Question Answering Give a list of Italian Restaurants in the neighborhood of Agia Napa. La Strada Italian Restaurant, Bosko’s ristorante, … Workshop on SEmantic Based Geographic Information Systems

  4. Geographic Question Answering • Resources • Linguistic resources for question analysis, answer extraction, … • A traditional search engine to locate relevant documents • Geographic background knowledge • Footprints provided by gazetteers are often inadequate • We need a more fine-grained representation than a bounding box • Questions may involve vague regions such as the Alpes, the Highlands, … • Our solution: construct footprints automatically • Use the web the collect relevant information • Use a digital gazetteer to map location names to co-ordinates • Use fuzzy sets to represent footprints Workshop on SEmantic Based Geographic Information Systems

  5. Fuzzy Sets • A fuzzy set A in a universe U is a mapping from U to [0,1] (Zadeh, 1965) • u belongs to A  A(u)=1 • u doesn’t belong to A  A(u)=0 • u more or less belongs to A  0 < A(u) < 1 Old Workshop on SEmantic Based Geographic Information Systems

  6. Fuzzy Footprints • We represent footprints as fuzzy sets in the universe of co-ordinates “South of France” Workshop on SEmantic Based Geographic Information Systems

  7. Introduction • Constructing fuzzy footprints • Experimental results Workshop on SEmantic Based Geographic Information Systems

  8. Obtaining relevant locations the Ardeche region - Located in the north of the Ardeche region, <city>- (<city>,)* and other cities in the Ardeche region- <city> is situated in the heart of the Ardeche region- … St-Félicien, Lamastre, St-Agrève,… ADL gazetteer Workshop on SEmantic Based Geographic Information Systems

  9. Obtaining relevant locations • Disambiguation of location names based on • the country the region is located in • the distance to the other locations Workshop on SEmantic Based Geographic Information Systems

  10. Constructing a footprint • Existing approaches • Use the convex hull of the locations  web data is too noisy  not suitable for vague regions • Use the density of the locations (Purves et al., 2005)  reflects popularity rather than the extent of a region • Our solution: search for additional constraints to filter out noise Workshop on SEmantic Based Geographic Information Systems

  11. Constructing a footprint x is in the north of the Ardeche region Workshop on SEmantic Based Geographic Information Systems

  12. Constructing a footprint inconsistent x is in the north of the Ardeche region ??? consistent Workshop on SEmantic Based Geographic Information Systems

  13. Modelling constraints x is located in the north of the Ardeche Inconsistent Gradual transition Consistent Workshop on SEmantic Based Geographic Information Systems

  14. Modelling constraints x is located in the north of the Ardeche Inconsistent Gradual transition Based on the average difference in y co-ordinates Consistent Workshop on SEmantic Based Geographic Information Systems

  15. Modelling constraints • In a similar way: • x is located in the south of the Ardeche • x is located in the west of the Ardeche • x is located in the east of the Ardeche • x is located in the north-west of the Ardeche  x is located in the north of the Ardeche  x is located in the west of the Ardeche • x is located in the heart of the Ardeche Workshop on SEmantic Based Geographic Information Systems

  16. Modelling constraints the Ardeche is located in the south of France Inconsistent Gradual transition Consistent Workshop on SEmantic Based Geographic Information Systems

  17. Modelling constraints the Ardeche is located in the south of France Inconsistent Gradual transition Based on the minimal bounding box for France (ADL gazetteer) Consistent Workshop on SEmantic Based Geographic Information Systems

  18. Modelling constraints • In a similar way: • R is located in the north of France • R is located in the east of France • R is located in the west of France • R is located in the north-west of France  R is located in the north of France  R is located in the west of France • R is located in the heart of France Workshop on SEmantic Based Geographic Information Systems

  19. Modelling constraints Heuristic: points that are too far from the median are likely to be noise Inconsistent Gradual transition Consistent Workshop on SEmantic Based Geographic Information Systems

  20. Modelling constraints Heuristic: points that are too far from the median are likely to be noise Inconsistent Gradual transition Based on the average distance to the median Consistent Workshop on SEmantic Based Geographic Information Systems

  21. Example Constraints satisfied to degree 0 Constraints satisfied to degree 0.4 Constraints satisfied to degree 0.6 Constraints satisfied to degree 1 Workshop on SEmantic Based Geographic Information Systems

  22. Example Constraints satisfied to degree 1 Workshop on SEmantic Based Geographic Information Systems

  23. Example Constraints satisfied to degree 0.6 Workshop on SEmantic Based Geographic Information Systems

  24. Example Constraints satisfied to degree 0.4 Workshop on SEmantic Based Geographic Information Systems

  25. Some remarks • If the set of constraints is inconsistent (i.e. no point satisfies all constraints), we remove a minimal set of constraints such that: • As many constraints as possible are preserved • The area of the fuzzy footprint is as high as possible • Imposing constraints is used to improve precision, not recall Workshop on SEmantic Based Geographic Information Systems

  26. Bordering regions Footprint can be constructed using the ADL gazetteer Workshop on SEmantic Based Geographic Information Systems

  27. Introduction • Constructing fuzzy footprints • Experimental results Workshop on SEmantic Based Geographic Information Systems

  28. Evaluation metric • Precision: degree to which the fuzzy footprint F is included in the correct footprint G • Recall: degree to which the correct footprint G is included in the fuzzy footprint F Workshop on SEmantic Based Geographic Information Systems

  29. Test data • 81 political subregions of France, Italy, Canada, Australia and China • Divided into three groups: • Regions for which we found more than 30 candidate cities • Regions for which we found less than 10 candidate cities • Regions for which we found between 10 and 30 candidate cities • Gold standard: convex hull of the locations that are known to lie in the region according to the ADL gazetteer Workshop on SEmantic Based Geographic Information Systems

  30. Precision • Without bordering regions • With bordering regions Workshop on SEmantic Based Geographic Information Systems

  31. Recall • Without bordering regions • With bordering regions Workshop on SEmantic Based Geographic Information Systems

  32. Conclusions • New approach to approximate the footprint of an unknown region • Also suitable for vague regions • Search for constraints on the web to improve precision • Search for bordering regions on the web to improve recall • Experimental results confirm this hypothesis Thank you for your attention! Workshop on SEmantic Based Geographic Information Systems

More Related