32
Automatic Acquisition of Fuzzy Footprints Steven Schockaert, Martine De Cock, Etienne E. Kerre

Automatic Acquisition of Fuzzy Footprints

  • Upload
    umika

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Automatic Acquisition of Fuzzy Footprints. Steven Schockaert, Martine De Cock, Etienne E. Kerre. Introduction Constructing fuzzy footprints Experimental results. WWW. Geographical Question Answering. Give a list of Italian Restaurants in the neighborhood of Agia Napa. - PowerPoint PPT Presentation

Citation preview

Page 1: Automatic Acquisition of Fuzzy Footprints

Automatic Acquisition of Fuzzy Footprints

Steven Schockaert, Martine De Cock, Etienne E. Kerre

Page 2: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

1. Introduction

2. Constructing fuzzy footprints

3. Experimental results

Page 3: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Geographical Question Answering

WWW

Give a list of Italian Restaurants in the neighborhood of Agia Napa.

La Strada Italian Restaurant, Bosko’s ristorante, …

Page 4: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Geographic Question Answering

• Resources– Linguistic resources for question analysis, answer

extraction, …– A traditional search engine to locate relevant documents– Geographic background knowledge

• Footprints provided by gazetteers are often inadequate– We need a more fine-grained representation than a

bounding box– Questions may involve vague regions such as the Alpes,

the Highlands, …

• Our solution: construct footprints automatically– Use the web the collect relevant information– Use a digital gazetteer to map location names to co-

ordinates– Use fuzzy sets to represent footprints

Page 5: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Fuzzy Sets

• A fuzzy set A in a universe U is a mapping from U to [0,1] (Zadeh, 1965)– u belongs to A A(u)=1– u doesn’t belong to A A(u)=0– u more or less belongs to A 0 < A(u) < 1

Old

Page 6: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• We represent footprints as fuzzy sets in the universe of co-ordinates

Fuzzy Footprints

“South of France”

Page 7: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

1. Introduction

2. Constructing fuzzy footprints

3. Experimental results

Page 8: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Obtaining relevant locations

the Ardeche region

- Located in the north of the Ardeche region, <city>- (<city>,)* and other cities in the Ardeche region- <city> is situated in the heart of the Ardeche region- …

St-Félicien, Lamastre, St-Agrève,…

ADL gazetteer

Page 9: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• Disambiguation of location names based on– the country the region is located in– the distance to the other locations

Obtaining relevant locations

Page 10: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• Existing approaches– Use the convex hull of the locations

web data is too noisy not suitable for vague regions

– Use the density of the locations (Purves et al., 2005) reflects popularity rather than the extent of a

region

• Our solution: search for additional constraints to filter out noise

Constructing a footprint

Page 11: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Constructing a footprint

x is in the north of the Ardeche region

Page 12: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Constructing a footprint

x is in the north of the Ardeche region

inconsistent

consistent

???

Page 13: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Modelling constraints

x is located in the north of the Ardeche

Gradual transition

Consistent

Inconsistent

Page 14: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Modelling constraints

x is located in the north of the Ardeche

Gradual transition

Consistent

Inconsistent

Based on the average difference in y co-ordinates

Page 15: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• In a similar way:– x is located in the south of the Ardeche– x is located in the west of the Ardeche– x is located in the east of the Ardeche– x is located in the north-west of the Ardeche

x is located in the north of the Ardeche x is located in the west of the Ardeche

– x is located in the heart of the Ardeche

Modelling constraints

Page 16: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Modelling constraints

the Ardeche is located in the south of France

Gradual transition

Consistent

Inconsistent

Page 17: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Modelling constraints

the Ardeche is located in the south of France

Gradual transition

Consistent

Inconsistent

Based on the minimal bounding box for France (ADL gazetteer)

Page 18: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• In a similar way:– R is located in the north of France– R is located in the east of France– R is located in the west of France– R is located in the north-west of France

R is located in the north of France R is located in the west of France

– R is located in the heart of France

Modelling constraints

Page 19: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Modelling constraints

Heuristic: points that are too far from themedian are likely to be noise

Inconsistent

Gradual transition

Consistent

Page 20: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Modelling constraints

Heuristic: points that are too far from themedian are likely to be noise

Inconsistent

Gradual transition

ConsistentBased on the average distance to the median

Page 21: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Example

Constraints satisfied to degree 1

Constraints satisfied to degree 0.6

Constraints satisfied to degree 0.4

Constraints satisfied to degree 0

Page 22: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Example

Constraints satisfied to degree 1

Page 23: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Example

Constraints satisfied to degree 0.6

Page 24: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Example

Constraints satisfied to degree 0.4

Page 25: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• If the set of constraints is inconsistent (i.e. no point satisfies all constraints), we remove a minimal set of constraints such that:– As many constraints as possible are preserved– The area of the fuzzy footprint is as high as possible

• Imposing constraints is used to improve precision, not recall

Some remarks

Page 26: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Bordering regions

Footprint can be constructed using the ADL gazetteer

Page 27: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

1. Introduction

2. Constructing fuzzy footprints

3. Experimental results

Page 28: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Evaluation metric

• Precision: degree to which the fuzzy footprint F is included in the correct footprint G

• Recall: degree to which the correct footprint G is included in the fuzzy footprint F

Page 29: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• 81 political subregions of France, Italy, Canada, Australia and China

• Divided into three groups:– Regions for which we found more than 30 candidate cities– Regions for which we found less than 10 candidate cities– Regions for which we found between 10 and 30 candidate

cities

• Gold standard: convex hull of the locations that are known to lie in the region according to the ADL gazetteer

Test data

Page 30: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

Precision

• Without bordering regions

• With bordering regions

Page 31: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• Without bordering regions

• With bordering regions

Recall

Page 32: Automatic Acquisition of Fuzzy Footprints

Workshop on SEmantic Based Geographic Information Systems

• New approach to approximate the footprint of an unknown region

• Also suitable for vague regions• Search for constraints on the web to improve precision• Search for bordering regions on the web to improve

recall• Experimental results confirm this hypothesis

Conclusions

Thank you for your attention!