Deep Learning for Toponym Resolution: Geocoding Based on Pairs of Toponyms

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 02 Dec 2021 France English Publisher:MDPI AGJournal:ISPRS International Journal of Geo-Information, volume 10, page 818 (eissn: 2220-9964,

Copyright policy )Funded by:ANR | IDEXLYON

Authors: Jacques Fize; Ludovic Moncla; Bruno Martins 0001;

doi: 10.3390/ijgi10120818

Deep Learning for Toponym Resolution: Geocoding Based on Pairs of Toponyms

- Summary
- Subjects
- Metrics

Abstract

Geocoding aims to assign unambiguous locations (i.e., geographic coordinates) to place names (i.e., toponyms) referenced within documents (e.g., within spreadsheet tables or textual paragraphs). This task comes with multiple challenges, such as dealing with referent ambiguity (multiple places with a same name) or reference database completeness. In this work, we propose a geocoding approach based on modeling pairs of toponyms, which returns latitude-longitude coordinates. One of the input toponyms will be geocoded, and the second one is used as context to reduce ambiguities. The proposed approach is based on a deep neural network that uses Long Short-Term Memory (LSTM) units to produce representations from sequences of character n-grams. To train our model, we use toponym co-occurrences collected from different contexts, namely textual (i.e., co-occurrences of toponyms in Wikipedia articles) and geographical (i.e., inclusion and proximity of places based on Geonames data). Experiments based on multiple geographical areas of interest—France, United States, Great-Britain, Nigeria, Argentina and Japan—were conducted. Results show that models trained with co-occurrence data obtained a higher geocoding accuracy, and that proximity relations in combination with co-occurrences can help to obtain a slightly higher accuracy in geographical areas with fewer places in the data sources.

Country

France

Related Organizations

INSA Lyon
France
Claude Bernard University Lyon 1
France
Institut National des Sciences Appliquées de Lyon
France
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento
Portugal
INSA Lyon
France

View all View all

Keywords

Geography (General), toponym resolution; geocoding; deep neural networks, geocoding, deep neural networks, [INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE], G1-922, [INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR], [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], toponym resolution

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	15
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%