
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
doi: 10.1145/3202662
Location extraction, also called “toponym extraction,” is a field covering geoparsing, extracting spatial representations from location mentions in text, and geotagging, assigning spatial coordinates to content items. This article evaluates five “best-of-class” location extraction algorithms. We develop a geoparsing algorithm using an OpenStreetMap database, and a geotagging algorithm using a language model constructed from social media tags and multiple gazetteers. Third-party work evaluated includes a DBpedia-based entity recognition and disambiguation approach, a named entity recognition and Geonames gazetteer approach, and a Google Geocoder API approach. We perform two quantitative benchmark evaluations, one geoparsing tweets and one geotagging Flickr posts, to compare all approaches. We also perform a qualitative evaluation recalling top N location mentions from tweets during major news events. The OpenStreetMap approach was best (F1 0.90+) for geoparsing English, and the language model approach was best (F1 0.66) for Turkish. The language model was best (F1@1km 0.49) for the geotagging evaluation. The map database was best (R@20 0.60+) in the qualitative evaluation. We report on strengths, weaknesses, and a detailed failure analysis for the approaches and suggest concrete areas for further research.
Geoparsing, Location Extraction, Geocoding, Linguistic processing, Text analysis, Information Extraction, Geotagging, Toponym Extraction, 004, Natural Language Processing
Geoparsing, Location Extraction, Geocoding, Linguistic processing, Text analysis, Information Extraction, Geotagging, Toponym Extraction, 004, Natural Language Processing
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 92 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
views | 7 | |
downloads | 10 |