
handle: 11250/2418429
The technological advances and an ever-growing amount of online data that has evolved over the recent years has stimulated an emergence of research fields that investigates how to extract useful information from the vast amounts of such data. This information extraction is a challenge due to the complexity and inherent ambiguity of natural language. In this thesis we study whether it is possible to improve the polysemous aspect of named entity disambiguation in entity linking by considering temporal data. We perform a preliminary entity linking to test the feasibility of this hypothesis by identifying mentions of person entities in a semantically tagged document corpus, and attempt to link these to corresponding entities that resides in a knowledge base. We then consider whether it is possible to improve this entity linking by assessing the temporal data that is available in the knowledge base, and in the content and metadata of the documents in the corpus. Our study indicate that it may be possible to improve named entity disambiguation by considering the temporal aspect of the data, but is likely inexpedient as the temporal data available in knowledge bases only hold a few explicit data points. This put restrictions on the information that can be extracted about an entity, and can really only convey whether the person was alive when the document was published, and perhaps exploit the information embedded in content-based temporal expressions, given that these expressions refer to dates that are remarkable enough to be registered in the knowledge base. Our study has also revealed pertinent aspects of language ambiguity in the context of entity linking and temporal information extraction that may be useful in further research on named entity disambiguation.
Informatikk, Databaser og søk
Informatikk, Databaser og søk
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
