Views provided by UsageCounts
handle: 2117/13246
Record Linkage (RL) is an important component of data cleaning and integration and data processing in general. For years, many efforts have focused on improving the performance of the RL process, either by reducing the number of record comparisons or reducing the number of attribute comparisons, which reduces the computational time, but increases the amount of error. However, the real bottleneck of RL is the post-process, where the results have to be reviewed by experts that decide which pairs or groups of records are real links and which are false hits. In this paper we show that exploiting the semantic relationships (e.g. foreign key), established between one or more data sources, makes it possible to find a new sort of semantic blocking method that improves the number of hits and reduces the amount of review effort.
Data cleansing, Processament electrònic de dades -- Depuració, Electronic data processing -- Data preparation, Processament electrònic de dades -- Control de qualitat, Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació, Data processing, Record linkage, Semantic information, :Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació [Àrees temàtiques de la UPC], Data integration, Electronic data processing -- Quality control, Data integration (Computer science), Blocking algorithms
Data cleansing, Processament electrònic de dades -- Depuració, Electronic data processing -- Data preparation, Processament electrònic de dades -- Control de qualitat, Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació, Data processing, Record linkage, Semantic information, :Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació [Àrees temàtiques de la UPC], Data integration, Electronic data processing -- Quality control, Data integration (Computer science), Blocking algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 25 |

Views provided by UsageCounts