Reassembling Fragmented Entity Names: A Novel Model for Chinese Compound Noun Processing

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 14 Oct 2023 English Publisher:MDPI AGJournal:Electronics, volume 12, page 4,251 (eissn: 2079-9292,

Copyright policy )

Authors: Yuze Pan; Xiaofeng Fu;

doi: 10.3390/electronics12204251

Reassembling Fragmented Entity Names: A Novel Model for Chinese Compound Noun Processing

- Summary
- Subjects
- Metrics

Abstract

In the process of classifying intelligent assets, we encountered challenges with a limited dataset dominated by complex compound noun phrases. Training classifiers directly on this dataset posed risks of overfitting and potential misinterpretations due to inherent ambiguities in these phrases. Recognizing the gap in the current literature for tailored methods addressing this challenge, this paper introduces a refined approach for the accurate extraction of entity names from such structures. We leveraged the Chinese pre-trained BERT model combined with an attention mechanism, ensuring precise interpretation of each token’s significance. This was followed by employing both a multi-layer perceptron (MLP) and an LSTM-based Sequence Parsing Model, tailored for sequence annotation and rule-based parsing. With the aid of a rule-driven decoder, we reconstructed comprehensive entity names. Our approach adeptly extracts structurally coherent entity names from fragmented compound noun phrases. Experiments on a manually annotated dataset of compound noun phrases demonstrate that our model consistently outperforms rival methodologies. These results compellingly validate our method’s superiority in extracting entity names from compound noun phrases.

Related Organizations

Hangzhou Dianzi University
China (People's Republic of)

Keywords

sequence parsing, entity name extraction, fragmentation, compound noun phrases, sequence labeling

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

gold

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering