Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Research
Data sources: ZENODO
addClaim

Radical-Aligned Structure in Multilingual Transformer Representations of Chinese Characters : A Controlled Empirical Study

Authors: Maity, Aryan;

Radical-Aligned Structure in Multilingual Transformer Representations of Chinese Characters : A Controlled Empirical Study

Abstract

This paper investigates whether multilingual transformer models organize Chinese character representations according to Kangxi radical categories. Using a dataset of 6,306 Chinese characters across 68 radicals, we analyze embeddings from mBERT and Chinese-BERT through cosine similarity, Euclidean distance, permutation testing, bootstrap confidence intervals, and effect size analysis. Results show a small but statistically reliable radical-aligned signal at corpus scale. However, a controlled semantic experiment demonstrates that this effect disappears when semantic similarity is matched, suggesting that the observed structure is primarily driven by semantic regularities historically encoded in the Chinese writing system rather than independent orthographic encoding. The repository includes the paper, analysis code, datasets, statistical outputs, and reproducibility artifacts.

Powered by OpenAIRE graph
Found an issue? Give us feedback