
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>This paper introduces X-Spanformer, a tokenizer-free, span-aware encoder that learns compositional segmentation directly from raw input streams using a pointer-network mechanism inspired by X-bar theory. Starting with a compact BPE seed, the model refines span boundaries through a staged curriculum involving synthetic supervision, entropy regularization, and contrastive alignment, producing softly typed spans pooled into transformer layers via a lightweight compositional interface. This joint optimization approach supports adaptable segmentation and representation across modalities such as code and natural language, validated through metrics including compression ratio, entropy decay, span-type KL divergence, and syntactic fidelity. The release includes an ONNX-compatible implementation and reproducible training recipes, positioning X-Spanformer as a foundation for interpretable, scalable encoders in structured learning, neural parsing, and program induction.
program induction, contrastive learning, pointer networks, span-based modeling, entropy regularization, semantic composition, code representation, multilingual NLP, modular architecture, structured learning, curriculum learning, neural parsing, tokenizer-free segmentation, compositional representation, transformer encoder, span-aware encoding, unsupervised segmentation, X-bar theory, syntactic structure, ONNX-compatible
program induction, contrastive learning, pointer networks, span-based modeling, entropy regularization, semantic composition, code representation, multilingual NLP, modular architecture, structured learning, curriculum learning, neural parsing, tokenizer-free segmentation, compositional representation, transformer encoder, span-aware encoding, unsupervised segmentation, X-bar theory, syntactic structure, ONNX-compatible
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
