Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

X-Spanformer: A Tokenizer-Free, Span-Aware Encoder Inspired by X-Bar Theory

Authors: Rawson, Kara;

X-Spanformer: A Tokenizer-Free, Span-Aware Encoder Inspired by X-Bar Theory

Abstract

This paper introduces X-Spanformer, a tokenizer-free, span-aware encoder that learns compositional segmentation directly from raw input streams using a pointer-network mechanism inspired by X-bar theory. Starting with a compact BPE seed, the model refines span boundaries through a staged curriculum involving synthetic supervision, entropy regularization, and contrastive alignment, producing softly typed spans pooled into transformer layers via a lightweight compositional interface. This joint optimization approach supports adaptable segmentation and representation across modalities such as code and natural language, validated through metrics including compression ratio, entropy decay, span-type KL divergence, and syntactic fidelity. The release includes an ONNX-compatible implementation and reproducible training recipes, positioning X-Spanformer as a foundation for interpretable, scalable encoders in structured learning, neural parsing, and program induction.

Keywords

program induction, contrastive learning, pointer networks, span-based modeling, entropy regularization, semantic composition, code representation, multilingual NLP, modular architecture, structured learning, curriculum learning, neural parsing, tokenizer-free segmentation, compositional representation, transformer encoder, span-aware encoding, unsupervised segmentation, X-bar theory, syntactic structure, ONNX-compatible

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
Upload OA version
Are you the author? Do you have the OA version of this publication?