Powered by OpenAIRE graph
Found an issue? Give us feedback
OpenMETUarrow_drop_down
OpenMETU
Master thesis . 2025
License: CC BY NC ND
Data sources: OpenMETU
addClaim

Token interchangeability and alpha-equivalence: enhancing the generalization capacity of language models for formal logic

Belirteç değiştirilebilirliği ve alfa-eşdeğerlilik: dil modellerinin biçimsel mantık için genelleme kabiliyetinin iyileştirilmesi
Authors: Işık, İlker;

Token interchangeability and alpha-equivalence: enhancing the generalization capacity of language models for formal logic

Abstract

Dil modelleri, değiştirilebilir belirteç kavramından yoksundur. Bu kavram, biçimsel mantıktaki bağlı değişkenler gibi anlamsal olarak eşdeğer ancak farklı olan sembolleri ifade eder. Bu eksiklik, daha geniş sözcük dağarcıklarına genellemeyi engeller ve modelin alfa eşdeğerliği tanıma yeteneğini engeller. Alfa eşdeğerlik, bağlı değişkenleri yeniden adlandırmanın anlamı kormasıdır. Bu çalışmada, bu makine öğrenimi sorunu formüle edildi ve bu tür dönüşümlere karşı sağlamlığı değerlendirmek için bir ölçüt olan alfa kovaryansı sunuldu. Bu görevi ele almak için, çift parçalı bir belirteç yerleştirme stratejisi öneriyoruz: paylaşılan bir bileşen anlamsal tutarlılığı sağlarken, rastgele bir bileşen belirteç ayırt edilebilirliğini koruyor. Veri artırma için alfa yeniden adlandırmaya dayanan bir yöntem ile karşılaştırıldığında, yaklaşımımız doğrusal zamansal mantık çözümünde, önermesel mantık atama tahmininde ve genişletilebilir bir sözcük dağarcığıyla kopyalamada görülmemiş belirteçlere yönelik genelleme gösterirken, alfa eşdeğerliği için olumlu bir tümevarımsal önyargı sunuyor. Bulgularımız, biçimsel (formal) alanlarda daha esnek ve sistematik akıl yürütmeye doğru önemli bir adım olan, değiştirilebilir belirteç gösterimlerini öğrenebilen dil modelleri tasarlamak için bir temel oluşturuyor.

Language models lack the notion of interchangeable tokens: symbols that are semantically equivalent yet distinct, such as bound variables in formal logic. This limitation prevents generalization to larger vocabularies and hinders the model's ability to recognize alpha-equivalence, where renaming bound variables preserves meaning. We formalize this machine learning problem and introduce alpha-covariance, a metric for evaluating robustness to such transformations. To tackle this task, we propose a dual-part token embedding strategy: a shared component ensures semantic consistency, while a randomized component maintains token distinguishability. Compared to a baseline that relies on alpha-renaming for data augmentation, our approach demonstrates improved generalization to unseen tokens in linear temporal logic solving, propositional logic assignment prediction, and copying with an extendable vocabulary, while introducing a favorable inductive bias for alpha-equivalence. Our findings establish a foundation for designing language models that can learn interchangeable token representations, a crucial step toward more flexible and systematic reasoning in formal domains.

Country
Turkey
Related Organizations
Keywords

Formal methods, Language modeling, Machine learning, Linear temporal logic

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!