Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Mathematicsarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Mathematics
Article . 2025 . Peer-reviewed
License: CC BY
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://dx.doi.org/10.48550/ar...
Article . 2025
License: arXiv Non-Exclusive Distribution
Data sources: Datacite
DBLP
Preprint . 2025
Data sources: DBLP
versions View all 4 versions
addClaim

Probing the Topology of the Space of Tokens with Structured Prompts

Authors: Michael Robinson 0001; Sourya Dey; Taisa Kushner;

Probing the Topology of the Space of Tokens with Structured Prompts

Abstract

Some large language models (LLMs) are open source and are therefore fully open for scientific study. However, many LLMs are proprietary, and their internals are hidden, which hinders the ability of the research community to study their behavior under controlled conditions. For instance, the token input embedding specifies an internal vector representation of each token used by the model. If the token input embedding is hidden, latent semantic information about the set of tokens is unavailable to researchers. This article presents a general and flexible method for prompting an LLM to reveal its token input embedding, even if this information is not published with the model. Moreover, this article provides strong theoretical justification—a mathematical proof for generic LLMs—for why this method should be expected to work. If the LLM can be prompted systematically and certain benign conditions about the quantity of data collected from the responses are met, the topology of the token embedding is recovered. With this method in hand, we demonstrate its effectiveness by recovering the token subspace of the Llemma-7BLLM. We demonstrate the flexibility of this method by performing the recovery at three different times, each using the same algorithm applied to different information collected from the responses. While the prompting can be a performance bottleneck depending on the size and complexity of the LLM, the recovery runs within a few hours on a typical workstation. The results of this paper apply not only to LLMs but also to general nonlinear autoregressive processes.

Related Organizations
Keywords

Mathematics - Differential Geometry, FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Differential Geometry (math.DG), Computer Science - Artificial Intelligence, I.2.7, FOS: Mathematics, 53Z50, 58Z05

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
gold