Name: Unnatural language processing: How do language models handle machine-generated prompts?
Keywords: Semàntica, FOS: Computer and information sciences, Computer Science - Computation and Language, Llengües artificials, Lingüística computacional, Tractament del llenguatge natural (Informàtica), Computation and Language (cs.CL)

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2023Embargo end date: 01 Jan 2023Publisher:Association for Computational Linguistics (ACL)Journal:Findings of the Association for Computational Linguistics: EMNLP 2023Funded by:EC | ALiEN

Authors: Kervadec, Corentin; Franzon, Francesca; Baroni, Marco;

doi: 10.18653/v1/2023.findings-emnlp.959 , 10.48550/arxiv.2310.15829

arXiv: http://arxiv.org/abs/2310.15829

handle: 10230/58560

Unnatural language processing: How do language models handle machine-generated prompts?

- Summary
- Subjects
- Metrics

Abstract

Language model prompt optimization research has shown that semantically and grammatically well-formed manually crafted prompts are routinely outperformed by automatically generated token sequences with no apparent meaning or syntactic structure, including sequences of vectors from a model's embedding space. We use machine-generated prompts to probe how models respond to input that is not composed of natural language expressions. We study the behavior of models of different sizes in multiple semantic tasks in response to both continuous and discrete machine-generated prompts, and compare it to the behavior in response to human-generated natural-language prompts. Even when producing a similar output, machine-generated and human prompts trigger different response patterns through the network processing pathways, including different perplexities, different attention and output entropy distributions, and different unit activation profiles. We provide preliminary insight into the nature of the units activated by different prompt types, suggesting that only natural language prompts recruit a genuinely linguistic circuit.

Findings of EMNLP 2023 Camera-Ready

Related Organizations

Universitat Pompeu Fabra
Spain

Keywords

Semàntica, FOS: Computer and information sciences, Computer Science - Computation and Language, Llengües artificials, Lingüística computacional, Tractament del llenguatge natural (Informàtica), Computation and Language (cs.CL)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

Funded by

EC| ALiEN

Related to Research communities

EUTOPIA Open Research Portal