Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
Data sources: ZENODO
ZENODO
Dataset . 2025
Data sources: Datacite
ZENODO
Dataset . 2025
Data sources: Datacite
ZENODO
Dataset . 2025
Data sources: Datacite
versions View all 3 versions
addClaim

PAN'25 Generative AI Detection (Task 2): Human-AI Collaborative Text Classification

Authors: Wang, Yuxia; Shelmanov, Artem; Mansurov, Jonibek; Tsvigun, Akim; Habash, Nizar; Aji, Alham Fikri; Artemova, Ekaterina; +5 Authors

PAN'25 Generative AI Detection (Task 2): Human-AI Collaborative Text Classification

Abstract

Dataset for the Generative AI Detection Task (Subtask 2) @ PAN 2025. As large language models (LLMs) like GPT-4o, Claude 3.5, and Gemini 1.5-pro become increasingly accessible, machine-generated content is proliferating across diverse domains, including news, social media, education, and academia. These models produce highly fluent and coherent text, making them valuable for automating various writing tasks. However, their widespread use also raises concerns about misinformation, academic integrity, and content authenticity. Identifying the degree of human and machine involvement in text creation is crucial for addressing these challenges. In this shared task, we focus on Human-AI Collaborative Text Classification, where the goal is to categorize documents that have been co-authored by humans and LLMs. Specifically, we aim to classify texts into six distinct categories based on the nature of human and machine contributions: Fully human-written: The document is entirely authored by a human without any AI assistance. Human-initiated, then machine-continued: A human starts writing, and an AI model completes the text. Human-written, then machine-polished: The text is initially written by a human but later refined or edited by an AI model. Machine-written, then machine-humanized (obfuscated): An AI generates the text, which is later modified to obscure its machine origin. Machine-written, then human-edited: The content is generated by an AI but subsequently edited or refined by a human. Deeply-mixed text: The document contains interwoven sections written by both humans and AI, without a clear separation. Label Distribution: Label Category Train Dev Machine-written, then machine-humanized 91,232 10,137 Human-written, then machine-polished 95,398 12,289 Fully human-written 75,270 12,330 Human-initiated, then machine-continued 10,740 37,170 Deeply-mixed text (human + machine parts) 14,910 225 Machine-written, then human-edited 1,368 510 Total 288,918 72,661

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average