Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Agent Trajectory Replay for Debugging Tool-Using AI Workflow Regressions

Authors: Katta, Mukunda Rao;

Agent Trajectory Replay for Debugging Tool-Using AI Workflow Regressions

Abstract

Tool-using AI agents are difficult to debug because a failure may emerge from a sequence of planning steps, tool calls, intermediate errors, and final-output decisions rather than from a single response. This paper presents Agent Trajectory Replay, a small zero-dependency JavaScript package for summarizing, replaying, and diffing agent event traces. The package removes unstable timing fields before comparison, counts tool calls and errors, exposes final-output changes, and allows event handlers to rebuild state from a recorded trajectory. The contribution is a lightweight regression-debugging pattern for teams that need a simple way to compare agent behavior across model, prompt, or tool changes. This artifact bundle includes the manuscript, PDF, workflow figure, bibliography, metadata, and source notes grounded in the agent-trajectory-replay package.

Powered by OpenAIRE graph
Found an issue? Give us feedback