Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object
Data sources: ZENODO
addClaim

Video Reconstruction using Diffusion-based Image-to-Video Generation with Trajectory Guidance

Authors: Bompai, Stelio; Kontopoulos, Ioannis; Spiliopoulos, Giannis; Zissis, Dimitris; Tserpes, Konstantinos;

Video Reconstruction using Diffusion-based Image-to-Video Generation with Trajectory Guidance

Abstract

This paper addresses the problem of reconstructing missing or dropped frames in top-down drone video of autonomoussurface vehicles performing structured maritime manoeuvres. We propose a pipeline that converts raw GPS telemetryand a single reference frame into a trajectory-guided video sequence using a pre-trained image-to-video diffusion model,requiring no domain-specific fine-tuning. GPS coordinates from onboard telemetry logs are projected into image space via anequirectangular mapping, producing per-vessel motion cues that condition the SG-I2V diffusion model. The generated frames are evaluated against ground-truth video using perceptual, temporal and trajectory-based metrics, and benchmarked against optical flow extrapolation and RIFE interpolation baselines. SG-I2V produces the most naturally appearing frames among all methods (BRISQUE 25.52, closest to ground-truth 23.64), the most realistic motion magnitude (temporal smoothness 1.14 vs. ground truth 1.42), and the strongest GPS trajectory adherence (9.31px vs. 28.70px for ground-truth, the latter reflecting approximate temporal alignment between footage and GPS logs rather than generation error), demonstrating that trajectory-guided diffusion synthesis is a viable approach to maritime video reconstruction under challenging low-texture, small-object conditions.

Powered by OpenAIRE graph
Found an issue? Give us feedback