
This paper describes VERGE, an interactive system for retrieving video content, supporting searches over images extracted from videos. The system retains its core retrieval modalities and fusion techniques in an improved form, while introducing new modalities, including optical character recognition, video question answering, underwater object detection, image quality-based retrieval, and surgical scenes understanding. VERGE’s web application continues to offer its user-friendly interface, helping users create queries and efficiently explore the top search results. The system now incorporates newly extracted keyframes for the V3C dataset, generated with a new algorithm developed for this purpose.
