Advancing Temporal Action Localization: Efficient Large Model Adaptation and Open-Vocabulary Recognition in Videos

Gupta, Akshita

Found an issue? Give us feedback

downloadFull-Text

DSpace at the Univer...arrow_drop_down

DSpace at the University of Guelph (Atrium)

Thesis

License: CC BY NC

Full-Text: https://atrium.lib.uoguelph.ca/bitstreams/ccc2850b-8011-4f29-a618-9c363c29a877/download

Data sources: DSpace at the University of Guelph (Atrium)

Advancing Temporal Action Localization: Efficient Large Model Adaptation and Open-Vocabulary Recognition in Videos

descriptionPublicationkeyboard_double_arrow_right Thesis Canada English Publisher:University of Guelph

Authors: Gupta, Akshita;

handle: 10214/28683

Advancing Temporal Action Localization: Efficient Large Model Adaptation and Open-Vocabulary Recognition in Videos

- Summary
- Subjects
- Metrics

Abstract

This thesis introduces two novel approaches for Temporal Action Localization (TAL) in video understanding. The Long-Short-range Adapter (LoSA) is a memory-efficient backbone adapter for untrimmed videos, modifying intermediate layers across various temporal ranges to enhance video features. It enables end-to-end adaptation of billion-parameter models like VideoMAEv2. The OVFormer framework addresses Open-Vocabulary TAL by generating rich class descriptions using a language model, aligning these with video features through cross-attention, and employing a two-stage training strategy for novel category generalization. LoSA enables efficient use of state-of-the-art video models, while OVFormer expands recognizable actions beyond predefined categories. These contributions significantly advance TAL, enhancing both capability and flexibility in action recognition and paving the way for more versatile video understanding systems.

Country

Canada

Related Organizations

University of Guelph
Canada

Keywords

action recognition, Temporal Action Localization, video understanding

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green