Spatio-Temporal Volume-based Video Event Detection

Doctoral thesis English OPEN
Wang, Jing
  • Subject: QA75

Online and offline video clips provide rich information on dynamic events that occurred over a period of time, for example, human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last 3 decades on 2D image feature processing and their applications in areas such as face matching and objects recognition, video event detection still remains one of the most challenging fields in computer vision study due to the wide range of continuous and non-linear signals engaged by an imaging system, and the inherent semantic difficulties in machine-based understanding of the detected feature patterns. \ud \ud For bridging the gap between the pixel-level image features and the semantic “meanings” of a videoed single human event, this research has investigated the problem domain through employing the 3D Spatio-Temporal Volume (STV) structure and its global feature paradigm for event pattern recognition. The process pipeline follows an improved Pair-wise Region Comparison (I-PWRC) and a region intersection (RI) based 3D template matching approach for detecting and identifying human actions under uncontrolled real-world videoing conditions. To maintain the run-time performance of this innovative system design, this programme has also developed an efficient pre-filtering mechanism to reduce the amount of voxels (volumetric pixels) that need to be processed in each operational cycle.\ud \ud For further improving the system’s adaptability and robustness, several optimisation techniques, such as scale-invariant template matching and event location prediction mechanisms, have also been developed and implemented. The proposed design has been tested on various renowned online computer vision research databases and been benchmarked against other classic implementation strategies and systems. Satisfactory evaluation results have been obtained through statistical analyses on standard test criteria such as "Recall" rate and the processing efficiency.
  • References (7)

    Foix, S., G. Alenya, et al. (2011). "Lock-in Time-of-Flight (ToF) Cameras: A Survey." Sensors Journal, IEEE 11(9): 1917-1926.

    Forsyth, D. A. and J. Ponce (2003). Chapter 4: Colour. Computer Vision: A Moden Approach. Indianapolis, Prentice Hall: 80-121.

    Forsyth, D. A. and J. Ponce (2003). Chapter 9: Edge Detection. Computer Vision: A Moden Approach. Indianapolis, Prentice Hall: 238-260.

    Forsyth, D. A. and J. Ponce (2003). Computer Vision: A Moden Approach. Indianapolis, Prentice Hall: 245-246.

    Forsyth, D. A. and J. Ponce (2003). Section 5: Mid-level Vision. Computer Vision: A Moden Approach. Indianapolis, Prentice Hall: 433-468.

    Fry, P. (2011, 16 Aug 2011). "How https://www.cctvusergroup.com/art.php?art=94.

    Wu, Z. and R. Leahy (1993). "An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation." IEEE Transactions on Pattern Analysis and Machine Intelligence 15(11): 1101-1113.

  • Metrics
    0
    views in OpenAIRE
    0
    views in local repository
    28
    downloads in local repository

    The information is available from the following content providers:

    From Number Of Views Number Of Downloads
    University of Huddersfield Repository - IRUS-UK 0 28
Share - Bookmark