Using stereo vision to support the automated analysis of surveillance videos

Article, Other literature type English OPEN
Menze, Moritz ; Muhle, Daniel (2012)
  • Publisher: Copernicus GmbH
  • Journal: (issn: 1682-1750, eissn: 2194-9034)
  • Related identifiers: doi: 10.5194/isprsarchives-XXXIX-B3-47-2012, doi: 10.15488/1092
  • Subject: TA1-2040 | T | Geowissenschaften | TA1501-1820 | Applied optics. Photonics | Engineering (General). Civil engineering (General) | Technology | Stereoscopic Vision | Image Matching | Image Sequences
    • ddc: ddc:550

Video surveillance systems are no longer a collection of independent cameras, manually controlled by human operators. Instead, smart sensor networks are developed, able to fulfil certain tasks on their own and thus supporting security personnel by automated analyses. One well-known task is the derivation of people’s positions on a given ground plane from monocular video footage. An improved accuracy for the ground position as well as a more detailed representation of single salient people can be expected from a stereoscopic processing of overlapping views. Related work mostly relies on dedicated stereo devices or camera pairs with a small baseline. While this set-up is helpful for the essential step of image matching, the high accuracy potential of a wide baseline and the according good intersection geometry is not utilised. In this paper we present a stereoscopic approach, working on overlapping views of standard pan-tilt-zoom cameras which can easily be generated for arbitrary points of interest by an appropriate reconfiguration of parts of a sensor network. Experiments are conducted on realistic surveillance footage to show the potential of the suggested approach and to investigate the influence of different baselines on the quality of the derived surface model. Promising estimations of people’s position and height are retrieved. Although standard matching approaches show helpful results, future work will incorporate temporal dependencies available from image sequences in order to reduce computational effort and improve the derived level of detail.