
handle: 20.500.12608/35518
A human motion capture system can be defined as a process that digitally records the movements of a person and then translates them into computer-animated images. To achieve this goal, motion capture systems usually exploit different types of algorithms, which include techniques such as pose estimation or background subtraction: this latter aims at segmenting moving objects from the background under multiple challenging scenarios. Recently, encoder-decoder-type deep neural networks designed to accomplish this task have reached impressive results, outperforming classical approaches. The aim of this thesis is to evaluate and discuss the predictions provided by the multi-scale convolutional neural network FgSegNet_v2, a deep learning-based method which represents the current state-of-the-art for implementing scene-specific background subtraction. In this work, FgSegNet_v2 is trained and tested on BBSoF S.r.l. dataset, extending its scene- specific use to a more general application in several environments.
