Datenbestand vom 13. März 2019
Tel: 089 / 66060798
Mo - Fr, 9 - 12 Uhr
Fax: 089 / 66060799
aktualisiert am 13. März 2019
978-3-8439-2342-2, Reihe Informatik
Methods for Model-based and Model-free Recognition of Articulated Actions in Multi-View Environments
349 Seiten, Dissertation Friedrich-Schiller-Universität Jena (2015), Softcover, A5
Analyzing and recognizing human actions automatically from continuous image sequences is one of the most important and popular problems in modern computer vision and machine learning research and is applied in manifold aspects of daily live. For instance, human motion can be analyzed in order to gain knowledge about anatomy, pathology and therapeutical progress of patients. Furthermore, those systems enable to categorize recordings of public or security-relevant places into certain risk levels. Nowadays, people already got in touch with human action recognition systems implemented to realize human-machine interfaces in modern entertainment and gaming products.
Commonly, most of the traditional approaches for visual action recognition rely on single views only and hence have to face problems induced by self-occlusions and ambiguities. Extending these setups to the multi-view case is intended to resolve these problems and to increase robustness as well as reliability.
This thesis aims to highlight the entire processing pipeline ranging from image acquisition over object detection and tracking up to feature extraction and final reasoning. First, an extensive overview over recent approaches for single- and multi-view action recognition will be given and arranged within a semantic taxonomy. Then, with the help of concrete examples, two methods for realizing tracking by detection will be presented.
Focusing on action recognition, this thesis will subsequently contribute two methods reflecting an active dispute in psychology of perception. While the former attempts to recognize articulated actions in a model-based fashion by employing dimensionality reduction methods to the model parameter space, the latter relies on a model-free image representation scheme. The core idea behind this approach are temporal self-similarity maps which are able to capture the entire dynamics observed from a physical system and show invariance with respect to translation and rotation in feature space. Furthermore, these representations are mainly robust to viewpoint changes. Founding on this, two methods for supervised as well as unsupervised action learning and classification are derived.
Subsequently, results obtained from experiments performed for qualitative as well as quantitative evaluation of all presented methods will be presented. A new dataset for benchmarking model-free multi-view action recognition systems acquired in the course of this thesis will be introduced.