To see the other types of publications on this topic, follow the link: Gesture recognition module.

Dissertations / Theses on the topic 'Gesture recognition module'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 18 dissertations / theses for your research on the topic 'Gesture recognition module.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Terzi, Matteo. "Learning interpretable representations for classification, anomaly detection, human gesture and action recognition." Doctoral thesis, Università degli studi di Padova, 2019. http://hdl.handle.net/11577/3423183.

Full text
Abstract:
The goal of this thesis is to provide algorithms and models for classification, gesture recognition and anomaly detection with a partial focus on human activity. In applications where humans are involved, it is of paramount importance to provide robust and understandable algorithms and models. A way to accomplish this requirement is to use relatively simple and robust approaches, especially when devices are resource-constrained. The second approach, when a large amount of data is present, is to adopt complex algorithms and models and make them robust and interpretable from a human-like point of view. This motivates our thesis that is divided in two parts. The first part of this thesis is devoted to the development of parsimonious algorithms for action/gesture recognition in human-centric applications such as sports and anomaly detection for artificial pancreas. The data sources employed for the validation of our approaches consist of a collection of time-series data coming from sensors, such as accelerometers or glycemic. The main challenge in this context is to discard (i.e. being invariant to) many nuisance factors that make the recognition task difficult, especially where many different users are involved. Moreover, in some cases, data cannot be easily labelled, making supervised approaches not viable. Thus, we present the mathematical tools and the background with a focus to the recognition problems and then we derive novel methods for: (i) gesture/action recognition using sparse representations for a sport application; (ii) gesture/action recognition using a symbolic representations and its extension to the multivariate case; (iii) model-free and unsupervised anomaly detection for detecting faults on artificial pancreas. These algorithms are well-suited to be deployed in resource constrained devices, such as wearables. In the second part, we investigate the feasibility of deep learning frameworks where human interpretation is crucial. Standard deep learning models are not robust and, unfortunately, literature approaches that ensure robustness are typically detrimental to accuracy in general. However, in general, real-world applications often require a minimum amount of accuracy to be employed. In view of this, after reviewing some results present in the recent literature, we formulate a new algorithm being able to semantically trade-off between accuracy and robustness, where a cost-sensitive classification problem is provided and a given threshold of accuracy is required. In addition, we provide a link between robustness to input perturbations and interpretability guided by a physical minimum energy principle: in fact, leveraging optimal transport tools, we show that robust training is connected to the optimal transport problem. Thanks to these theoretical insights we develop a new algorithm that provides robust, interpretable and more transferable representations.
APA, Harvard, Vancouver, ISO, and other styles
2

Niezen, Gerrit. "The optimization of gesture recognition techniques for resource-constrained devices." Diss., Pretoria : [s.n.], 2008. http://upetd.up.ac.za/thesis/available/etd-01262009-125121/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Rajah, Christopher. "Chereme-based recognition of isolated, dynamic gestures from South African sign language with Hidden Markov Models." Thesis, University of the Western Cape, 2006. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_4979_1183461652.

Full text
Abstract:
<p>Much work has been done in building systems that can recognize gestures, e.g. as a component of sign language recognition systems. These systems typically use whole gestures as the smallest unit for recognition. Although high recognition rates have been reported, these systems do not scale well and are computationally intensive. The reason why these systems generally scale poorly is that they recognize gestures by building individual models for each separate gesture<br>as the number of gestures grows, so does the required number of models. Beyond a certain threshold number of gestures to be recognized, this approach become infeasible. This work proposed that similarly good recognition rates can be achieved by building models for subcomponents of whole gestures, so-called cheremes. Instead of building models for entire gestures, we build models for cheremes and recognize gestures as sequences of such cheremes. The assumption is that many gestures share cheremes and that the number of cheremes necessary to describe gestures is much smaller than the number of gestures. This small number of cheremes then makes it possible to recognized a large number of gestures with a small number of chereme models. This approach is akin to phoneme-based speech recognition systems where utterances are recognized as phonemes which in turn are combined into words.</p>
APA, Harvard, Vancouver, ISO, and other styles
4

Ma, Limin. "Statistical Modeling of Video Event Mining." Ohio University / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1146792818.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Yang, Ruiduo. "Dynamic programming with multiple candidates and its applications to sign language and hand gesture recognition." [Tampa, Fla.] : University of South Florida, 2008. http://purl.fcla.edu/usf/dc/et/SFE0002310.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Bodiroža, Saša. "Gestures in human-robot interaction." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät, 2017. http://dx.doi.org/10.18452/17705.

Full text
Abstract:
Gesten sind ein Kommunikationsweg, der einem Betrachter Informationen oder Absichten übermittelt. Daher können sie effektiv in der Mensch-Roboter-Interaktion, oder in der Mensch-Maschine-Interaktion allgemein, verwendet werden. Sie stellen eine Möglichkeit für einen Roboter oder eine Maschine dar, um eine Bedeutung abzuleiten. Um Gesten intuitiv benutzen zukönnen und Gesten, die von Robotern ausgeführt werden, zu verstehen, ist es notwendig, Zuordnungen zwischen Gesten und den damit verbundenen Bedeutungen zu definieren -- ein Gestenvokabular. Ein Menschgestenvokabular definiert welche Gesten ein Personenkreis intuitiv verwendet, um Informationen zu übermitteln. Ein Robotergestenvokabular zeigt welche Robotergesten zu welcher Bedeutung passen. Ihre effektive und intuitive Benutzung hängt von Gestenerkennung ab, das heißt von der Klassifizierung der Körperbewegung in diskrete Gestenklassen durch die Verwendung von Mustererkennung und maschinellem Lernen. Die vorliegende Dissertation befasst sich mit beiden Forschungsbereichen. Als eine Voraussetzung für die intuitive Mensch-Roboter-Interaktion wird zunächst ein Aufmerksamkeitsmodell für humanoide Roboter entwickelt. Danach wird ein Verfahren für die Festlegung von Gestenvokabulare vorgelegt, das auf Beobachtungen von Benutzern und Umfragen beruht. Anschliessend werden experimentelle Ergebnisse vorgestellt. Eine Methode zur Verfeinerung der Robotergesten wird entwickelt, die auf interaktiven genetischen Algorithmen basiert. Ein robuster und performanter Gestenerkennungsalgorithmus wird entwickelt, der auf Dynamic Time Warping basiert, und sich durch die Verwendung von One-Shot-Learning auszeichnet, das heißt durch die Verwendung einer geringen Anzahl von Trainingsgesten. Der Algorithmus kann in realen Szenarien verwendet werden, womit er den Einfluss von Umweltbedingungen und Gesteneigenschaften, senkt. Schließlich wird eine Methode für das Lernen der Beziehungen zwischen Selbstbewegung und Zeigegesten vorgestellt.<br>Gestures consist of movements of body parts and are a mean of communication that conveys information or intentions to an observer. Therefore, they can be effectively used in human-robot interaction, or in general in human-machine interaction, as a way for a robot or a machine to infer a meaning. In order for people to intuitively use gestures and understand robot gestures, it is necessary to define mappings between gestures and their associated meanings -- a gesture vocabulary. Human gesture vocabulary defines which gestures a group of people would intuitively use to convey information, while robot gesture vocabulary displays which robot gestures are deemed as fitting for a particular meaning. Effective use of vocabularies depends on techniques for gesture recognition, which considers classification of body motion into discrete gesture classes, relying on pattern recognition and machine learning. This thesis addresses both research areas, presenting development of gesture vocabularies as well as gesture recognition techniques, focusing on hand and arm gestures. Attentional models for humanoid robots were developed as a prerequisite for human-robot interaction and a precursor to gesture recognition. A method for defining gesture vocabularies for humans and robots, based on user observations and surveys, is explained and experimental results are presented. As a result of the robot gesture vocabulary experiment, an evolutionary-based approach for refinement of robot gestures is introduced, based on interactive genetic algorithms. A robust and well-performing gesture recognition algorithm based on dynamic time warping has been developed. Most importantly, it employs one-shot learning, meaning that it can be trained using a low number of training samples and employed in real-life scenarios, lowering the effect of environmental constraints and gesture features. Finally, an approach for learning a relation between self-motion and pointing gestures is presented.
APA, Harvard, Vancouver, ISO, and other styles
7

Pavllo, Dario. "Riconoscimento real-time di gesture tramite tecniche di machine learning." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/10999/.

Full text
Abstract:
Il riconoscimento delle gesture è un tema di ricerca che sta acquisendo sempre più popolarità, specialmente negli ultimi anni, grazie ai progressi tecnologici dei dispositivi embedded e dei sensori. Lo scopo di questa tesi è quello di utilizzare alcune tecniche di machine learning per realizzare un sistema in grado di riconoscere e classificare in tempo reale i gesti delle mani, a partire dai segnali mioelettrici (EMG) prodotti dai muscoli. Inoltre, per consentire il riconoscimento di movimenti spaziali complessi, verranno elaborati anche segnali di tipo inerziale, provenienti da una Inertial Measurement Unit (IMU) provvista di accelerometro, giroscopio e magnetometro. La prima parte della tesi, oltre ad offrire una panoramica sui dispositivi wearable e sui sensori, si occuperà di analizzare alcune tecniche per la classificazione di sequenze temporali, evidenziandone vantaggi e svantaggi. In particolare, verranno considerati approcci basati su Dynamic Time Warping (DTW), Hidden Markov Models (HMM), e reti neurali ricorrenti (RNN) di tipo Long Short-Term Memory (LSTM), che rappresentano una delle ultime evoluzioni nel campo del deep learning. La seconda parte, invece, riguarderà il progetto vero e proprio. Verrà impiegato il dispositivo wearable Myo di Thalmic Labs come caso di studio, e saranno applicate nel dettaglio le tecniche basate su DTW e HMM per progettare e realizzare un framework in grado di eseguire il riconoscimento real-time di gesture. Il capitolo finale mostrerà i risultati ottenuti (fornendo anche un confronto tra le tecniche analizzate), sia per la classificazione di gesture isolate che per il riconoscimento in tempo reale.
APA, Harvard, Vancouver, ISO, and other styles
8

Gurrapu, Chaitanya. "Human Action Recognition In Video Data For Surveillance Applications." Thesis, Queensland University of Technology, 2004. https://eprints.qut.edu.au/15878/1/Chaitanya_Gurrapu_Thesis.pdf.

Full text
Abstract:
Detecting human actions using a camera has many possible applications in the security industry. When a human performs an action, his/her body goes through a signature sequence of poses. To detect these pose changes and hence the activities performed, a pattern recogniser needs to be built into the video system. Due to the temporal nature of the patterns, Hidden Markov Models (HMM), used extensively in speech recognition, were investigated. Initially a gesture recognition system was built using novel features. These features were obtained by approximating the contour of the foreground object with a polygon and extracting the polygon's vertices. A Gaussian Mixture Model (GMM) was fit to the vertices obtained from a few frames and the parameters of the GMM itself were used as features for the HMM. A more practical activity detection system using a more sophisticated foreground segmentation algorithm immune to varying lighting conditions and permanent changes to the foreground was then built. The foreground segmentation algorithm models each of the pixel values using clusters and continually uses incoming pixels to update the cluster parameters. Cast shadows were identified and removed by assuming that shadow regions were less likely to produce strong edges in the image than real objects and that this likelihood further decreases after colour segmentation. Colour segmentation itself was performed by clustering together pixel values in the feature space using a gradient ascent algorithm called mean shift. More robust features in the form of mesh features were also obtained by dividing the bounding box of the binarised object into grid elements and calculating the ratio of foreground to background pixels in each of the grid elements. These features were vector quantized to reduce their dimensionality and the resulting symbols presented as features to the HMM to achieve a recognition rate of 62% for an event involving a person writing on a white board. The recognition rate increased to 80% for the &quotseen" person sequences, i.e. the sequences of the person used to train the models. With a fixed lighting position, the lack of a shadow removal subsystem improved the detection rate. This is because of the consistent profile of the shadows in both the training and testing sequences due to the fixed lighting positions. Even with a lower recognition rate, the shadow removal subsystem was considered an indispensable part of a practical, generic surveillance system.
APA, Harvard, Vancouver, ISO, and other styles
9

Gurrapu, Chaitanya. "Human Action Recognition In Video Data For Surveillance Applications." Queensland University of Technology, 2004. http://eprints.qut.edu.au/15878/.

Full text
Abstract:
Detecting human actions using a camera has many possible applications in the security industry. When a human performs an action, his/her body goes through a signature sequence of poses. To detect these pose changes and hence the activities performed, a pattern recogniser needs to be built into the video system. Due to the temporal nature of the patterns, Hidden Markov Models (HMM), used extensively in speech recognition, were investigated. Initially a gesture recognition system was built using novel features. These features were obtained by approximating the contour of the foreground object with a polygon and extracting the polygon's vertices. A Gaussian Mixture Model (GMM) was fit to the vertices obtained from a few frames and the parameters of the GMM itself were used as features for the HMM. A more practical activity detection system using a more sophisticated foreground segmentation algorithm immune to varying lighting conditions and permanent changes to the foreground was then built. The foreground segmentation algorithm models each of the pixel values using clusters and continually uses incoming pixels to update the cluster parameters. Cast shadows were identified and removed by assuming that shadow regions were less likely to produce strong edges in the image than real objects and that this likelihood further decreases after colour segmentation. Colour segmentation itself was performed by clustering together pixel values in the feature space using a gradient ascent algorithm called mean shift. More robust features in the form of mesh features were also obtained by dividing the bounding box of the binarised object into grid elements and calculating the ratio of foreground to background pixels in each of the grid elements. These features were vector quantized to reduce their dimensionality and the resulting symbols presented as features to the HMM to achieve a recognition rate of 62% for an event involving a person writing on a white board. The recognition rate increased to 80% for the &quotseen" person sequences, i.e. the sequences of the person used to train the models. With a fixed lighting position, the lack of a shadow removal subsystem improved the detection rate. This is because of the consistent profile of the shadows in both the training and testing sequences due to the fixed lighting positions. Even with a lower recognition rate, the shadow removal subsystem was considered an indispensable part of a practical, generic surveillance system.
APA, Harvard, Vancouver, ISO, and other styles
10

Jaroň, Lukáš. "Ovládání počítače gesty." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236609.

Full text
Abstract:
This masters thesis describes possibilities and principles of gesture-based computer interface. The work describes general approaches for gesture control.  It also deals with implementation of the selected detection method of the hands and fingers using depth maps loaded form Kinect sensor. The implementation also deals with gesture recognition using hidden Markov models. For demonstration purposes there is also described implementation of a simple photo viewer that uses developed gesture-based computer interface. The work also focuses on quality testing and accuracy evaluation for selected gesture recognizer.
APA, Harvard, Vancouver, ISO, and other styles
11

Clay, Alexis. "La branche émotion, un modèle conceptuel pour l’intégration de la reconnaissance multimodale d’émotions dans des applications interactives : application au mouvement et `a la danse augmentée." Thesis, Bordeaux 1, 2009. http://www.theses.fr/2009BOR13935/document.

Full text
Abstract:
La reconnaissance d'émotions est un domaine jeune mais dont la maturité grandissante implique de nouveaux besoins en termes de modélisation et d'intégration dans des modèles existants. Ce travail de thèse expose un modèle conceptuel pour la conception d'applications interactives sensibles aux émotions de l'utilisateur. Notre approche se fonde sur les résultats conceptuels issus de l'interaction multimodale: nous redéfinissons les concepts de modalité et de multimodalité dans le cadre de la reconnaissance passive d'émotions. Nous décrivons ensuite un modèle conceptuel à base de composants logiciels s'appuyant sur cette redéfinition: la branche émotion, facilitant la conception, le développement et le maintien d'applications reconnaissant l'émotion de l'utilisateur. Une application multimodale de reconnaissance d'émotions par la gestuelle a été développée selon le modèle de la branche émotion et intégrée dans un système d'augmentation de spectacle de ballet sensible aux émotions d'un danseur<br>Computer-based emotion recognition is a growing field which develops new needs in terms of software modeling and integration of existing models. This thesis describes a conceptual framework for designing emotionally-aware interactive software. Our approach is based upon conceptual results from the field of multimodal interaction: we redefine the concepts of modality and multimodality within the frame of passive emotion recognition. We then describe a component-based conceptual model relying on this redefinition. The emotion branch facilitates the design, development and maintenance of emotionally-aware systems. A multimodal, interactive, gesture-based emotion recognition software based on the emotion branch was developed. This system was integrated within an augmented reality system to augment a ballet dance show according to the dancer's expressed emotions
APA, Harvard, Vancouver, ISO, and other styles
12

Fdili, Alaoui Sarah. "Analyse du geste dansé et retours visuels par modèles physiques : apport des qualités de mouvement à l'interaction avec le corps entier." Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00805519.

Full text
Abstract:
La présente thèse a pour but d'approfondir l'étude du geste dans le cadre de l'interaction Homme Machine. Il s'agit de créer de nouveaux paradigmes d'interaction qui offrent à l'utilisateur de plus amples possibilités d'expression basées sur le geste. Un des vecteurs d'expression du geste, très rarement traité en Interaction Homme Machine, qui lui confère sa coloration et son aspect, est ce que les théoriciens et praticiens de la danse appellent " les qualités de mouvement ". Nous mettons à profit des collaborations avec le domaine de la danse pour étudier la notion de qualités de mouvement et l'intégrer à des paradigmes d'interaction gestuelle. Notre travail analyse les apports de l'intégration des qualités de mouvement comme modalité d'interaction, fournit les outils propices à l'élaboration de cette intégration (en termes de méthodes d'analyse, de visualisation et de contrôle gestuel), en développe et évalue certaines techniques d'interaction.Les contributions de la thèse se situent d'abord dans la formalisation de la notion de qualités de mouvement et l'évaluation de son intégration dans un dispositif interactif en termes d'expérience utilisateur. Sur le plan de la visualisation des qualités de mouvement, les travaux menés pendant la thèse ont permis de démontrer que les modèles physiques masses-ressorts offrent de grandes possibilités de simulation de comportements dynamiques et de contrôle en temps réel. Sur le plan de l'analyse, la thèse a permis de développer des approches novatrices de reconnaissance automatique des qualités de mouvement de l'utilisateur. Enfin, à partir des approches d'analyse et de visualisation des qualités de mouvement, la thèse a donné lieu à l'implémentation d'un ensemble de techniques d'interaction. Elle a appliqué et évalué ses techniques dans le contexte de la pédagogie de la danse et de la performance.
APA, Harvard, Vancouver, ISO, and other styles
13

Truong, Arthur. "Analyse du contenu expressif des gestes corporels." Thesis, Evry, Institut national des télécommunications, 2016. http://www.theses.fr/2016TELE0015/document.

Full text
Abstract:
Aujourd’hui, les recherches portant sur le geste manquent de modèles génériques. Les spécialistes du geste doivent osciller entre une formalisation excessivement conceptuelle et une description purement visuelle du mouvement. Nous reprenons les concepts développés par le chorégraphe Rudolf Laban pour l’analyse de la danse classique contemporaine, et proposons leur extension afin d’élaborer un modèle générique du geste basé sur ses éléments expressifs. Nous présentons également deux corpus de gestes 3D que nous avons constitués. Le premier, ORCHESTRE-3D, se compose de gestes pré-segmentés de chefs d’orchestre enregistrés en répétition. Son annotation à l’aide d’émotions musicales est destinée à l’étude du contenu émotionnel de la direction musicale. Le deuxième corpus, HTI 2014-2015, propose des séquences d’actions variées de la vie quotidienne. Dans une première approche de reconnaissance dite « globale », nous définissons un descripteur qui se rapporte à l’entièreté du geste. Ce type de caractérisation nous permet de discriminer diverses actions, ainsi que de reconnaître les différentes émotions musicales que portent les gestes des chefs d’orchestre de notre base ORCHESTRE-3D. Dans une seconde approche dite « dynamique », nous définissons un descripteur de trame gestuelle (e.g. défini pour tout instant du geste). Les descripteurs de trame sont utilisés des poses-clés du mouvement, de sorte à en obtenir à tout instant une représentation simplifiée et utilisable pour reconnaître des actions à la volée. Nous testons notre approche sur plusieurs bases de geste, dont notre propre corpus HTI 2014-2015<br>Nowadays, researches dealing with gesture analysis suffer from a lack of unified mathematical models. On the one hand, gesture formalizations by human sciences remain purely theoretical and are not inclined to any quantification. On the other hand, the commonly used motion descriptors are generally purely intuitive, and limited to the visual aspects of the gesture. In the present work, we retain Laban Movement Analysis (LMA – originally designed for the study of dance movements) as a framework for building our own gesture descriptors, based on expressivity. Two datasets are introduced: the first one is called ORCHESTRE-3D, and is composed of pre-segmented orchestra conductors’ gestures, which have been annotated with the help of lexicon of musical emotions. The second one, HTI 2014-2015, comprises sequences of multiple daily actions. In a first experiment, we define a global feature vector based upon the expressive indices of our model and dedicated to the characterization of the whole gesture. This descriptor is used for action recognition purpose and to discriminate the different emotions of our orchestra conductors’ dataset. In a second approach, the different elements of our expressive model are used as a frame descriptor (e.g., describing the gesture at a given time). The feature space provided by such local characteristics is used to extract key poses of the motion. With the help of such poses, we obtain a per-frame sub-representation of body motions which is available for real-time action recognition purpose
APA, Harvard, Vancouver, ISO, and other styles
14

Truong, Arthur. "Analyse du contenu expressif des gestes corporels." Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2016. http://www.theses.fr/2016TELE0015.

Full text
Abstract:
Aujourd’hui, les recherches portant sur le geste manquent de modèles génériques. Les spécialistes du geste doivent osciller entre une formalisation excessivement conceptuelle et une description purement visuelle du mouvement. Nous reprenons les concepts développés par le chorégraphe Rudolf Laban pour l’analyse de la danse classique contemporaine, et proposons leur extension afin d’élaborer un modèle générique du geste basé sur ses éléments expressifs. Nous présentons également deux corpus de gestes 3D que nous avons constitués. Le premier, ORCHESTRE-3D, se compose de gestes pré-segmentés de chefs d’orchestre enregistrés en répétition. Son annotation à l’aide d’émotions musicales est destinée à l’étude du contenu émotionnel de la direction musicale. Le deuxième corpus, HTI 2014-2015, propose des séquences d’actions variées de la vie quotidienne. Dans une première approche de reconnaissance dite « globale », nous définissons un descripteur qui se rapporte à l’entièreté du geste. Ce type de caractérisation nous permet de discriminer diverses actions, ainsi que de reconnaître les différentes émotions musicales que portent les gestes des chefs d’orchestre de notre base ORCHESTRE-3D. Dans une seconde approche dite « dynamique », nous définissons un descripteur de trame gestuelle (e.g. défini pour tout instant du geste). Les descripteurs de trame sont utilisés des poses-clés du mouvement, de sorte à en obtenir à tout instant une représentation simplifiée et utilisable pour reconnaître des actions à la volée. Nous testons notre approche sur plusieurs bases de geste, dont notre propre corpus HTI 2014-2015<br>Nowadays, researches dealing with gesture analysis suffer from a lack of unified mathematical models. On the one hand, gesture formalizations by human sciences remain purely theoretical and are not inclined to any quantification. On the other hand, the commonly used motion descriptors are generally purely intuitive, and limited to the visual aspects of the gesture. In the present work, we retain Laban Movement Analysis (LMA – originally designed for the study of dance movements) as a framework for building our own gesture descriptors, based on expressivity. Two datasets are introduced: the first one is called ORCHESTRE-3D, and is composed of pre-segmented orchestra conductors’ gestures, which have been annotated with the help of lexicon of musical emotions. The second one, HTI 2014-2015, comprises sequences of multiple daily actions. In a first experiment, we define a global feature vector based upon the expressive indices of our model and dedicated to the characterization of the whole gesture. This descriptor is used for action recognition purpose and to discriminate the different emotions of our orchestra conductors’ dataset. In a second approach, the different elements of our expressive model are used as a frame descriptor (e.g., describing the gesture at a given time). The feature space provided by such local characteristics is used to extract key poses of the motion. With the help of such poses, we obtain a per-frame sub-representation of body motions which is available for real-time action recognition purpose
APA, Harvard, Vancouver, ISO, and other styles
15

Mihoub, Alaeddine. "Apprentissage statistique de modèles de comportement multimodal pour les agents conversationnels interactifs." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAT079/document.

Full text
Abstract:
L'interaction face-à-face représente une des formes les plus fondamentales de la communication humaine. C'est un système dynamique multimodal et couplé – impliquant non seulement la parole mais de nombreux segments du corps dont le regard, l'orientation de la tête, du buste et du corps, les gestes faciaux et brachio-manuels, etc – d'une grande complexité. La compréhension et la modélisation de ce type de communication est une étape cruciale dans le processus de la conception des agents interactifs capables d'engager des conversations crédibles avec des partenaires humains. Concrètement, un modèle de comportement multimodal destiné aux agents sociaux interactifs fait face à la tâche complexe de générer un comportement multimodal étant donné une analyse de la scène et une estimation incrémentale des objectifs conjoints visés au cours de la conversation. L'objectif de cette thèse est de développer des modèles de comportement multimodal pour permettre aux agents artificiels de mener une communication co-verbale pertinente avec un partenaire humain. Alors que l'immense majorité des travaux dans le domaine de l'interaction humain-agent repose essentiellement sur des modèles à base de règles, notre approche se base sur la modélisation statistique des interactions sociales à partir de traces collectées lors d'interactions exemplaires, démontrées par des tuteurs humains. Dans ce cadre, nous introduisons des modèles de comportement dits "sensori-moteurs", qui permettent à la fois la reconnaissance des états cognitifs conjoints et la génération des signaux sociaux d'une manière incrémentale. En particulier, les modèles de comportement proposés ont pour objectif d'estimer l'unité d'interaction (IU) dans laquelle sont engagés de manière conjointe les interlocuteurs et de générer le comportement co-verbal du tuteur humain étant donné le comportement observé de son/ses interlocuteur(s). Les modèles proposés sont principalement des modèles probabilistes graphiques qui se basent sur les chaînes de markov cachés (HMM) et les réseaux bayésiens dynamiques (DBN). Les modèles ont été appris et évalués – notamment comparés à des classifieurs classiques – sur des jeux de données collectés lors de deux différentes interactions face-à-face. Les deux interactions ont été soigneusement conçues de manière à collecter, en un minimum de temps, un nombre suffisant d'exemplaires de gestion de l'attention mutuelle et de deixis multimodale d'objets et de lieux. Nos contributions sont complétées par des méthodes originales d'interprétation et d'évaluation des propriétés des modèles proposés. En comparant tous les modèles avec les vraies traces d'interactions, les résultats montrent que le modèle HMM, grâce à ses propriétés de modélisation séquentielle, dépasse les simples classifieurs en terme de performances. Les modèles semi-markoviens (HSMM) ont été également testé et ont abouti à un meilleur bouclage sensori-moteur grâce à leurs propriétés de modélisation des durées des états. Enfin, grâce à une structure de dépendances riche apprise à partir des données, le modèle DBN a les performances les plus probantes et démontre en outre la coordination multimodale la plus fidèle aux évènements multimodaux originaux<br>Face to face interaction is one of the most fundamental forms of human communication. It is a complex multimodal and coupled dynamic system involving not only speech but of numerous segments of the body among which gaze, the orientation of the head, the chest and the body, the facial and brachiomanual movements, etc. The understanding and the modeling of this type of communication is a crucial stage for designing interactive agents capable of committing (hiring) credible conversations with human partners. Concretely, a model of multimodal behavior for interactive social agents faces with the complex task of generating gestural scores given an analysis of the scene and an incremental estimation of the joint objectives aimed during the conversation. The objective of this thesis is to develop models of multimodal behavior that allow artificial agents to engage into a relevant co-verbal communication with a human partner. While the immense majority of the works in the field of human-agent interaction (HAI) is scripted using ruled-based models, our approach relies on the training of statistical models from tracks collected during exemplary interactions, demonstrated by human trainers. In this context, we introduce "sensorimotor" models of behavior, which perform at the same time the recognition of joint cognitive states and the generation of the social signals in an incremental way. In particular, the proposed models of behavior have to estimate the current unit of interaction ( IU) in which the interlocutors are jointly committed and to predict the co-verbal behavior of its human trainer given the behavior of the interlocutor(s). The proposed models are all graphical models, i.e. Hidden Markov Models (HMM) and Dynamic Bayesian Networks (DBN). The models were trained and evaluated - in particular compared with classic classifiers - using datasets collected during two different interactions. Both interactions were carefully designed so as to collect, in a minimum amount of time, a sufficient number of exemplars of mutual attention and multimodal deixis of objects and places. Our contributions are completed by original methods for the interpretation and comparative evaluation of the properties of the proposed models. By comparing the output of the models with the original scores, we show that the HMM, thanks to its properties of sequential modeling, outperforms the simple classifiers in term of performances. The semi-Markovian models (HSMM) further improves the estimation of sensorimotor states thanks to duration modeling. Finally, thanks to a rich structure of dependency between variables learnt from the data, the DBN has the most convincing performances and demonstrates both the best performance and the most faithful multimodal coordination to the original multimodal events
APA, Harvard, Vancouver, ISO, and other styles
16

"Real-time gesture recognition using MEMS acceleration sensors." 2009. http://library.cuhk.edu.hk/record=b5896907.

Full text
Abstract:
by Zhou, Shengli.<br>Thesis (M.Phil.)--Chinese University of Hong Kong, 2009.<br>Includes bibliographical references (leaves 70-75).<br>Abstract also in Chinese.<br>Chapter Chapter 1 --- Introduction --- p.1<br>Chapter 1.1 --- Background of Gesture Recognition --- p.1<br>Chapter 1.2 --- HCI System --- p.2<br>Chapter 1.2.1 --- Vision Based HCI System --- p.2<br>Chapter 1.2.2 --- Accelerometer Based HCI System --- p.4<br>Chapter 1.3 --- Pattern Recognition Methods --- p.6<br>Chapter 1.4 --- Thesis Outline --- p.7<br>Chapter Chapter 2 --- 2D Hand-Written Character Recognition --- p.8<br>Chapter 2.1 --- Introduction to Accelerometer Based Hand-Written Character Recognition --- p.8<br>Chapter 2.1.1 --- Character Recognition Based on Trajectory Reconstruction --- p.9<br>Chapter 2.1.2 --- Character Recognition Based on Classification --- p.10<br>Chapter 2.2 --- Neural Network --- p.11<br>Chapter 2.2.1 --- Mathematical Model of Neural Network (NN) --- p.11<br>Chapter 2.2.2 --- Types of Neural Network Learning --- p.13<br>Chapter 2.2.3 --- Self-Organizing Maps (SOMs) --- p.14<br>Chapter 2.2.4 --- Properties of Neural Network --- p.16<br>Chapter 2.3 --- Experimental Setup --- p.17<br>Chapter 2.4 --- Configuration of Sensing Mote --- p.18<br>Chapter 2.5 --- Data Acquisition Methods --- p.19<br>Chapter 2.6 --- Data Preprocessing Methods --- p.20<br>Chapter 2.6.1 --- Fast Fourier Transform (FFT) --- p.21<br>Chapter 2.6.2 --- Discrete Cosine Transform (DCT) --- p.23<br>Chapter 2.6.3 --- Problem Analysis --- p.25<br>Chapter 2.7 --- Hand-written Character Classification using SOMs --- p.26<br>Chapter 2.7.1 --- Recognition of All Characters in the Same Group --- p.27<br>Chapter 2.7.2 --- Recognize the Numbers and Letters Respectively --- p.28<br>Chapter 2.8 --- Conclusion --- p.29<br>Chapter Chapter 3 --- Human Gesture Recognition --- p.32<br>Chapter 3.1 --- Introduction to Human Gesture Recognition --- p.32<br>Chapter 3.1.1 --- Dynamic Gesture Recognition --- p.32<br>Chapter 3.1.2 --- Hidden Markov Models (HMMs) --- p.33<br>Chapter 3.1.2.1 --- Applications of HMMs --- p.34<br>Chapter 3.1.2.2 --- Training Algorithm --- p.35<br>Chapter 3.1.2.3 --- Recognition Algorithm --- p.35<br>Chapter 3.2 --- System Architecture --- p.36<br>Chapter 3.2.1 --- Experimental Devices --- p.36<br>Chapter 3.2.2 --- Data Acquisition Methods --- p.38<br>Chapter 3.2.3 --- System Work Flow --- p.39<br>Chapter 3.3 --- Real-Time Gesture Spotting --- p.40<br>Chapter 3.3.1 --- Introduction --- p.40<br>Chapter 3.3.2 --- Gesture Segmentation Based on Standard Deviation Calculation --- p.42<br>Chapter 3.3.3 --- Evaluation of Gesture Spotting Program --- p.47<br>Chapter 3.4 --- Comparison of Data Processing Methods --- p.48<br>Chapter 3.4.1 --- Discrete Cosine Transform (DCT) --- p.48<br>Chapter 3.4.2 --- Discrete Wavelet Transform (DWT) --- p.49<br>Chapter 3.4.3 --- Zero Bias Compensation and Filtering (ZBC&F) --- p.51<br>Chapter 3.4.4 --- Comparison of Experimental Results --- p.52<br>Chapter 3.5 --- Data Base Setup --- p.53<br>Chapter 3.6 --- Experimental Results Based on the Database Obtained from Ten Test Subjects --- p.53<br>Chapter 3.6.1 --- Experimental Results when Gestures are Manually and Automatically “cut´ح --- p.54<br>Chapter 3.6.2 --- The Influence of Number of Dominant Frequencies on Recognition --- p.55<br>Chapter 3.6.3 --- The Influence of Sampling Frequencies on Recognition --- p.59<br>Chapter 3.6.4 --- Influence of Number of Test Subjects on Recognition --- p.62<br>Chapter 3.6.4.1 --- Experimental Results When Training and Testing Subjects Are Overlaped --- p.61<br>Chapter 3.6.4.2 --- Experimental Results When Training and Testing Subjects Are Not Overlap --- p.62<br>Chapter 3.6.4.3 --- Discussion --- p.65<br>Chapter Chapter 4 --- Conclusion --- p.68<br>Bibliography --- p.70
APA, Harvard, Vancouver, ISO, and other styles
17

Frieslaar, Ibraheem. "Robust South African sign language gesture recognition using hand motion and shape." 2014. http://hdl.handle.net/11394/3526.

Full text
Abstract:
Magister Scientiae - MSc<br>Research has shown that five fundamental parameters are required to recognize any sign language gesture: hand shape, hand motion, hand location, hand orientation and facial expressions. The South African Sign Language (SASL) research group at the University of the Western Cape (UWC) has created several systems to recognize sign language gestures using single parameters. These systems are, however, limited to a vocabulary size of 20 – 23 signs, beyond which the recognition accuracy is expected to decrease. The first aim of this research is to investigate the use of two parameters – hand motion and hand shape – to recognise a larger vocabulary of SASL gestures at a high accuracy. Also, the majority of related work in the field of sign language gesture recognition using these two parameters makes use of Hidden Markov Models (HMMs) to classify gestures. Hidden Markov Support Vector Machines (HM-SVMs) are a relatively new technique that make use of Support Vector Machines (SVMs) to simulate the functions of HMMs. Research indicates that HM-SVMs may perform better than HMMs in some applications. To our knowledge, they have not been applied to the field of sign language gesture recognition. This research compares the use of these two techniques in the context of SASL gesture recognition. The results indicate that, using two parameters results in a 15% increase in accuracy over the use of a single parameter. Also, it is shown that HM-SVMs are a more accurate technique than HMMs, generally performing better or at least as good as HMMs.
APA, Harvard, Vancouver, ISO, and other styles
18

Almeida, Rui Nuno de. "Portuguese sign language recognition via computer vision and depth sensor." Master's thesis, 2011. http://hdl.handle.net/10071/8231.

Full text
Abstract:
Sign languages are used worldwide by a multitude of individuals. They are mostly used by the deaf communities and their teachers, or people associated with them by ties of friendship or family. Speakers are a minority of citizens, often segregated, and over the years not much attention has been given to this form of communication, even by the scientific community. In fact, in Computer Science there is some, but limited, research and development in this area. In the particular case of sign Portuguese Sign Language-PSL that fact is more evident and, to our knowledge there isn’t yet an efficient system to perform the automatic recognition of PSL signs. With the advent and wide spreading of devices such as depth sensors, there are new possibilities to address this problem. In this thesis, we have specified, developed, tested and preliminary evaluated, solutions that we think will bring valuable contributions to the problem of Automatic Gesture Recognition, applied to Sign Languages, such as the case of Portuguese Sign Language. In the context of this work, Computer Vision techniques were adapted to the case of Depth Sensors. A proper gesture taxonomy for this problem was proposed, and techniques for feature extraction, representation, storing and classification were presented. Two novel algorithms to solve the problem of real-time recognition of isolated static poses were specified, developed, tested and evaluated. Two other algorithms for isolated dynamic movements for gesture recognition (one of them novel), have been also specified, developed, tested and evaluated. Analyzed results compare well with the literature.<br>As Línguas Gestuais são utilizadas em todo o Mundo por uma imensidão de indivíduos. Trata-se na sua grande maioria de surdos e/ou mudos, ou pessoas a eles associados por laços familiares de amizade ou professores de Língua Gestual. Tratando-se de uma minoria, muitas vezes segregada, não tem vindo a ser dada ao longo dos anos pela comunidade científica, a devida atenção a esta forma de comunicação. Na área das Ciências da Computação existem alguns, mas poucos trabalhos de investigação e desenvolvimento. No caso particular da Língua Gestual Portuguesa - LGP esse facto é ainda mais evidente não sendo nosso conhecimento a existência de um sistema eficaz e efetivo para fazer o reconhecimento automático de gestos da LGP. Com o aparecimento ou massificação de dispositivos, tais como sensores de profundidade, surgem novas possibilidades para abordar este problema. Nesta tese, foram especificadas, desenvolvidas, testadas e efectuada a avaliação preliminar de soluções que acreditamos que trarão valiosas contribuições para o problema do Reconhecimento Automático de Gestos, aplicado às Línguas Gestuais, como é o caso da Língua Gestual Portuguesa. Foram adaptadas técnicas de Visão por Computador ao caso dos Sensores de Profundidade. Foi proposta uma taxonomia adequada ao problema, e apresentadas técnicas para a extração, representação e armazenamento de características. Foram especificados, desenvolvidos, testados e avaliados dois algoritmos para resolver o problema do reconhecimento em tempo real de poses estáticas isoladas. Foram também especificados, desenvolvidos, testados e avaliados outros dois algoritmos para o Reconhecimento de Movimentos Dinâmicos Isolados de Gestos(um deles novo).Os resultados analisados são comparáveis à literatura.<br>Las lenguas de Signos se utilizan en todo el Mundo por una multitud de personas. En su mayoría son personas sordas y/o mudas, o personas asociadas con ellos por vínculos de amistad o familiares y profesores de Lengua de Signos. Es una minoría de personas, a menudo segregadas, y no se ha dado en los últimos años por la comunidad científica, la atención debida a esta forma de comunicación. En el área de Ciencias de la Computación hay alguna pero poca investigación y desarrollo. En el caso particular de la Lengua de Signos Portuguesa - LSP, no es de nuestro conocimiento la existencia de un sistema eficiente y eficaz para el reconocimiento automático. Con la llegada en masa de dispositivos tales como Sensores de Profundidad, hay nuevas posibilidades para abordar el problema del Reconocimiento de Gestos. En esta tesis se han especificado, desarrollado, probado y hecha una evaluación preliminar de soluciones, aplicada a las Lenguas de Signos como el caso de la Lengua de Signos Portuguesa - LSP. Se han adaptado las técnicas de Visión por Ordenador para el caso de los Sensores de Profundidad. Se propone una taxonomía apropiada para el problema y se presentan técnicas para la extracción, representación y el almacenamiento de características. Se desarrollaran, probaran, compararan y analizan los resultados de dos nuevos algoritmos para resolver el problema del Reconocimiento Aislado y Estático de Posturas. Otros dos algoritmos (uno de ellos nuevo) fueran también desarrollados, probados, comparados y analizados los resultados, para el Reconocimiento de Movimientos Dinámicos Aislados de los Gestos.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!