Academic literature on the topic 'Multimodale Annotation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Multimodale Annotation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Multimodale Annotation"

1

Kleida, Danae. "Entering a dance performance through multimodal annotation: annotating with scores." International Journal of Performance Arts and Digital Media 17, no. 1 (January 2, 2021): 19–30. http://dx.doi.org/10.1080/14794713.2021.1880182.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Debras, Camille. "How to prepare the video component of the Diachronic Corpus of Political Speeches for multimodal analysis." Research in Corpus Linguistics 9, no. 1 (2021): 132–51. http://dx.doi.org/10.32714/ricl.09.01.08.

Full text
Abstract:
The Diachronic Corpus of Political Speeches (DCPS) is a collection of 1,500 full-length political speeches in English. It includes speeches delivered in countries where English is an official language (the US, Britain, Canada, Ireland) by English-speaking politicians in various settings from 1800 up to the present time. Enriched with semi-automatic morphosyntactic annotations and with discourse-pragmatic manual annotations, the DCPS is designed to achieve maximum representativeness and balance for political English speeches from major national English varieties in time, preserve detailed metadata, and enable corpus-based studies of syntactic, semantic and discourse-pragmatic variation and change on political corpora. For speeches given from 1950 onwards, video-recordings of the original delivery are often retrievable online. This opens up avenues of research in multimodal linguistics, in which studies on the integration of speech and gesture in the construction of meaning can include analyses of recurrent gestures and of multimodal constructions. This article discusses the issues at stake in preparing the video-recorded component of the DCPS for linguistic multimodal analysis, namely the exploitability of recordings, the segmentation and alignment of transcriptions, the annotation of gesture forms and functions in the software ELAN and the quantity of available gesture data.
APA, Harvard, Vancouver, ISO, and other styles
3

Chou, Chien-Li, Hua-Tsung Chen, and Suh-Yin Lee. "Multimodal Video-to-Near-Scene Annotation." IEEE Transactions on Multimedia 19, no. 2 (February 2017): 354–66. http://dx.doi.org/10.1109/tmm.2016.2614426.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Saneiro, Mar, Olga C. Santos, Sergio Salmeron-Majadas, and Jesus G. Boticario. "Towards Emotion Detection in Educational Scenarios from Facial Expressions and Body Movements through Multimodal Approaches." Scientific World Journal 2014 (2014): 1–14. http://dx.doi.org/10.1155/2014/484873.

Full text
Abstract:
We report current findings when considering video recordings of facial expressions and body movements to provide affective personalized support in an educational context from an enriched multimodal emotion detection approach. In particular, we describe an annotation methodology to tag facial expression and body movements that conform to changes in the affective states of learners while dealing with cognitive tasks in a learning process. The ultimate goal is to combine these annotations with additional affective information collected during experimental learning sessions from different sources such as qualitative, self-reported, physiological, and behavioral information. These data altogether are to train data mining algorithms that serve to automatically identify changes in the learners’ affective states when dealing with cognitive tasks which help to provide emotional personalized support.
APA, Harvard, Vancouver, ISO, and other styles
5

Ladilova, Anna. "Multimodal Metaphors of Interculturereality / Metaforas multimodais da interculturealidade." REVISTA DE ESTUDOS DA LINGUAGEM 28, no. 2 (May 5, 2020): 917. http://dx.doi.org/10.17851/2237-2083.28.2.917-955.

Full text
Abstract:
Abstract: The present paper looks at the interactive construction of multimodal metaphors of interculturereality – a term coined by the author from interculturality and intercorporeality, assuming that intercultural interaction is always an embodied phenomenon, shared among its participants. For this, two videotaped sequences of a group conversation are analyzed drawing upon interaction analysis (Couper-Kuhlen; Selting, 2018). The data was transcribed following the GAT2 (Selting et al., 2011) guidelines, including gesture form annotation, which relied on the system described by Bressem (2013). Gesture function was interpreted drawing on the interactional context and on the system proposed by Kendon (2004) and Bressem and Müller (2013). The results question the validity of the classical conduit metaphor of communication (Reddy, 1979) in the intercultural context and instead propose an embodied approach to the conceptualization of the understanding process among the participants. The analysis also shows that even though the metaphors are multimodal, the metaphoric content is not always evenly distributed among the different modalities (speech, gesture). Apart from that, the metaphorical content is constructed sequentially, referring to preceding metaphors used by the same or different interlocutors and associated with metaphorical blends.Keywords: metaphors; multimodality; interculturality; intercorporeality; migration.Resumo: O presente artigo analisa a construção interativa de metáforas multimodais da interculturealidade – um termo proveniente da interculturalidade e intercorporealidade, assumindo que a interação intercultural é sempre um fenômeno incorporado, compartilhado entre os seus participantes. Para tal, duas sequências gravadas em vídeo de uma conversa em grupo serão analisadas com base na análise da interação (COUPER-KUHLEN; SELTING, 2018). Os dados foram transcritos seguindo as orientações do sistema GAT2 (SELTING et al., 2011), incluindo a anotação da forma gestual, que se baseou no sistema descrito por Bressem (2013). A função dos gestos foi interpretada com base no contexto interacional e no sistema proposto por Kendon (2004) e Bressem e Müller (2013). Os resultados questionam a validade da metáfora clássica do conduto de comunicação (REDDY, 2012) no contexto intercultural e, além disso, propõem uma abordagem corporificada da conceituação do processo de entendimento entre os participantes. A análise também mostra que, embora as metáforas sejam multimodais, o conteúdo metafórico não é sempre uniformemente distribuído entre as diferentes modalidades (fala, gesto). Além disso, o conteúdo metafórico é construído sequencialmente, referindo-se a metáforas anteriores utilizadas pelos mesmos ou diferentes interlocutores e recorrendo a mesclagens metafóricas.Palavras-chave: metáforas; multimodalidade; interculturalidade; intercorporealidade; migração.
APA, Harvard, Vancouver, ISO, and other styles
6

Hardison, Debra M. "Visualizing the acoustic and gestural beats of emphasis in multimodal discourse." Journal of Second Language Pronunciation 4, no. 2 (December 31, 2018): 232–59. http://dx.doi.org/10.1075/jslp.17006.har.

Full text
Abstract:
Abstract Perceivers’ attention is entrained to the rhythm of a speaker’s gestural and acoustic beats. When different rhythms (polyrhythms) occur across the visual and auditory modalities of speech simultaneously, attention may be heightened, enhancing memorability of the sequence. In this three-stage study, Stage 1 analyzed videorecordings of native English-speaking instructors, focusing on frame-by-frame analysis of time-aligned annotations from Praat and Anvil (video annotation tool) of polyrhythmic sequences. Stage 2 explored the perceivers’ perspective on the sequences’ discourse role. Stage 3 analyzed 10 international teaching assistants’ gestures, and implemented a multistep technology-assisted program to enhance verbal and nonverbal communication skills. Findings demonstrated (a) a dynamic temporal gesture-speech relationship involving perturbations of beat intervals surrounding pitch-accented vowels, (b) the sequences’ important role as highlighters of information, and (c) improvement of ITA confidence, teaching effectiveness, and ability to communicate important points. Findings support the joint production of gesture and prosodically prominent features.
APA, Harvard, Vancouver, ISO, and other styles
7

Diete, Alexander, Timo Sztyler, and Heiner Stuckenschmidt. "Exploring Semi-Supervised Methods for Labeling Support in Multimodal Datasets." Sensors 18, no. 8 (August 11, 2018): 2639. http://dx.doi.org/10.3390/s18082639.

Full text
Abstract:
Working with multimodal datasets is a challenging task as it requires annotations which often are time consuming and difficult to acquire. This includes in particular video recordings which often need to be watched as a whole before they can be labeled. Additionally, other modalities like acceleration data are often recorded alongside a video. For that purpose, we created an annotation tool that enables to annotate datasets of video and inertial sensor data. In contrast to most existing approaches, we focus on semi-supervised labeling support to infer labels for the whole dataset. This means, after labeling a small set of instances our system is able to provide labeling recommendations. We aim to rely on the acceleration data of a wrist-worn sensor to support the labeling of a video recording. For that purpose, we apply template matching to identify time intervals of certain activities. We test our approach on three datasets, one containing warehouse picking activities, one consisting of activities of daily living and one about meal preparations. Our results show that the presented method is able to give hints to annotators about possible label candidates.
APA, Harvard, Vancouver, ISO, and other styles
8

Zhu, Songhao, Xiangxiang Li, and Shuhan Shen. "Multimodal deep network learning‐based image annotation." Electronics Letters 51, no. 12 (June 2015): 905–6. http://dx.doi.org/10.1049/el.2015.0258.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Brunner, Marie-Louise, and Stefan Diemer. "Multimodal meaning making: The annotation of nonverbal elements in multimodal corpus transcription." Research in Corpus Linguistics 10, no. 1 (2021): 63–88. http://dx.doi.org/10.32714/ricl.09.01.05.

Full text
Abstract:
The article discusses how to integrate annotation for nonverbal elements (NVE) from multimodal raw data as part of a standardized corpus transcription. We argue that it is essential to include multimodal elements when investigating conversational data, and that in order to integrate these elements, a structured approach to complex multimodal data is needed. We discuss how to formulate a structured corpus-suitable standard syntax and taxonomy for nonverbal features such as gesture, facial expressions, and physical stance, and how to integrate it in a corpus. Using corpus examples, the article describes the development of a robust annotation system for spoken language in the corpus of Video-mediated English as a Lingua Franca Conversations (ViMELF 2018) and illustrates how the system can be used for the study of spoken discourse. The system takes into account previous research on multimodality, transcribes salient nonverbal features in a concise manner, and uses a standard syntax. While such an approach introduces a degree of subjectivity through the criteria of salience and conciseness, the system also offers considerable advantages: it is versatile and adaptable, flexible enough to work with a wide range of multimodal data, and it allows both quantitative and qualitative research on the pragmatics of interaction.
APA, Harvard, Vancouver, ISO, and other styles
10

Da Fonte, Renata Fonseca Lima, and Késia Vanessa Nascimento da Silva. "MULTIMODALIDADE NA LINGUAGEM DE CRIANÇAS AUTISTAS: O "NÃO" EM SUAS DIVERSAS MANIFESTAÇÕES." PROLÍNGUA 14, no. 2 (May 6, 2020): 250–62. http://dx.doi.org/10.22478/ufpb.1983-9979.2019v14n2.48829.

Full text
Abstract:
Este trabalho tem o intuito de analisar os aspectos multimodais da linguagem de crianças autistas em contextos interativos de negação, a partir da perspectiva multimodal da linguagem, na qual gesto e produção vocal são duas facetas de uma mesma matriz de significação. Metodologicamente, é um estudo de natureza qualitativa e quantitativa, no qual dados foram extraídos a partir da observação e análise das interações de três crianças autistas com faixa etária entre cinco e seis anos de idade, participantes do Grupo de Estudos e Atendimento ao Espectro Autista – GEAUT/UNICAP. Para a transcrição, utilizou-se o software ELAN (Eudico Linguistic Annotator) que permite a transcrição de áudio e vídeo simultaneamente. Os dados mostraram uma sincronia semântica e temporal de diferentes aspectos multimodais da linguagem: “gesto”, “vocalização/prosódia” e “olhar” nos enunciados negativos das crianças autistas. Entre eles, as estereotipias motoras, o desvio do olhar e a ação de virar as costas caracterizaram-se como aspectos multimodais peculiares do “não” nas crianças autistas.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Multimodale Annotation"

1

Völkel, Thorsten. "Multimodale Annotation geographischer Daten zur personalisierten Fußgängernavigation." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2009. http://nbn-resolving.de/urn:nbn:de:bsz:14-ds-1239804877252-19609.

Full text
Abstract:
Mobilitätseingeschränkte Fußgänger, wie etwa Rollstuhlfahrer, blinde und sehbehinderte Menschen oder Senioren, stellen besondere Anforderungen an die Berechnung geeigneter Routen. Die kürzeste Route ist nicht immer die am besten geeignete. In dieser Arbeit wird das Verfahren der multimodalen Annotation entwickelt, welches die Erweiterung der geographischen Basisdaten durch die Benutzer selbst erlaubt. Auf Basis der durch das Verfahren gewonnenen Daten werden Konzepte zu personalisierten Routenberechnung auf Grundlage der individuellen Anforderungen der Benutzer entwickelt. Das beschriebene Verfahren wurde erfolgreich mit insgesamt 35 Benutzern evaluiert und bildet somit die Grundlage für weiterführende Arbeiten in diesem Bereich
Mobility impaired pedestrians such as wheelchair users, blind and visually impaired, or elderly people impose specific requirements upon the calculation of appropriate routes. The shortest path might not be the best. Within this thesis, the concept of multimodal annotation is developed. The concept allows for extension of the geographical base data by users. Further concepts are developed allowing for the application of the acquired data for the calculation of personalized routes based on the requirements of the individual user. The concept of multimodal annotation was successfully evaluated incorporating 35 users and may be used as the base for further research in the area
APA, Harvard, Vancouver, ISO, and other styles
2

Völkel, Thorsten. "Multimodale Annotation geographischer Daten zur personalisierten Fußgängernavigation." Doctoral thesis, Technische Universität Dresden, 2008. https://tud.qucosa.de/id/qucosa%3A23563.

Full text
Abstract:
Mobilitätseingeschränkte Fußgänger, wie etwa Rollstuhlfahrer, blinde und sehbehinderte Menschen oder Senioren, stellen besondere Anforderungen an die Berechnung geeigneter Routen. Die kürzeste Route ist nicht immer die am besten geeignete. In dieser Arbeit wird das Verfahren der multimodalen Annotation entwickelt, welches die Erweiterung der geographischen Basisdaten durch die Benutzer selbst erlaubt. Auf Basis der durch das Verfahren gewonnenen Daten werden Konzepte zu personalisierten Routenberechnung auf Grundlage der individuellen Anforderungen der Benutzer entwickelt. Das beschriebene Verfahren wurde erfolgreich mit insgesamt 35 Benutzern evaluiert und bildet somit die Grundlage für weiterführende Arbeiten in diesem Bereich.
Mobility impaired pedestrians such as wheelchair users, blind and visually impaired, or elderly people impose specific requirements upon the calculation of appropriate routes. The shortest path might not be the best. Within this thesis, the concept of multimodal annotation is developed. The concept allows for extension of the geographical base data by users. Further concepts are developed allowing for the application of the acquired data for the calculation of personalized routes based on the requirements of the individual user. The concept of multimodal annotation was successfully evaluated incorporating 35 users and may be used as the base for further research in the area.
APA, Harvard, Vancouver, ISO, and other styles
3

Znaidia, Amel. "Handling Imperfections for Multimodal Image Annotation." Phd thesis, Ecole Centrale Paris, 2014. http://tel.archives-ouvertes.fr/tel-01012009.

Full text
Abstract:
This thesis deals with multimodal image annotation in the context of social media. We seek to take advantage of textual (tags) and visual information in order to enhance the image annotation performances. However, these tags are often noisy, overly personalized and only a few of them are related to the semantic visual content of the image. In addition, when combining prediction scores from different classifiers learned on different modalities, multimodal image annotation faces their imperfections (uncertainty, imprecision and incompleteness). Consequently, we consider that multimodal image annotation is subject to imperfections at two levels: the representation and the decision. Inspired from the information fusion theory, we focus in this thesis on defining, identifying and handling imperfection aspects in order to improve image annotation.
APA, Harvard, Vancouver, ISO, and other styles
4

Tayari, Meftah Imen. "Modélisation, détection et annotation des états émotionnels à l'aide d'un espace vectoriel multidimensionnel." Phd thesis, Université Nice Sophia Antipolis, 2013. http://tel.archives-ouvertes.fr/tel-00838803.

Full text
Abstract:
Notre travail s'inscrit dans le domaine de l'affective computing et plus précisément la modélisation, détection et annotation des émotions. L'objectif est d'étudier, d'identifier et de modéliser les émotions afin d'assurer l'échange entre applications multimodales. Notre contribution s'axe donc sur trois points. En premier lieu, nous présentons une nouvelle vision de la modélisation des états émotionnels basée sur un modèle générique pour la représentation et l'échange des émotions entre applications multimodales. Il s'agit d'un modèle de représentation hiérarchique composé de trois couches distinctes : la couche psychologique, la couche de calcul formel et la couche langage. Ce modèle permet la représentation d'une infinité d'émotions et la modélisation aussi bien des émotions de base comme la colère, la tristesse et la peur que les émotions complexes comme les émotions simulées et masquées. Le second point de notre contribution est axé sur une approche monomodale de reconnaissance des émotions fondée sur l'analyse des signaux physiologiques. L'algorithme de reconnaissance des émotions s'appuie à la fois sur l'application des techniques de traitement du signal, sur une classification par plus proche voisins et également sur notre modèle multidimensionnel de représentation des émotions. Notre troisième contribution porte sur une approche multimodale de reconnaissance des émotions. Cette approche de traitement des données conduit à une génération d'information de meilleure qualité et plus fiable que celle obtenue à partir d'une seule modalité. Les résultats expérimentaux montrent une amélioration significative des taux de reconnaissance des huit émotions par rapport aux résultats obtenus avec l'approche monomodale. Enfin nous avons intégré notre travail dans une application de détection de la dépression des personnes âgées dans un habitat intelligent. Nous avons utilisé les signaux physiologiques recueillis à partir de différents capteurs installés dans l'habitat pour estimer l'état affectif de la personne concernée.
APA, Harvard, Vancouver, ISO, and other styles
5

Nguyen, Nhu Van. "Représentations visuelles de concepts textuels pour la recherche et l'annotation interactives d'images." Phd thesis, Université de La Rochelle, 2011. http://tel.archives-ouvertes.fr/tel-00730707.

Full text
Abstract:
En recherche d'images aujourd'hui, nous manipulons souvent de grands volumes d'images, qui peuvent varier ou même arriver en continu. Dans une base d'images, on se retrouve ainsi avec certaines images anciennes et d'autres nouvelles, les premières déjà indexées et possiblement annotées et les secondes en attente d'indexation ou d'annotation. Comme la base n'est pas annotée uniformément, cela rend l'accès difficile par le biais de requêtes textuelles. Nous présentons dans ce travail différentes techniques pour interagir, naviguer et rechercher dans ce type de bases d'images. Premièrement, un modèle d'interaction à court terme est utilisé pour améliorer la précision du système. Deuxièmement, en se basant sur un modèle d'interaction à long terme, nous proposons d'associer mots textuels et caractéristiques visuelles pour la recherche d'images par le texte, par le contenu visuel, ou mixte texte/visuel. Ce modèle de recherche d'images permet de raffiner itérativement l'annotation et la connaissance des images. Nous identifions quatre contributions dans ce travail. La première contribution est un système de recherche multimodale d'images qui intègre différentes sources de données, comme le contenu de l'image et le texte. Ce système permet l'interrogation par l'image, l'interrogation par mot-clé ou encore l'utilisation de requêtes hybrides. La deuxième contribution est une nouvelle technique pour le retour de pertinence combinant deux techniques classiques utilisées largement dans la recherche d'information~: le mouvement du point de requête et l'extension de requêtes. En profitant des images non pertinentes et des avantages de ces deux techniques classiques, notre méthode donne de très bons résultats pour une recherche interactive d'images efficace. La troisième contribution est un modèle nommé "Sacs de KVR" (Keyword Visual Representation) créant des liens entre des concepts sémantiques et des représentations visuelles, en appui sur le modèle de Sac de Mots. Grâce à une stratégie d'apprentissage incrémental, ce modèle fournit l'association entre concepts sémantiques et caractéristiques visuelles, ce qui contribue à améliorer la précision de l'annotation sur l'image et la performance de recherche. La quatrième contribution est un mécanisme de construction incrémentale des connaissances à partir de zéro. Nous ne séparons pas les phases d'annotation et de recherche, et l'utilisateur peut ainsi faire des requêtes dès la mise en route du système, tout en laissant le système apprendre au fur et à mesure de son utilisation. Les contributions ci-dessus sont complétées par une interface permettant la visualisation et l'interrogation mixte textuelle/visuelle. Même si pour l'instant deux types d'informations seulement sont utilisées, soit le texte et le contenu visuel, la généricité du modèle proposé permet son extension vers d'autres types d'informations externes à l'image, comme la localisation (GPS) et le temps.
APA, Harvard, Vancouver, ISO, and other styles
6

Budnik, Mateusz. "Active and deep learning for multimedia." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM011.

Full text
Abstract:
Les thèmes principaux abordés dans cette thèse sont l'utilisation de méthodes d'apprentissage actif et d'apprentissage profond dans le contexte du traitement de documents multimodaux. Les contributions proposées dans cette thèse abordent ces deux thèmes. Un système d'apprentissage actif a été introduit pour permettre une annotation plus efficace des émissions de télévision grâce à la propagation des étiquettes, à l'utilisation de données multimodales et à des stratégies de sélection efficaces. Plusieurs scénarios et expériences ont été envisagés dans le cadre de l'identification des personnes dans les vidéos, en prenant en compte l'utilisation de différentes modalités (telles que les visages, les segments de la parole et le texte superposé) et différentes stratégies de sélection. Le système complet a été validé au cours d'un ``test à blanc'' impliquant des annotateurs humains réels.Une deuxième contribution majeure a été l'étude et l'utilisation de l'apprentissage profond (en particulier les réseaux de neurones convolutifs) pour la recherche d'information dans les vidéos. Une étude exhaustive a été réalisée en utilisant différentes architectures de réseaux neuronaux et différentes techniques d'apprentissage telles que le réglage fin (fine-tuning) ou des classificateurs plus classiques comme les SVMs. Une comparaison a été faite entre les caractéristiques apprises (la sortie des réseaux neuronaux) et les caractéristiques plus classiques (``engineered features''). Malgré la performance inférieure des seconds, une fusion de ces deux types de caractéristiques augmente la performance globale.Enfin, l'utilisation d'un réseau neuronal convolutif pour l'identification des locuteurs à l'aide de spectrogrammes a été explorée. Les résultats ont été comparés à ceux obtenus avec d'autres systèmes d'identification de locuteurs récents. Différentes approches de fusion ont également été testées. L'approche proposée a permis d'obtenir des résultats comparables à ceux certains des autres systèmes testés et a offert une augmentation de la performance lorsqu'elle est fusionnée avec la sortie du meilleur système
The main topics of this thesis include the use of active learning-based methods and deep learning in the context of retrieval of multimodal documents. The contributions proposed during this thesis address both these topics. An active learning framework was introduced, which allows for a more efficient annotation of broadcast TV videos thanks to the propagation of labels, the use of multimodal data and selection strategies. Several different scenarios and experiments were considered in the context of person identification in videos, including using different modalities (such as faces, speech segments and overlaid text) and different selection strategies. The whole system was additionally validated in a dry run involving real human annotators.A second major contribution was the investigation and use of deep learning (in particular the convolutional neural network) for video retrieval. A comprehensive study was made using different neural network architectures and training techniques such as fine-tuning or using separate classifiers like SVM. A comparison was made between learned features (the output of neural networks) and engineered features. Despite the lower performance of the engineered features, fusion between these two types of features increases overall performance.Finally, the use of convolutional neural network for speaker identification using spectrograms is explored. The results are compared to other state-of-the-art speaker identification systems. Different fusion approaches are also tested. The proposed approach obtains comparable results to some of the other tested approaches and offers an increase in performance when fused with the output of the best system
APA, Harvard, Vancouver, ISO, and other styles
7

Nag, Chowdhury Sreyasi [Verfasser]. "Text-image synergy for multimodal retrieval and annotation / Sreyasi Nag Chowdhury." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2021. http://d-nb.info/1240674139/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Abrilian, Sarkis. "Représentation de comportements emotionnels multimodaux spontanés : perception, annotation et synthèse." Phd thesis, Université Paris Sud - Paris XI, 2007. http://tel.archives-ouvertes.fr/tel-00620827.

Full text
Abstract:
L'objectif de cette thèse est de représenter les émotions spontanées et les signes multimodaux associés pour contribuer à la conception des futurs systèmes affectifs interactifs. Les prototypes actuels sont généralement limités à la détection et à la génération de quelques émotions simples et se fondent sur des données audio ou vidéo jouées par des acteurs et récoltées en laboratoire. Afin de pouvoir modéliser les relations complexes entre les émotions spontanées et leurs expressions dans différentes modalités, une approche exploratoire est nécessaire. L'approche exploratoire que nous avons choisie dans cette thèse pour l'étude de ces émotions spontanées consiste à collecter et annoter un corpus vidéo d'interviews télévisées. Ce type de corpus comporte des émotions plus complexes que les 6 émotions de base (colère, peur, joie, tristesse, surprise, dégoût). On observe en effet dans les comportements émotionnels spontanés des superpositions, des masquages, des conflits entre émotions positives et négatives. Nous rapportons plusieurs expérimentations ayant permis la définition de plusieurs niveaux de représentation des émotions et des paramètres comportementaux multimodaux apportant des informations pertinentes pour la perception de ces émotions complexes spontanées. En perspective, les outils développés durant cette thèse (schémas d'annotation, programmes de mesures, protocoles d'annotation) pourront être utilisés ultérieurement pour concevoir des modèles utilisables par des systèmes interactifs affectifs capables de détecter/synthétiser des expressions multimodales d'émotions spontanées.
APA, Harvard, Vancouver, ISO, and other styles
9

Oram, Louise Carolyn. "Scrolling in radiology image stacks : multimodal annotations and diversifying control mobility." Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/45508.

Full text
Abstract:
Advances in image acquisition technology mean that radiologists today must examine thousands of images to make a diagnosis. However, the physical interactions performed to view these images are repetitive and not specialized to the task. Additionally, automatic and/or radiologist-generated annotations may impact how radiologists scroll through image stacks as they review areas of interest. We analyzed manual aspects of this work by observing and/or interviewing 19 radiologists; stack scrolling dominated the resulting task examples. We used a simplified stack seeded with correct or incorrect annotations in our experiment on lay users. The experiment investigated the impact of four scrolling techniques: traditional scrollwheel, click+drag, sliding-touch and tilting to access rate control. We also examined the effect of visual vs. haptic annotation cues’ on scrolling dynamics, detection accuracy and subjective factors. Scrollwheel was the fastest scrolling technique overall for our lay participants. Combined visual and haptic annotation highlights increased the speed of target-finding in comparison to either modality alone. Multimodal annotations may be useful in radiology image interpretation; users are heavily visually loaded, and there is background noise in the hospital environment. From interviews with radiologists, we see that they are receptive to a mouse that they can use to map different movements to interactions with images as an alternative to the standard mouse usually provided with their workstation.
APA, Harvard, Vancouver, ISO, and other styles
10

Silva, Miguel Marinhas da. "Automated image tagging through tag propagation." Master's thesis, Faculdade de Ciências e Tecnologia, 2011. http://hdl.handle.net/10362/5963.

Full text
Abstract:
Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial Para obtenção do grau de Mestre em Engenharia Informática
Today, more and more data is becoming available on the Web. In particular, we have recently witnessed an exponential increase of multimedia content within various content sharing websites. While this content is widely available, great challenges have arisen to effectively search and browse such vast amount of content. A solution to this problem is to annotate information, a task that without computer aid requires a large-scale human effort. The goal of this thesis is to automate the task of annotating multimedia information with machine learning algorithms. We propose the development of a machine learning framework capable of doing automated image annotation in large-scale consumer photos. To this extent a study on state of art algorithms was conducted, which concluded with a baseline implementation of a k-nearest neighbor algorithm. This baseline was used to implement a more advanced algorithm capable of annotating images in the situations with limited training images and a large set of test images – thus, a semi-supervised approach. Further studies were conducted on the feature spaces used to describe images towards a successful integration in the developed framework. We first analyzed the semantic gap between the visual feature spaces and concepts present in an image, and how to avoid or mitigate this gap. Moreover, we examined how users perceive images by performing a statistical analysis of the image tags inserted by users. A linguistic and statistical expansion of image tags was also implemented. The developed framework withstands uneven data distributions that occur in consumer datasets, and scales accordingly, requiring few previously annotated data. The principal mechanism that allows easier scaling is the propagation of information between the annotated data and un-annotated data.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Multimodale Annotation"

1

Cassidy, Steve, and Thomas Schmidt. "Tools for Multimodal Annotation." In Handbook of Linguistic Annotation, 209–27. Dordrecht: Springer Netherlands, 2017. http://dx.doi.org/10.1007/978-94-024-0881-2_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Steininger, Silke, Florian Schiel, and Susen Rabold. "Annotation of Multimodal Data." In SmartKom: Foundations of Multimodal Dialogue Systems, 571–96. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/3-540-36678-4_35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Schmidt, Thomas, Susan Duncan, Oliver Ehmer, Jeffrey Hoyt, Michael Kipp, Dan Loehr, Magnus Magnusson, Travis Rose, and Han Sloetjes. "An Exchange Format for Multimodal Annotations." In Multimodal Corpora, 207–21. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04793-0_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Colletta, Jean-Marc, Ramona N. Kunene, Aurélie Venouil, Virginie Kaufmann, and Jean-Pascal Simon. "Multi-track Annotation of Child Language and Gestures." In Multimodal Corpora, 54–72. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04793-0_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Blache, Philippe, Roxane Bertrand, Gaëlle Ferré, Berthille Pallaud, Laurent Prévot, and Stéphane Rauzy. "The Corpus of Interactional Data: A Large Multimodal Annotated Resource." In Handbook of Linguistic Annotation, 1323–56. Dordrecht: Springer Netherlands, 2017. http://dx.doi.org/10.1007/978-94-024-0881-2_51.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Grassi, Marco, Christian Morbidoni, and Francesco Piazza. "Towards Semantic Multimodal Video Annotation." In Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues, 305–16. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-18184-9_25.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Cavicchio, Federica, and Massimo Poesio. "Multimodal Corpora Annotation: Validation Methods to Assess Coding Scheme Reliability." In Multimodal Corpora, 109–21. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04793-0_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Gibbon, Dafydd, Inge Mertins, and Roger K. Moore. "Representation and annotation of dialogue." In Handbook of Multimodal and Spoken Dialogue Systems, 1–101. Boston, MA: Springer US, 2000. http://dx.doi.org/10.1007/978-1-4615-4501-9_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Johnston, Michael. "Extensible Multimodal Annotation for Intelligent Interactive Systems." In Multimodal Interaction with W3C Standards, 37–64. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-42816-1_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Bunt, Harry, Volha Petukhova, David Traum, and Jan Alexandersson. "Dialogue Act Annotation with the ISO 24617-2 Standard." In Multimodal Interaction with W3C Standards, 109–35. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-42816-1_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Multimodale Annotation"

1

Xing, Yuying, Guoxian Yu, Jun Wang, Carlotta Domeniconi, and Xiangliang Zhang. "Weakly-Supervised Multi-view Multi-instance Multi-label Learning." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/432.

Full text
Abstract:
Multi-view, Multi-instance, and Multi-label Learning (M3L) can model complex objects (bags), which are represented with different feature views, made of diverse instances, and annotated with discrete non-exclusive labels. Existing M3L approaches assume a complete correspondence between bags and views, and also assume a complete annotation for training. However, in practice, neither the correspondence between bags, nor the bags' annotations are complete. To tackle such a weakly-supervised M3L task, a solution called WSM3L is introduced. WSM3L adapts multimodal dictionary learning to learn a shared dictionary (representational space) across views and individual encoding vectors of bags for each view. The label similarity and feature similarity of encoded bags are jointly used to match bags across views. In addition, it replenishes the annotations of a bag based on the annotations of its neighborhood bags, and introduces a dispatch and aggregation term to dispatch bag-level annotations to instances and to reversely aggregate instance-level annotations to bags. WSM3L unifies these objectives and processes in a joint objective function to predict the instance-level and bag-level annotations in a coordinated fashion, and it further introduces an alternative solution for the objective function optimization. Extensive experimental results show the effectiveness of WSM3L on benchmark datasets.
APA, Harvard, Vancouver, ISO, and other styles
2

Thomas, Martin. "Querying multimodal annotation." In the Linguistic Annotation Workshop. Morristown, NJ, USA: Association for Computational Linguistics, 2007. http://dx.doi.org/10.3115/1642059.1642069.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Podlasov, A., K. O'Halloran, S. Tan, B. Smith, and A. Nagarajan. "Developing novel multimodal and linguistic annotation software." In the Third Linguistic Annotation Workshop. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1698381.1698404.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Blache, Philippe. "A general scheme for broad-coverage multimodal annotation." In the Third Linguistic Annotation Workshop. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1698381.1698414.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Barz, Michael, Mohammad Mehdi Moniri, Markus Weber, and Daniel Sonntag. "Multimodal multisensor activity annotation tool." In UbiComp '16: The 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2968219.2971459.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wieschebrink, Stephan. "Collaborative editing of multimodal annotation data." In the 11th ACM symposium. New York, New York, USA: ACM Press, 2011. http://dx.doi.org/10.1145/2034691.2034706.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Froumentin, Max. "Extensible multimodal annotation markup language (EMMA)." In Proceeedings of the Workshop on NLP and XML (NLPXML-2004): RDF/RDFS and OWL in Language Technology. Morristown, NJ, USA: Association for Computational Linguistics, 2004. http://dx.doi.org/10.3115/1621066.1621071.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zang, Xiaoxue, Ying Xu, and Jindong Chen. "Multimodal Icon Annotation For Mobile Applications." In MobileHCI '21: 23rd International Conference on Mobile Human-Computer Interaction. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3447526.3472064.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Cabral, Diogo, Urândia Carvalho, João Silva, João Valente, Carla Fernandes, and Nuno Correia. "Multimodal video annotation for contemporary dance creation." In the 2011 annual conference extended abstracts. New York, New York, USA: ACM Press, 2011. http://dx.doi.org/10.1145/1979742.1979930.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Seta, L., G. Chiazzese, G. Merlo, S. Ottaviano, G. Ciulla, M. Allegra, V. Samperi, and G. Todaro. "Multimodal Annotation to Support Web Learning Activities." In 2008 19th International Conference on Database and Expert Systems Applications (DEXA). IEEE, 2008. http://dx.doi.org/10.1109/dexa.2008.68.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography