Log in

Relevant bibliographies by topics / Multimodale Annotation / Journal articles

To see the other types of publications on this topic, follow the link: Multimodale Annotation.

Journal articles on the topic 'Multimodale Annotation'

Author: Grafiati

Published: 4 June 2021

Last updated: 10 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Multimodale Annotation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Kleida, Danae. "Entering a dance performance through multimodal annotation: annotating with scores." International Journal of Performance Arts and Digital Media 17, no. 1 (January 2, 2021): 19–30. http://dx.doi.org/10.1080/14794713.2021.1880182.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Debras, Camille. "How to prepare the video component of the Diachronic Corpus of Political Speeches for multimodal analysis." Research in Corpus Linguistics 9, no. 1 (2021): 132–51. http://dx.doi.org/10.32714/ricl.09.01.08.

Full text

Abstract:

The Diachronic Corpus of Political Speeches (DCPS) is a collection of 1,500 full-length political speeches in English. It includes speeches delivered in countries where English is an official language (the US, Britain, Canada, Ireland) by English-speaking politicians in various settings from 1800 up to the present time. Enriched with semi-automatic morphosyntactic annotations and with discourse-pragmatic manual annotations, the DCPS is designed to achieve maximum representativeness and balance for political English speeches from major national English varieties in time, preserve detailed metadata, and enable corpus-based studies of syntactic, semantic and discourse-pragmatic variation and change on political corpora. For speeches given from 1950 onwards, video-recordings of the original delivery are often retrievable online. This opens up avenues of research in multimodal linguistics, in which studies on the integration of speech and gesture in the construction of meaning can include analyses of recurrent gestures and of multimodal constructions. This article discusses the issues at stake in preparing the video-recorded component of the DCPS for linguistic multimodal analysis, namely the exploitability of recordings, the segmentation and alignment of transcriptions, the annotation of gesture forms and functions in the software ELAN and the quantity of available gesture data.

APA, Harvard, Vancouver, ISO, and other styles

3

Chou, Chien-Li, Hua-Tsung Chen, and Suh-Yin Lee. "Multimodal Video-to-Near-Scene Annotation." IEEE Transactions on Multimedia 19, no. 2 (February 2017): 354–66. http://dx.doi.org/10.1109/tmm.2016.2614426.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Saneiro, Mar, Olga C. Santos, Sergio Salmeron-Majadas, and Jesus G. Boticario. "Towards Emotion Detection in Educational Scenarios from Facial Expressions and Body Movements through Multimodal Approaches." Scientific World Journal 2014 (2014): 1–14. http://dx.doi.org/10.1155/2014/484873.

Full text

Abstract:

We report current findings when considering video recordings of facial expressions and body movements to provide affective personalized support in an educational context from an enriched multimodal emotion detection approach. In particular, we describe an annotation methodology to tag facial expression and body movements that conform to changes in the affective states of learners while dealing with cognitive tasks in a learning process. The ultimate goal is to combine these annotations with additional affective information collected during experimental learning sessions from different sources such as qualitative, self-reported, physiological, and behavioral information. These data altogether are to train data mining algorithms that serve to automatically identify changes in the learners’ affective states when dealing with cognitive tasks which help to provide emotional personalized support.

APA, Harvard, Vancouver, ISO, and other styles

5

Ladilova, Anna. "Multimodal Metaphors of Interculturereality / Metaforas multimodais da interculturealidade." REVISTA DE ESTUDOS DA LINGUAGEM 28, no. 2 (May 5, 2020): 917. http://dx.doi.org/10.17851/2237-2083.28.2.917-955.

Full text

Abstract:

Abstract: The present paper looks at the interactive construction of multimodal metaphors of interculturereality – a term coined by the author from interculturality and intercorporeality, assuming that intercultural interaction is always an embodied phenomenon, shared among its participants. For this, two videotaped sequences of a group conversation are analyzed drawing upon interaction analysis (Couper-Kuhlen; Selting, 2018). The data was transcribed following the GAT2 (Selting et al., 2011) guidelines, including gesture form annotation, which relied on the system described by Bressem (2013). Gesture function was interpreted drawing on the interactional context and on the system proposed by Kendon (2004) and Bressem and Müller (2013). The results question the validity of the classical conduit metaphor of communication (Reddy, 1979) in the intercultural context and instead propose an embodied approach to the conceptualization of the understanding process among the participants. The analysis also shows that even though the metaphors are multimodal, the metaphoric content is not always evenly distributed among the different modalities (speech, gesture). Apart from that, the metaphorical content is constructed sequentially, referring to preceding metaphors used by the same or different interlocutors and associated with metaphorical blends.Keywords: metaphors; multimodality; interculturality; intercorporeality; migration.Resumo: O presente artigo analisa a construção interativa de metáforas multimodais da interculturealidade – um termo proveniente da interculturalidade e intercorporealidade, assumindo que a interação intercultural é sempre um fenômeno incorporado, compartilhado entre os seus participantes. Para tal, duas sequências gravadas em vídeo de uma conversa em grupo serão analisadas com base na análise da interação (COUPER-KUHLEN; SELTING, 2018). Os dados foram transcritos seguindo as orientações do sistema GAT2 (SELTING et al., 2011), incluindo a anotação da forma gestual, que se baseou no sistema descrito por Bressem (2013). A função dos gestos foi interpretada com base no contexto interacional e no sistema proposto por Kendon (2004) e Bressem e Müller (2013). Os resultados questionam a validade da metáfora clássica do conduto de comunicação (REDDY, 2012) no contexto intercultural e, além disso, propõem uma abordagem corporificada da conceituação do processo de entendimento entre os participantes. A análise também mostra que, embora as metáforas sejam multimodais, o conteúdo metafórico não é sempre uniformemente distribuído entre as diferentes modalidades (fala, gesto). Além disso, o conteúdo metafórico é construído sequencialmente, referindo-se a metáforas anteriores utilizadas pelos mesmos ou diferentes interlocutores e recorrendo a mesclagens metafóricas.Palavras-chave: metáforas; multimodalidade; interculturalidade; intercorporealidade; migração.

APA, Harvard, Vancouver, ISO, and other styles

6

Hardison, Debra M. "Visualizing the acoustic and gestural beats of emphasis in multimodal discourse." Journal of Second Language Pronunciation 4, no. 2 (December 31, 2018): 232–59. http://dx.doi.org/10.1075/jslp.17006.har.

Full text

Abstract:

Abstract Perceivers’ attention is entrained to the rhythm of a speaker’s gestural and acoustic beats. When different rhythms (polyrhythms) occur across the visual and auditory modalities of speech simultaneously, attention may be heightened, enhancing memorability of the sequence. In this three-stage study, Stage 1 analyzed videorecordings of native English-speaking instructors, focusing on frame-by-frame analysis of time-aligned annotations from Praat and Anvil (video annotation tool) of polyrhythmic sequences. Stage 2 explored the perceivers’ perspective on the sequences’ discourse role. Stage 3 analyzed 10 international teaching assistants’ gestures, and implemented a multistep technology-assisted program to enhance verbal and nonverbal communication skills. Findings demonstrated (a) a dynamic temporal gesture-speech relationship involving perturbations of beat intervals surrounding pitch-accented vowels, (b) the sequences’ important role as highlighters of information, and (c) improvement of ITA confidence, teaching effectiveness, and ability to communicate important points. Findings support the joint production of gesture and prosodically prominent features.

APA, Harvard, Vancouver, ISO, and other styles

7

Diete, Alexander, Timo Sztyler, and Heiner Stuckenschmidt. "Exploring Semi-Supervised Methods for Labeling Support in Multimodal Datasets." Sensors 18, no. 8 (August 11, 2018): 2639. http://dx.doi.org/10.3390/s18082639.

Full text

Abstract:

Working with multimodal datasets is a challenging task as it requires annotations which often are time consuming and difficult to acquire. This includes in particular video recordings which often need to be watched as a whole before they can be labeled. Additionally, other modalities like acceleration data are often recorded alongside a video. For that purpose, we created an annotation tool that enables to annotate datasets of video and inertial sensor data. In contrast to most existing approaches, we focus on semi-supervised labeling support to infer labels for the whole dataset. This means, after labeling a small set of instances our system is able to provide labeling recommendations. We aim to rely on the acceleration data of a wrist-worn sensor to support the labeling of a video recording. For that purpose, we apply template matching to identify time intervals of certain activities. We test our approach on three datasets, one containing warehouse picking activities, one consisting of activities of daily living and one about meal preparations. Our results show that the presented method is able to give hints to annotators about possible label candidates.

APA, Harvard, Vancouver, ISO, and other styles

8

Zhu, Songhao, Xiangxiang Li, and Shuhan Shen. "Multimodal deep network learning‐based image annotation." Electronics Letters 51, no. 12 (June 2015): 905–6. http://dx.doi.org/10.1049/el.2015.0258.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Brunner, Marie-Louise, and Stefan Diemer. "Multimodal meaning making: The annotation of nonverbal elements in multimodal corpus transcription." Research in Corpus Linguistics 10, no. 1 (2021): 63–88. http://dx.doi.org/10.32714/ricl.09.01.05.

Full text

Abstract:

The article discusses how to integrate annotation for nonverbal elements (NVE) from multimodal raw data as part of a standardized corpus transcription. We argue that it is essential to include multimodal elements when investigating conversational data, and that in order to integrate these elements, a structured approach to complex multimodal data is needed. We discuss how to formulate a structured corpus-suitable standard syntax and taxonomy for nonverbal features such as gesture, facial expressions, and physical stance, and how to integrate it in a corpus. Using corpus examples, the article describes the development of a robust annotation system for spoken language in the corpus of Video-mediated English as a Lingua Franca Conversations (ViMELF 2018) and illustrates how the system can be used for the study of spoken discourse. The system takes into account previous research on multimodality, transcribes salient nonverbal features in a concise manner, and uses a standard syntax. While such an approach introduces a degree of subjectivity through the criteria of salience and conciseness, the system also offers considerable advantages: it is versatile and adaptable, flexible enough to work with a wide range of multimodal data, and it allows both quantitative and qualitative research on the pragmatics of interaction.

APA, Harvard, Vancouver, ISO, and other styles

10

Da Fonte, Renata Fonseca Lima, and Késia Vanessa Nascimento da Silva. "MULTIMODALIDADE NA LINGUAGEM DE CRIANÇAS AUTISTAS: O "NÃO" EM SUAS DIVERSAS MANIFESTAÇÕES." PROLÍNGUA 14, no. 2 (May 6, 2020): 250–62. http://dx.doi.org/10.22478/ufpb.1983-9979.2019v14n2.48829.

Full text

Abstract:

Este trabalho tem o intuito de analisar os aspectos multimodais da linguagem de crianças autistas em contextos interativos de negação, a partir da perspectiva multimodal da linguagem, na qual gesto e produção vocal são duas facetas de uma mesma matriz de significação. Metodologicamente, é um estudo de natureza qualitativa e quantitativa, no qual dados foram extraídos a partir da observação e análise das interações de três crianças autistas com faixa etária entre cinco e seis anos de idade, participantes do Grupo de Estudos e Atendimento ao Espectro Autista – GEAUT/UNICAP. Para a transcrição, utilizou-se o software ELAN (Eudico Linguistic Annotator) que permite a transcrição de áudio e vídeo simultaneamente. Os dados mostraram uma sincronia semântica e temporal de diferentes aspectos multimodais da linguagem: “gesto”, “vocalização/prosódia” e “olhar” nos enunciados negativos das crianças autistas. Entre eles, as estereotipias motoras, o desvio do olhar e a ação de virar as costas caracterizaram-se como aspectos multimodais peculiares do “não” nas crianças autistas.

APA, Harvard, Vancouver, ISO, and other styles

11

Völkel, Thorsten. "Personalized and adaptive navigation based on multimodal annotation." ACM SIGACCESS Accessibility and Computing, no. 86 (September 2006): 4–7. http://dx.doi.org/10.1145/1196148.1196149.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Podlasov, Alexey, Sabine Tan, and Kay O'Halloran. "Interactive state-transition diagrams for visualization of multimodal annotation." Intelligent Data Analysis 16, no. 4 (July 11, 2012): 683–702. http://dx.doi.org/10.3233/ida-2012-0544.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Tian, Feng, Quge Wang, Xin Li, and Ning Sun. "Heterogeneous multimedia cooperative annotation based on multimodal correlation learning." Journal of Visual Communication and Image Representation 58 (January 2019): 544–53. http://dx.doi.org/10.1016/j.jvcir.2018.12.028.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Sinte, Aurélie. "Répéter, redire, reformuler : analyse plurisémiotique de conférences TEDx." SHS Web of Conferences 46 (2018): 01001. http://dx.doi.org/10.1051/shsconf/20184601001.

Full text

Abstract:

Cette proposition s’inscrit dans un large projet d’analyse des reformulations multimodales (RM) dans la construction du discours : décrire les relations qu’entretiennent trois canaux sémiotiques multimodaux (la parole (S1), la gestualité co-verbale (S2) et les supports de présentation (S3)) dans des discours scientifiques. L’objectif est de décrire comment les reformulations multimodales participent au caractère performant du discours, à la construction de sa cohérence. Les RM sont étudiées du point de vue interne à chaque système sémiotique (S1, S2, S3) et du point de vue du croisement d’un système à l’autre (rapport S1/S2, S1/S3, S2/S3 et S1/S2/S3). L’analyse en cours s’opère comme suit : repérage des passages où se trouvent des RM et les canaux mobilisés, annotation des données, analyse quantitative et qualitative des RM et des croisements, identification des paradigmes d’utilisation (des prestations sans RM à celles qui exploitent abondamment les croisements sur les 3 niveaux). Contrairement à ce qui a été avancé par d’autres, mon hypothèse est qu’il ne s’agit pas de deux (voire trois) discours distincts et simultanés. Je considère que la linéarité (de S1 d’une part, de S3 d’autre part) et la simultanéité des trois sources d’information (S1, S2 et S3) s’entrecroisent dans la construction d’un discours unique mais plurisémiotique.

APA, Harvard, Vancouver, ISO, and other styles

15

Nguyen, Nhu Van, Alain Boucher, and Jean-Marc Ogier. "Keyword Visual Representation for Image Retrieval and Image Annotation." International Journal of Pattern Recognition and Artificial Intelligence 29, no. 06 (August 12, 2015): 1555010. http://dx.doi.org/10.1142/s0218001415550101.

Full text

Abstract:

Keyword-based image retrieval is more comfortable for users than content-based image retrieval. Because of the lack of semantic description of images, image annotation is often used a priori by learning the association between the semantic concepts (keywords) and the images (or image regions). This association issue is particularly difficult but interesting because it can be used for annotating images but also for multimodal image retrieval. However, most of the association models are unidirectional, from image to keywords. In addition to that, existing models rely on a fixed image database and prior knowledge. In this paper, we propose an original association model, which provides image-keyword bidirectional transformation. Based on the state-of-the-art Bag of Words model dealing with image representation, including a strategy of interactive incremental learning, our model works well with a zero-or-weak-knowledge image database and evolving from it. Some objective quantitative and qualitative evaluations of the model are proposed, in order to highlight the relevance of the method.

APA, Harvard, Vancouver, ISO, and other styles

16

Martin, J. C., G. Caridakis, L. Devillers, K. Karpouzis, and S. Abrilian. "Manual annotation and automatic image processing of multimodal emotional behaviors: validating the annotation of TV interviews." Personal and Ubiquitous Computing 13, no. 1 (May 3, 2007): 69–76. http://dx.doi.org/10.1007/s00779-007-0167-y.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

O’Halloran, Kay, Sabine Tan, Bradley Smith, and Alexey Podlasov. "Challenges in designing digital interfaces for the study of multimodal phenomena." Information Design Journal 18, no. 1 (June 9, 2010): 2–21. http://dx.doi.org/10.1075/idj.18.1.02hal.

Full text

Abstract:

The paper discusses the challenges faced by researchers in developing effective digital interfaces for analyzing the meaning-making processes of multimodal phenomena. The authors propose a social semiotic approach as the underlying theoretical foundation, because interactive digital technology is the embodiment of multimodal social semiotic communication. The paper outlines the complex issues with which researchers are confronted in designing digital interface frameworks for modeling, analyzing, and retrieving meaning from multimodal data, giving due consideration to the multiplicity of theoretical frameworks and theories which have been developed for the study of multimodal text within social semiotics, and their impact on the development of a computer-based tool for the exploration, annotation, and analysis of multimodal data.

APA, Harvard, Vancouver, ISO, and other styles

18

Carletta, Jean, Stefan Evert, Ulrich Heid, Jonathan Kilgour, Judy Robertson, and Holger Voormann. "The NITE XML Toolkit: Flexible annotation for multimodal language data." Behavior Research Methods, Instruments, & Computers 35, no. 3 (August 2003): 353–63. http://dx.doi.org/10.3758/bf03195511.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Kim, Donghyun, Kuniaki Saito, Kate Saenko, Stan Sclaroff, and Bryan Plummer. "MULE: Multimodal Universal Language Embedding." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11254–61. http://dx.doi.org/10.1609/aaai.v34i07.6785.

Full text

Abstract:

Existing vision-language methods typically support two languages at a time at most. In this paper, we present a modular approach which can easily be incorporated into existing vision-language methods in order to support many languages. We accomplish this by learning a single shared Multimodal Universal Language Embedding (MULE) which has been visually-semantically aligned across all languages. Then we learn to relate MULE to visual data as if it were a single language. Our method is not architecture specific, unlike prior work which typically learned separate branches for each language, enabling our approach to easily be adapted to many vision-language methods and tasks. Since MULE learns a single language branch in the multimodal model, we can also scale to support many languages, and languages with fewer annotations can take advantage of the good representation learned from other (more abundant) language data. We demonstrate the effectiveness of our embeddings on the bidirectional image-sentence retrieval task, supporting up to four languages in a single model. In addition, we show that Machine Translation can be used for data augmentation in multilingual learning, which, combined with MULE, improves mean recall by up to 20.2% on a single language compared to prior work, with the most significant gains seen on languages with relatively few annotations. Our code is publicly available1.

APA, Harvard, Vancouver, ISO, and other styles

20

Lazaridis, Michalis, Apostolos Axenopoulos, Dimitrios Rafailidis, and Petros Daras. "Multimedia search and retrieval using multimodal annotation propagation and indexing techniques." Signal Processing: Image Communication 28, no. 4 (April 2013): 351–67. http://dx.doi.org/10.1016/j.image.2012.04.001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Cantalini, Giorgina, and Massimo Moneglia. "annotation of gesture and gesture / prosody synchronization in multimodal speech corpora." Journal of Speech Sciences 9 (September 9, 2020): 07–30. http://dx.doi.org/10.20396/joss.v9i00.14956.

Full text

Abstract:

This paper was written with the aim of highlighting the functional and structural correlations between gesticulation and prosody, focusing on gesture / prosody synchronization in spontaneous spoken Italian. The gesture annotation used follows the LASG model (Bressem et al. 2013), while the prosodic annotation focuses on the identification of terminal and non-terminal prosodic breaks which, according to L-AcT (Cresti, 2000; Moneglia & Raso 2014), determine speech act boundaries and the information structure, respectively. Gesticulation co-occurs with speech in about 90% of the speech flow examined and gestural arcs are synchronous with prosodic boundaries. Gesture Phrases, which contain the expressive phase (Stroke) never cross terminal prosodic boundaries, finding in the utterance the maximum unit for gesture / speech correlation. Strokes may correlate with all information unit types, however only infrequently with Dialogic Units (i.e. those functional to the management of the communication). The identification of linguistic units via the marking of prosodic boundaries allows us to understand the linguistic scope of the gesture, supporting its interpretation. Gestures may be linked at different linguistic levels, namely those of: a) the word level; b) the information unit phrase; c) the information unit function; d) the illocutionary value.

APA, Harvard, Vancouver, ISO, and other styles

22

Relyea, Robert, Darshan Bhanushali, Karan Manghi, Abhishek Vashist, Clark Hochgraf, Amlan Ganguly, Andres Kwasinski, Michael E. Kuhl, and Raymond Ptucha. "Improving Multimodal Localization Through Self-Supervision." Electronic Imaging 2020, no. 6 (January 26, 2020): 14–1. http://dx.doi.org/10.2352/issn.2470-1173.2020.6.iriacv-014.

Full text

Abstract:

Modern warehouses utilize fleets of robots for inventory management. To ensure efficient and safe operation, real-time localization of each agent is essential. Most robots follow metal tracks buried in the floor and use a grid of precisely mounted RFID tags for localization. As robotic agents in warehouses and manufacturing plants become ubiquitous, it would be advantageous to eliminate the need for these metal wires and RFID tags. Not only do they suffer from significant installation costs, the removal of wires would allow agents to travel to any area inside the building. Sensors including cameras and LiDAR have provided meaningful localization information for many different positioning system implementations. Fusing localization features from multiple sensor sources is a challenging task especially when the target localization task’s dataset is small. We propose a deep-learning based localization system which fuses features from an omnidirectional camera image and a 3D LiDAR point cloud to create a robust robot positioning model. Although the usage of vision and LiDAR eliminate the need for the precisely installed RFID tags, they do require the collection and annotation of ground truth training data. Deep neural networks thrive on lots of supervised data, and the collection of this data can be time consuming. Using a dataset collected in a warehouse environment, we evaluate the performance of two individual sensor models for localization accuracy. To minimize the need for extensive ground truth data collection, we introduce a self-supervised pretraining regimen to populate the image feature extraction network with meaningful weights before training on the target localization task with limited data. In this research, we demonstrate how our self-supervision improves accuracy and convergence of localization models without the need for additional sample annotation.

APA, Harvard, Vancouver, ISO, and other styles

23

Weiß, Christof, Frank Zalkow, Vlora Arifi-Müller, Meinard Müller, Hendrik Vincent Koops, Anja Volk, and Harald G. Grohganz. "Schubert Winterreise Dataset." Journal on Computing and Cultural Heritage 14, no. 2 (June 2021): 1–18. http://dx.doi.org/10.1145/3429743.

Full text

Abstract:

This article presents a multimodal dataset comprising various representations and annotations of Franz Schubert’s song cycle Winterreise . Schubert’s seminal work constitutes an outstanding example of the Romantic song cycle—a central genre within Western classical music. Our dataset unifies several public sources and annotations carefully created by music experts, compiled in a comprehensive and consistent way. The multimodal representations comprise the singer’s lyrics, sheet music in different machine-readable formats, and audio recordings of nine performances, two of which are freely accessible for research purposes. By means of explicit musical measure positions, we establish a temporal alignment between the different representations, thus enabling a detailed comparison across different performances and modalities. Using these alignments, we provide for the different versions various musicological annotations describing tonal and structural characteristics. This metadata comprises chord annotations in different granularities, local and global annotations of musical keys, and segmentations into structural parts. From a technical perspective, the dataset allows for evaluating algorithmic approaches to tasks such as automated music transcription, cross-modal music alignment, or tonal analysis, and for testing these algorithms’ robustness across songs, performances, and modalities. From a musicological perspective, the dataset enables the systematic study of Schubert’s musical language and style in Winterreise and the comparison of annotations regarding different annotators and granularities. Beyond the research domain, the data may serve further purposes such as the didactic preparation of Schubert’s work and its presentation to a wider public by means of an interactive multimedia experience. With this article, we provide a detailed description of the dataset, indicate its potential for computational music analysis by means of several studies, and point out possibilities for future research.

APA, Harvard, Vancouver, ISO, and other styles

24

Landolsi, Mohamed Yassine, Hela Haj Mohamed, and Lotfi Ben Romdhane. "Image annotation in social networks using graph and multimodal deep learning features." Multimedia Tools and Applications 80, no. 8 (January 8, 2021): 12009–34. http://dx.doi.org/10.1007/s11042-020-09730-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Chen, Yu-Hua, and Radovan Bruncak. "Transcribear – Introducing a secure online transcription and annotation tool." Digital Scholarship in the Humanities 35, no. 2 (March 25, 2019): 265–75. http://dx.doi.org/10.1093/llc/fqz016.

Full text

Abstract:

Abstract Reliable high-quality transcription and/or annotation (a.k.a. ‘coding’) is essential for research in a variety of areas in Humanities and Social Sciences which make use of qualitative data such as interviews, focus groups, classroom observations, or any other audio/video recordings. A good tool can facilitate the work of transcription and annotation because the process is notoriously time-consuming and challenging. However, our survey indicates that few existing tools can accommodate the requirements for transcription and annotation (e.g. audio/video playback, spelling checks, keyboard shortcuts, adding tags of annotation) in one place so that a user does not need to constantly switch between multiple windows, for example, an audio player and a text editor. ‘Transcribear’ (https://transcribear.com) is therefore developed as an easy-to-use online tool which facilitates transcription and annotation on the same interface while this web tool operates offline so that a user’s recordings and transcripts can remain secure and confidential. To minimize human errors, the functionality of tag validation is also added. Originally designed for a multimodal corpus project UNNC CAWSE (https://www.nottingham.edu.cn/en/english/research/cawse/), this browser-based application can be customized for individual users’ needs in terms of the annotation scheme and corresponding shortcut keys. This article will explain how this new tool can make tedious and repetitive manual work faster and easier and at the same time improve the quality of outputs as the process of transcription and annotation tends to be prone to human errors. The limitations of Transcribear and future work will also be discussed.

APA, Harvard, Vancouver, ISO, and other styles

26

Diemer, Stefan, Marie-Louise Brunner, and Selina Schmidt. "Compiling computer-mediated spoken language corpora." Compilation, transcription, markup and annotation of spoken corpora 21, no. 3 (September 19, 2016): 348–71. http://dx.doi.org/10.1075/ijcl.21.3.03die.

Full text

Abstract:

This paper discusses key issues in the compilation of spoken language corpora in a computer-mediated communication (CMC) environment, using data from the Corpus of Academic Spoken English (CASE), a corpus of Skype conversations currently being compiled at Saarland University, Germany, in cooperation with European and US partners. Based on first findings, Skype is presented as a suitable tool for collecting informal spoken data. In addition, new recommendations concerning data compilation and transcription are put forward to supplement existing best practice as presented in Wynne (2005). We recommend the preservation of multimodal features during anonymisation, and the addition of annotation elements already at the transcription stage, particularly CMC-related discourse features, English as a Lingua Franca (ELF) features (e.g. non-standard language and code-switching), as well as the inclusion of prosodic, paralinguistic, and non-verbal annotation. Additionally, we propose a layered corpus design in order to allow researchers to focus on specific annotation features.

APA, Harvard, Vancouver, ISO, and other styles

27

Boers, Frank, Paul Warren, Gina Grimshaw, and Anna Siyanova-Chanturia. "On the benefits of multimodal annotations for vocabulary uptake from reading." Computer Assisted Language Learning 30, no. 7 (August 2017): 709–25. http://dx.doi.org/10.1080/09588221.2017.1356335.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Millet, Agnès, and Isabelle Estève. "Transcribing and annotating multimodality." Gesture and Multimodal Development 10, no. 2-3 (December 31, 2010): 297–320. http://dx.doi.org/10.1075/gest.10.2-3.09mil.

Full text

Abstract:

This paper deals with the central question of transcribing deaf children’s productions. We present the annotation grid we created on Elan®, explaining in detail how and why the observation of the narrative productions of 6 to 12 year-old deaf children led us to modify the annotation schemes previously available. Deaf children resort to every resource available in both modalities: voice and gesture. Thus, these productions are fundamentally multimodal and bilingual. In order to describe these specific practices, we propose considering verbal and non-verbal, vocal and gestural, materials as parts of one integrated production. A linguistic-centered transcription is not efficient in describing such bimodal productions, since describing bimodal utterances implies taking into account the ‘communicative desire’ (‘vouloir-dire’) of the children. For this reason, both the question of the transcription unit and the issue of the complexity of semiotic interactions in bimodal utterances need to be reconsidered.

APA, Harvard, Vancouver, ISO, and other styles

29

Drăgan, Nicolae Sorin. "Left/Right Polarity in Gestures and Politics." Romanian Journal of Communication and Public Relations 20, no. 3 (December 1, 2018): 53. http://dx.doi.org/10.21018/rjcpr.2018.3.265.

Full text

Abstract:

In this article we investigate how political actors involved in TV debates during the 2009 and 2014 presidential elections in Romania manage the relationship between handedness (left/right polarity in hand gestures) and political orientation (left/right polarity in politics),. For this purpose we developed a multimodal analysis for some relevant sequences during these debates. The practice of integrating the meanings of different semiotic resources allows a better understanding of the meaning of verbal discourse, actions and behavior of political actors involved in a particular communication situation. In addition, the Multimodal Professional Analysis Tool, ELAN, allows the annotation and dynamic analysis of the semiotic behavior of the political actors involved in the analyzed sequences.

APA, Harvard, Vancouver, ISO, and other styles

30

Chang, E., Kingshy Goh, G. Sychay, and Gang Wu. "CBSA: content-based soft annotation for multimodal image retrieval using bayes point machines." IEEE Transactions on Circuits and Systems for Video Technology 13, no. 1 (January 2003): 26–38. http://dx.doi.org/10.1109/tcsvt.2002.808079.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Chapman, Roger J., and Philip J. Smith. "Asynchronous Communications to Support Distributed Work in the National Airspace System." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 46, no. 1 (September 2002): 41–45. http://dx.doi.org/10.1177/154193120204600109.

Full text

Abstract:

This research involved the evaluation of a multimodal asynchronous communications tool to support collaborative analysis of post-operations in the National Airspace System (NAS). Collaborating authors have been shown to provide different feedback with asynchronous speech based communications compared to text. Voice synchronized with pointing in asynchronous annotation systems has been found to be more efficient in scheduling tasks, than voice-only, or text only communication. This research investigated how synchronized voice and pointing annotation over asynchronously shared slide shows composed of post operations graphical and tabular data differs in its effect compared to text based annotation, as collections of flights ranked low by standard performance metrics are discussed by FAA (Federal Aviation Administration) and airline representatives. The results showed the combined problem solving and message creation time was shorter when working in the voice and pointing mode than the text based mode, without having an effect on the number and type of ideas generated for improving performance. In both modes the system was also considered useful and usable to both dispatchers and traffic managers.

APA, Harvard, Vancouver, ISO, and other styles

32

Belhi, Abdelhak, Abdelaziz Bouras, and Sebti Foufou. "Leveraging Known Data for Missing Label Prediction in Cultural Heritage Context." Applied Sciences 8, no. 10 (September 30, 2018): 1768. http://dx.doi.org/10.3390/app8101768.

Full text

Abstract:

Cultural heritage represents a reliable medium for history and knowledge transfer. Cultural heritage assets are often exhibited in museums and heritage sites all over the world. However, many assets are poorly labeled, which decreases their historical value. If an asset’s history is lost, its historical value is also lost. The classification and annotation of overlooked or incomplete cultural assets increase their historical value and allows the discovery of various types of historical links. In this paper, we tackle the challenge of automatically classifying and annotating cultural heritage assets using their visual features as well as the metadata available at hand. Traditional approaches mainly rely only on image data and machine-learning-based techniques to predict missing labels. Often, visual data are not the only information available at hand. In this paper, we present a novel multimodal classification approach for cultural heritage assets that relies on a multitask neural network where a convolutional neural network (CNN) is designed for visual feature learning and a regular neural network is used for textual feature learning. These networks are merged and trained using a shared loss. The combined networks rely on both image and textual features to achieve better asset classification. Initial tests related to painting assets showed that our approach performs better than traditional CNNs that only rely on images as input.

APA, Harvard, Vancouver, ISO, and other styles

33

Gomes de Melo Bezerra, Jéssica Tayrine, Paula Michely Soares da Silva, and Marianne Carvalho Bezerra Cavalcante. "Softwares de transcrição como auxílio para as pesquisas com enfoque multimodal no processo de aquisição da linguagem." Texto Livre: Linguagem e Tecnologia 9, no. 1 (July 13, 2016): 77–93. http://dx.doi.org/10.17851/1983-3652.9.1.77-93.

Full text

Abstract:

RESUMO:Através do presente trabalho, almeja-se contribuir com as pesquisas realizadas no âmbito das análises em Aquisição da Linguagem, identificando de que forma dois softwareslivres, de transcrição de dados, podem auxiliar a organização e a análise das interações comunicativas em uso natural com base na multimodalidade. A abordagem multimodal em Aquisição da Linguagem propõe que gestos e produções vocais não podem ser dissociados e compõem uma matriz única de significação (KENDON, 1982, 2000, 2004; McNEILL, 1985, 1992, 2000). Para tal, os programas Eudico Language Annotator (ELAN) e o Computerized Language ANalysis (CLAN) são apresentados. OELAN, desenvolvido pelo Max Planck Institute for Psycholinguistics, é um software de transcrição que tem o objetivo de auxiliar a anotação de dados sobre gravações de áudio ou vídeo. Esse programa foi desenvolvido para a análise linguística, podendo englobar além dos dados da linguagem, anotações sobre gestos. É por isso que o software é atrativo às pesquisas multimodais, pois possibilita o exame do verbal, não verbal e contextual. O CLANé umsoftware que permite a realização de anotações simultâneas de fala, gestos e elementos contextuais observados em interações comunicativas. Ele permite análises que abrangem a contagem de frequência, pesquisas de palavra, análises de coocorrência, contagens MLU (Mean Length of Utterance – Extensão Média do Enunciado), mudanças de texto e análise morfossintática (MacWhinney, 2000). De modo geral, espera-se colaborar com pesquisas na área da linguagem, que analisam seu corpus sob a perspectiva multimodal.PALAVRAS-CHAVE: interação e tecnologias; softwares de transcrição; aquisição da linguagem. ABSTRACT: Through this work, it aims to contribute to the research carried out under the analysis in language acquisition, through the presentation of two data transcription free software that assist the organization and analysis of communicative interactions in natural use. The multimodal approach Language Acquisition proposes that gestures and vocal productions can not be dissociated and form a single matrix of meaning (KENDON, 1982, 2000, 2004; McNEILL, 1985, 1992, 2000). For this, it presented the Eudico Linguistic Annotator and the Computerized Language Analysis programs. The ELAN, developed by Max Planck Institute for Psycholinguistics, is a transcription software aims to assist data recording on audio or video recordings.This software was developed for linguistic analysis, which may cover beyond data language, notes about gestures. Because of this the software is attractive to multimodal research, since it allows the examination of verbal, nonverbal and contextual. CLAN is a software that allows the realization of simultaneous speech notes, gestures and contextual elements observed in communicative interactions. It enables analysis that cover the frequency count, word searches, co-occurrence analysis, MLU (Mean Length of Utterance) counts, text changes and morphosyntactic analysis (MacWhinney, 2000). Overall, it is expected to cooperate with research in the area of language, which analyze your corpus in multimodal perspective.KEYWORDS: interaction and technologies; transcription softwares; language acquisition.

APA, Harvard, Vancouver, ISO, and other styles

34

Szekrényes, István. "Annotation and interpretation of prosodic data in the HuComTech corpus for multimodal user interfaces." Journal on Multimodal User Interfaces 8, no. 2 (April 29, 2014): 143–50. http://dx.doi.org/10.1007/s12193-013-0140-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

MARTIN, JEAN-CLAUDE, RADOSLAW NIEWIADOMSKI, LAURENCE DEVILLERS, STEPHANIE BUISINE, and CATHERINE PELACHAUD. "MULTIMODAL COMPLEX EMOTIONS: GESTURE EXPRESSIVITY AND BLENDED FACIAL EXPRESSIONS." International Journal of Humanoid Robotics 03, no. 03 (September 2006): 269–91. http://dx.doi.org/10.1142/s0219843606000825.

Full text

Abstract:

One of the challenges of designing virtual humans is the definition of appropriate models of the relation between realistic emotions and the coordination of behaviors in several modalities. In this paper, we present the annotation, representation and modeling of multimodal visual behaviors occurring during complex emotions. We illustrate our work using a corpus of TV interviews. This corpus has been annotated at several levels of information: communicative acts, emotion labels, and multimodal signs. We have defined a copy-synthesis approach to drive an Embodied Conversational Agent from these different levels of information. The second part of our paper focuses on a model of complex (superposition and masking of) emotions in facial expressions of the agent. We explain how the complementary aspects of our work on corpus and computational model is used to specify complex emotional behaviors.

APA, Harvard, Vancouver, ISO, and other styles

36

Bolly, Catherine T., and Dominique Boutet. "The multimodal CorpAGEst corpus: keeping an eye on pragmatic competence in later life." Corpora 13, no. 3 (November 2018): 279–317. http://dx.doi.org/10.3366/cor.2018.0151.

Full text

Abstract:

The CorpAGEst project aims to study the pragmatic competence of very old people (75 years old and more), by looking at their use of verbal and gestural pragmatic markers in real-world settings (versus laboratory conditions). More precisely, we hypothesise that identifying multimodal pragmatic patterns in language use, as produced by older adults at the gesture–speech interface, helps to better characterise language variation and communication abilities in later life. The underlying assumption is that discourse markers (e.g., tu sais ‘you know’) and pragmatic gestures (e.g., an exaggerated opening of the eyes) are relevant indicators of stance in discourse. This paper's objective is mainly methodological. It aims to demonstrate how the pragmatic profile of older adults can be established by analysing audio and video data. After a brief theoretical introduction, we describe the annotation protocol that has been developed to explore issues in multimodal pragmatics and ageing. Lastly, first results from a corpus-based study are given, showing how multimodal approaches can tackle important aspects of communicative abilities, at the crossroads of language and ageing research in linguistics.

APA, Harvard, Vancouver, ISO, and other styles

37

Ladewig, Silva, and Lena Hotze. "Zur temporalen Entfaltung und multimodalen Orchestrierung von konzeptuellen Räumen am Beispiel einer Erzählung." Linguistik Online 104, no. 4 (November 15, 2020): 109–36. http://dx.doi.org/10.13092/lo.104.7320.

Full text

Abstract:

The study presented in this article investigates the temporal unfolding and multimodal orchestration of meaning in a narration. Two aspects are focused on. First, the temporal and multimodal orchestration of conceptual spaces in the entire narrative is described. Five conceptual spaces were identified which were construed by multiple visual-kinesic modalities and speech. Moreover, the study showed that the conceptual spaces are often created simultaneously, which, however, does not lead to communication problems due to the media properties of the modalities involved (see also Schmitt 2005). The second part of the analysis zoomed in onto the phase of the narrative climax in which the multimodal production of the narrative space with role shift dominated. By applying a timeline-annotation procedure for gestures (Müller/Ladewig 2013) a temporally unfolding salience structure (Müller/Tag 2010) could be reconstructed which highlights certain semantic aspects in the creation and flow of multimodal meaning. Thus, specific information “necessary” to understand the climax of the narration was foregrounded and made prominent for a co-participant. By focusing methodically and theoretically on the temporal structure and the interplay of different modalities, the paper offers a further contribution to the current discussion about temporality, dynamics and multimodality of language (Deppermann/Günthner 2015; Müller 2008b).

APA, Harvard, Vancouver, ISO, and other styles

38

Partarakis, Nikolaos, Xenophon Zabulis, Antonis Chatziantoniou, Nikolaos Patsiouras, and Ilia Adami. "An Approach to the Creation and Presentation of Reference Gesture Datasets, for the Preservation of Traditional Crafts." Applied Sciences 10, no. 20 (October 19, 2020): 7325. http://dx.doi.org/10.3390/app10207325.

Full text

Abstract:

A wide spectrum of digital data are becoming available to researchers and industries interested in the recording, documentation, recognition, and reproduction of human activities. In this work, we propose an approach for understanding and articulating human motion recordings into multimodal datasets and VR demonstrations of actions and activities relevant to traditional crafts. To implement the proposed approach, we introduce Animation Studio (AnimIO) that enables visualisation, editing, and semantic annotation of pertinent data. AnimIO is compatible with recordings acquired by Motion Capture (MoCap) and Computer Vision. Using AnimIO, the operator can isolate segments from multiple synchronous recordings and export them in multimodal animation files. AnimIO can be used to isolate motion segments that refer to individual craft actions, as described by practitioners. The proposed approach has been iteratively designed for use by non-experts in the domain of 3D motion digitisation.

APA, Harvard, Vancouver, ISO, and other styles

39

RIESER, V., and O. LEMON. "Learning human multimodal dialogue strategies." Natural Language Engineering 16, no. 1 (April 22, 2009): 3–23. http://dx.doi.org/10.1017/s1351324909005099.

Full text

Abstract:

AbstractWe investigate the use of different machine learning methods in combination with feature selection techniques to explore human multimodal dialogue strategies and the use of those strategies for automated dialogue systems. We learn policies from data collected in a Wizard-of-Oz study where different human ‘wizards’ decide whether to ask a clarification request in a multimodal manner or else to use speech alone. We first describe the data collection, the coding scheme and annotated corpus, and the validation of the multimodal annotations. We then show that there is a uniform multimodal dialogue strategy across wizards, which is based on multiple features in the dialogue context. These are generic features, available at runtime, which can be implemented in dialogue systems. Our prediction models (for human wizard behaviour) achieve a weighted f-score of 88.6 per cent (which is a 25.6 per cent improvement over the majority baseline). We interpret and discuss the learned strategy. We conclude that human wizard behaviour is not optimal for automatic dialogue systems, and argue for the use of automatic optimization methods, such as Reinforcement Learning. Throughout the investigation we also discuss the issues arising from using small initial Wizard-of-Oz data sets, and we show that feature engineering is an essential step when learning dialogue strategies from such limited data.

APA, Harvard, Vancouver, ISO, and other styles

40

Caicedo, Juan C., Jaafar BenAbdallah, Fabio A. González, and Olfa Nasraoui. "Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization." Neurocomputing 76, no. 1 (January 2012): 50–60. http://dx.doi.org/10.1016/j.neucom.2011.04.037.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Larkey, Edward. "Narratological Approaches to Multimodal Cross-Cultural Comparisons of Global TV Formats." Audiovisual Data in Digital Humanities 7, no. 14 (December 31, 2018): 38. http://dx.doi.org/10.18146/2213-0969.2018.jethc152.

Full text

Abstract:

This article cross-culturally compares different versions of the Quebec sitcom/sketch comedy television series Un Gars, Une Fille (1997-2002) by examining the various gender roles and family conflict management strategies in a scene in which the heterosexual couple visits the male character’s mother-in-law. The article summarizes similarities and differences in the narrative structure, sequencing and content of several format adaptations by compiling computer-generated quantitative and qualitative data on the length of segments. To accomplish this, I have used the annotation function of Adobe Premiere, and visualized the findings using Microsoft Excel bar graphs and tables. This study applies a multimodal methodology to reveal the textual organization of scenes, shots and sequences which guide viewers toward culturally proxemic interpretations. This article discusses the benefits of applying the notion of discursive proximity suggested by Uribe-Jongbloed and Espinosa-Medina (2014) to gain a more comprehensive and complex understanding of the multimodal nature of cross-cultural comparison of global television format adaptations.

APA, Harvard, Vancouver, ISO, and other styles

42

Tan, Sabine, Michael Wiebrands, Kay O’Halloran, and Peter Wignell. "Analysing student engagement with 360-degree videos through multimodal data analytics and user annotations." Technology, Pedagogy and Education 29, no. 5 (October 19, 2020): 593–612. http://dx.doi.org/10.1080/1475939x.2020.1835708.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Race, Alan M., Daniel Sutton, Gregory Hamm, Gareth Maglennon, Jennifer P. Morton, Nicole Strittmatter, Andrew Campbell, et al. "Deep Learning-Based Annotation Transfer between Molecular Imaging Modalities: An Automated Workflow for Multimodal Data Integration." Analytical Chemistry 93, no. 6 (February 3, 2021): 3061–71. http://dx.doi.org/10.1021/acs.analchem.0c02726.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Barbosa do Rêgo Barros, Isabela, Renata Fonseca Lima da Fonte, and Ana Fabrícia Rodrigues de Souza. "Ecolalia e gestos no autismo: reflexões em torno da metáfora enunciativa." Forma y Función 33, no. 1 (January 1, 2020): 173–89. http://dx.doi.org/10.15446/fyf.v33n1.84184.

Full text

Abstract:

Buscamos estudar a linguagem no autismo dentro do campo linguístico enunciativo e da perspectiva multimodal da linguagem, e identificamos a ecolalia como pertencente ao campo da metáfora a partir de sua relação com o gesto e com o contexto enunciativo. Fundamentamos o estudo na teoria enunciativa de Benveniste e na matriz multimodal da linguagem proposta por McNeill para discutir a possibilidade de a ecolalia coexistir como metáfora na linguagem de uma criança autista. Como metodologia, realizamos uma pesquisa qualitativa, do tipo estudo de caso, na qual selecionamos fragmentos de ecolalias extraídos do banco de dados do Grupo de Estudos e Atendimento ao Espectro do Autismo, que foram transcritos com a utilização do software Elan (Eudico Language Annotator). Os dados mostraram o funcionamento multimodal da ecolalia percebida como metáfora por uma transferência analógica de denominação produzida no discurso, a partir de gestos estereotipados associados a ela.

APA, Harvard, Vancouver, ISO, and other styles

45

De Melo, Ediclécia Sousa, Ivonaldo Leidson Barbosa Lima, and Paulo Vinícius Ávila Nóbrega. "A emergência do gesto de apontar na Síndrome de Down em contexto clínico." Entrepalavras 9, no. 3 (October 15, 2019): 442. http://dx.doi.org/10.22168/2237-6321-31601.

Full text

Abstract:

A síndrome de Down (SD) é uma condição genética que provoca uma série de alterações no desenvolvimento linguístico da criança. Em relação à linguagem, é importante considerar sua natureza multimodal, assumindo-a como um envelope linguístico (ÁVILA-NÓBREGA, 2018) constituído de gestos, fala e olhar. Desse modo, é fundamental observar como o envelope linguístico multimodal emerge e é mobilizado pela criança com SD, em diferentes contextos. Este trabalho teve o objetivo de analisar a emergência do gesto do apontar de crianças com SD, em contextos de atendimento clínico. O estudo foi desenvolvido na Clínica Escola de Fonoaudiologia de uma instituição de ensino superior da Paraíba, a partir do registro de atendimentos terapêuticos fonoaudiológicos de duas díades: estagiária e criança com SD. Os dados foram registrados qualitativamente no programa ELAN – EUDICO Linguistic Annotator. Pode-se observar que o gesto de apontar emergiu com frequência no envelope linguístico multimodal das crianças, facilitando sua interação com as estagiárias, bem como na atenção conjunta e referência.

APA, Harvard, Vancouver, ISO, and other styles

46

Pamart, A., F. Ponchio, V. Abergel, A. Alaoui M'Darhri, M. Corsini, M. Dellepiane, F. Morlet, R. Scopigno, and L. De Luca. "A COMPLETE FRAMEWORK OPERATING SPATIALLY-ORIENTED RTI IN A 3D/2D CULTURAL HERITAGE DOCUMENTATION AND ANALYSIS TOOL." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W9 (January 31, 2019): 573–80. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w9-573-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> Close-Range Photogrammetry (CRP) and Reflectance Transformation Imaging (RTI) are two of the most used image-based techniques when documenting and analyzing Cultural Heritage (CH) objects. Nevertheless, their potential impact in supporting study and analysis of conservation status of CH assets is reduced as they remain mostly applied and analyzed separately. This is mostly because we miss easy-to-use tools for of a spatial registration of multimodal data and features for joint visualisation gaps. The aim of this paper is to describe a complete framework for an effective data fusion and to present a user friendly viewer enabling the joint visual analysis of 2D/3D data and RTI images. This contribution is framed by the on-going implementation of automatic multimodal registration (3D, 2D RGB and RTI) into a collaborative web platform (AIOLI) enabling the management of hybrid representations through an intuitive visualization framework and also supporting semantic enrichment through spatialized 2D/3D annotations.</p>

APA, Harvard, Vancouver, ISO, and other styles

47

Poggi, Isabella. "Signals of intensification and attenuation in orchestra and choir conduction." Normas 7, no. 1 (June 23, 2017): 33. http://dx.doi.org/10.7203/normas.7.10423.

Full text

Abstract:

Based on a model of communication according to which not only words but also body signals constitute lexicons (Poggi, 2007), the study presented aimes at building a lexicon of conductors’ multimodal behaviours requesting intensification and attenuation of sound intensity. In a corpus of concerts and rehearsals, the conductors’ body signals requesting to play or sing forte, piano, crescendo, diminuendo were analysed through an annotation scheme describing the body signals, their meanings, and their semiotic devices: generic codified (the same as in everyday language); specific codified (shared with laypeople but with specific meanings in conduction); direct iconic, (resemblance between visual and acoustic modality); indirect iconic, (evoking the technical movement by connected movements or emotion expressions). The work outlines a lexicon of the conductors’ signals that in gesture, head, face, gaze, posture, body convey attenuation and intensification in music.

APA, Harvard, Vancouver, ISO, and other styles

48

Celata, Chiara, Chiara Meluzzi, and Irene Ricci. "The sociophonetics of rhotic variation in Sicilian dialects and Sicilian Italian: corpus, methodology and first results." Loquens 3, no. 1 (September 29, 2016): 025. http://dx.doi.org/10.3989/loquens.2016.025.

Full text

Abstract:

SoPhISM (The SocioPhonetics of verbal Interaction: Sicilian Multimodal corpus) is an acoustic and articulatory sociophonetic corpus focused on whithin-speaker variation as a function of stylistic/communicative factors. The corpus is particularly intended for the study of rhotics as a sociolinguistic variable in the production of Sicilian speakers. Rhotics are analyzed according to the distinction between single-phase and multiple-phase rhotics along with the presence of constriction and aperture articulatory phases. Based on these parameters, the annotation protocol seeks to classify rhotic variants within a sufficiently granular, but internally consistent, phonetic perspective. The proposed descriptive parameters allow for the discussion of atypical realizations in terms of phonetic derivations (or simplifications) of typical closure–aperture sequences. The distribution of fricative variants in the speech repertoire of one speaker and his interlocutors shows the potential provided by SoPhISM for sociophonetic variation to be studied at the ‘micro’ level of individual speaker’s idiolects.

APA, Harvard, Vancouver, ISO, and other styles

49

Kalir, Jeremiah H., and Antero Garcia. "Civic Writing on Digital Walls." Journal of Literacy Research 51, no. 4 (October 3, 2019): 420–43. http://dx.doi.org/10.1177/1086296x19877208.

Full text

Abstract:

Civic writing has appeared on walls over centuries, across cultures, and in response to political concerns. This article advances a civic interrogation of how civic writing is publicly authored, read, and discussed as openly accessible and multimodal texts on digital walls. Drawing upon critical literacy perspectives, we examine how a repertoire of 10 civic writing practices associated with open web annotation (OWA) helped educators develop critical literacy. We introduce a social design experiment in which educators leveraged OWA to discuss educational equity across sociopolitical texts and contexts. We then describe a single case of OWA conversation among educators and use discourse analysis to examine shifting situated meanings and political expressions present in educators’ civic writing practices. We conclude by considering implications for theorizing the marginality of critical literacy, designing learning environments that foster educators’ civic writing, and facilitating learning opportunities that encourage educators’ civic writing across digital walls.

APA, Harvard, Vancouver, ISO, and other styles

50

Trujillo, James P., and Judith Holler. "The Kinematics of Social Action: Visual Signals Provide Cues for What Interlocutors Do in Conversation." Brain Sciences 11, no. 8 (July 28, 2021): 996. http://dx.doi.org/10.3390/brainsci11080996.

Full text

Abstract:

During natural conversation, people must quickly understand the meaning of what the other speaker is saying. This concerns not just the semantic content of an utterance, but also the social action (i.e., what the utterance is doing—requesting information, offering, evaluating, checking mutual understanding, etc.) that the utterance is performing. The multimodal nature of human language raises the question of whether visual signals may contribute to the rapid processing of such social actions. However, while previous research has shown that how we move reveals the intentions underlying instrumental actions, we do not know whether the intentions underlying fine-grained social actions in conversation are also revealed in our bodily movements. Using a corpus of dyadic conversations combined with manual annotation and motion tracking, we analyzed the kinematics of the torso, head, and hands during the asking of questions. Manual annotation categorized these questions into six more fine-grained social action types (i.e., request for information, other-initiated repair, understanding check, stance or sentiment, self-directed, active participation). We demonstrate, for the first time, that the kinematics of the torso, head and hands differ between some of these different social action categories based on a 900 ms time window that captures movements starting slightly prior to or within 600 ms after utterance onset. These results provide novel insights into the extent to which our intentions shape the way that we move, and provide new avenues for understanding how this phenomenon may facilitate the fast communication of meaning in conversational interaction, social action, and conversation.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!