Bibliographies thématiques / Audiovisual synthesis

Sommaire

Articles de revues
Thèses
Livres
Chapitres de livres
Actes de conférences

Littérature scientifique sur le sujet « Audiovisual synthesis »

Auteur : Grafiati

Publié le 10 mai 2025

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Audiovisual synthesis ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Articles de revues sur le sujet "Audiovisual synthesis"

Lokki, Tapio, Jarmo Hiipakka, Rami Hänninen, Tommi Ilmonen, Lauri Savioja et Tapio Takala. « Realtime audiovisual rendering and contemporary audiovisual art ». Organised Sound 3, n^o 3 (décembre 1998) : 219–33. http://dx.doi.org/10.1017/s1355771898003069.

Texte intégral

Résumé :

Visual rendering is the process of creating synthetic images of digital models. The modelling of sound synthesis and propagation in a virtual space is called sound rendering. In this article we review different audiovisual rendering techniques suitable for realtime rendering of three-dimensional virtual worlds. Virtual environments are useful in various application areas, for example in architectural visualisation. With audiovisual rendering, lighting and acoustics of a modelled concert hall can be experienced early in the design stage of the building. In this article we demonstrate an interactive audiovisual rendering system where an animated virtual orchestra plays in a modelled concert hall. Virtual musicians are conducted by a real conductor who wears a wired data dress suit and a baton. The conductor and the audience hear the music rendered according to the acoustics of the virtual concert hall, creating a lifelike experience.

Styles APA, Harvard, Vancouver, ISO, etc.

Pueo Ortega, Basilio, et Victoria Tur Viñes. « Sonido espacial para una inmersión audiovisual de alto realismo ». Revista ICONO14 Revista científica de Comunicación y Tecnologías emergentes 7, n^o 2 (1 juillet 2009) : 334–45. http://dx.doi.org/10.7195/ri14.v7i2.330.

Texte intégral

Résumé :

Los sistemas de vídeo y audio de alta inmersión tienen un auge impor-tante en entornos audiovisuales realistas. Las sensaciones visuales y sonoras que crean en el público se aproximan con un alto grado de similitud a lo percibido en el entorno real que pretenden recrear. Para ello, los estímulos deben contener toda la información necesaria, tanto espacial como temporal, que permita crear la ilusión de que el objeto audiovisual es real. En este artículo, se realiza un repaso de los sistemas audiovisuales que permiten esta recreación, con especial atención en los sistemas de audio envolvente. Se describe la técnica de audio 3D más prometedora, Wave Field Synthesis, junto con diversos campos de aplicación de entornos audiovisuales de alto realismo.

Styles APA, Harvard, Vancouver, ISO, etc.

Adaikhanovna, Utemgaliyeva Nassikhat, Bektemirova Saule Bekmukhamedovna, Odanova Sagira Amangeldiyevna, William P. Rivers et Akimisheva Zhanar Abdisadykkyzy. « Texts with academic terms ». XLinguae 15, n^o 2 (avril 2022) : 121–29. http://dx.doi.org/10.18355/xl.2022.15.02.09.

Texte intégral

Résumé :

The article discusses innovations in the audiovisual translation of texts with academic terms. The modern world does not stand still as new technologies emerge that make it possible to create a large amount of audiovisual content. Every year, there are many recent films, TV series, and cartoons in foreign languages that require translation. As a result, audiovisual translation is becoming ever more relevant for research. Our paper aims to identify the main features of audiovisual translation as a particular type of translation activity. The research objective is the process of audiovisual translation as a special type of translation activity. The subject of the study pertains to features of subtitling as a type of audiovisual translation. The theoretical basis of the research consists of the works of scientists in the field of cultural studies (L.G. Dunyasheva, J. Mitri, etc.), semiotics (R. Barth, Y.M. Lotman, U. Eco), discursive linguistics (N. D. Arutyunova, T. van Dyck, M.L. Makarov, O.A. Radchenko, etc.), translation studies (V.S. Vinogradova, T.A. Volkova, V.N. Komissarova, etc.), and theory and practice of audiovisual translation (H. Dias-Synthesis, M.A. Efremova, A.V. Kozulyaeva, etc.).

Styles APA, Harvard, Vancouver, ISO, etc.

Richards, Michael D., Herbert C. Goltz et Agnes M. F. Wong. « Audiovisual perception in amblyopia : A review and synthesis ». Experimental Eye Research 183 (juin 2019) : 68–75. http://dx.doi.org/10.1016/j.exer.2018.04.017.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Dufour, Frank, et Lee Dufour. « DreamArchitectonics : An Interactive Audiovisual Installation ». Leonardo 51, n^o 2 (avril 2018) : 105–10. http://dx.doi.org/10.1162/leon_a_01188.

Texte intégral

Résumé :

This article presents the processes that guided the production of the interactive artwork DreamArchitectonics, attempting to render perceivable the altered experience of time characteristic of the dream-state. This project originated with the observation of dream reports that were revealed, across a broad variety of contents, to be relatively invariant in form, with this form appearing to function as a mnemonic artifact allowing the dreamer to actually remember dreams. The details of the representational process applied to oneiric time and manifested in these artifacts have been identified to resonate meaningfully with poetic expression, especially in its relationship to the sensation of movement. DreamArchitectonics aims at producing the context for an experiential synthesis of this intuition and acting as the generator of phenomenological data in a disposition that the authors envision as the most fruitful for collaboration between arts and sciences.

Styles APA, Harvard, Vancouver, ISO, etc.

Li, Yuanqing, Fangyi Wang, Yongbin Chen, Andrzej Cichocki et Terrence Sejnowski. « The Effects of Audiovisual Inputs on Solving the Cocktail Party Problem in the Human Brain : An fMRI Study ». Cerebral Cortex 28, n^o 10 (25 septembre 2017) : 3623–37. http://dx.doi.org/10.1093/cercor/bhx235.

Texte intégral

Résumé :

Abstract At cocktail parties, our brains often simultaneously receive visual and auditory information. Although the cocktail party problem has been widely investigated under auditory-only settings, the effects of audiovisual inputs have not. This study explored the effects of audiovisual inputs in a simulated cocktail party. In our fMRI experiment, each congruent audiovisual stimulus was a synthesis of 2 facial movie clips, each of which could be classified into 1 of 2 emotion categories (crying and laughing). Visual-only (faces) and auditory-only stimuli (voices) were created by extracting the visual and auditory contents from the synthesized audiovisual stimuli. Subjects were instructed to selectively attend to 1 of the 2 objects contained in each stimulus and to judge its emotion category in the visual-only, auditory-only, and audiovisual conditions. The neural representations of the emotion features were assessed by calculating decoding accuracy and brain pattern-related reproducibility index based on the fMRI data. We compared the audiovisual condition with the visual-only and auditory-only conditions and found that audiovisual inputs enhanced the neural representations of emotion features of the attended objects instead of the unattended objects. This enhancement might partially explain the benefits of audiovisual inputs for the brain to solve the cocktail party problem.

Styles APA, Harvard, Vancouver, ISO, etc.

Doel, Kees van den, Dave Knott et Dinesh K. Pai. « Interactive Simulation of Complex Audiovisual Scenes ». Presence : Teleoperators and Virtual Environments 13, n^o 1 (février 2004) : 99–111. http://dx.doi.org/10.1162/105474604774048252.

Texte intégral

Résumé :

We demonstrate a method for efficiently rendering the audio generated by graphical scenes with a large number of sounding objects. This is achieved by using modal synthesis for rigid bodies and rendering only those modes that we judge to be audible to a user observing the scene. We show how excitations of modes can be estimated and inaudible modes eliminated based on the masking characteristics of the human ear. We describe a novel technique for generating contact events by performing closed-form particle simulation and collision detection with the aid of programmable graphics hardware. The effectiveness of our system is shown in the context of suitably complex simulations.

Styles APA, Harvard, Vancouver, ISO, etc.

Pecheranskyi, Ihor. « Brief Technical History and Audiovisual Parameters of Electromechanical Television ». Bulletin of Kyiv National University of Culture and Arts. Series in Audiovisual Art and Production 6, n^o 2 (20 octobre 2023) : 263–76. http://dx.doi.org/10.31866/2617-2674.6.2.2023.289313.

Texte intégral

Résumé :

The purpose of the research is to characterize the most important milestones in the technical history and audiovisual parameters of electromechanical (mechanical) television of the 1840s–1930s as an audiovisual ecosystem. Research methodology. The study uses, firstly, an ecosystem approach, which made it possible to qualify television as a networked audiovisual ecosystem with internal dynamics and external interactions, and secondly, media archaeology as a field that attempts to understand the early stage and electromechanical practices of television through the prism of technical history, and, thirdly, general scientific methods of analysis and synthesis, induction and deduction, generalization and abstraction when working with theoretical material. Scientific novelty. For the first time, the article comprehensively and at the appropriate theoretical level considers the most significant milestones in the development of electromechanical (mechanical) television of the period and its audiovisual parameters. Conclusions. It is proved that during the 40 years since the patent for the “Nipkow disc” was granted in 1885 to the first public demonstration of television moving images by the Scottish inventor John Logie Baird in 1925, electromechanical TV has gone through a rapid and very significant path from broadcasting a static image (analogue of photography) to transmitting a moving image (analogue of cinema). It has been substantiated that despite numerous experiments aimed at “collaborating” the means of preserving and transmitting sound and image (telegraph, radio, telephone), early mechanical television broadcasts remained silent and black and white. It is emphasized that further technical development and improvement of audiovisual parameters of mechanical television led to the deepening of audiovisual synthesis in the industry and its transformation, which first resulted in the emergence of electromechanical and electronic systems with the ability to preserve colour images and later regular and cable television broadcasting systems.

Styles APA, Harvard, Vancouver, ISO, etc.

Schabus, Dietmar, Michael Pucher et Gregor Hofer. « Joint Audiovisual Hidden Semi-Markov Model-Based Speech Synthesis ». IEEE Journal of Selected Topics in Signal Processing 8, n^o 2 (avril 2014) : 336–47. http://dx.doi.org/10.1109/jstsp.2013.2281036.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Zozulia, I., A. Stadnii et A. Slobodianiuk. « Audiovisual teaching aids in the formation process of foreign language communicative competence ». Teaching languages at higher institutions, n^o 40 (30 mai 2022) : 12–28. http://dx.doi.org/10.26565/2073-4379-2022-40-01.

Texte intégral

Résumé :

The modern visually-oriented world is the world of real and virtual possibilities due to the application of information technologies. Therefore, television and the Internet are used not only for entertainment but also for educational purposes in all the spheres of human activity including education. In didactics, audiovisuals play an important role in mastering foreign languages. The appropriateness of their use in the educational process is stipulated by their implementation facilitating all general didactic learning principles – activity, consciousness, consistency, and visualization as well. Audio and video materials are aimed at improving the perception efficiency of the program material, checking the level of its assimilation, and mastering the abilities and practical skills for application of the acquired knowledge. The aim of our analysis was an attempt to define the most basic principles for the formation of competence in a foreign language in students from non-language institutions of higher education based on the use of audiovisual teaching aids. During the study, the authors conducted an analysis and synthesis of the best pedagogical practice, generalized their observations, and experimentally verified the validity of using audiovisual teaching aids in the formation of foreign language competence of foreign students of the preparatory department and the main stage of training (1st-4th year students), and 1st-2nd year Ukrainian students of Vinnytsia National Technical University. The article clarifies the concept of foreign language competence. There have been analyzed the main methods of forming students’ foreign language competence based on the use of audio and video materials (phonetic recordings, texts recordings, audio lessons, podcasts, video clips, video films, electronic textbooks, online dictionaries, video conferences, virtual seminars, telecommunication projects). The authors have substantiated theoretically and tested experimentally the effectiveness of the use of audiovisual teaching aids and their influence on the formation of speech competence among Ukrainian and foreign students. They have proved that while using audiovisuals in the classroom one should take into account the cognitive patterns of students’ learning activities, and their readiness to perceive and assimilate the learning material. It is also important to ensure an organic combination of specific audiovisuals with the teacher’s professional skills. The creation of the foreign language communicative environment will help both optimize the training of future engineers of foreign language professional communication, and also form their professional communicative competence.

Styles APA, Harvard, Vancouver, ISO, etc.

Plus de sources

Thèses sur le sujet "Audiovisual synthesis"

Mital, Parag Kumar. « Audiovisual scene synthesis ». Thesis, Goldsmiths College (University of London), 2014. http://research.gold.ac.uk/10662/.

Texte intégral

Résumé :

This thesis attempts to open a dialogue around fundamental questions of perception such as: how do we represent our ongoing auditory or visual perception of the world using our brain; what could these representations explain and not explain; and how can these representations eventually be modeled by computers? Rather than answer these questions scientifically, we will attempt to develop a computational arts practice presenting these questions to participants. The approach this thesis takes is computational scene synthesis: a computationally generative collage process where the units of the collage are built using perceptually-inspired representations. We explain how scene synthesis is built in detail and relate it to an existing lineage of collage-based practitioners. Then, working in auditory and visual domains separately, in order to bring questions of perception to the experience of the artwork, this thesis makes significant interdisciplinary strides from reviewing fundamental issues in perception in terms of experimental psychology and cognitive neuroscience, to formulating and developing perceptually-inspired computational models of large databases of audiovisual material, to finally developing these models with a computationally generative collage-based arts practice. Two final practical outputs using audiovisual scene synthesis will be explored: (1) a short film series which attempts to recreate the number 1 video of the week on YouTube using only the audiovisual content from the remaining top 10 videos; and (2) a real-time augmented reality experience presented through a virtual reality headset and headphones presenting a scene synthesis of a participant's surroundings using only previously learned audiovisual fragments. Results from both outputs demonstrate the ability for scene synthesis to provoke meaningful engagements with one's own process of perception. The results further demonstrate that scene synthesis is capable of highlighting both theoretical and practical gaps in our current understanding of human perception and their computational implementations.

Styles APA, Harvard, Vancouver, ISO, etc.

Thomas, Zach (Zachary R. ). « Audiovisual Concatenative Synthesis and "Replica" ». Thesis, University of North Texas, 2019. https://digital.library.unt.edu/ark:/67531/metadc1538747/.

Texte intégral

Résumé :

Audiovisual concatenative synthesis is an analysis-driven granular technique using a corpus of multimedia data to sequence audio and video streams on a microtemporal level. This text outlines my development of this technique as a tool for multimedia composition, using my work, Replica, as a case study. The paper illustrates how the concepts of granular structure, gesture capture, and replication are integral not only to the software but to the architecture of the composition. In doing so, machine learning approaches to music and visual art are reviewed and related to my personal compositional practice. Additionally, I attempt to show how audiovisual concatenative synthesis provides a composer with strategies for shaping one's sense of time through disorienting audiovisual cues and tightly organized counterpoint between sound and image, stage and screen, and the real and virtual.

Styles APA, Harvard, Vancouver, ISO, etc.

Melenchón, Maldonado Javier. « Síntesis Audiovisual Realista Personalizable ». Doctoral thesis, Universitat Ramon Llull, 2007. http://hdl.handle.net/10803/9133.

Texte intégral

Résumé :

Es presenta un esquema únic per a la síntesi i anàlisi audiovisual personalitzable realista de seqüències audiovisuals de cares parlants i seqüències visuals de llengua de signes en àmbit domèstic. En el primer cas, amb animació totalment sincronitzada a través d'una font de text o veu; en el segon, utilitzant la tècnica de lletrejar paraules mitjançant la ma. Les seves possibilitats de personalització faciliten la creació de seqüències audiovisuals per part d'usuaris no experts. Les aplicacions possibles d'aquest esquema de síntesis comprenen des de la creació de personatges virtuals realistes per interacció natural o vídeo jocs fins vídeo conferència des de molt baix ample de banda i telefonia visual per a les persones amb problemes d'oïda, passant per oferir ajuda a la pronunciació i la comunicació a aquest mateix col·lectiu. El sistema permet processar seqüències llargues amb un consum de recursos molt reduït, sobre tot, en el referent a l'emmagatzematge, gràcies al desenvolupament d'un nou procediment de càlcul incremental per a la descomposició en valors singulars amb actualització de la informació mitja. Aquest procediment es complementa amb altres tres: el decremental, el de partició i el de composició.
Se presenta un esquema único para la síntesis y análisis audiovisual personalizable realista de secuencias audiovisuales de caras parlantes y secuencias visuales de lengua de signos en entorno doméstico. En el primer caso, con animación totalmente sincronizada a través de una fuente de texto o voz; en el segundo, utilizando la técnica de deletreo de palabras mediante la mano. Sus posibilidades de personalización facilitan la creación de secuencias audiovisuales por parte de usuarios no expertos. Las aplicaciones posibles de este esquema de síntesis comprenden desde la creación de personajes virtuales realistas para interacción natural o vídeo juegos hasta vídeo conferencia de muy bajo ancho de banda y telefonía visual para las personas con problemas de oído, pasando por ofrecer ayuda en la pronunciación y la comunicación a este mismo colectivo. El sistema permite procesar secuencias largas con un consumo de recursos muy reducido gracias al desarrollo de un nuevo procedimiento de cálculo incremental para la descomposición en valores singulares con actualización de la información media.
A shared framework for realistic and personalizable audiovisual synthesis and analysis of audiovisual sequences of talking heads and visual sequences of sign language is presented in a domestic environment. The former has full synchronized animation using a text or auditory source of information; the latter consists in finger spelling. Their personalization capabilities ease the creation of audiovisual sequences by non expert users. The applications range from realistic virtual avatars for natural interaction or videogames to low bandwidth videoconference and visual telephony for the hard of hearing, including help to speech therapists. Long sequences can be processed with reduced resources, specially storing ones. This is allowed thanks to the proposed scheme for the incremental singular value decomposition with mean preservation. This scheme is complemented with another three: the decremental, the split and the composed ones.

Styles APA, Harvard, Vancouver, ISO, etc.

Mohamadi, Tayeb. « Synthèse à partir du texte de visages parlants : réalisation d'un prototype et mesures d'intelligibilité bimodale ». Grenoble INPG, 1993. http://www.theses.fr/1993INPG0010.

Texte intégral

Résumé :

Le but de cette etude est l'analyse geometrique des differentes formes de levres en francais, leur intelligibilite audiovisuelle et la realisation d'un prototype de synthetiseur de visage parlant francais. Dans ce manuscrit, nous retracons d'abord le role des levres dans la production de la parole, et l'apport de leur vision a l'intelligibilite de la parole degradee (une analyse phonetique des confusions des voyelles et des consonnes choisies, a ete faite en parallele), nous presentons les resultats d'une etude de leur geometrie et de leur mouvement qui a permis d'identifier une vingtaine de formes labiales de base appelees visemes. Ensuite, nous presentons un prototype de synthetiseur audiovisuel a partir du texte realise a partir de ce jeu de visemes et son evaluation en intelligibilite. Enfin, nous evaluons l'apport de l'intelligibilite en parole naturelle degradee de deux modeles de levres synthetiques realises a l'icp, avec une comparaison au cas naturel

Styles APA, Harvard, Vancouver, ISO, etc.

Le, Goff Bertrand. « Synthèse à partir du texte de visage 3D parlant français ». Grenoble INPG, 1997. http://www.theses.fr/1997INPG0140.

Texte intégral

Résumé :

Les recherches presentees dans cette these sont axees sur la bimodalite de la parole. Afin de disposer d'un outil de recherche sur la parole visuelle, un synthetiseur visuel de parole a ete developpe pour le francais. Il permet de predire les commandes temporelles d'un modele de visage a partir d'une entree phonetique. Dans un premier temps, nous presentons le modele de visage que nous avons adapte afin qu'il puisse etre anime par des parametres directement mesurables sur la face et le profil d'un locuteur de reference. La qualite de la modelisation du visage a ete evaluee par un ensemble de tests de perception. Puis, nous avons dresse une liste des differents modeles permettant de trouver une solution au probleme essentiel de la parole : la coarticulation. L'approche que nous avons choisie s'appuie sur le principe de fonctions de dominance qui reproduisent temporellement l'influence de la production de chaque unite phonetique sur ses voisines. Une methodologie - generalisable a d'autres langues - a ete elaboree afin de determiner automatiquement les coefficients caracteristiques de ces fonctions de dominance a partir des donnees mesurees sur un locuteur de reference. Cette synthese visuelle a ete synchronisee avec un synthetiseur acoustique, permettant ainsi l'animation audiovisuelle du modele de visage a partir d'un texte quelconque en francais. Cette synthese audiovisuelle a ete evaluee a travers plusieurs tests. Une comparaison quantitative des trajectoires des parametres produits par le synthetiseur visuel a ete faite avec les trajectoires observees sur le locuteur de reference. Le synthetiseur visuel a egalement ete evalue en termes d'intelligibilite, et compare a l'intelligibilite du meme modele de visage commande par analyse/synthese. Cette evaluation a montre que l'intelligibilite du modele anime par le synthetiseur visuel est equivalente a celle du modele anime par analyse/synthese.

Styles APA, Harvard, Vancouver, ISO, etc.

Boutet, de Monvel Violaine. « Du feedback vidéo à l'IA générative : sur la récursivité dans les arts et médias ». Electronic Thesis or Diss., Paris 3, 2025. http://www.theses.fr/2025PA030009.

Texte intégral

Résumé :

Cette thèse érige, sous le prisme du feedback, un pont entre l’art vidéo pionnier des années 1960 à 1980 et les pratiques en lien avec l’IA générative, que les avancées phénoménales de l’apprentissage profond ont précipitées depuis le milieu des années 2010. La rétroaction renvoie en cybernétique à l’autorégulation par la boucle de systèmes naturels et technologiques. Appliqué à des dispositifs analogiques, numériques ou hybrides en circuit fermé, ce processus automatisé qualifie aussi les effets contingents qui en résultent à l'écran. La première partie revient sur l’influence colossale que la théorie de l’information et la notion de bruit ont exercé sur la genèse du genre vidéo depuis l’avènement du médium, en 1965. Elle se concentre sur le paradigme narcissique (Rosalind Krauss, 1976) qui en a renseigné les canons jusqu’à la fin des années 1970, en analysant la place centrale occupée par la perception humaine et son extension prothétique télévisuelle. La seconde partie s’attache à l’exploration concurrente de ladite vision des machines, en dialogue avec les outils (Steina et Woody Vasulka, 1976). À partir du retournement technocratique de l’esthétique alors inhérent au traitement d’images en temps réel, une transition est opérée de la synthèse audiovisuelle à ses pendants cinématiques, puis artificiels. La troisième partie se penche sur la création en prise avec des modèles d’IA générative développés depuis l’introduction des GANs, en 2014. Interrogeant la redistribution de l’agentivité en réseau, elle considère ultimement la généalogie récursive des arts et médias, ainsi que les conditions d’une culture algorithmique sensible entre signal et données
This thesis raises, through the prism of feedback, a bridge between pioneer video art from the 1960s to the 1980s and the practices associated with generative AI, which the phenomenal advances in deep learning have precipitated since the mid-2010s. Retroaction in cybernetics refers to the self-regulation by the loop of natural and technological systems. Applied to closed-circuit analog, digital or hybrid setups, this automated process also qualifies the contingent effects that result from it on screen. The first section looks back at the colossal influence that information theory and the notion of noise have had on the genesis of the video genre since the advent of the medium, in 1965. It revolves around the narcissistic paradigm (Rosalind Krauss, 1976) that essentialized its canons until the late 1970s, by analyzing the central place occupied by human perception and its televisual prosthetic extension. The second section focuses on the concurrent exploration of so-called machine vision, in dialogue with the tools (Steina and Woody Vasulka, 1976). Building upon the technocratic reversal of aesthetics then inherent to real-time image processing, a transition is made from audiovisual synthesis to its cinematic, and artificial counterparts. The third section contemplates creation with generative AI models developed since the introduction of GANs, in 2014. Questioning the redistribution of agency in networks, it ultimately considers the recursive genealogy of the arts and media, as well as the conditions for a sensitive algorithmic culture between signal and data

Styles APA, Harvard, Vancouver, ISO, etc.

Dahmani, Sara. « Synthèse audiovisuelle de la parole expressive : modélisation des émotions par apprentissage profond ». Electronic Thesis or Diss., Université de Lorraine, 2020. http://www.theses.fr/2020LORR0137.

Texte intégral

Résumé :

Les travaux de cette thèse portent sur la modélisation des émotions pour la synthèse audiovisuelle expressive de la parole à partir du texte. Aujourd’hui, les résultats des systèmes de synthèse de la parole à partir du texte sont de bonne qualité, toutefois la synthèse audiovisuelle reste encore une problématique ouverte et la synthèse expressive l’est encore d’avantage. Nous proposons dans le cadre de cette thèse une méthode de modélisation des émotions malléable et flexible, permettant de mélanger les émotions comme on mélange les teintes sur une palette de couleurs. Dans une première partie, nous présentons et étudions deux corpus expressifs que nous avons construits. La stratégie d’acquisition ainsi que le contenu expressif de ces corpus sont analysés pour valider leur utilisation à des fins de synthèse audiovisuelle de la parole. Dans une seconde partie, nous proposons deux architectures neuronales pour la synthèse de la parole. Nous avons utilisé ces deux architectures pour modéliser trois aspects de la parole : 1) les durées des sons, 2) la modalité acoustique et 3) la modalité visuelle. Dans un premier temps, nous avons adopté une architecture entièrement connectée. Cette dernière nous a permis d’étudier le comportement des réseaux de neurones face à différents descripteurs contextuels et linguistiques. Nous avons aussi pu analyser, via des mesures objectives, la capacité du réseau à modéliser les émotions. La deuxième architecture neuronale proposée est celle d’un auto-encodeur variationnel. Cette architecture est capable d’apprendre une représentation latente des émotions sans utiliser les étiquettes des émotions. Après analyse de l’espace latent des émotions, nous avons proposé une procédure de structuration de ce dernier pour pouvoir passer d’une représentation par catégorie vers une représentation continue des émotions. Nous avons pu valider, via des expériences perceptives, la capacité de notre système à générer des émotions, des nuances d’émotions et des mélanges d’émotions, et cela pour la synthèse audiovisuelle expressive de la parole à partir du texte
: The work of this thesis concerns the modeling of emotions for expressive audiovisual textto-speech synthesis. Today, the results of text-to-speech synthesis systems are of good quality, however audiovisual synthesis remains an open issue and expressive synthesis is even less studied. As part of this thesis, we present an emotions modeling method which is malleable and flexible, and allows us to mix emotions as we mix shades on a palette of colors. In the first part, we present and study two expressive corpora that we have built. The recording strategy and the expressive content of these corpora are analyzed to validate their use for the purpose of audiovisual speech synthesis. In the second part, we present two neural architectures for speech synthesis. We used these two architectures to model three aspects of speech : 1) the duration of sounds, 2) the acoustic modality and 3) the visual modality. First, we use a fully connected architecture. This architecture allowed us to study the behavior of neural networks when dealing with different contextual and linguistic descriptors. We were also able to analyze, with objective measures, the network’s ability to model emotions. The second neural architecture proposed is a variational auto-encoder. This architecture is able to learn a latent representation of emotions without using emotion labels. After analyzing the latent space of emotions, we presented a procedure for structuring it in order to move from a discrete representation of emotions to a continuous one. We were able to validate, through perceptual experiments, the ability of our system to generate emotions, nuances of emotions and mixtures of emotions, and this for expressive audiovisual text-to-speech synthesis

Styles APA, Harvard, Vancouver, ISO, etc.

Majerová, Radka. « Lingvistika ve speciální pedagogice ». Doctoral thesis, 2016. http://www.nusl.cz/ntk/nusl-353603.

Texte intégral

Résumé :

Lingvistika je představena v aplikaci na řešení obtíží u lidí s jazykovým hendikepem, kteří se nacházejí ve speciálně pedagogickém prostředí. Ve výzkumu a rehabilitaci jazykových symptomů se nazývá klinickou lingvistikou. Klinická lingvistika kooperuje v multidisciplinárním kontextu s psycholingvistikou a neurolingvistikou. Práce nastiňuje potřebnost klinické lingvistiky také v českém měřítku. Je analyzována diagnóza vývojová anartrie u celoživotně nemluvících lidí s dětskou mozkovou obrnou. Vývojová anartrie je dosti častou diagnózou ve speciálním školství. Vyjevuje se její nedostatečný popis klinickou logopedií a potřeba její analýzy z klinicko-lingvistického hlediska. U vývojové anartrie je odhaleno druhotné narušení jazykových funkcí, sekundární dysfázie. Sekundární dysfázie u vývojové anartrie se manifestuje na všech jazykových rovinách, práce tuto manifestaci ukazuje. Inteligentní lidé s vývojovou anartrií se ocitají v situaci pozdní akvizice mateřského jazyka, který produkčně uchopují pouze ve formě psané řeči. Mají dílčí percepční obtíže. Fatální nemožnost mluvené produkce jim zapříčiňuje subvokální vnitřně řečový deficit. Je diskutován potenciál těchto lidí osvojovat jazyk v procesu pozdní akvizice, diskuze je uvedena v kontextu světového výzkumu o otázkách maturace a kritických period....

Styles APA, Harvard, Vancouver, ISO, etc.

Livres sur le sujet "Audiovisual synthesis"

Statistics on Selected Service Sectors in the EU : A Synthesis of Quantitative Results of Pilot Surveys on Audiovisual Services, Hotels and Travel Agencies and Transport. European Communities / Union (EUR-OP/OOPEC/OPOCE), 1997.

Trouver le texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Pitozzi, Enrico. Body Soundscape. Sous la direction de Yael Kaduri. Oxford University Press, 2016. http://dx.doi.org/10.1093/oxfordhb/9780199841547.013.43.

Texte intégral

Résumé :

Starting from an interdisciplinary perspective of methodological integration of the concepts of body and sound in the contemporary dance scene, this chapter addresses the general aesthetic notion ofsonorous body. Through a survey of some key practices and pieces by Wayne McGregor, Ginette Laurin, Angelin Preljocaj, Cindy Van Acker and others, the author analyzes the audiovisual dimension of these works, developed with digital technologies and in a collaboration of choreographers with electronic musician and sound artists such as Scanner, Kasper T. Toeplitz, Granular Synthesis, and Mika Vainio. This audiovisual tension, defined as the sonorous body, can be read through two interpretations. In the first, thesound is a body, which means the electronic sound of the scene is an acoustic material. In the second, the body is a sound, which means the body of the dancers produces the soundscape of a scene.

Styles APA, Harvard, Vancouver, ISO, etc.

Chapitres de livres sur le sujet "Audiovisual synthesis"

Aller, Sven, et Mark Fishel. « Adapting Audiovisual Speech Synthesis to Estonian ». Dans Lecture Notes in Computer Science, 13–23. Cham : Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-70566-3_2.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Sevillano, Xavier, Javier Melenchón, Germán Cobo, Joan Claudi Socoró et Francesc Alías. « Audiovisual Analysis and Synthesis for Multimodal Human-Computer Interfaces ». Dans Engineering the User Interface, 1–16. London : Springer London, 2008. http://dx.doi.org/10.1007/978-1-84800-136-7_13.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Luerssen, Martin, Trent Lewis et David Powers. « Head X : Customizable Audiovisual Synthesis for a Multi-purpose Virtual Head ». Dans AI 2010 : Advances in Artificial Intelligence, 486–95. Berlin, Heidelberg : Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-17432-2_49.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Almeida, Nuno, Diogo Cunha, Samuel Silva et António Teixeira. « Designing and Deploying an Interaction Modality for Articulatory-Based Audiovisual Speech Synthesis ». Dans Speech and Computer, 36–49. Cham : Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87802-3_4.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Meister Einar, Fagel Sascha et Metsvahi Rainer. « Towards Audiovisual TTS in Estonian ». Dans Frontiers in Artificial Intelligence and Applications. IOS Press, 2012. https://doi.org/10.3233/978-1-61499-133-5-138.

Texte intégral

Résumé :

In the current paper we report our first results in the development of audiovisual speech synthesis for Estonian. The MASSY model, developed originally for German, serves as a prototype for the Estonian AV synthesis. First, we give an overview of the methods of AV speech synthesis and the Estonian viseme inventory, then we introduce the MASSY model and its adaptation for Estonian; finally, we discuss the ideas for further development.

Styles APA, Harvard, Vancouver, ISO, etc.

Meister Einar, Metsvahi Rainer et Fagel Sascha. « Evaluation of the Estonian Audiovisual Speech Synthesis ». Dans Frontiers in Artificial Intelligence and Applications. IOS Press, 2014. https://doi.org/10.3233/978-1-61499-442-8-11.

Texte intégral

Résumé :

In the current paper we report the first evaluation results of the Estonian virtual talking head. The testing scenario involved perceptual experiments with unimodal audio and bimodal audiovisual speech stimuli in five noise conditions. As expected, in the presence of noise, the scores of consonant recognition of audiovisual stimuli were always higher than the scores of audio stimuli. The average recognition error in audio-only presentation was reduced by 42%–65% when the virtual talking head was displayed along with audio.

Styles APA, Harvard, Vancouver, ISO, etc.

Rojas Parra, Rosa Mariana, Ana María Salazar Montes et Lina María Rodríguez Granada. « Análisis de contenido audiovisual en el ejercicio físico de las personas adultas mayores. Revisión sistemática ». Dans Psicología de la actividad física y el deporte. Formación y aplicación en Colombia, 142–68. Asociación Colombiana de Facultades de Psicología, 2023. http://dx.doi.org/10.61676/9786289532425.05.

Texte intégral

Résumé :

Este capítulo es el resultado de un estudio de revisión en el cual se analizó el contenido audiovisual de videos publicados en plataformas digitales que promueven la práctica del ejercicio físico en las personas mayores. El análisis se basó en comprobar la idoneidad de los contenidos digitales teniendo en cuenta recomendaciones de diferentes fuentes guía de actividad física para mayores. Se utilizó el método de cuatro fases search, appraisal, synthesis, analysis (salsa) propuesta por Codina (2017). Los resultados mostraron el análisis de 35 videos publicados entre 2016 y 2021 en plataformas y canales de acceso libre, tales como YouTube, Facebook, Instagram y Google. Los contenidos audiovisuales son una alternativa para que las personas mayores realicen actividad física debido a su flexibilidad y accesibilidad. Sin embargo, los videos revisados cuentan con los requisitos mínimos para fomentar la práctica de ejercicio físico autónomo en casa. Además, su falta de rigurosidad y especificidad podrían eventualmente poner en riesgo a poblaciones con condiciones específicas de salud, ya que no garantizan estrategias que permitan la medición de su efectividad o impacto más allá del número de visualizaciones, por lo que se requieren plataformas interactivas que permitan una guía mayormente personalizada.

Styles APA, Harvard, Vancouver, ISO, etc.

Fernández-Martín, María José, Pilar Moreno-Crespo et Francisco Núñez-Román. « Cinema and Secondary Education in Spain ». Dans Educational Innovation to Address Complex Societal Challenges, 103–20. IGI Global, 2024. http://dx.doi.org/10.4018/979-8-3693-3073-9.ch008.

Texte intégral

Résumé :

Formative cinema is a motivating tool for pedagogical transformation that acknowledges the possibilities of image, sound, and video in the classroom, although there is neither a single form of instruction nor a single method of application. This research proposes the analysis of the most recent trends in the use of cinema as a didactic resource in Spanish secondary education. For this retrospective longitudinal descriptive study, a bibliographic review of the scientific production in the last 15 years (n=48) has been carried out, following the recommendations of the PRISMA 2020 protocol from a critical interpretative synthesis approach. Among the results, didactic proposals and compulsory secondary education stand out as the stage of greatest interest. The mission of teachers as a fundamental element in the transformation of cinema from an audiovisual product to a didactic product and the role of students as active spectators are highlighted.

Styles APA, Harvard, Vancouver, ISO, etc.

« Sholay, Stereo Sound and the Auditory Spectacle ». Dans Sound in Indian Film and Audiovisual Media. Nieuwe Prinsengracht 89 1018 VR Amsterdam Nederland : Amsterdam University Press, 2023. http://dx.doi.org/10.5117/9789463724739_ch08.

Texte intégral

Résumé :

Stereophonic mixing was introduced much later to Indian cinema than it was in Hollywood and had a relatively shorter lifespan. Following information on historical developments, critical analysis of sound in film and finding corresponding evidence in conversations with Indian sound practitioners, in the eighth chapter I demonstrate that stereophonic mixing, as an extension of magnetic recording and dubbing practices, rendered the cinematic imagination of this period as something spectacular; extravagant song-and-dance sequences shot in exotic locations and action scenes packed with studio-manipulated and synthetic sound effects which appealed to a mass audience.

Styles APA, Harvard, Vancouver, ISO, etc.

« Popular Films from the Dubbing Era ». Dans Sound in Indian Film and Audiovisual Media. Nieuwe Prinsengracht 89 1018 VR Amsterdam Nederland : Amsterdam University Press, 2023. http://dx.doi.org/10.5117/9789463724739_ch06.

Texte intégral

Résumé :

In the sixth chapter, I discuss the filmmaking period from roughly the 1960s to the late 1990s in India known as the ‘dubbing era’. Many representative popular films from this era are examined to investigate how the asynchronous practices of sound in this period incorporated a technologically informed approach, using analogue sound processing with expressionistic and melodramatic overtones. Following critical analysis of sound from several films and drawing supporting evidence from the conversations with veteran practitioners, the chapter demonstrates how magnetic recording and dubbing rendered the sound of this period as something spectacular, with extravagant song-and-dance routines in foreign locations packed with studio-manipulated, synthetic effects. Add to this a deliberate lack of site-specific sounds, and this practice triggered a cinematic experience of emotive tension and affective stimulation.

Styles APA, Harvard, Vancouver, ISO, etc.

Actes de conférences sur le sujet "Audiovisual synthesis"

Batty, Joshua, Kipps Horn et Stefan Greuter. « Audiovisual granular synthesis ». Dans The 9th Australasian Conference. New York, New York, USA : ACM Press, 2013. http://dx.doi.org/10.1145/2513002.2513568.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Šimbelis, Vygandas 'Vegas', et Anders Lundström. « Synthesis in the Audiovisual ». Dans CHI'16 : CHI Conference on Human Factors in Computing Systems. New York, NY, USA : ACM, 2016. http://dx.doi.org/10.1145/2851581.2889462.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Hussen Abdelaziz, Ahmed, Anushree Prasanna Kumar, Chloe Seivwright, Gabriele Fanelli, Justin Binder, Yannis Stylianou et Sachin Kajareker. « Audiovisual Speech Synthesis using Tacotron2 ». Dans ICMI '21 : INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA : ACM, 2021. http://dx.doi.org/10.1145/3462244.3479883.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Chen, Sihang, Junliang Chen et Xiaojuan Gu. « EDAVS : Emotion-Driven Audiovisual Synthesis Experience ». Dans SIGGRAPH '24 : Special Interest Group on Computer Graphics and Interactive Techniques Conference. New York, NY, USA : ACM, 2024. http://dx.doi.org/10.1145/3641234.3671080.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Silva, Samuel, et António Teixeira. « An Anthropomorphic Perspective for Audiovisual Speech Synthesis ». Dans 10th International Conference on Bio-inspired Systems and Signal Processing. SCITEPRESS - Science and Technology Publications, 2017. http://dx.doi.org/10.5220/0006150201630172.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Bailly, Gérard. « Audiovisual speech synthesis. from ground truth to models ». Dans 7th International Conference on Spoken Language Processing (ICSLP 2002). ISCA : ISCA, 2002. http://dx.doi.org/10.21437/icslp.2002-422.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Matthews, I. A. « Scale based features for audiovisual speech recognition ». Dans IEE Colloquium on Integrated Audio-Visual Processing for Recognition, Synthesis and Communication. IEE, 1996. http://dx.doi.org/10.1049/ic:19961152.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Thangthai, Ausdang, Sumonmas Thatphithakkul, Kwanchiva Thangthai et Arnon Namsanit. « TSynC-3miti : Audiovisual Speech Synthesis Database from Found Data ». Dans 2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA). IEEE, 2020. http://dx.doi.org/10.1109/o-cocosda50338.2020.9295001.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Mawass, Khaled, Pierre Badin et Gérard Bailly. « Synthesis of fricative consonants by audiovisual-to-articulatory inversion ». Dans 5th European Conference on Speech Communication and Technology (Eurospeech 1997). ISCA : ISCA, 1997. http://dx.doi.org/10.21437/eurospeech.1997-386.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Fagel, Sascha, et Walter F. Sendlmeier. « An expandable web-based audiovisual text-to-speech synthesis system ». Dans 8th European Conference on Speech Communication and Technology (Eurospeech 2003). ISCA : ISCA, 2003. http://dx.doi.org/10.21437/eurospeech.2003-673.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!