Accedi

Bibliografie tematiche / Active speaker detection

Indice

Articoli di riviste

Letteratura scientifica selezionata sul tema "Active speaker detection"

Autore: Grafiati

Pubblicato: 7 settembre 2024

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Active speaker detection".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Articoli di riviste sul tema "Active speaker detection"

1

Assunção, Gustavo, Nuno Gonçalves, and Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection." Applied Sciences 11, no. 8 (2021): 3397. http://dx.doi.org/10.3390/app11083397.

Testo completo

Abstract (sommario):

Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound source location, as in distinguishing whose voice is being heard in a group conversation. Furthermore, neuroscience has successfully identified the superior colliculus region in the brain as the one re

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Pu, Jie, Yannis Panagakis, and Maja Pantic. "Active Speaker Detection and Localization in Videos Using Low-Rank and Kernelized Sparsity." IEEE Signal Processing Letters 27 (2020): 865–69. http://dx.doi.org/10.1109/lsp.2020.2996412.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Lindstrom, Fredric, Keni Ren, Kerstin Persson Waye, and Haibo Li. "A comparison of two active‐speaker‐detection methods suitable for usage in noise dosimeter measurements." Journal of the Acoustical Society of America 123, no. 5 (2008): 3527. http://dx.doi.org/10.1121/1.2934471.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Zhu, Ying-Xin, and Hao-Ran Jin. "Speaker Localization Based on Audio-Visual Bimodal Fusion." Journal of Advanced Computational Intelligence and Intelligent Informatics 25, no. 3 (2021): 375–82. http://dx.doi.org/10.20965/jaciii.2021.p0375.

Testo completo

Abstract (sommario):

The demand for fluency in human–computer interaction is on an increase globally; thus, the active localization of the speaker by the machine has become a problem worth exploring. Considering that the stability and accuracy of the single-mode localization method are low, while the multi-mode localization method can utilize the redundancy of information to improve accuracy and anti-interference, a speaker localization method based on voice and image multimodal fusion is proposed. First, the voice localization method based on time differences of arrival (TDOA) in a microphone array and the face d

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Stefanov, Kalin, Jonas Beskow, and Giampiero Salvi. "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially Aware Language Acquisition." IEEE Transactions on Cognitive and Developmental Systems 12, no. 2 (2020): 250–59. http://dx.doi.org/10.1109/tcds.2019.2927941.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

6

DAI, Hai, Kean CHEN, Yang WANG, and Haoxin YU. "Fault detection method of secondary sound source in ANC system based on impedance characteristics." Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 40, no. 6 (2022): 1242–49. http://dx.doi.org/10.1051/jnwpu/20224061242.

Testo completo

Abstract (sommario):

As an indispensable component in an active noise control system, the working states of the secondary sound sources affect directly noise reduction and the robustness of the system. Therefore, it is very crucial to detect the working states of the secondary sound sources in the process of active control in real time. In this study, a real-time fault detection method for secondary sound sources during the process of active control is presented, and the corresponding detection algorithm is numerically given and experimentally verified. By collecting the input voltage and output current of the spe

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Ahmad, Zubair, Alquhayz, and Ditta. "Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model." Sensors 19, no. 23 (2019): 5163. http://dx.doi.org/10.3390/s19235163.

Testo completo

Abstract (sommario):

Speaker diarization systems aim to find ‘who spoke when?’ in multi-speaker recordings. The dataset usually consists of meetings, TV/talk shows, telephone and multi-party interaction recordings. In this paper, we propose a novel multimodal speaker diarization technique, which finds the active speaker through audio-visual synchronization model for diarization. A pre-trained audio-visual synchronization model is used to find the synchronization between a visible person and the respective audio. For that purpose, short video segments comprised of face-only regions are acquired using a face detecti

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Wang, Shaolei, Zhongyuan Wang, Wanxiang Che, Sendong Zhao, and Ting Liu. "Combining Self-supervised Learning and Active Learning for Disfluency Detection." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 3 (2022): 1–25. http://dx.doi.org/10.1145/3487290.

Testo completo

Abstract (sommario):

Spoken language is fundamentally different from the written language in that it contains frequent disfluencies or parts of an utterance that are corrected by the speaker. Disfluency detection (removing these disfluencies) is desirable to clean the input for use in downstream NLP tasks. Most existing approaches to disfluency detection heavily rely on human-annotated data, which is scarce and expensive to obtain in practice. To tackle the training data bottleneck, in this work, we investigate methods for combining self-supervised learning and active learning for disfluency detection. First, we c

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Maltezou-Papastylianou, Constantina, Riccardo Russo, Denise Wallace, Chelsea Harmsworth, and Silke Paulmann. "Different stages of emotional prosody processing in healthy ageing–evidence from behavioural responses, ERPs, tDCS, and tRNS." PLOS ONE 17, no. 7 (2022): e0270934. http://dx.doi.org/10.1371/journal.pone.0270934.

Testo completo

Abstract (sommario):

Past research suggests that the ability to recognise the emotional intent of a speaker decreases as a function of age. Yet, few studies have looked at the underlying cause for this effect in a systematic way. This paper builds on the view that emotional prosody perception is a multi-stage process and explores which step of the recognition processing line is impaired in healthy ageing using time-sensitive event-related brain potentials (ERPs). Results suggest that early processes linked to salience detection as reflected in the P200 component and initial build-up of emotional representation as

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Lahemer, Elfituri S. F., and Ahmad Rad. "An Audio-Based SLAM for Indoor Environments: A Robotic Mixed Reality Presentation." Sensors 24, no. 9 (2024): 2796. http://dx.doi.org/10.3390/s24092796.

Testo completo

Abstract (sommario):

In this paper, we present a novel approach referred to as the audio-based virtual landmark-based HoloSLAM. This innovative method leverages a single sound source and microphone arrays to estimate the voice-printed speaker’s direction. The system allows an autonomous robot equipped with a single microphone array to navigate within indoor environments, interact with specific sound sources, and simultaneously determine its own location while mapping the environment. The proposed method does not require multiple audio sources in the environment nor sensor fusion to extract pertinent information an

Gli stili APA, Harvard, Vancouver, ISO e altri

Più fonti

Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!