Academic literature on the topic 'VoxCeleb2'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'VoxCeleb2.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "VoxCeleb2"

1

Seo, Soonshin, and Ji-Hwan Kim. "Self-Attentive Multi-Layer Aggregation with Feature Recalibration and Deep Length Normalization for Text-Independent Speaker Verification System." Electronics 9, no. 10 (2020): 1706. http://dx.doi.org/10.3390/electronics9101706.

Full text
Abstract:
One of the most important parts of a text-independent speaker verification system is speaker embedding generation. Previous studies demonstrated that shortcut connections-based multi-layer aggregation improves the representational power of a speaker embedding system. However, model parameters are relatively large in number, and unspecified variations increase in the multi-layer aggregation. Therefore, in this study, we propose a self-attentive multi-layer aggregation with feature recalibration and deep length normalization for a text-independent speaker verification system. To reduce the numbe
APA, Harvard, Vancouver, ISO, and other styles
2

Nagrani, Arsha, Joon Son Chung, Weidi Xie, and Andrew Zisserman. "Voxceleb: Large-scale speaker verification in the wild." Computer Speech & Language 60 (March 2020): 101027. http://dx.doi.org/10.1016/j.csl.2019.101027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Badr, Ameer A., and Alia K. Abdul Hassan. "Age Estimation in Short Speech Utterances Based on Bidirectional ‎Gated-Recurrent Neural Networks." Engineering and Technology Journal 39, no. 1B (2021): 129–40. http://dx.doi.org/10.30684/etj.v39i1b.1905.

Full text
Abstract:
Recently, age estimates from speech have received growing interest as they are important for ‎many applications like custom call routing, targeted marketing, or user-profiling. In this work, an ‎automatic system to estimate age in short speech utterances without ‎depending on the text is proposed. From each utterance frame, four ‎groups of features are extracted and then 10 statistical functionals are measured for each ‎extracted dimension of the features, to be followed by dimensionality reduction using Linear ‎Discriminant Analysis (LDA). Finally, bidirectional Gated-Recurrent Neural Network
APA, Harvard, Vancouver, ISO, and other styles
4

Lei, Lei, and Kun She. "Identity Vector Extraction by Perceptual Wavelet Packet Entropy and Convolutional Neural Network for Voice Authentication." Entropy 20, no. 8 (2018): 600. http://dx.doi.org/10.3390/e20080600.

Full text
Abstract:
Recently, the accuracy of voice authentication system has increased significantly due to the successful application of the identity vector (i-vector) model. This paper proposes a new method for i-vector extraction. In the method, a perceptual wavelet packet transform (PWPT) is designed to convert speech utterances into wavelet entropy feature vectors, and a Convolutional Neural Network (CNN) is designed to estimate the frame posteriors of the wavelet entropy feature vectors. In the end, i-vector is extracted based on those frame posteriors. TIMIT and VoxCeleb speech corpus are used for experim
APA, Harvard, Vancouver, ISO, and other styles
5

Mo, Jianye, and Li Xu. "Weighted Cluster-Range Loss and Criticality-Enhancement Loss for Speaker Recognition." Applied Sciences 10, no. 24 (2020): 9004. http://dx.doi.org/10.3390/app10249004.

Full text
Abstract:
While traditional i-vector based methods are popular in the field of speaker recognition, deep learning has recently found more and more applications to the end-to-end models due to its attractive performance. One effective practice is the integration of attention mechanism into the Convolution Neural Networks (CNNs). In this work, a light-weight dual-path attention block is proposed by combining the self-attention and Convolutional Block Attention Module (CBAM), which helps to capture more multi-source features with neglectable extra time expense. Additionally, a Weighted Cluster-Range Loss (
APA, Harvard, Vancouver, ISO, and other styles
6

Zeng, Xianfang, Yusu Pan, Mengmeng Wang, Jiangning Zhang, and Yong Liu. "Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 12757–64. http://dx.doi.org/10.1609/aaai.v34i07.6970.

Full text
Abstract:
Recent works have shown how realistic talking face images can be obtained under the supervision of geometry guidance, e.g., facial landmark or boundary. To alleviate the demand for manual annotations, in this paper, we propose a novel self-supervised hybrid model (DAE-GAN) that learns how to reenact face naturally given large amounts of unlabeled videos. Our approach combines two deforming autoencoders with the latest advances in the conditional generation. On the one hand, we adopt the deforming autoencoder to disentangle identity and pose representations. A strong prior in talking face video
APA, Harvard, Vancouver, ISO, and other styles
7

Zhong, Qinghua, Ruining Dai, Han Zhang, Yongsheng Zhu, and Guofu Zhou. "Text-independent speaker recognition based on adaptive course learning loss and deep residual network." EURASIP Journal on Advances in Signal Processing 2021, no. 1 (2021). http://dx.doi.org/10.1186/s13634-021-00762-2.

Full text
Abstract:
AbstractText-independent speaker recognition is widely used in identity recognition that has a wide spectrum of applications, such as criminal investigation, payment certification, and interest-based customer services. In order to improve the recognition ability of log filter bank feature vectors, a method of text-independent speaker recognition based on deep residual networks model was proposed in this paper. The deep residual network was composed of a residual network (ResNet) and a convolutional attention statistics pooling (CASP) layer. The CASP layer could aggregate frame-level features f
APA, Harvard, Vancouver, ISO, and other styles
8

Liu, Yi, Liang He, Jia Liu, and Michael T. Johnson. "Introducing phonetic information to speaker embedding for speaker verification." EURASIP Journal on Audio, Speech, and Music Processing 2019, no. 1 (2019). http://dx.doi.org/10.1186/s13636-019-0166-8.

Full text
Abstract:
AbstractPhonetic information is one of the most essential components of a speech signal, playing an important role for many speech processing tasks. However, it is difficult to integrate phonetic information into speaker verification systems since it occurs primarily at the frame level while speaker characteristics typically reside at the segment level. In deep neural network-based speaker verification, existing methods only apply phonetic information to the frame-wise trained speaker embeddings. To improve this weakness, this paper proposes phonetic adaptation and hybrid multi-task learning a
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "VoxCeleb2"

1

Lukáč, Peter. "Verifikace osob podle hlasu bez extrakce příznaků." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445531.

Full text
Abstract:
Verifikácia osôb je oblasť, ktorá sa stále modernizuje, zlepšuje a snaží sa vyhovieť požiadavkám, ktoré sa na ňu kladú vo oblastiach využitia ako sú autorizačné systmémy, forenzné analýzy, atď. Vylepšenia sa uskutočňujú vďaka pokrom v hlbokom učení, tvorením nových trénovacích a testovacích dátovych sad a rôznych súťaží vo verifikácií osôb a workshopov. V tejto práci preskúmame modely pre verifikáciu osôb bez extrakcie príznakov. Používanie nespracovaných zvukových stôp ako vstupy modelov zjednodušuje spracovávanie vstpu a teda znižujú sa výpočetné a pamäťové požiadavky a redukuje sa počet hyp
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "VoxCeleb2"

1

Chung, Joon Son, Arsha Nagrani, and Andrew Zisserman. "VoxCeleb2: Deep Speaker Recognition." In Interspeech 2018. ISCA, 2018. http://dx.doi.org/10.21437/interspeech.2018-1929.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Chung, Joon Son, Jaesung Huh, and Seongkyu Mun. "Delving into VoxCeleb: Environment Invariant Speaker Recognition." In Odyssey 2020 The Speaker and Language Recognition Workshop. ISCA, 2020. http://dx.doi.org/10.21437/odyssey.2020-49.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Nagrani, Arsha, Joon Son Chung, and Andrew Zisserman. "VoxCeleb: A Large-Scale Speaker Identification Dataset." In Interspeech 2017. ISCA, 2017. http://dx.doi.org/10.21437/interspeech.2017-950.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Chen, Zhengyang, Shuai Wang, and Yanmin Qian. "Multi-Modality Matters: A Performance Leap on VoxCeleb." In Interspeech 2020. ISCA, 2020. http://dx.doi.org/10.21437/interspeech.2020-2229.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Xiao, Xiong, Naoyuki Kanda, Zhuo Chen, et al. "Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020." In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413832.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Hautamäki, Rosa González, and Tomi Kinnunen. "Why Did the x-Vector System Miss a Target Speaker? Impact of Acoustic Mismatch Upon Target Score on VoxCeleb Data." In Interspeech 2020. ISCA, 2020. http://dx.doi.org/10.21437/interspeech.2020-2715.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!