Log in

Relevant bibliographies by topics / Vocoders / Journal articles

To see the other types of publications on this topic, follow the link: Vocoders.

Journal articles on the topic 'Vocoders'

Author: Grafiati

Published: 4 June 2021

Last updated: 30 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Vocoders.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Karoui, Chadlia, Chris James, Pascal Barone, David Bakhos, Mathieu Marx, and Olivier Macherey. "Searching for the Sound of a Cochlear Implant: Evaluation of Different Vocoder Parameters by Cochlear Implant Users With Single-Sided Deafness." Trends in Hearing 23 (January 2019): 233121651986602. http://dx.doi.org/10.1177/2331216519866029.

Full text

Abstract:

Cochlear implantation in subjects with single-sided deafness (SSD) offers a unique opportunity to directly compare the percepts evoked by a cochlear implant (CI) with those evoked acoustically. Here, nine SSD-CI users performed a forced-choice task evaluating the similarity of speech processed by their CI with speech processed by several vocoders presented to their healthy ear. In each trial, subjects heard two intervals: their CI followed by a certain vocoder in Interval 1 and their CI followed by a different vocoder in Interval 2. The vocoders differed either (i) in carrier type—(sinusoidal [SINE], bandfiltered noise [NOISE], and pulse-spreading harmonic complex) or (ii) in frequency mismatch between the analysis and synthesis frequency ranges—(no mismatch, and two frequency-mismatched conditions of 2 and 4 equivalent rectangular bandwidths [ERBs]). Subjects had to state in which of the two intervals the CI and vocoder sounds were more similar. Despite a large intersubject variability, the PSHC vocoder was judged significantly more similar to the CI than SINE or NOISE vocoders. Furthermore, the No-mismatch and 2-ERB mismatch vocoders were judged significantly more similar to the CI than the 4-ERB mismatch vocoder. The mismatch data were also interpreted by comparing spiral ganglion characteristic frequencies with electrode contact positions determined from postoperative computed tomography scans. Only one subject demonstrated a pattern of preference consistent with adaptation to the CI sound processor frequency-to-electrode allocation table and two subjects showed possible partial adaptation. Those subjects with adaptation patterns presented overall small and consistent frequency mismatches across their electrode arrays.

APA, Harvard, Vancouver, ISO, and other styles

2

Roebel, Axel, and Frederik Bous. "Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet." Information 13, no. 3 (2022): 103. http://dx.doi.org/10.3390/info13030103.

Full text

Abstract:

The use of the mel spectrogram as a signal parameterization for voice generation is quite recent and linked to the development of neural vocoders. These are deep neural networks that allow reconstructing high-quality speech from a given mel spectrogram. While initially developed for speech synthesis, now neural vocoders have also been studied in the context of voice attribute manipulation, opening new means for voice processing in audio production. However, to be able to apply neural vocoders in real-world applications, two problems need to be addressed: (1) To support use in professional audio workstations, the computational complexity should be small, (2) the vocoder needs to support a large variety of speakers, differences in voice qualities, and a wide range of intensities potentially encountered during audio production. In this context, the present study will provide a detailed description of the Multi-band Excited WaveNet, a fully convolutional neural vocoder built around signal processing blocks. It will evaluate the performance of the vocoder when trained on a variety of multi-speaker and multi-singer databases, including an experimental evaluation of the neural vocoder trained on speech and singing voices. Addressing the problem of intensity variation, the study will introduce a new adaptive signal normalization scheme that allows for robust compensation for dynamic and static gain variations. Evaluations are performed using objective measures and a number of perceptual tests including different neural vocoder algorithms known from the literature. The results confirm that the proposed vocoder compares favorably to the state-of-the-art in its capacity to generalize to unseen voices and voice qualities. The remaining challenges will be discussed.

APA, Harvard, Vancouver, ISO, and other styles

3

Harding, Eleanor E., Etienne Gaudrain, Imke J. Hrycyk, et al. "Musical Emotion Categorization with Vocoders of Varying Temporal and Spectral Content." Trends in Hearing 27 (January 2023): 233121652211411. http://dx.doi.org/10.1177/23312165221141142.

Full text

Abstract:

While previous research investigating music emotion perception of cochlear implant (CI) users observed that temporal cues informing tempo largely convey emotional arousal (relaxing/stimulating), it remains unclear how other properties of the temporal content may contribute to the transmission of arousal features. Moreover, while detailed spectral information related to pitch and harmony in music — often not well perceived by CI users— reportedly conveys emotional valence (positive, negative), it remains unclear how the quality of spectral content contributes to valence perception. Therefore, the current study used vocoders to vary temporal and spectral content of music and tested music emotion categorization (joy, fear, serenity, sadness) in 23 normal-hearing participants. Vocoders were varied with two carriers (sinewave or noise; primarily modulating temporal information), and two filter orders (low or high; primarily modulating spectral information). Results indicated that emotion categorization was above-chance in vocoded excerpts but poorer than in a non-vocoded control condition. Among vocoded conditions, better temporal content (sinewave carriers) improved emotion categorization with a large effect while better spectral content (high filter order) improved it with a small effect. Arousal features were comparably transmitted in non-vocoded and vocoded conditions, indicating that lower temporal content successfully conveyed emotional arousal. Valence feature transmission steeply declined in vocoded conditions, revealing that valence perception was difficult for both lower and higher spectral content. The reliance on arousal information for emotion categorization of vocoded music suggests that efforts to refine temporal cues in the CI user signal may immediately benefit their music emotion perception.

APA, Harvard, Vancouver, ISO, and other styles

4

Bernstein, Lynne E., Marilyn E. Demorest, Michael P. O'Connell, and David C. Coulter. "Lipreading with vibrotactile vocoders." Journal of the Acoustical Society of America 87, S1 (1990): S124—S125. http://dx.doi.org/10.1121/1.2027907.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Mohammed, Zinah J., and Abdulkareem A. Kadhim. "A Comparative Study of Speech Coding Techniques for Electro Larynx Speech Production." Iraqi Journal of Information and Communication Technology 5, no. 1 (2022): 31–41. http://dx.doi.org/10.31987/ijict.5.1.185.

Full text

Abstract:

Speech coding is a method of earning a tight speech signals representation for efficient storage and efficient transmission over band limited wired or wireless channels. This is usually achieved with acceptable representation and least number of bits without depletion in the perceptual quality. A number of speech coding methods already developed and various speech coding algorithms for speech analysis and synthesis are used. This paper deals with the comparison of selected coding methods for speech signal produced by Electro Larynx (EL) device. The latter is a device used by cancer patients with their vocal laryngeal cords being removed. The used methods are Residual-Excited Linear Prediction (RELP), Code Excited Linear Prediction (CELP), Algebraic Code Excited Linear Predictive (ACELP), Phase Vocoders based on Wavelet Transform (PVWT), Channel Vocoders based on Wavelet Transform (CVWT), and Phase vocoder based on Dual-Tree Rational-Dilation Complex Wavelet Transform (PVDT-RADWT). The aim here is to select the best coding approach based on the quality of the reproduced speech. The signal used in the test is speech signal recorded either directly by normal persons or else produced by EL device. The performance of each method is evaluated using both objective and subjective listening tests. The results indicate that PVWT and ACELP coders perform better than other methods having about 40 dB SNR and 3 PESQ score for EL speech and 75 dB with 3.5 PESQ score for normal speech, respectively.

APA, Harvard, Vancouver, ISO, and other styles

6

Ausili, Sebastian A., Bradford Backus, Martijn J. H. Agterberg, A. John van Opstal, and Marc M. van Wanrooij. "Sound Localization in Real-Time Vocoded Cochlear-Implant Simulations With Normal-Hearing Listeners." Trends in Hearing 23 (January 2019): 233121651984733. http://dx.doi.org/10.1177/2331216519847332.

Full text

Abstract:

Bilateral cochlear-implant (CI) users and single-sided deaf listeners with a CI are less effective at localizing sounds than normal-hearing (NH) listeners. This performance gap is due to the degradation of binaural and monaural sound localization cues, caused by a combination of device-related and patient-related issues. In this study, we targeted the device-related issues by measuring sound localization performance of 11 NH listeners, listening to free-field stimuli processed by a real-time CI vocoder. The use of a real-time vocoder is a new approach, which enables testing in a free-field environment. For the NH listening condition, all listeners accurately and precisely localized sounds according to a linear stimulus–response relationship with an optimal gain and a minimal bias both in the azimuth and in the elevation directions. In contrast, when listening with bilateral real-time vocoders, listeners tended to orient either to the left or to the right in azimuth and were unable to determine sound source elevation. When listening with an NH ear and a unilateral vocoder, localization was impoverished on the vocoder side but improved toward the NH side. Localization performance was also reflected by systematic variations in reaction times across listening conditions. We conclude that perturbation of interaural temporal cues, reduction of interaural level cues, and removal of spectral pinna cues by the vocoder impairs sound localization. Listeners seem to ignore cues that were made unreliable by the vocoder, leading to acute reweighting of available localization cues. We discuss how current CI processors prevent CI users from localizing sounds in everyday environments.

APA, Harvard, Vancouver, ISO, and other styles

7

Ozdamar, Ozcan, Rebecca E. Eilers, and D. Kimbrough Oller. "Tactile Vocoders for the Deaf." IEEE Engineering in Medicine and Biology Magazine 6, no. 3 (1987): 37–42. http://dx.doi.org/10.1109/memb.1987.5006436.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

VAN RENSBURG, T. JANSE, M. A. VAN WYK, A. T. POTGIETER, and W. H. STEEB. "PHASE VOCODER TECHNOLOGY FOR THE SIMULATION OF ENGINE SOUND." International Journal of Modern Physics C 17, no. 05 (2006): 721–31. http://dx.doi.org/10.1142/s0129183106009333.

Full text

Abstract:

For a driving simulator which should be an exact replica of a certain vehicle, an accurate sound model is of extreme importance. The most games select between three or more prerecorded engine sounds, depending on the engine speed. Other methods use linear interpolation between engine sounds for a more accurate approximation, but this is still not ideal. By using vocoders, a technique used for the manipulation of voice, a much higher level of accuracy and realism can be obtained. This article proposes the use of vocoders for the modeling of engine sound for driving simulation and computer driving games.

APA, Harvard, Vancouver, ISO, and other styles

9

Fodor, Ádám, László Kopácsi, Zoltán Ádám Milacski, and András Lőrincz. "Speech De-identification with Deep Neural Networks." Acta Cybernetica 25, no. 2 (2021): 257–69. http://dx.doi.org/10.14232/actacyb.288282.

Full text

Abstract:

Cloud-based speech services are powerful practical tools but the privacy of the speakers raises important legal concerns when exposed to the Internet. We propose a deep neural network solution that removes personal characteristics from human speech by converting it to the voice of a Text-to-Speech (TTS) system before sending the utterance to the cloud. The network learns to transcode sequences of vocoder parameters, delta and delta-delta features of human speech to those of the TTS engine. We evaluated several TTS systems, vocoders and audio alignment techniques. We measured the performance of our method by (i) comparing the result of speech recognition on the de-identified utterances with the original texts, (ii) computing the Mel-Cepstral Distortion of the aligned TTS and the transcoded sequences, and (iii) questioning human participants in A-not-B, 2AFC and 6AFC tasks. Our approach achieves the level required by diverse applications.

APA, Harvard, Vancouver, ISO, and other styles

10

Chang, W. W., and D. Y. Wang. "Quality enhancement of sinusoidal transform vocoders." IEE Proceedings - Vision, Image, and Signal Processing 145, no. 6 (1998): 379. http://dx.doi.org/10.1049/ip-vis:19982456.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Yip, William C., and David L. Barron. "Efficient codebook search for CELP vocoders." Journal of the Acoustical Society of America 95, no. 5 (1994): 2794. http://dx.doi.org/10.1121/1.409796.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Dickinson, Kay. "‘Believe’? Vocoders, digitalised female identity and camp." Popular Music 20, no. 3 (2001): 333–47. http://dx.doi.org/10.1017/s0261143001001532.

Full text

Abstract:

In the two or so years since Cher's ‘Believe’ rather unexpectedly became the number one selling British single of 1998, the vocoder effect – which arguably snagged the track such widespread popularity – grew into one of the safest, maybe laziest, means of guaranteeing chart success. Since then, vocoder-wielding tracks such as Eiffel 65's ‘Blue (Da Ba Dee)’ and Sonique's ‘It Feels So Good’ have held fast at the slippery British number one spot for longer than the now-standard one week, despite their artists' relative obscurity. Even chart mainstays such as Madonna (‘Music’), Victoria Beckham (with the help of True Steppers and Dane Bowers) (‘Out of Your Mind’), Steps (‘Summer of Love’) and Kylie Minogue (the back-ups in ‘Spinning Around’) turned to this strange, automated-sounding gimmick which also proved to be a favourite with the poppier UK garage outfits (you can hear it on hits such as Lonyo/Comme Ci Comme Ca's ‘Summer of Love’, for example).

APA, Harvard, Vancouver, ISO, and other styles

13

Kazemi, Reza, Fernando Perez-Gonzalez, Mohammad Ali Akhaee, and Fereydoon Behnia. "Data Hiding Robust to Mobile Communication Vocoders." IEEE Transactions on Multimedia 18, no. 12 (2016): 2345–57. http://dx.doi.org/10.1109/tmm.2016.2599149.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Yip, William C., and David L. Barron. "Reduced codebook search arrangement for CELP vocoders." Journal of the Acoustical Society of America 95, no. 4 (1994): 2302. http://dx.doi.org/10.1121/1.408590.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Lin, Rong‐San, Jar‐Ferr Yang, and David Ho. "Successive bit‐vector search algorithm for celp vocoders." Journal of the Chinese Institute of Engineers 26, no. 3 (2003): 261–70. http://dx.doi.org/10.1080/02533839.2003.9670778.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Cabrera, Laurianne, Christian Lorenzi, and Josiane Bertoncini. "Infants Discriminate Voicing and Place of Articulation With Reduced Spectral and Temporal Modulation Cues." Journal of Speech, Language, and Hearing Research 58, no. 3 (2015): 1033–42. http://dx.doi.org/10.1044/2015_jslhr-h-14-0121.

Full text

Abstract:

Purpose This study assessed the role of spectro-temporal modulation cues in the discrimination of 2 phonetic contrasts (voicing and place) for young infants. Method A visual-habituation procedure was used to assess the ability of French-learning 6-month-old infants with normal hearing to discriminate voiced versus unvoiced (/aba/-/apa/) and labial versus dental (/aba/-/ada/) stop consonants. The stimuli were processed by tone-excited vocoders to degrade frequency-modulation cues while preserving: (a) amplitude-modulation (AM) cues within 32 analysis frequency bands, (b) slow AM cues only (<16 Hz) within 32 bands, and (c) AM cues within 8 bands. Results Infants exhibited discrimination responses for both phonetic contrasts in each processing condition. However, when fast AM cues were degraded, infants required a longer exposure to vocoded stimuli to reach the habituation criterion. Conclusions Altogether, these results indicate that the processing of modulation cues conveying phonetic information on voicing and place is “functional” at 6 months. The data also suggest that the perceptual weight of fast AM speech cues may change during development.

APA, Harvard, Vancouver, ISO, and other styles

17

Kitroser, I., E. Chai, and Y. Ben-Shimol. "Efficient mapping of multiple VoIP vocoders in WiMAX systems." Wireless Communications and Mobile Computing 11, no. 6 (2009): 667–78. http://dx.doi.org/10.1002/wcm.841.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Matsubara, Keisuke, Takuma Okamoto, Ryoichi Takashima, Tetsuya Takiguchi, Tomoki Toda, and Hisashi Kawai. "Comparison of real-time multi-speaker neural vocoders on CPUs." Acoustical Science and Technology 43, no. 2 (2022): 121–24. http://dx.doi.org/10.1250/ast.43.121.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Vergara, Kathleen, Lynn Miskiel, D. Oller, and Rebecca Eilers. "Training Children to Use Tactual Vocoders in a Model Program." Seminars in Hearing 16, no. 04 (1995): 404–14. http://dx.doi.org/10.1055/s-0028-1083736.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Mathew, Lani Rachel, Ancy S. Anselam, and Sakuntala S. Pillai. "Performance Comparison of Linear Prediction based Vocoders in Linux Platform." International Journal of Engineering Trends and Technology 10, no. 11 (2014): 554–58. http://dx.doi.org/10.14445/22315381/ijett-v10p310.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Effer, Elizabeth A., and Stephen A. Zahorian. "An investigation of mixed‐source excitation models for linear predictive vocoders." Journal of the Acoustical Society of America 79, S1 (1986): S26. http://dx.doi.org/10.1121/1.2023136.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Eroǧul, Osman, Hakki Gökhan İlk, and Özlem İlk. "A flexible bit rate switching method for low bit rate vocoders." Signal Processing 81, no. 8 (2001): 1737–42. http://dx.doi.org/10.1016/s0165-1684(01)00072-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Exter, Mats. "Anwendungen eines CI-Vocoders in der Logopädie – ein Blick in die Zukunft." Sprache · Stimme · Gehör 46, no. 01 (2022): 28–32. http://dx.doi.org/10.1055/a-1706-9004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Wess, Jessica M., Douglas S. Brungart, and Joshua G. W. Bernstein. "The Effect of Interaural Mismatches on Contralateral Unmasking With Single-Sided Vocoders." Ear and Hearing 38, no. 3 (2017): 374–86. http://dx.doi.org/10.1097/aud.0000000000000374.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Iverson, Paul, Bronwen G. Evans, and Charlotte A. Smith. "Vowel formant movement and duration perceived through noise vocoders and cochlear implants." Journal of the Acoustical Society of America 115, no. 5 (2004): 2425–26. http://dx.doi.org/10.1121/1.4781379.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Morise, Masanori, and Yusuke Watanabe. "Sound quality comparison among high-quality vocoders by using re-synthesized speech." Acoustical Science and Technology 39, no. 3 (2018): 263–65. http://dx.doi.org/10.1250/ast.39.263.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Matsubara, Keisuke, Takuma Okamoto, Ryoichi Takashima, et al. "Investigation of training data size for real-time neural vocoders on CPUs." Acoustical Science and Technology 42, no. 1 (2021): 65–68. http://dx.doi.org/10.1250/ast.42.65.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Tsakalos, Nick, and Evangelos Zigouris. "Autocorrelation-based pitch determination algorithms for realtime vocoders with the TMS32020/C25." Microprocessors and Microsystems 14, no. 8 (1990): 511–16. http://dx.doi.org/10.1016/0141-9331(90)90050-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Korobeynikov, A. V., M. A. Boyarshinov, A. I. Nistyuk, and V. N. Emelianov. "Using the Standard P.862 to Compare the Quality of Low-Bitrate Vocoders." Intellekt. Sist. Proizv. 16, no. 4 (2019): 109. http://dx.doi.org/10.22213/2410-9304-2018-4-109-113.

Full text

Abstract:

Рассматриваются объективные методы оценки качества речевого сигнала: 1) Perceptual Evaluation of Speech (PESQ, рекомендация МСЭ-T P.862) - оценка восприятия качества речи, 2) Listening Quality Objective (LQO, рекомендация, МСЭ-T P.800.1) - качество прослушивания. Приведено краткое описание и схемы работы методики PESQ и формулы для преобразования оценок Raw MOS в MOS-LQO и обратно. Для тестирования были выбраны низкоскоростные вокодеры: 1) MEPLe, 2) Speex, 3) Codec2. Тестирование вокодеров проводилось на битовых скоростях от 700 до 4800 бит/с. Для тестирования использовались аудиофайлы артикуляционных таблиц количеством 20 записей (wav, 8000 КГц, 16 бит, моно). В результате тестирования были построены таблицы и графики для Raw MOS и MOS-LQO оценок выбранных вокодеров. При анализе результатов экспериментов сделан вывод об эффективности применения объективных методов оценки качества речи, и в качестве перспективного для дальнейших разработок вокодера был выделен MELPe, обеспечивающий на битовых скоростях 1200 и 2400 бит/с оценку качества MOS соответственно 2,9...3,2 и 3,0...3,3. Вокодер Speex показал сравнимые с MELPe результаты оценки при большей битовой скорости (4800 бит/с), а вокодер Codec2 показал результаты оценки ниже, чем MELPe.

APA, Harvard, Vancouver, ISO, and other styles

30

Bernstein, Lynne E., Marilyn E. Demorest, David C. Coulter, and Michael P. O’Connell. "Lipreading sentences with vibrotactile vocoders: Performance of normal‐hearing and hearing‐impaired subjects." Journal of the Acoustical Society of America 90, no. 6 (1991): 2971–84. http://dx.doi.org/10.1121/1.401771.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Qin, Michael, and Andrew Oxenham. "Fundamental frequency discriminability and utility in normal‐hearing listeners using noise‐excited vocoders." Journal of the Acoustical Society of America 115, no. 5 (2004): 2390. http://dx.doi.org/10.1121/1.4780508.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Bernstein, Lynne E., Marilyn E. Demorest, David C. Coulter, and Michael P. O'Connell. "Lipreading sentences with vibrotactile vocoders: Performance of normal‐hearing and profoundly deaf subjects." Journal of the Acoustical Society of America 89, no. 4B (1991): 1958. http://dx.doi.org/10.1121/1.2029673.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Tsakalos, Nick, and Evangelos Zigouris. "Use of a single chip fixed-point DSP for multiple speech channel vocoders." Microprocessors and Microsystems 18, no. 1 (1994): 12–18. http://dx.doi.org/10.1016/0141-9331(94)90016-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Eilers, Rebecca E., Alan B. Cobo-Lewis, Kathleen C. Vergara, D. Kimbrough Oller, and Karen E. Friedman. "A Longitudinal Evaluation of the Speech Perception Capabilities of Children Using Multichannel Tactile Vocoders." Journal of Speech, Language, and Hearing Research 39, no. 3 (1996): 518–33. http://dx.doi.org/10.1044/jshr.3903.518.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Exenberger, Anna, and Paul Iverson. "Electrophysiological measures of listening effort and comprehension: Speech in noise, vocoders, and competing talkers." Journal of the Acoustical Society of America 143, no. 3 (2018): 1921. http://dx.doi.org/10.1121/1.5036266.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Iverson, Paul, Charlotte A. Smith, and Bronwen G. Evans. "Vowel recognition via cochlear implants and noise vocoders: Effects of formant movement and duration." Journal of the Acoustical Society of America 120, no. 6 (2006): 3998–4006. http://dx.doi.org/10.1121/1.2372453.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Padovani, José. "Pandemics, Delays, and Pure Data: on ‘afterlives’ (2020), for Flute and Live Electronics and Visuals." Revista Vórtex 9, no. 2 (2021): 1–14. http://dx.doi.org/10.33871/23179937.2021.9.2.17.

Full text

Abstract:

The essay addresses creative and technical aspects of the piece ‘afterlives’ (2020), for flute and live electronics and visuals. Composed and premiered in the context of the COVID-19 pandemic, the composition employs audiovisual processes based on different audiovisual techniques: phase-vocoders, buffer-based granulations, Ambisonics spatialization, and variable delay of video streams. The resulting sounds and images allude to typical situations of social interaction via video conferencing applications. ‘Afterlives’ relies on an interplay between current, almost-current, and past moments of the audiovisual streams, which dephase the performer’s images and sounds. I have avoided, in text, delving deeper into the Pure Data abstractions and or into the musical analysis of my composition. The main purpose of the text is rather to present compositional/technical elements of ‘afterlives’ and discuss how they enable new experiences of time.

APA, Harvard, Vancouver, ISO, and other styles

38

Wess, Jessica M., Nathaniel J. Spencer, and Joshua G. W. Bernstein. "Counting or discriminating the number of voices to assess binaural fusion with single-sided vocoders." Journal of the Acoustical Society of America 147, no. 1 (2020): 446–58. http://dx.doi.org/10.1121/10.0000511.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Wess, Jessica M., and Joshua G. W. Bernstein. "The Effect of Nonlinear Amplitude Growth on the Speech Perception Benefits Provided by a Single-Sided Vocoder." Journal of Speech, Language, and Hearing Research 62, no. 3 (2019): 745–57. http://dx.doi.org/10.1044/2018_jslhr-h-18-0001.

Full text

Abstract:

PurposeFor listeners with single-sided deafness, a cochlear implant (CI) can improve speech understanding by giving the listener access to the ear with the better target-to-masker ratio (TMR; head shadow) or by providing interaural difference cues to facilitate the perceptual separation of concurrent talkers (squelch). CI simulations presented to listeners with normal hearing examined how these benefits could be affected by interaural differences in loudness growth in a speech-on-speech masking task.MethodExperiment 1 examined a target–masker spatial configuration where the vocoded ear had a poorer TMR than the nonvocoded ear. Experiment 2 examined the reverse configuration. Generic head-related transfer functions simulated free-field listening. Compression or expansion was applied independently to each vocoder channel (power-law exponents: 0.25, 0.5, 1, 1.5, or 2).ResultsCompression reduced the benefit provided by the vocoder ear in both experiments. There was some evidence that expansion increased squelch in Experiment 1 but reduced the benefit in Experiment 2 where the vocoder ear provided a combination of head-shadow and squelch benefits.ConclusionsThe effects of compression and expansion are interpreted in terms of envelope distortion and changes in the vocoded-ear TMR (for head shadow) or changes in perceived target–masker spatial separation (for squelch). The compression parameter is a candidate for clinical optimization to improve single-sided deafness CI outcomes.

APA, Harvard, Vancouver, ISO, and other styles

40

Nose, Takashi, and Takao Kobayashi. "Very low bit-rate F0 coding for phonetic vocoders using MSD-HMM with quantized F0 symbols." Speech Communication 54, no. 3 (2012): 384–92. http://dx.doi.org/10.1016/j.specom.2011.10.002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Goupell, Matthew J., Garrison T. Draves, and Ruth Y. Litovsky. "Recognition of vocoded words and sentences in quiet and multi-talker babble with children and adults." PLOS ONE 15, no. 12 (2020): e0244632. http://dx.doi.org/10.1371/journal.pone.0244632.

Full text

Abstract:

A vocoder is used to simulate cochlear-implant sound processing in normal-hearing listeners. Typically, there is rapid improvement in vocoded speech recognition, but it is unclear if the improvement rate differs across age groups and speech materials. Children (8–10 years) and young adults (18–26 years) were trained and tested over 2 days (4 hours) on recognition of eight-channel noise-vocoded words and sentences, in quiet and in the presence of multi-talker babble at signal-to-noise ratios of 0, +5, and +10 dB. Children achieved poorer performance than adults in all conditions, for both word and sentence recognition. With training, vocoded speech recognition improvement rates were not significantly different between children and adults, suggesting that improvement in learning how to process speech cues degraded via vocoding is absent of developmental differences across these age groups and types of speech materials. Furthermore, this result confirms that the acutely measured age difference in vocoded speech recognition persists after extended training.

APA, Harvard, Vancouver, ISO, and other styles

42

Shinohara, Yasuaki. "Japanese pitch-accent perception of noise-vocoded sine-wave speech." Journal of the Acoustical Society of America 152, no. 4 (2022): A175. http://dx.doi.org/10.1121/10.0015940.

Full text

Abstract:

A previous study has demonstrated that speech intelligibility is improved for a tone language when sine-wave speech is noise-vocoded, because noise-vocoding eliminates the quasi-periodicity of sine-wave speech. This study examined whether identification accuracy of Japanese pitch-accent words increases after sine-wave speech is noise-vocoded. The results showed that the Japanese listeners’ identification accuracy significantly increased, but their discrimination accuracy did not show a significant difference between the sine-wave speech and noise-vocoded sine-wave speech conditions. These results suggest that Japanese listeners can auditorily discriminate minimal-pair words using any acoustic cues in both conditions, but quasi-periodicity is eliminated by noise-vocoding so that the Japanese listeners’ identification accuracy increases in the noise-vocoded sine-wave speech condition. The same results were not observed when another way of noise-vocoding was used in a previous study, suggesting that the quasi-periodicity of sine-wave speech needs to be adequately eliminated by a noise-vocoder to show a significant difference in identification.

APA, Harvard, Vancouver, ISO, and other styles

43

Laneau, Johan, Marc Moonen, and Jan Wouters. "Factors affecting the use of noise-band vocoders as acoustic models for pitch perception in cochlear implants." Journal of the Acoustical Society of America 119, no. 1 (2006): 491–506. http://dx.doi.org/10.1121/1.2133391.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Hodges, Aaron, Raymond L. Goldsworthy, Matthew B. Fitzgerald, and Takako Fujioka. "Transfer effects of discrete tactile mapping of musical pitch on discrimination of vocoded stimuli." Journal of the Acoustical Society of America 152, no. 4 (2022): A229. http://dx.doi.org/10.1121/10.0016101.

Full text

Abstract:

Many studies have found benefits of using somatosensory modality to augment sound information for individuals with hearing loss. However, few studies have explored the use of multiple regions of the body sensitive to vibrotactile stimulation to convey discrete F0 information, important for music perception. This study explored whether mapping of multiple finger patterns associated with musical notes can be learned quickly and transferred to discriminate vocoded auditory stimuli. Each of eight musical diatonic scale notes were associated with one of unique finger digits 2-5 patterns in the dominant hand, where pneumatic tactile stimulation apparatus were attached. The study consisted of a pre and post-test with a learning phase in-between. During the learning phase, normal-hearing participants had to identify common nursery song melodies presented with simultaneous auditory-tactile stimulus for about 10 min, using non-vocoded (original) audio. Pre- and post-tests examined stimulus discrimination for 4 conditions: original audio + tactile, tactile only, vocoded audio only, and vocoded audio + tactile. The audio vocoder used cochlear implant 4 channel simulation. Our results demonstrated audio-tactile learning improved participant’s performance on the vocoded audio + tactile tasks. The tactile only condition also significantly improved, indicating the rapid learning of the audio-tactile mapping and its effective transfer.

APA, Harvard, Vancouver, ISO, and other styles

45

Laurent, Pierre‐Andre. "Method to evaluate the pitch and voicing of the speech signal in vocoders with very slow bit rates." Journal of the Acoustical Society of America 97, no. 5 (1995): 3223. http://dx.doi.org/10.1121/1.411817.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Bosen, Adam K., and Michael F. Barry. "Serial Recall Predicts Vocoded Sentence Recognition Across Spectral Resolutions." Journal of Speech, Language, and Hearing Research 63, no. 4 (2020): 1282–98. http://dx.doi.org/10.1044/2020_jslhr-19-00319.

Full text

Abstract:

Purpose The goal of this study was to determine how various aspects of cognition predict speech recognition ability across different levels of speech vocoding within a single group of listeners. Method We tested the ability of young adults ( N = 32) with normal hearing to recognize Perceptually Robust English Sentence Test Open-set (PRESTO) sentences that were degraded with a vocoder to produce different levels of spectral resolution (16, eight, and four carrier channels). Participants also completed tests of cognition (fluid intelligence, short-term memory, and attention), which were used as predictors of sentence recognition. Sentence recognition was compared across vocoder conditions, predictors were correlated with individual differences in sentence recognition, and the relationships between predictors were characterized. Results PRESTO sentence recognition performance declined with a decreasing number of vocoder channels, with no evident floor or ceiling performance in any condition. Individual ability to recognize PRESTO sentences was consistent relative to the group across vocoder conditions. Short-term memory, as measured with serial recall, was a moderate predictor of sentence recognition (ρ = 0.65). Serial recall performance was constant across vocoder conditions when measured with a digit span task. Fluid intelligence was marginally correlated with serial recall, but not sentence recognition. Attentional measures had no discernible relationship to sentence recognition and a marginal relationship with serial recall. Conclusions Verbal serial recall is a substantial predictor of vocoded sentence recognition, and this predictive relationship is independent of spectral resolution. In populations that show variable speech recognition outcomes, such as listeners with cochlear implants, it should be possible to account for the independent effects of spectral resolution and verbal serial recall in their speech recognition ability. Supplemental Material https://doi.org/10.23641/asha.12021051

APA, Harvard, Vancouver, ISO, and other styles

47

Shi, Yong Peng. "Research and Implementation of MELP Algorithm Based on TMS320VC5509A." Advanced Materials Research 934 (May 2014): 239–44. http://dx.doi.org/10.4028/www.scientific.net/amr.934.239.

Full text

Abstract:

A kind of MELP vocode is designed based on DSP TMS320VC5509A in this article. Firstly, it expatiates the MELP algorithm,then the idea of modeling and realization process on DSP based is proposed. At last we can complete the function simulation of the encoding and decoding system,and the experiment result shows that the synthetical signals fit well with the original ones, and the quality of the speech got from the vocoder is good.

APA, Harvard, Vancouver, ISO, and other styles

48

Tamati, Terrin N., Lars Bakker, Stefan Smeenk, Almut Jebens, Thomas Koelewijn, and Deniz Başkent. "Pupil response to familiar and unfamiliar talkers in the recognition of noise-vocoded speech." Journal of the Acoustical Society of America 151, no. 4 (2022): A264. http://dx.doi.org/10.1121/10.0011285.

Full text

Abstract:

In some challenging listening conditions, listeners are more accurate at recognizing speech produced by a familiar talker compared to unfamiliar talkers. However, previous studies have found little to no talker-familiarity benefit in the recognition of noise-vocoded speech, potentially due to limitations in the talker-specific details conveyed in noise-vocoded signals. Although no strong effect on performance has been observed, listening to a familiar talker may reduce the listening effort experienced. The current study used pupillometry to assess how talker familiarity could impact the amount of effort required to recognize noise-vocoded speech. Four groups of normal-hearing, listeners completed talker familiarity training, each with a different talker. Then, listeners repeated sentences produced by the familiar (training) talker and three unfamiliar talkers. Sentences were mixed with multi-talker babble, and were processed with an 8-channel noise-vocoder; SNR was set to a participant’s 50% correct performance level. Preliminary results demonstrate no overall talker-familiarity benefit across training groups. Examining each training group separately showed differences in pupil response for familiar and unfamiliar talkers, but the direction and size of the effect depended on the training talker. These preliminary findings suggest that normal-hearing, listeners make use of limited talker-specific details in the recognition of noise-vocoded speech.

APA, Harvard, Vancouver, ISO, and other styles

49

Asyraf, Muhammad A., and Dhany Arifianto. "Effect of electric-acoustic cochlear implant stimulation and coding strategies on spatial cues of speech signals in reverberant room." Journal of the Acoustical Society of America 152, no. 4 (2022): A195. http://dx.doi.org/10.1121/10.0016005.

Full text

Abstract:

The comparison of spatial cues changes in different setups and coding strategies used in cochlear implants (CI) is investigated. In this experiment, we implement three voice coder setups, such as bilateral CI, bimodal CI, and electro-acoustic stimulation (EAS). Two well-known coding strategies are used, which are continuous interleaved sampling (CIS) and spectral peak (SPEAK). Speech signals are convoluted with appropriate binaural room impulse response (BRIR), creating reverberant spatial stimuli. Five different reverberant conditions (including anechoic) were applied to the stimuli. Interaural level and time differences (ILD and ITD) are evaluated objectively and subjectively, and their relationship with the intelligibility of speech is observed. Prior objective evaluation with CIS reveals that clarity (C50) becomes a more important factor in spatial cue change than reverberation time. Vocoded conditions (bilateral CI) show an increment in ILD value (compression has not been implemented yet on the vocoder processing), when the value of ITD gets more different (decreased) from the middle point. Reverberation degrades the intelligibility rate at various rates depending on the C50 value, both in unvocoded and vocoded conditions. In the vocoded condition, decrement on spatial cues was also followed by the decreement on the intelligibility of spatial stimuli.

APA, Harvard, Vancouver, ISO, and other styles

50

Gibbs, Bobby E., Joshua G. W. Bernstein, Douglas S. Brungart, and Matthew J. Goupell. "Effects of better-ear glimpsing, binaural unmasking, and spectral resolution on spatial release from masking in cochlear-implant users." Journal of the Acoustical Society of America 152, no. 2 (2022): 1230–46. http://dx.doi.org/10.1121/10.0013746.

Full text

Abstract:

Bilateral cochlear-implant (BICI) listeners obtain less spatial release from masking (SRM; speech-recognition improvement for spatially separated vs co-located conditions) than normal-hearing (NH) listeners, especially for symmetrically placed maskers that produce similar long-term target-to-masker ratios at the two ears. Two experiments examined possible causes of this deficit, including limited better-ear glimpsing (using speech information from the more advantageous ear in each time-frequency unit), limited binaural unmasking (using interaural differences to improve signal-in-noise detection), or limited spectral resolution. Listeners had NH (presented with unprocessed or vocoded stimuli) or BICIs. Experiment 1 compared natural symmetric maskers, idealized monaural better-ear masker (IMBM) stimuli that automatically performed better-ear glimpsing, and hybrid stimuli that added worse-ear information, potentially restoring binaural cues. BICI and NH-vocoded SRM was comparable to NH-unprocessed SRM for idealized stimuli but was 14%–22% lower for symmetric stimuli, suggesting limited better-ear glimpsing ability. Hybrid stimuli improved SRM for NH-unprocessed listeners but degraded SRM for BICI and NH-vocoded listeners, suggesting they experienced across-ear interference instead of binaural unmasking. In experiment 2, increasing the number of vocoder channels did not change NH-vocoded SRM. BICI SRM deficits likely reflect a combination of across-ear interference, limited better-ear glimpsing, and poorer binaural unmasking that stems from cochlear-implant-processing limitations other than reduced spectral resolution.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!