To see the other types of publications on this topic, follow the link: Auditory-visual speech perception.

Journal articles on the topic 'Auditory-visual speech perception'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Auditory-visual speech perception.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

ERDENER, DOĞU, and DENIS BURNHAM. "Auditory–visual speech perception in three- and four-year-olds and its relationship to perceptual attunement and receptive vocabulary." Journal of Child Language 45, no. 2 (June 6, 2017): 273–89. http://dx.doi.org/10.1017/s0305000917000174.

Full text
Abstract:
AbstractDespite the body of research on auditory–visual speech perception in infants and schoolchildren, development in the early childhood period remains relatively uncharted. In this study, English-speaking children between three and four years of age were investigated for: (i) the development of visual speech perception – lip-reading and visual influence in auditory–visual integration; (ii) the development of auditory speech perception and native language perceptual attunement; and (iii) the relationship between these and a language skill relevant at this age, receptive vocabulary. Visual speech perception skills improved even over this relatively short time period. However, regression analyses revealed that vocabulary was predicted by auditory-only speech perception, and native language attunement, but not by visual speech perception ability. The results suggest that, in contrast to infants and schoolchildren, in three- to four-year-olds the relationship between speech perception and language ability is based on auditory and not visual or auditory–visual speech perception ability. Adding these results to existing findings allows elaboration of a more complete account of the developmental course of auditory–visual speech perception.
APA, Harvard, Vancouver, ISO, and other styles
2

Sams, M. "Audiovisual Speech Perception." Perception 26, no. 1_suppl (August 1997): 347. http://dx.doi.org/10.1068/v970029.

Full text
Abstract:
Persons with hearing loss use visual information from articulation to improve their speech perception. Even persons with normal hearing utilise visual information, especially when the stimulus-to-noise ratio is poor. A dramatic demonstration of the role of vision in speech perception is the audiovisual fusion called the ‘McGurk effect’. When the auditory syllable /pa/ is presented in synchrony with the face articulating the syllable /ka/, the subject usually perceives /ta/ or /ka/. The illusory perception is clearly auditory in nature. We recently studied the audiovisual fusion (acoustical /p/, visual /k/) for Finnish (1) syllables, and (2) words. Only 3% of the subjects perceived the syllables according to the acoustical input, ie in 97% of the subjects the perception was influenced by the visual information. For words the percentage of acoustical identifications was 10%. The results demonstrate a very strong influence of visual information of articulation in face-to-face speech perception. Word meaning and sentence context have a negligible influence on the fusion. We have also recorded neuromagnetic responses of the human cortex when the subjects both heard and saw speech. Some subjects showed a distinct response to a ‘McGurk’ stimulus. The response was rather late, emerging about 200 ms from the onset of the auditory stimulus. We suggest that the perisylvian cortex, close to the source area for the auditory 100 ms response (M100), may be activated by the discordant stimuli. The behavioural and neuromagnetic results suggest a precognitive audiovisual speech integration occurring at a relatively early processing level.
APA, Harvard, Vancouver, ISO, and other styles
3

Cienkowski, Kathleen M., and Arlene Earley Carney. "Auditory-Visual Speech Perception and Aging." Ear and Hearing 23, no. 5 (October 2002): 439–49. http://dx.doi.org/10.1097/00003446-200210000-00006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Helfer, Karen S. "Auditory and Auditory-Visual Perception of Clear and Conversational Speech." Journal of Speech, Language, and Hearing Research 40, no. 2 (April 1997): 432–43. http://dx.doi.org/10.1044/jslhr.4002.432.

Full text
Abstract:
Research has shown that speaking in a deliberately clear manner can improve the accuracy of auditory speech recognition. Allowing listeners access to visual speech cues also enhances speech understanding. Whether the nature of information provided by speaking clearly and by using visual speech cues is redundant has not been determined. This study examined how speaking mode (clear vs. conversational) and presentation mode (auditory vs. auditory-visual) influenced the perception of words within nonsense sentences. In Experiment 1, 30 young listeners with normal hearing responded to videotaped stimuli presented audiovisually in the presence of background noise at one of three signal-to-noise ratios. In Experiment 2, 9 participants returned for an additional assessment using auditory-only presentation. Results of these experiments showed significant effects of speaking mode (clear speech was easier to understand than was conversational speech) and presentation mode (auditoryvisual presentation led to better performance than did auditory-only presentation). The benefit of clear speech was greater for words occurring in the middle of sentences than for words at either the beginning or end of sentences for both auditory-only and auditory-visual presentation, whereas the greatest benefit from supplying visual cues was for words at the end of sentences spoken both clearly and conversationally. The total benefit from speaking clearly and supplying visual cues was equal to the sum of each of these effects. Overall, the results suggest that speaking clearly and providing visual speech information provide complementary (rather than redundant) information.
APA, Harvard, Vancouver, ISO, and other styles
5

Ediwarman, Ediwarman, Syafrizal Syafrizal, and John Pahamzah. "PERCEPTION OF SPEECH USING AUDIO VISUAL AND REPLICA FOR STUDENTS OF SULTAN AGENG TIRTAYASA UNIVERSITY." JOURNAL OF LANGUAGE 3, no. 2 (November 29, 2021): 95–102. http://dx.doi.org/10.30743/jol.v3i2.3695.

Full text
Abstract:
This paper exmined the perception of speech using audio visual and replica for students of Sultan Ageng Tirtayasa Univesity. This research was aimed at discussing face-to-face conversation or speech felt by the ears and eyes. The prerequisites for audio-visual perception of speech by using ambiguous perceptual sine wave replicas of natural speech as auditory stimuli are studied in details. When the subjects were unaware that auditory stimuli were speech, they only showed a negligible integration of auditory and visual stimuli. The same subjects learn to feel the same auditory stimuli as speech; they integrate auditory and visual stimuli in the same way as natural speech. These research result suggests a special mode of perception of multisensory speech.
APA, Harvard, Vancouver, ISO, and other styles
6

PONS, FERRAN, LLORENÇ ANDREU, MONICA SANZ-TORRENT, LUCÍA BUIL-LEGAZ, and DAVID J. LEWKOWICZ. "Perception of audio-visual speech synchrony in Spanish-speaking children with and without specific language impairment." Journal of Child Language 40, no. 3 (July 9, 2012): 687–700. http://dx.doi.org/10.1017/s0305000912000189.

Full text
Abstract:
ABSTRACTSpeech perception involves the integration of auditory and visual articulatory information, and thus requires the perception of temporal synchrony between this information. There is evidence that children with specific language impairment (SLI) have difficulty with auditory speech perception but it is not known if this is also true for the integration of auditory and visual speech. Twenty Spanish-speaking children with SLI, twenty typically developing age-matched Spanish-speaking children, and twenty Spanish-speaking children matched for MLU-w participated in an eye-tracking study to investigate the perception of audiovisual speech synchrony. Results revealed that children with typical language development perceived an audiovisual asynchrony of 666 ms regardless of whether the auditory or visual speech attribute led the other one. Children with SLI only detected the 666 ms asynchrony when the auditory component followed the visual component. None of the groups perceived an audiovisual asynchrony of 366 ms. These results suggest that the difficulty of speech processing by children with SLI would also involve difficulties in integrating auditory and visual aspects of speech perception.
APA, Harvard, Vancouver, ISO, and other styles
7

Van Engen, Kristin J., Avanti Dey, Mitchell S. Sommers, and Jonathan E. Peelle. "Audiovisual speech perception: Moving beyond McGurk." Journal of the Acoustical Society of America 152, no. 6 (December 2022): 3216–25. http://dx.doi.org/10.1121/10.0015262.

Full text
Abstract:
Although it is clear that sighted listeners use both auditory and visual cues during speech perception, the manner in which multisensory information is combined is a matter of debate. One approach to measuring multisensory integration is to use variants of the McGurk illusion, in which discrepant auditory and visual cues produce auditory percepts that differ from those based on unimodal input. Not all listeners show the same degree of susceptibility to the McGurk illusion, and these individual differences are frequently used as a measure of audiovisual integration ability. However, despite their popularity, we join the voices of others in the field to argue that McGurk tasks are ill-suited for studying real-life multisensory speech perception: McGurk stimuli are often based on isolated syllables (which are rare in conversations) and necessarily rely on audiovisual incongruence that does not occur naturally. Furthermore, recent data show that susceptibility to McGurk tasks does not correlate with performance during natural audiovisual speech perception. Although the McGurk effect is a fascinating illusion, truly understanding the combined use of auditory and visual information during speech perception requires tasks that more closely resemble everyday communication: namely, words, sentences, and narratives with congruent auditory and visual speech cues.
APA, Harvard, Vancouver, ISO, and other styles
8

Clement, Bart R., Sarah K. Erickson, Su‐Hyun Jin, and Arlene E. Carney. "Confidence ratings in auditory–visual speech perception." Journal of the Acoustical Society of America 107, no. 5 (May 2000): 2887–88. http://dx.doi.org/10.1121/1.428732.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Burnham, Denis, Kaoru Sekiyama, and Dogu Erdener. "Cross‐language auditory‐visual speech perception development." Journal of the Acoustical Society of America 123, no. 5 (May 2008): 3879. http://dx.doi.org/10.1121/1.2935787.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

McCotter, Maxine V., and Timothy R. Jordan. "The Role of Facial Colour and Luminance in Visual and Audiovisual Speech Perception." Perception 32, no. 8 (August 2003): 921–36. http://dx.doi.org/10.1068/p3316.

Full text
Abstract:
We conducted four experiments to investigate the role of colour and luminance information in visual and audiovisual speech perception. In experiments la (stimuli presented in quiet conditions) and 1b (stimuli presented in auditory noise), face display types comprised naturalistic colour (NC), grey-scale (GS), and luminance inverted (LI) faces. In experiments 2a (quiet) and 2b (noise), face display types comprised NC, colour inverted (CI), LI, and colour and luminance inverted (CLI) faces. Six syllables and twenty-two words were used to produce auditory and visual speech stimuli. Auditory and visual signals were combined to produce congruent and incongruent audiovisual speech stimuli. Experiments 1a and 1b showed that perception of visual speech, and its influence on identifying the auditory components of congruent and incongruent audiovisual speech, was less for LI than for either NC or GS faces, which produced identical results. Experiments 2a and 2b showed that perception of visual speech, and influences on perception of incongruent auditory speech, was less for LI and CLI faces than for NC and CI faces (which produced identical patterns of performance). Our findings for NC and CI faces suggest that colour is not critical for perception of visual and audiovisual speech. The effect of luminance inversion on performance accuracy was relatively small (5%), which suggests that the luminance information preserved in LI faces is important for the processing of visual and audiovisual speech.
APA, Harvard, Vancouver, ISO, and other styles
11

Ujiie, Yuta, and Kohske Takahashi. "Weaker McGurk Effect for Rubin’s Vase-Type Speech in People With High Autistic Traits." Multisensory Research 34, no. 6 (April 16, 2021): 663–79. http://dx.doi.org/10.1163/22134808-bja10047.

Full text
Abstract:
Abstract While visual information from facial speech modulates auditory speech perception, it is less influential on audiovisual speech perception among autistic individuals than among typically developed individuals. In this study, we investigated the relationship between autistic traits (Autism-Spectrum Quotient; AQ) and the influence of visual speech on the recognition of Rubin’s vase-type speech stimuli with degraded facial speech information. Participants were 31 university students (13 males and 18 females; mean age: 19.2, SD: 1.13 years) who reported normal (or corrected-to-normal) hearing and vision. All participants completed three speech recognition tasks (visual, auditory, and audiovisual stimuli) and the AQ–Japanese version. The results showed that accuracies of speech recognition for visual (i.e., lip-reading) and auditory stimuli were not significantly related to participants’ AQ. In contrast, audiovisual speech perception was less susceptible to facial speech perception among individuals with high rather than low autistic traits. The weaker influence of visual information on audiovisual speech perception in autism spectrum disorder (ASD) was robust regardless of the clarity of the visual information, suggesting a difficulty in the process of audiovisual integration rather than in the visual processing of facial speech.
APA, Harvard, Vancouver, ISO, and other styles
12

Brendel, O. "SOME ASPECTS OF VISUAL-AUDITORY PERCEPTION OF ORAL SPEECH WHILE VIDEO AND SOUND RECORDINGS EXAMINATION." Theory and Practice of Forensic Science and Criminalistics 21, no. 1 (December 15, 2020): 349–58. http://dx.doi.org/10.32353/khrife.1.2020_24.

Full text
Abstract:
A problematic issue that frequently arises in the examination of video and audio recordings, namely the question of visual and auditory perception of oral speech – the establishment of the content of a conversation based on its image (lip reading) – is considered. The article purpose is to analyze the possibility and feasibility of examining the visual-auditory perception of oral speech in the framework of the examination of video and sound recordings, considering the peculiarities of such research; the ability to use visual information either as an independent object of examination (lip reading), or as a supplementary, additional to auditory analysis of a particular message. The main components of the process of lip reading, the possibility of visual examination of visual and auditory information in order to establish the content of a conversation are considered. Attention is paid to the features of visual and auditory perception of oral speech, and the factors that contribute enormously to the informative nature of the overall picture of oral speech perception by an image are analyzed. The influence of the visual image on the speech perception by an image is considered, such as active articulation, facial expressions, head movement, position of teeth, gestures, etc. In addition to the quality of the image, the duration of the speech fragment also affects the perception of oral speech by the image: a fully uttered expression is usually read better than its individual parts. The article also draws attention to the ambiguity of articulatory images of sounds. The features of the McGurk effect – a perception phenomenon that demonstrates the interaction between hearing and vision while the perception of speech – are considered. The analysis of the possibility and feasibility of examining visual and auditory perception of oral speech within the framework of the examination of video and sound recordings is carried out, and the peculiarities of such research are highlighted.
APA, Harvard, Vancouver, ISO, and other styles
13

Rosenblum, Lawrence D. "Speech Perception as a Multimodal Phenomenon." Current Directions in Psychological Science 17, no. 6 (December 2008): 405–9. http://dx.doi.org/10.1111/j.1467-8721.2008.00615.x.

Full text
Abstract:
Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal speech information could explain the reported automaticity, immediacy, and completeness of audiovisual speech integration. However, recent findings suggest that speech integration can be influenced by higher cognitive properties such as lexical status and semantic context. Proponents of amodal accounts will need to explain these results.
APA, Harvard, Vancouver, ISO, and other styles
14

Yordamlı, Arzu, and Doğu Erdener. "Auditory–Visual Speech Integration in Bipolar Disorder: A Preliminary Study." Languages 3, no. 4 (October 17, 2018): 38. http://dx.doi.org/10.3390/languages3040038.

Full text
Abstract:
This study aimed to investigate how individuals with bipolar disorder integrate auditory and visual speech information compared to healthy individuals. Furthermore, we wanted to see whether there were any differences between manic and depressive episode bipolar disorder patients with respect to auditory and visual speech integration. It was hypothesized that the bipolar group’s auditory–visual speech integration would be weaker than that of the control group. Further, it was predicted that those in the manic phase of bipolar disorder would integrate visual speech information more robustly than their depressive phase counterparts. To examine these predictions, a McGurk effect paradigm with an identification task was used with typical auditory–visual (AV) speech stimuli. Additionally, auditory-only (AO) and visual-only (VO, lip-reading) speech perceptions were also tested. The dependent variable for the AV stimuli was the amount of visual speech influence. The dependent variables for AO and VO stimuli were accurate modality-based responses. Results showed that the disordered and control groups did not differ in AV speech integration and AO speech perception. However, there was a striking difference in favour of the healthy group with respect to the VO stimuli. The results suggest the need for further research whereby both behavioural and physiological data are collected simultaneously. This will help us understand the full dynamics of how auditory and visual speech information are integrated in people with bipolar disorder.
APA, Harvard, Vancouver, ISO, and other styles
15

Hertrich, Ingo, Susanne Dietrich, and Hermann Ackermann. "Cross-modal Interactions during Perception of Audiovisual Speech and Nonspeech Signals: An fMRI Study." Journal of Cognitive Neuroscience 23, no. 1 (January 2011): 221–37. http://dx.doi.org/10.1162/jocn.2010.21421.

Full text
Abstract:
During speech communication, visual information may interact with the auditory system at various processing stages. Most noteworthy, recent magnetoencephalography (MEG) data provided first evidence for early and preattentive phonetic/phonological encoding of the visual data stream—prior to its fusion with auditory phonological features [Hertrich, I., Mathiak, K., Lutzenberger, W., & Ackermann, H. Time course of early audiovisual interactions during speech and non-speech central-auditory processing: An MEG study. Journal of Cognitive Neuroscience, 21, 259–274, 2009]. Using functional magnetic resonance imaging, the present follow-up study aims to further elucidate the topographic distribution of visual–phonological operations and audiovisual (AV) interactions during speech perception. Ambiguous acoustic syllables—disambiguated to /pa/ or /ta/ by the visual channel (speaking face)—served as test materials, concomitant with various control conditions (nonspeech AV signals, visual-only and acoustic-only speech, and nonspeech stimuli). (i) Visual speech yielded an AV-subadditive activation of primary auditory cortex and the anterior superior temporal gyrus (STG), whereas the posterior STG responded both to speech and nonspeech motion. (ii) The inferior frontal and the fusiform gyrus of the right hemisphere showed a strong phonetic/phonological impact (differential effects of visual /pa/ vs. /ta/) upon hemodynamic activation during presentation of speaking faces. Taken together with the previous MEG data, these results point at a dual-pathway model of visual speech information processing: On the one hand, access to the auditory system via the anterior supratemporal “what” path may give rise to direct activation of “auditory objects.” On the other hand, visual speech information seems to be represented in a right-hemisphere visual working memory, providing a potential basis for later interactions with auditory information such as the McGurk effect.
APA, Harvard, Vancouver, ISO, and other styles
16

BURNHAM, DENIS, BENJAWAN KASISOPA, AMANDA REID, SUDAPORN LUKSANEEYANAWIN, FRANCISCO LACERDA, VIRGINIA ATTINA, NAN XU RATTANASONE, IRIS-CORINNA SCHWARZ, and DIANE WEBSTER. "Universality and language-specific experience in the perception of lexical tone and pitch." Applied Psycholinguistics 36, no. 6 (November 21, 2014): 1459–91. http://dx.doi.org/10.1017/s0142716414000496.

Full text
Abstract:
ABSTRACTTwo experiments focus on Thai tone perception by native speakers of tone languages (Thai, Cantonese, and Mandarin), a pitch–accent (Swedish), and a nontonal (English) language. In Experiment 1, there was better auditory-only and auditory–visual discrimination by tone and pitch–accent language speakers than by nontone language speakers. Conversely and counterintuitively, there was better visual-only discrimination by nontone language speakers than tone and pitch–accent language speakers. Nevertheless, visual augmentation of auditory tone perception in noise was evident for all five language groups. In Experiment 2, involving discrimination in three fundamental frequency equivalent auditory contexts, tone and pitch–accent language participants showed equivalent discrimination for normal Thai speech, filtered speech, and violin sounds. In contrast, nontone language listeners had significantly better discrimination for violin sounds than filtered speech and in turn speech. Together the results show that tone perception is determined by both auditory and visual information, by acoustic and linguistic contexts, and by universal and experiential factors.
APA, Harvard, Vancouver, ISO, and other styles
17

Treille, Avril, Coriandre Vilain, Thomas Hueber, Laurent Lamalle, and Marc Sato. "Inside Speech: Multisensory and Modality-specific Processing of Tongue and Lip Speech Actions." Journal of Cognitive Neuroscience 29, no. 3 (March 2017): 448–66. http://dx.doi.org/10.1162/jocn_a_01057.

Full text
Abstract:
Action recognition has been found to rely not only on sensory brain areas but also partly on the observer's motor system. However, whether distinct auditory and visual experiences of an action modulate sensorimotor activity remains largely unknown. In the present sparse sampling fMRI study, we determined to which extent sensory and motor representations interact during the perception of tongue and lip speech actions. Tongue and lip speech actions were selected because tongue movements of our interlocutor are accessible via their impact on speech acoustics but not visible because of its position inside the vocal tract, whereas lip movements are both “audible” and visible. Participants were presented with auditory, visual, and audiovisual speech actions, with the visual inputs related to either a sagittal view of the tongue movements or a facial view of the lip movements of a speaker, previously recorded by an ultrasound imaging system and a video camera. Although the neural networks involved in visual visuolingual and visuofacial perception largely overlapped, stronger motor and somatosensory activations were observed during visuolingual perception. In contrast, stronger activity was found in auditory and visual cortices during visuofacial perception. Complementing these findings, activity in the left premotor cortex and in visual brain areas was found to correlate with visual recognition scores observed for visuolingual and visuofacial speech stimuli, respectively, whereas visual activity correlated with RTs for both stimuli. These results suggest that unimodal and multimodal processing of lip and tongue speech actions rely on common sensorimotor brain areas. They also suggest that visual processing of audible but not visible movements induces motor and visual mental simulation of the perceived actions to facilitate recognition and/or to learn the association between auditory and visual signals.
APA, Harvard, Vancouver, ISO, and other styles
18

Irwin, Julia, Trey Avery, Lawrence Brancazio, Jacqueline Turcios, Kayleigh Ryherd, and Nicole Landi. "Electrophysiological Indices of Audiovisual Speech Perception: Beyond the McGurk Effect and Speech in Noise." Multisensory Research 31, no. 1-2 (2018): 39–56. http://dx.doi.org/10.1163/22134808-00002580.

Full text
Abstract:
Visual information on a talker’s face can influence what a listener hears. Commonly used approaches to study this include mismatched audiovisual stimuli (e.g., McGurk type stimuli) or visual speech in auditory noise. In this paper we discuss potential limitations of these approaches and introduce a novel visual phonemic restoration method. This method always presents the same visual stimulus (e.g., /ba/) dubbed with a matched auditory stimulus (/ba/) or one that has weakened consonantal information and sounds more /a/-like). When this reduced auditory stimulus (or /a/) is dubbed with the visual /ba/, a visual influence will result in effectively ‘restoring’ the weakened auditory cues so that the stimulus is perceived as a /ba/. An oddball design in which participants are asked to detect the /a/ among a stream of more frequently occurring /ba/s while either a speaking face or face with no visual speech was used. In addition, the same paradigm was presented for a second contrast in which participants detected /pa/ among /ba/s, a contrast which should be unaltered by the presence of visual speech. Behavioral and some ERP findings reflect the expected phonemic restoration for the /ba/ vs. /a/ contrast; specifically, we observed reduced accuracy and P300 response in the presence of visual speech. Further, we report an unexpected finding of reduced accuracy and P300 response for both speech contrasts in the presence of visual speech, suggesting overall modulation of the auditory signal in the presence of visual speech. Consistent with this, we observed a mismatch negativity (MMN) effect for the /ba/ vs. /pa/ contrast only that was larger in absence of visual speech. We discuss the potential utility for this paradigm for listeners who cannot respond actively, such as infants and individuals with developmental disabilities.
APA, Harvard, Vancouver, ISO, and other styles
19

Bosseler, Alexis N., Patricia K. Kuhl, Dominic W. Massaro, and Andrew N. Meltzoff. "Auditory‐visual speech perception: What isolated articulators contribute." Journal of the Acoustical Society of America 121, no. 5 (May 2007): 3045. http://dx.doi.org/10.1121/1.4781742.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Carney, Arlene E., Bart R. Clement, and Kathleen M. Cienkowski. "Talker variability effects in auditory‐visual speech perception." Journal of the Acoustical Society of America 106, no. 4 (October 1999): 2270. http://dx.doi.org/10.1121/1.427755.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Bernstein, Lynne E., Edward T. Auer, Jean K. Moore, Curtis W. Ponton, Manual Don, and Manbir Singh. "Visual speech perception without primary auditory cortex activation." Neuroreport 13, no. 3 (March 2002): 311–15. http://dx.doi.org/10.1097/00001756-200203040-00013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Lalonde, Kaylah, and Lynne A. Werner. "Development of the Mechanisms Underlying Audiovisual Speech Perception Benefit." Brain Sciences 11, no. 1 (January 5, 2021): 49. http://dx.doi.org/10.3390/brainsci11010049.

Full text
Abstract:
The natural environments in which infants and children learn speech and language are noisy and multimodal. Adults rely on the multimodal nature of speech to compensate for noisy environments during speech communication. Multiple mechanisms underlie mature audiovisual benefit to speech perception, including reduced uncertainty as to when auditory speech will occur, use of correlations between the amplitude envelope of auditory and visual signals in fluent speech, and use of visual phonetic knowledge for lexical access. This paper reviews evidence regarding infants’ and children’s use of temporal and phonetic mechanisms in audiovisual speech perception benefit. The ability to use temporal cues for audiovisual speech perception benefit emerges in infancy. Although infants are sensitive to the correspondence between auditory and visual phonetic cues, the ability to use this correspondence for audiovisual benefit may not emerge until age four. A more cohesive account of the development of audiovisual speech perception may follow from a more thorough understanding of the development of sensitivity to and use of various temporal and phonetic cues.
APA, Harvard, Vancouver, ISO, and other styles
23

Randazzo, Melissa, Ryan Priefer, Paul J. Smith, Amanda Nagler, Trey Avery, and Karen Froud. "Neural Correlates of Modality-Sensitive Deviance Detection in the Audiovisual Oddball Paradigm." Brain Sciences 10, no. 6 (May 28, 2020): 328. http://dx.doi.org/10.3390/brainsci10060328.

Full text
Abstract:
The McGurk effect, an incongruent pairing of visual /ga/–acoustic /ba/, creates a fusion illusion /da/ and is the cornerstone of research in audiovisual speech perception. Combination illusions occur given reversal of the input modalities—auditory /ga/-visual /ba/, and percept /bga/. A robust literature shows that fusion illusions in an oddball paradigm evoke a mismatch negativity (MMN) in the auditory cortex, in absence of changes to acoustic stimuli. We compared fusion and combination illusions in a passive oddball paradigm to further examine the influence of visual and auditory aspects of incongruent speech stimuli on the audiovisual MMN. Participants viewed videos under two audiovisual illusion conditions: fusion with visual aspect of the stimulus changing, and combination with auditory aspect of the stimulus changing, as well as two unimodal auditory- and visual-only conditions. Fusion and combination deviants exerted similar influence in generating congruency predictions with significant differences between standards and deviants in the N100 time window. Presence of the MMN in early and late time windows differentiated fusion from combination deviants. When the visual signal changes, a new percept is created, but when the visual is held constant and the auditory changes, the response is suppressed, evoking a later MMN. In alignment with models of predictive processing in audiovisual speech perception, we interpreted our results to indicate that visual information can both predict and suppress auditory speech perception.
APA, Harvard, Vancouver, ISO, and other styles
24

Watkins, Kate, and Tomáš Paus. "Modulation of Motor Excitability during Speech Perception: The Role of Broca's Area." Journal of Cognitive Neuroscience 16, no. 6 (July 2004): 978–87. http://dx.doi.org/10.1162/0898929041502616.

Full text
Abstract:
Studies in both human and nonhuman primates indicate that motor and premotor cortical regions participate in auditory and visual perception of actions. Previous studies, using transcranial magnetic stimulation (TMS), showed that perceiving visual and auditory speech increased the excitability of the orofacial motor system during speech perception. Such studies, however, cannot tell us which brain regions mediate this effect. In this study, we used the technique of combining positron emission tomography with TMS to identify the brain regions that modulate the excitability of the motor system during speech perception. Our results show that during auditory speech perception, there is increased excitability of motor system underlying speech production and that this increase is significantly correlated with activity in the posterior part of the left inferior frontal gyrus (Broca's area). We propose that this area “primes” the motor system in response to heard speech even when no speech output is required and, as such, operates at the interface of perception and action.
APA, Harvard, Vancouver, ISO, and other styles
25

Plass, John, David Brang, Satoru Suzuki, and Marcia Grabowecky. "Vision perceptually restores auditory spectral dynamics in speech." Proceedings of the National Academy of Sciences 117, no. 29 (July 6, 2020): 16920–27. http://dx.doi.org/10.1073/pnas.2002887117.

Full text
Abstract:
Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require lossy remapping of speech signals onto abstracted representations. Because visible articulators shape the spectral content of speech, we hypothesized that the perceptual system might exploit natural correlations between midlevel visual (oral deformations) and auditory speech features (frequency modulations) to extract detailed spectrotemporal information from visual speech without employing high-level abstractions. Consistent with this hypothesis, we found that the time–frequency dynamics of oral resonances (formants) could be predicted with unexpectedly high precision from the changing shape of the mouth during speech. When isolated from other speech cues, speech-based shape deformations improved perceptual sensitivity for corresponding frequency modulations, suggesting that listeners could exploit this cross-modal correspondence to facilitate perception. To test whether this type of correspondence could improve speech comprehension, we selectively degraded the spectral or temporal dimensions of auditory sentence spectrograms to assess how well visual speech facilitated comprehension under each degradation condition. Visual speech produced drastically larger enhancements during spectral degradation, suggesting a condition-specific facilitation effect driven by cross-modal recovery of auditory speech spectra. The perceptual system may therefore use audiovisual correlations rooted in oral acoustics to extract detailed spectrotemporal information from visual speech.
APA, Harvard, Vancouver, ISO, and other styles
26

Phatak, Sandeep A., Danielle J. Zion, and Ken W. Grant. "Consonant Perception in Connected Syllables Spoken at a Conversational Syllabic Rate." Trends in Hearing 27 (January 2023): 233121652311566. http://dx.doi.org/10.1177/23312165231156673.

Full text
Abstract:
Closed-set consonant identification, measured using nonsense syllables, has been commonly used to investigate the encoding of speech cues in the human auditory system. Such tasks also evaluate the robustness of speech cues to masking from background noise and their impact on auditory-visual speech integration. However, extending the results of these studies to everyday speech communication has been a major challenge due to acoustic, phonological, lexical, contextual, and visual speech cue differences between consonants in isolated syllables and in conversational speech. In an attempt to isolate and address some of these differences, recognition of consonants spoken in multisyllabic nonsense phrases (e.g., aBaSHaGa spoken as /ɑbɑʃɑɡɑ/) produced at an approximately conversational syllabic rate was measured and compared with consonant recognition using Vowel-Consonant-Vowel bisyllables spoken in isolation. After accounting for differences in stimulus audibility using the Speech Intelligibility Index, consonants spoken in sequence at a conversational syllabic rate were found to be more difficult to recognize than those produced in isolated bisyllables. Specifically, place- and manner-of-articulation information was transmitted better in isolated nonsense syllables than for multisyllabic phrases. The contribution of visual speech cues to place-of-articulation information was also lower for consonants spoken in sequence at a conversational syllabic rate. These data imply that auditory-visual benefit based on models of feature complementarity from isolated syllable productions may over-estimate real-world benefit of integrating auditory and visual speech cues.
APA, Harvard, Vancouver, ISO, and other styles
27

Massaro, Dominic W., and Michael M. Cohen. "Perception of Synthesized Audible and Visible Speech." Psychological Science 1, no. 1 (January 1990): 55–63. http://dx.doi.org/10.1111/j.1467-9280.1990.tb00068.x.

Full text
Abstract:
The research reported in this paper uses novel stimuli to study how speech perception is influenced by information presented to ear and eye. Auditory and visual sources of information (syllables) were synthesized and presented in isolation or in factorial combination. A five-step continuum between the syllables ibal and idal was synthesized along both auditory and visual dimensions, by varying properties of the syllable at its onset. The onsets of the second and third formants were manipulated in the audible speech. For the visible speech, the shape of the lips and the jaw position at the onset of the syllable were manipulated. Subjects’ identification judgments of the test syllables presented on videotape were influenced by both auditory and visual information. The results were used to test between a fuzzy logical model of speech perception (FLMP) and a categorical model of perception (CMP). These tests indicate that evaluation and integration of the two sources of information makes available continuous as opposed to just categorical information. In addition, the integration of the two sources appears to be nonadditive in that the least ambiguous source has the largest impact on the judgment. The two sources of information appear to be evaluated, integrated, and identified as described by the FLMP-an optimal algorithm for combining information from multiple sources. The research provides a theoretical framework for understanding the improvement in speech perception by hearing-impaired listeners when auditory speech is supplemented with other sources of information.
APA, Harvard, Vancouver, ISO, and other styles
28

Hilić-Huskić, Azra, Esad H. Mahmutović, and Meliha Povlakić Hadžiefendić. "VISUAL PERCEPTION OF SPEECH IN CHILDREN WITH COCHLEAR IMPLANT." Journal Human Research in Rehabilitation 8, no. 2 (September 2018): 79–84. http://dx.doi.org/10.21554/hrr.091809.

Full text
Abstract:
By the development and application of cochlear implants, a large number of people with hearing impairment realize a better perception of speech after implantation. The aim of the research was to determine the differences in the quality of the perception of speech of children with cochlear implant in relation to the perception modality (auditory, visual, and audiovisual). The sample consisted of 30 deaf children with a cochlear implant, both sexes, chronologically aged from 3 to 15 years old, who regularly attend or had attended rehabilitation of hearing and speaking. The Test Lingvogram and the Articulation Test were used for testing (Vuletić, 1990). The data were processed with descriptive statistics and singlefactor analysis of variance. Respondents had the weakest results of word repetition and word understanding in the visual modality, much better results in auditory modality, and the best results in audiovisual modality. By comparing different modalities of speech perception, it was found that the differences were statistically significant in all pairs of modalities, both in word repetition and in word understanding, at the level of statistical significance p < .05, except between the visual and auditory perception (p = .26) in word repetition, but they were clinically significant in this combination too. The reason for the better effects of the modalities of the auditory and especially audiovisual perception, in relation to the visual perception of speech in this study is the application of cochlear implants in improving hearing and listening. However, people with a cochlear implant are still persons with hearing impairment. They should always have a high level of quality of the visual perception of speech in communication, which can be achieved by special exercises in the process of early rehabilitation of hearing and speaking
APA, Harvard, Vancouver, ISO, and other styles
29

Walden, Brian E., Allen A. Montgomery, Robert A. Prosek, and David B. Hawkins. "Visual Biasing of Normal and Impaired Auditory Speech Perception." Journal of Speech, Language, and Hearing Research 33, no. 1 (March 1990): 163–73. http://dx.doi.org/10.1044/jshr.3301.163.

Full text
Abstract:
Intersensory biasing occurs when cues in one sensory modality influence the perception of discrepant cues in another modality. Visual biasing of auditory stop consonant perception was examined in two related experiments in an attempt to clarify the role of hearing impairment on susceptibility to visual biasing of auditory speech perception. Fourteen computer-generated acoustic approximations of consonant-vowel syllables forming a /ba-da-ga/ continuum were presented for labeling as one of the three exemplars, via audition alone and in synchrony with natural visual articulations of /ba/ and of /ga/. Labeling functions were generated for each test condition showing the percentage of /ba/, /da/, and /ga/ responses to each of the 14 synthetic syllables. The subjects of the first experiment were 15 normal-hearing and 15 hearing-impaired observers. The hearing-impaired subjects demonstrated a greater susceptibility to biasing from visual cues than did the normal-hearing subjects. In the second experiment, the auditory stimuli were presented in a low-level background noise to 15 normal-hearing observers. A comparison of their labeling responses with those from the first experiment suggested that hearing-impaired persons may develop a propensity to rely on visual cues as a result of long-term hearing impairment. The results are discussed in terms of theories of intersensory bias.
APA, Harvard, Vancouver, ISO, and other styles
30

Everdell, Ian T., Heidi Marsh, Micheal D. Yurick, Kevin G. Munhall, and Martin Paré. "Gaze Behaviour in Audiovisual Speech Perception: Asymmetrical Distribution of Face-Directed Fixations." Perception 36, no. 10 (October 2007): 1535–45. http://dx.doi.org/10.1068/p5852.

Full text
Abstract:
Speech perception under natural conditions entails integration of auditory and visual information. Understanding how visual and auditory speech information are integrated requires detailed descriptions of the nature and processing of visual speech information. To understand better the process of gathering visual information, we studied the distribution of face-directed fixations of humans performing an audiovisual speech perception task to characterise the degree of asymmetrical viewing and its relationship to speech intelligibility. Participants showed stronger gaze fixation asymmetries while viewing dynamic faces, compared to static faces or face-like objects, especially when gaze was directed to the talkers' eyes. Although speech perception accuracy was significantly enhanced by the viewing of congruent, dynamic faces, we found no correlation between task performance and gaze fixation asymmetry. Most participants preferentially fixated the right side of the faces and their preferences persisted while viewing horizontally mirrored stimuli, different talkers, or static faces. These results suggest that the asymmetrical distributions of gaze fixations reflect the participants' viewing preferences, rather than being a product of asymmetrical faces, but that this behavioural bias does not predict correct audiovisual speech perception.
APA, Harvard, Vancouver, ISO, and other styles
31

Ika Afiyanti, Nur Rahma, Sartin T. Miolo, and Helena Badu. "Students Perceptions of the Use of English as Medium of Instruction." TRANS-KATA: Journal of Language, Literature, Culture and Education 1, no. 2 (May 30, 2021): 115–23. http://dx.doi.org/10.54923/transkata.v1i2.57.

Full text
Abstract:
This study aims to explore the students’ perception of the use of English as Medium of Instruction (EMI) in teaching and learning at SMA Negeri 1 Kota Gorontalo. This explorative qualitative study employed a purposive sampling method. This study involved 112 twelve-grade students from IPA, IPS, and BAHASA majors of SMA Negeri 1 Kota Gorontalo as the subjects. This study utilized questionnaires for data collection, while the Likert Scale and Percentage Formula were performed in data analysis. The result showed that the students had positive perceptions of the three EMI types of perception: visual, auditory, and speech perception. The average results in visual, auditory, and speech perceptions were 70.96% (‘strongly agree’ category), 70.49% (‘strongly agree’ category), and 57.90% (‘agree’ category). Further study with bigger samples and different subjects is recommended since this research only focuses on students perception of the Use of EMI in Teaching and Learning at SMA Negeri 1 Kota Gorontalo.
APA, Harvard, Vancouver, ISO, and other styles
32

Lindborg, Alma, and Tobias S. Andersen. "Bayesian binding and fusion models explain illusion and enhancement effects in audiovisual speech perception." PLOS ONE 16, no. 2 (February 19, 2021): e0246986. http://dx.doi.org/10.1371/journal.pone.0246986.

Full text
Abstract:
Speech is perceived with both the ears and the eyes. Adding congruent visual speech improves the perception of a faint auditory speech stimulus, whereas adding incongruent visual speech can alter the perception of the utterance. The latter phenomenon is the case of the McGurk illusion, where an auditory stimulus such as e.g. “ba” dubbed onto a visual stimulus such as “ga” produces the illusion of hearing “da”. Bayesian models of multisensory perception suggest that both the enhancement and the illusion case can be described as a two-step process of binding (informed by prior knowledge) and fusion (informed by the information reliability of each sensory cue). However, there is to date no study which has accounted for how they each contribute to audiovisual speech perception. In this study, we expose subjects to both congruent and incongruent audiovisual speech, manipulating the binding and the fusion stages simultaneously. This is done by varying both temporal offset (binding) and auditory and visual signal-to-noise ratio (fusion). We fit two Bayesian models to the behavioural data and show that they can both account for the enhancement effect in congruent audiovisual speech, as well as the McGurk illusion. This modelling approach allows us to disentangle the effects of binding and fusion on behavioural responses. Moreover, we find that these models have greater predictive power than a forced fusion model. This study provides a systematic and quantitative approach to measuring audiovisual integration in the perception of the McGurk illusion as well as congruent audiovisual speech, which we hope will inform future work on audiovisual speech perception.
APA, Harvard, Vancouver, ISO, and other styles
33

Massaro, Dominic W., Michael M. Cohen, and Paula M. T. Smeele. "Perception of asynchronous and conflicting visual and auditory speech." Journal of the Acoustical Society of America 100, no. 3 (September 1996): 1777–86. http://dx.doi.org/10.1121/1.417342.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Youse, Kathleen M., Kathleen M. Cienkowski, and Carl A. Coelho. "Auditory-visual speech perception in an adult with aphasia." Brain Injury 18, no. 8 (August 2004): 825–34. http://dx.doi.org/10.1080/02699000410001671784.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Tye-Murray, Nancy, Mitchell Sommers, and Brent Spehar. "Auditory and Visual Lexical Neighborhoods in Audiovisual Speech Perception." Trends in Amplification 11, no. 4 (December 2007): 233–41. http://dx.doi.org/10.1177/1084713807307409.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Hartmann, E. Eugenie. "Perceptual Development: Visual, Auditory, and Speech Perception in Infancy." Optometry and Vision Science 76, no. 9 (September 1999): 612–13. http://dx.doi.org/10.1097/00006324-199909000-00013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

van Wassenhove, Virginie, Ken W. Grant, and David Poeppel. "Temporal window of integration in auditory-visual speech perception." Neuropsychologia 45, no. 3 (January 2007): 598–607. http://dx.doi.org/10.1016/j.neuropsychologia.2006.01.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Sekiyama, Kaoru, Iwao Kanno, Shuichi Miura, and Yoichi Sugita. "Auditory-visual speech perception examined by fMRI and PET." Neuroscience Research 47, no. 3 (November 2003): 277–87. http://dx.doi.org/10.1016/s0168-0102(03)00214-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Burgmeier, Robert, Rajen U. Desai, Katherine C. Farner, Benjamin Tiano, Ryan Lacey, Nicholas J. Volpe, and Marilyn B. Mets. "The Effect of Amblyopia on Visual-Auditory Speech Perception." JAMA Ophthalmology 133, no. 1 (January 1, 2015): 11. http://dx.doi.org/10.1001/jamaophthalmol.2014.3307.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Grabowecky, Marcia, Emmanuel Guzman-Martinez, Laura Ortega, and Satoru Suzuki. "An invisible speaker can facilitate auditory speech perception." Seeing and Perceiving 25 (2012): 148. http://dx.doi.org/10.1163/187847612x647801.

Full text
Abstract:
Watching moving lips facilitates auditory speech perception when the mouth is attended. However, recent evidence suggests that visual attention and awareness are mediated by separate mechanisms. We investigated whether lip movements suppressed from visual awareness can facilitate speech perception. We used a word categorization task in which participants listened to spoken words and determined as quickly and accurately as possible whether or not each word named a tool. While participants listened to the words they watched a visual display that presented a video clip of the speaker synchronously speaking the auditorily presented words, or the same speaker articulating different words. Critically, the speaker’s face was either visible (the aware trials), or suppressed from awareness using continuous flash suppression. Aware and suppressed trials were randomly intermixed. A secondary probe-detection task ensured that participants attended to the mouth region regardless of whether the face was visible or suppressed. On the aware trials responses to the tool targets were no faster with the synchronous than asynchronous lip movements, perhaps because the visual information was inconsistent with the auditory information on 50% of the trials. However, on the suppressed trials responses to the tool targets were significantly faster with the synchronous than asynchronous lip movements. These results demonstrate that even when a random dynamic mask renders a face invisible, lip movements are processed by the visual system with sufficiently high temporal resolution to facilitate speech perception.
APA, Harvard, Vancouver, ISO, and other styles
41

Atcherson, Samuel R., Lisa Lucks Mendel, Wesley J. Baltimore, Chhayakanta Patro, Sungmin Lee, Monique Pousson, and M. Joshua Spann. "The Effect of Conventional and Transparent Surgical Masks on Speech Understanding in Individuals with and without Hearing Loss." Journal of the American Academy of Audiology 28, no. 01 (January 2017): 058–67. http://dx.doi.org/10.3766/jaaa.15151.

Full text
Abstract:
AbstractIt is generally well known that speech perception is often improved with integrated audiovisual input whether in quiet or in noise. In many health-care environments, however, conventional surgical masks block visual access to the mouth and obscure other potential facial cues. In addition, these environments can be noisy. Although these masks may not alter the acoustic properties, the presence of noise in addition to the lack of visual input can have a deleterious effect on speech understanding. A transparent (“see-through”) surgical mask may help to overcome this issue.To compare the effect of noise and various visual input conditions on speech understanding for listeners with normal hearing (NH) and hearing impairment using different surgical masks.Participants were assigned to one of three groups based on hearing sensitivity in this quasi-experimental, cross-sectional study.A total of 31 adults participated in this study: one talker, ten listeners with NH, ten listeners with moderate sensorineural hearing loss, and ten listeners with severe-to-profound hearing loss.Selected lists from the Connected Speech Test were digitally recorded with and without surgical masks and then presented to the listeners at 65 dB HL in five conditions against a background of four-talker babble (+10 dB SNR): without a mask (auditory only), without a mask (auditory and visual), with a transparent mask (auditory only), with a transparent mask (auditory and visual), and with a paper mask (auditory only).A significant difference was found in the spectral analyses of the speech stimuli with and without the masks; however, no more than ∼2 dB root mean square. Listeners with NH performed consistently well across all conditions. Both groups of listeners with hearing impairment benefitted from visual input from the transparent mask. The magnitude of improvement in speech perception in noise was greatest for the severe-to-profound group.Findings confirm improved speech perception performance in noise for listeners with hearing impairment when visual input is provided using a transparent surgical mask. Most importantly, the use of the transparent mask did not negatively affect speech perception performance in noise.
APA, Harvard, Vancouver, ISO, and other styles
42

Sánchez-García, Carolina, Sonia Kandel, Christophe Savariaux, Nara Ikumi, and Salvador Soto-Faraco. "Time course of audio–visual phoneme identification: A cross-modal gating study." Seeing and Perceiving 25 (2012): 194. http://dx.doi.org/10.1163/187847612x648233.

Full text
Abstract:
When both present, visual and auditory information are combined in order to decode the speech signal. Past research has addressed to what extent visual information contributes to distinguish confusable speech sounds, but usually ignoring the continuous nature of speech perception. Here we tap at the temporal course of the contribution of visual and auditory information during the process of speech perception. To this end, we designed an audio–visual gating task with videos recorded with high speed camera. Participants were asked to identify gradually longer fragments of pseudowords varying in the central consonant. Different Spanish consonant phonemes with different degree of visual and acoustic saliency were included, and tested on visual-only, auditory-only and audio–visual trials. The data showed different patterns of contribution of unimodal and bimodal information during identification, depending on the visual saliency of the presented phonemes. In particular, for phonemes which are clearly more salient in one modality than the other, audio–visual performance equals that of the best unimodal. In phonemes with more balanced saliency, audio–visual performance was better than both unimodal conditions. These results shed new light on the temporal course of audio–visual speech integration.
APA, Harvard, Vancouver, ISO, and other styles
43

Lüttke, Claudia S., Alexis Pérez-Bellido, and Floris P. de Lange. "Rapid recalibration of speech perception after experiencing the McGurk illusion." Royal Society Open Science 5, no. 3 (March 2018): 170909. http://dx.doi.org/10.1098/rsos.170909.

Full text
Abstract:
The human brain can quickly adapt to changes in the environment. One example is phonetic recalibration: a speech sound is interpreted differently depending on the visual speech and this interpretation persists in the absence of visual information. Here, we examined the mechanisms of phonetic recalibration. Participants categorized the auditory syllables /aba/ and /ada/, which were sometimes preceded by the so-called McGurk stimuli (in which an /aba/ sound, due to visual /aga/ input, is often perceived as ‘ada’). We found that only one trial of exposure to the McGurk illusion was sufficient to induce a recalibration effect, i.e. an auditory /aba/ stimulus was subsequently more often perceived as ‘ada’. Furthermore, phonetic recalibration took place only when auditory and visual inputs were integrated to ‘ada’ (McGurk illusion). Moreover, this recalibration depended on the sensory similarity between the preceding and current auditory stimulus. Finally, signal detection theoretical analysis showed that McGurk-induced phonetic recalibration resulted in both a criterion shift towards /ada/ and a reduced sensitivity to distinguish between /aba/ and /ada/ sounds. The current study shows that phonetic recalibration is dependent on the perceptual integration of audiovisual information and leads to a perceptual shift in phoneme categorization.
APA, Harvard, Vancouver, ISO, and other styles
44

Callan, Daniel E., Jeffery A. Jones, Kevin Munhall, Christian Kroos, Akiko M. Callan, and Eric Vatikiotis-Bateson. "Multisensory Integration Sites Identified by Perception of Spatial Wavelet Filtered Visual Speech Gesture Information." Journal of Cognitive Neuroscience 16, no. 5 (June 2004): 805–16. http://dx.doi.org/10.1162/089892904970771.

Full text
Abstract:
Perception of speech is improved when presentation of the audio signal is accompanied by concordant visual speech gesture information. This enhancement is most prevalent when the audio signal is degraded. One potential means by which the brain affords perceptual enhancement is thought to be through the integration of concordant information from multiple sensory channels in a common site of convergence, multisensory integration (MSI) sites. Some studies have identified potential sites in the superior temporal gyrus/sulcus (STG/S) that are responsive to multisensory information from the auditory speech signal and visual speech movement. One limitation of these studies is that they do not control for activity resulting from attentional modulation cued by such things as visual information signaling the onsets and offsets of the acoustic speech signal, as well as activity resulting from MSI of properties of the auditory speech signal with aspects of gross visual motion that are not specific to place of articulation information. This fMRI experiment uses spatial wavelet bandpass filtered Japanese sentences presented with background multispeaker audio noise to discern brain activity reflecting MSI induced by auditory and visual correspondence of place of articulation information that controls for activity resulting from the above-mentioned factors. The experiment consists of a low-frequency (LF) filtered condition containing gross visual motion of the lips, jaw, and head without specific place of articulation information, a midfrequency (MF) filtered condition containing place of articulation information, and an unfiltered (UF) condition. Sites of MSI selectively induced by auditory and visual correspondence of place of articulation information were determined by the presence of activity for both the MF and UF conditions relative to the LF condition. Based on these criteria, sites of MSI were found predominantly in the left middle temporal gyrus (MTG), and the left STG/S (including the auditory cortex). By controlling for additional factors that could also induce greater activity resulting from visual motion information, this study identifies potential MSI sites that we believe are involved with improved speech perception intelligibility.
APA, Harvard, Vancouver, ISO, and other styles
45

Stacey, Jemaine E., Christina J. Howard, Suvobrata Mitra, and Paula C. Stacey. "Audio-visual integration in noise: Influence of auditory and visual stimulus degradation on eye movements and perception of the McGurk effect." Attention, Perception, & Psychophysics 82, no. 7 (June 12, 2020): 3544–57. http://dx.doi.org/10.3758/s13414-020-02042-x.

Full text
Abstract:
AbstractSeeing a talker’s face can aid audiovisual (AV) integration when speech is presented in noise. However, few studies have simultaneously manipulated auditory and visual degradation. We aimed to establish how degrading the auditory and visual signal affected AV integration. Where people look on the face in this context is also of interest; Buchan, Paré and Munhall (Brain Research, 1242, 162–171, 2008) found fixations on the mouth increased in the presence of auditory noise whilst Wilson, Alsius, Paré and Munhall (Journal of Speech, Language, and Hearing Research, 59(4), 601–615, 2016) found mouth fixations decreased with decreasing visual resolution. In Condition 1, participants listened to clear speech, and in Condition 2, participants listened to vocoded speech designed to simulate the information provided by a cochlear implant. Speech was presented in three levels of auditory noise and three levels of visual blurring. Adding noise to the auditory signal increased McGurk responses, while blurring the visual signal decreased McGurk responses. Participants fixated the mouth more on trials when the McGurk effect was perceived. Adding auditory noise led to people fixating the mouth more, while visual degradation led to people fixating the mouth less. Combined, the results suggest that modality preference and where people look during AV integration of incongruent syllables varies according to the quality of information available.
APA, Harvard, Vancouver, ISO, and other styles
46

Oh, Yonghee, Nicole Kalpin, and Jessica Hunter. "The impact of temporally coherent visual and vibrotactile cues on speech perception in noise performance." Journal of the Acoustical Society of America 151, no. 4 (April 2022): A221. http://dx.doi.org/10.1121/10.0011118.

Full text
Abstract:
The inputs delivered to different sensory organs provide us with complementary information about the environment. Our recent study demonstrated that presenting abstract visual information of speech envelopes substantially improves speech perception ability in normal-hearing (NH), listeners [Yuan et al., J. Acoust. Soc. Am. (2020)]. The purpose of this study was to expand this audiovisual speech perception to the tactile domain. Twenty adults participated in sentence recognition threshold measurements in four different sensory modalities (AO: auditory-only; AV: auditory-visual; AT: audio-tactile; and AVT: audio-visual-tactile). The target sentence [CRM speech corpus, Bolia et al., J. Acoust. Soc. Am . (2000)] level was fixed at 60 dBA, and the masker (speech-shaped noise) levels were adaptively varied to find masked thresholds. The amplitudes of both visual and vibrotactile stimuli were temporally synchronized and non-synchronized with the target speech envelope for comparison. Results show that temporally coherent multi-modal stimulation (AV, AT, and AVT) significantly improves speech perception ability when compared to audio-only (AO) stimulation. These multisensory speech perception benefits were reduced when the cross-modal temporal coherence characteristics were eliminated. These findings suggest that multisensory interactions are fundamentally important for speech perception ability in NH listeners. The outcome of this multisensory speech processing highly depends on temporal coherence characteristics between multi-modal sensory inputs.
APA, Harvard, Vancouver, ISO, and other styles
47

Conrey, Brianna, and David B. Pisoni. "Auditory-visual speech perception and synchrony detection for speech and nonspeech signals." Journal of the Acoustical Society of America 119, no. 6 (June 2006): 4065–73. http://dx.doi.org/10.1121/1.2195091.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Sommers, Mitchell S., Nancy Tye-Murray, and Brent Spehar. "Auditory-Visual Speech Perception and Auditory-Visual Enhancement in Normal-Hearing Younger and Older Adults." Ear and Hearing 26, no. 3 (June 2005): 263–75. http://dx.doi.org/10.1097/00003446-200506000-00003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

JERGER, SUSAN, MARKUS F. DAMIAN, NANCY TYE-MURRAY, and HERVÉ ABDI. "Children perceive speech onsets by ear and eye." Journal of Child Language 44, no. 1 (January 11, 2016): 185–215. http://dx.doi.org/10.1017/s030500091500077x.

Full text
Abstract:
AbstractAdults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize cognitive resources, not lack of sensitivity. We investigated sensitivity to visual speech in children via the phonological priming produced by low-fidelity (non-intact onset) auditory speech presented audiovisually (see dynamic face articulate consonant/rhyme b/ag; hear non-intact onset/rhyme: –b/ag) vs. auditorily (see still face; hear exactly same auditory input). Audiovisual speech produced greater priming from four to fourteen years, indicating that visual speech filled in the non-intact auditory onsets. The influence of visual speech depended uniquely on phonology and speechreading. Children – like adults – perceive speech onsets multimodally. Findings are critical for incorporating visual speech into developmental theories of speech perception.
APA, Harvard, Vancouver, ISO, and other styles
50

Bosker, Hans Rutger, David Peeters, and Judith Holler. "How visual cues to speech rate influence speech perception." Quarterly Journal of Experimental Psychology 73, no. 10 (April 20, 2020): 1523–36. http://dx.doi.org/10.1177/1747021820914564.

Full text
Abstract:
Spoken words are highly variable and therefore listeners interpret speech sounds relative to the surrounding acoustic context, such as the speech rate of a preceding sentence. For instance, a vowel midway between short /ɑ/ and long /a:/ in Dutch is perceived as short /ɑ/ in the context of preceding slow speech, but as long /a:/ if preceded by a fast context. Despite the well-established influence of visual articulatory cues on speech comprehension, it remains unclear whether visual cues to speech rate also influence subsequent spoken word recognition. In two “Go Fish”–like experiments, participants were presented with audio-only (auditory speech + fixation cross), visual-only (mute videos of talking head), and audiovisual (speech + videos) context sentences, followed by ambiguous target words containing vowels midway between short /ɑ/ and long /a:/. In Experiment 1, target words were always presented auditorily, without visual articulatory cues. Although the audio-only and audiovisual contexts induced a rate effect (i.e., more long /a:/ responses after fast contexts), the visual-only condition did not. When, in Experiment 2, target words were presented audiovisually, rate effects were observed in all three conditions, including visual-only. This suggests that visual cues to speech rate in a context sentence influence the perception of following visual target cues (e.g., duration of lip aperture), which at an audiovisual integration stage bias participants’ target categorisation responses. These findings contribute to a better understanding of how what we see influences what we hear.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography