To see the other types of publications on this topic, follow the link: Perception/McGyrk effect.

Journal articles on the topic 'Perception/McGyrk effect'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Perception/McGyrk effect.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Möttönen, Riikka, Kaisa Tiippana, Mikko Sams, and Hanna Puharinen. "Sound Location Can Influence Audiovisual Speech Perception When Spatial Attention Is Manipulated." Seeing and Perceiving 24, no. 1 (2011): 67–90. http://dx.doi.org/10.1163/187847511x557308.

Full text
Abstract:
AbstractAudiovisual speech perception has been considered to operate independent of sound location, since the McGurk effect (altered auditory speech perception caused by conflicting visual speech) has been shown to be unaffected by whether speech sounds are presented in the same or different location as a talking face. Here we show that sound location effects arise with manipulation of spatial attention. Sounds were presented from loudspeakers in five locations: the centre (location of the talking face) and 45°/90° to the left/right. Auditory spatial attention was focused on a location by presenting the majority (90%) of sounds from this location. In Experiment 1, the majority of sounds emanated from the centre, and the McGurk effect was enhanced there. In Experiment 2, the major location was 90° to the left, causing the McGurk effect to be stronger on the left and centre than on the right. Under control conditions, when sounds were presented with equal probability from all locations, the McGurk effect tended to be stronger for sounds emanating from the centre, but this tendency was not reliable. Additionally, reaction times were the shortest for a congruent audiovisual stimulus, and this was the case independent of location. Our main finding is that sound location can modulate audiovisual speech perception, and that spatial attention plays a role in this modulation.
APA, Harvard, Vancouver, ISO, and other styles
2

Magnotti, John F., Debshila Basu Mallick, and Michael S. Beauchamp. "Reducing Playback Rate of Audiovisual Speech Leads to a Surprising Decrease in the McGurk Effect." Multisensory Research 31, no. 1-2 (2018): 19–38. http://dx.doi.org/10.1163/22134808-00002586.

Full text
Abstract:
We report the unexpected finding that slowing video playback decreases perception of the McGurk effect. This reduction is counter-intuitive because the illusion depends on visual speech influencing the perception of auditory speech, and slowing speech should increase the amount of visual information available to observers. We recorded perceptual data from 110 subjects viewing audiovisual syllables (either McGurk or congruent control stimuli) played back at one of three rates: the rate used by the talker during recording (the natural rate), a slow rate (50% of natural), or a fast rate (200% of natural). We replicated previous studies showing dramatic variability in McGurk susceptibility at the natural rate, ranging from 0–100% across subjects and from 26–76% across the eight McGurk stimuli tested. Relative to the natural rate, slowed playback reduced the frequency of McGurk responses by 11% (79% of subjects showed a reduction) and reduced congruent accuracy by 3% (25% of subjects showed a reduction). Fast playback rate had little effect on McGurk responses or congruent accuracy. To determine whether our results are consistent with Bayesian integration, we constructed a Bayes-optimal model that incorporated two assumptions: individuals combine auditory and visual information according to their reliability, and changing playback rate affects sensory reliability. The model reproduced both our findings of large individual differences and the playback rate effect. This work illustrates that surprises remain in the McGurk effect and that Bayesian integration provides a useful framework for understanding audiovisual speech perception.
APA, Harvard, Vancouver, ISO, and other styles
3

Omata, Kei, and Ken Mogi. "Fusion and combination in audio-visual integration." Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 464, no. 2090 (November 27, 2007): 319–40. http://dx.doi.org/10.1098/rspa.2007.1910.

Full text
Abstract:
Language is essentially multi-modal in its sensory origin, the daily conversation depending heavily on the audio-visual (AV) information. Although the perception of spoken language is primarily dominated by audition, the perception of facial expression, particularly that of the mouth, helps us comprehend speech. The McGurk effect is a striking phenomenon where the perceived phoneme is affected by the simultaneous observation of lip movement, and probably reflects the underlying AV integration process. The elucidation of the principles involved in this unique perceptual anomaly poses an interesting problem. Here we study the nature of the McGurk effect by means of neural networks (self-organizing maps, SOM) designed to extract patterns inherent in audio and visual stimuli. It is shown that a McGurk effect-like classification of incoming information occurs without any additional constraint or procedure added to the network, suggesting that the anomaly is a consequence of the AV integration process. Within this framework, an explanation is given for the asymmetric effect of AV pairs in causing the McGurk effect (fusion or combination) based on the ‘distance’ relationship between audio or visual information within the SOM. Our result reveals some generic features of the cognitive process of phoneme perception, and AV sensory integration in general.
APA, Harvard, Vancouver, ISO, and other styles
4

Alsius, Agnès, Martin Paré, and Kevin G. Munhall. "Forty Years After Hearing Lips and Seeing Voices: the McGurk Effect Revisited." Multisensory Research 31, no. 1-2 (2018): 111–44. http://dx.doi.org/10.1163/22134808-00002565.

Full text
Abstract:
Since its discovery 40 years ago, the McGurk illusion has been usually cited as a prototypical paradigmatic case of multisensory binding in humans, and has been extensively used in speech perception studies as a proxy measure for audiovisual integration mechanisms. Despite the well-established practice of using the McGurk illusion as a tool for studying the mechanisms underlying audiovisual speech integration, the magnitude of the illusion varies enormously across studies. Furthermore, the processing of McGurk stimuli differs from congruent audiovisual processing at both phenomenological and neural levels. This questions the suitability of this illusion as a tool to quantify the necessary and sufficient conditions under which audiovisual integration occurs in natural conditions. In this paper, we review some of the practical and theoretical issues related to the use of the McGurk illusion as an experimental paradigm. We believe that, without a richer understanding of the mechanisms involved in the processing of the McGurk effect, experimenters should be really cautious when generalizing data generated by McGurk stimuli to matching audiovisual speech events.
APA, Harvard, Vancouver, ISO, and other styles
5

MacDonald, John. "Hearing Lips and Seeing Voices: the Origins and Development of the ‘McGurk Effect’ and Reflections on Audio–Visual Speech Perception Over the Last 40 Years." Multisensory Research 31, no. 1-2 (2018): 7–18. http://dx.doi.org/10.1163/22134808-00002548.

Full text
Abstract:
In 1976 Harry McGurk and I published a paper in Nature, entitled ‘Hearing Lips and Seeing Voices’. The paper described a new audio–visual illusion we had discovered that showed the perception of auditorily presented speech could be influenced by the simultaneous presentation of incongruent visual speech. This hitherto unknown effect has since had a profound impact on audiovisual speech perception research. The phenomenon has come to be known as the ‘McGurk effect’, and the original paper has been cited in excess of 4800 times. In this paper I describe the background to the discovery of the effect, the rationale for the generation of the initial stimuli, the construction of the exemplars used and the serendipitous nature of the finding. The paper will also cover the reaction (and non-reaction) to the Nature publication, the growth of research on, and utilizing the ‘McGurk effect’ and end with some reflections on the significance of the finding.
APA, Harvard, Vancouver, ISO, and other styles
6

Lüttke, Claudia S., Alexis Pérez-Bellido, and Floris P. de Lange. "Rapid recalibration of speech perception after experiencing the McGurk illusion." Royal Society Open Science 5, no. 3 (March 2018): 170909. http://dx.doi.org/10.1098/rsos.170909.

Full text
Abstract:
The human brain can quickly adapt to changes in the environment. One example is phonetic recalibration: a speech sound is interpreted differently depending on the visual speech and this interpretation persists in the absence of visual information. Here, we examined the mechanisms of phonetic recalibration. Participants categorized the auditory syllables /aba/ and /ada/, which were sometimes preceded by the so-called McGurk stimuli (in which an /aba/ sound, due to visual /aga/ input, is often perceived as ‘ada’). We found that only one trial of exposure to the McGurk illusion was sufficient to induce a recalibration effect, i.e. an auditory /aba/ stimulus was subsequently more often perceived as ‘ada’. Furthermore, phonetic recalibration took place only when auditory and visual inputs were integrated to ‘ada’ (McGurk illusion). Moreover, this recalibration depended on the sensory similarity between the preceding and current auditory stimulus. Finally, signal detection theoretical analysis showed that McGurk-induced phonetic recalibration resulted in both a criterion shift towards /ada/ and a reduced sensitivity to distinguish between /aba/ and /ada/ sounds. The current study shows that phonetic recalibration is dependent on the perceptual integration of audiovisual information and leads to a perceptual shift in phoneme categorization.
APA, Harvard, Vancouver, ISO, and other styles
7

Lindborg, Alma, and Tobias S. Andersen. "Bayesian binding and fusion models explain illusion and enhancement effects in audiovisual speech perception." PLOS ONE 16, no. 2 (February 19, 2021): e0246986. http://dx.doi.org/10.1371/journal.pone.0246986.

Full text
Abstract:
Speech is perceived with both the ears and the eyes. Adding congruent visual speech improves the perception of a faint auditory speech stimulus, whereas adding incongruent visual speech can alter the perception of the utterance. The latter phenomenon is the case of the McGurk illusion, where an auditory stimulus such as e.g. “ba” dubbed onto a visual stimulus such as “ga” produces the illusion of hearing “da”. Bayesian models of multisensory perception suggest that both the enhancement and the illusion case can be described as a two-step process of binding (informed by prior knowledge) and fusion (informed by the information reliability of each sensory cue). However, there is to date no study which has accounted for how they each contribute to audiovisual speech perception. In this study, we expose subjects to both congruent and incongruent audiovisual speech, manipulating the binding and the fusion stages simultaneously. This is done by varying both temporal offset (binding) and auditory and visual signal-to-noise ratio (fusion). We fit two Bayesian models to the behavioural data and show that they can both account for the enhancement effect in congruent audiovisual speech, as well as the McGurk illusion. This modelling approach allows us to disentangle the effects of binding and fusion on behavioural responses. Moreover, we find that these models have greater predictive power than a forced fusion model. This study provides a systematic and quantitative approach to measuring audiovisual integration in the perception of the McGurk illusion as well as congruent audiovisual speech, which we hope will inform future work on audiovisual speech perception.
APA, Harvard, Vancouver, ISO, and other styles
8

Lu, Hong, and Chaochao Pan. "The McGurk effect in self-recognition of people with schizophrenia." Social Behavior and Personality: an international journal 48, no. 6 (June 2, 2020): 1–8. http://dx.doi.org/10.2224/sbp.9219.

Full text
Abstract:
The McGurk effect is a robust illusion phenomenon in the perception of speech; however, there is little research on its demonstration in nonverbal domains. Thus, we tested for the McGurk effect in the context of self-recognition. We presented a group of people with schizophrenia and a control group of people without mental illnesses, with 2 videos accompanied by a soundtrack featuring different identity information. The first video had a matched face and voice; the other featured conflicting face–voice information. The participants judged if the voice in the video was their own or someone else's. The results show there was a robust McGurk effect in self-recognition, which was stronger among participants with schizophrenia because of the influence of self-disorder. Further, people with schizophrenia were less accurate in voice self-recognition when there was conflicting face–voice identity information. Thus, presenting audiovisual-consistent information is conducive to information processing for people with schizophrenia.
APA, Harvard, Vancouver, ISO, and other styles
9

Walker, Grant M., Patrick Sarahan Rollo, Nitin Tandon, and Gregory Hickok. "Effect of Bilateral Opercular Syndrome on Speech Perception." Neurobiology of Language 2, no. 3 (2021): 335–53. http://dx.doi.org/10.1162/nol_a_00037.

Full text
Abstract:
Abstract Speech perception ability and structural neuroimaging were investigated in two cases of bilateral opercular syndrome. Due to bilateral ablation of the motor control center for the lower face and surrounds, these rare cases provide an opportunity to evaluate the necessity of cortical motor representations for speech perception, a cornerstone of some neurocomputational theories of language processing. Speech perception, including audiovisual integration (i.e., the McGurk effect), was mostly unaffected in these cases, although verbal short-term memory impairment hindered performance on several tasks that are traditionally used to evaluate speech perception. The results suggest that the role of the cortical motor system in speech perception is context-dependent and supplementary, not inherent or necessary.
APA, Harvard, Vancouver, ISO, and other styles
10

Sams, M. "Audiovisual Speech Perception." Perception 26, no. 1_suppl (August 1997): 347. http://dx.doi.org/10.1068/v970029.

Full text
Abstract:
Persons with hearing loss use visual information from articulation to improve their speech perception. Even persons with normal hearing utilise visual information, especially when the stimulus-to-noise ratio is poor. A dramatic demonstration of the role of vision in speech perception is the audiovisual fusion called the ‘McGurk effect’. When the auditory syllable /pa/ is presented in synchrony with the face articulating the syllable /ka/, the subject usually perceives /ta/ or /ka/. The illusory perception is clearly auditory in nature. We recently studied the audiovisual fusion (acoustical /p/, visual /k/) for Finnish (1) syllables, and (2) words. Only 3% of the subjects perceived the syllables according to the acoustical input, ie in 97% of the subjects the perception was influenced by the visual information. For words the percentage of acoustical identifications was 10%. The results demonstrate a very strong influence of visual information of articulation in face-to-face speech perception. Word meaning and sentence context have a negligible influence on the fusion. We have also recorded neuromagnetic responses of the human cortex when the subjects both heard and saw speech. Some subjects showed a distinct response to a ‘McGurk’ stimulus. The response was rather late, emerging about 200 ms from the onset of the auditory stimulus. We suggest that the perisylvian cortex, close to the source area for the auditory 100 ms response (M100), may be activated by the discordant stimuli. The behavioural and neuromagnetic results suggest a precognitive audiovisual speech integration occurring at a relatively early processing level.
APA, Harvard, Vancouver, ISO, and other styles
11

Matchin, William, Kier Groulx, and Gregory Hickok. "Audiovisual Speech Integration Does Not Rely on the Motor System: Evidence from Articulatory Suppression, the McGurk Effect, and fMRI." Journal of Cognitive Neuroscience 26, no. 3 (March 2014): 606–20. http://dx.doi.org/10.1162/jocn_a_00515.

Full text
Abstract:
Visual speech influences the perception of heard speech. A classic example of this is the McGurk effect, whereby an auditory /pa/ overlaid onto a visual /ka/ induces the fusion percept of /ta/. Recent behavioral and neuroimaging research has highlighted the importance of both articulatory representations and motor speech regions of the brain, particularly Broca's area, in audiovisual (AV) speech integration. Alternatively, AV speech integration may be accomplished by the sensory system through multisensory integration in the posterior STS. We assessed the claims regarding the involvement of the motor system in AV integration in two experiments: (i) examining the effect of articulatory suppression on the McGurk effect and (ii) determining if motor speech regions show an AV integration profile. The hypothesis regarding experiment (i) is that if the motor system plays a role in McGurk fusion, distracting the motor system through articulatory suppression should result in a reduction of McGurk fusion. The results of experiment (i) showed that articulatory suppression results in no such reduction, suggesting that the motor system is not responsible for the McGurk effect. The hypothesis of experiment (ii) was that if the brain activation to AV speech in motor regions (such as Broca's area) reflects AV integration, the profile of activity should reflect AV integration: AV > AO (auditory only) and AV > VO (visual only). The results of experiment (ii) demonstrate that motor speech regions do not show this integration profile, whereas the posterior STS does. Instead, activity in motor regions is task dependent. The combined results suggest that AV speech integration does not rely on the motor system.
APA, Harvard, Vancouver, ISO, and other styles
12

Han, Yueqiao, Martijn Goudbeek, Maria Mos, and Marc Swerts. "Relative Contribution of Auditory and Visual Information to Mandarin Chinese Tone Identification by Native and Tone-naïve Listeners." Language and Speech 63, no. 4 (December 30, 2019): 856–76. http://dx.doi.org/10.1177/0023830919889995.

Full text
Abstract:
Speech perception is a multisensory process: what we hear can be affected by what we see. For instance, the McGurk effect occurs when auditory speech is presented in synchrony with discrepant visual information. A large number of studies have targeted the McGurk effect at the segmental level of speech (mainly consonant perception), which tends to be visually salient (lip-reading based), while the present study aims to extend the existing body of literature to the suprasegmental level, that is, investigating a McGurk effect for the identification of tones in Mandarin Chinese. Previous studies have shown that visual information does play a role in Chinese tone perception, and that the different tones correlate with variable movements of the head and neck. We constructed various tone combinations of congruent and incongruent auditory-visual materials (10 syllables with 16 tone combinations each) and presented them to native speakers of Mandarin Chinese and speakers of tone-naïve languages. In line with our previous work, we found that tone identification varies with individual tones, with tone 3 (the low-dipping tone) being the easiest one to identify, whereas tone 4 (the high-falling tone) was the most difficult one. We found that both groups of participants mainly relied on auditory input (instead of visual input), and that the auditory reliance for Chinese subjects was even stronger. The results did not show evidence for auditory-visual integration among native participants, while visual information is helpful for tone-naïve participants. However, even for this group, visual information only marginally increases the accuracy in the tone identification task, and this increase depends on the tone in question.
APA, Harvard, Vancouver, ISO, and other styles
13

Lüttke, Claudia S., Matthias Ekman, Marcel A. J. van Gerven, and Floris P. de Lange. "Preference for Audiovisual Speech Congruency in Superior Temporal Cortex." Journal of Cognitive Neuroscience 28, no. 1 (January 2016): 1–7. http://dx.doi.org/10.1162/jocn_a_00874.

Full text
Abstract:
Auditory speech perception can be altered by concurrent visual information. The superior temporal cortex is an important combining site for this integration process. This area was previously found to be sensitive to audiovisual congruency. However, the direction of this congruency effect (i.e., stronger or weaker activity for congruent compared to incongruent stimulation) has been more equivocal. Here, we used fMRI to look at the neural responses of human participants during the McGurk illusion—in which auditory /aba/ and visual /aga/ inputs are fused to perceived /ada/—in a large homogenous sample of participants who consistently experienced this illusion. This enabled us to compare the neuronal responses during congruent audiovisual stimulation with incongruent audiovisual stimulation leading to the McGurk illusion while avoiding the possible confounding factor of sensory surprise that can occur when McGurk stimuli are only occasionally perceived. We found larger activity for congruent audiovisual stimuli than for incongruent (McGurk) stimuli in bilateral superior temporal cortex, extending into the primary auditory cortex. This finding suggests that superior temporal cortex prefers when auditory and visual input support the same representation.
APA, Harvard, Vancouver, ISO, and other styles
14

Stacey, Jemaine E., Christina J. Howard, Suvobrata Mitra, and Paula C. Stacey. "Audio-visual integration in noise: Influence of auditory and visual stimulus degradation on eye movements and perception of the McGurk effect." Attention, Perception, & Psychophysics 82, no. 7 (June 12, 2020): 3544–57. http://dx.doi.org/10.3758/s13414-020-02042-x.

Full text
Abstract:
AbstractSeeing a talker’s face can aid audiovisual (AV) integration when speech is presented in noise. However, few studies have simultaneously manipulated auditory and visual degradation. We aimed to establish how degrading the auditory and visual signal affected AV integration. Where people look on the face in this context is also of interest; Buchan, Paré and Munhall (Brain Research, 1242, 162–171, 2008) found fixations on the mouth increased in the presence of auditory noise whilst Wilson, Alsius, Paré and Munhall (Journal of Speech, Language, and Hearing Research, 59(4), 601–615, 2016) found mouth fixations decreased with decreasing visual resolution. In Condition 1, participants listened to clear speech, and in Condition 2, participants listened to vocoded speech designed to simulate the information provided by a cochlear implant. Speech was presented in three levels of auditory noise and three levels of visual blurring. Adding noise to the auditory signal increased McGurk responses, while blurring the visual signal decreased McGurk responses. Participants fixated the mouth more on trials when the McGurk effect was perceived. Adding auditory noise led to people fixating the mouth more, while visual degradation led to people fixating the mouth less. Combined, the results suggest that modality preference and where people look during AV integration of incongruent syllables varies according to the quality of information available.
APA, Harvard, Vancouver, ISO, and other styles
15

Ujiie, Yuta, and Kohske Takahashi. "Weaker McGurk Effect for Rubin’s Vase-Type Speech in People With High Autistic Traits." Multisensory Research 34, no. 6 (April 16, 2021): 663–79. http://dx.doi.org/10.1163/22134808-bja10047.

Full text
Abstract:
Abstract While visual information from facial speech modulates auditory speech perception, it is less influential on audiovisual speech perception among autistic individuals than among typically developed individuals. In this study, we investigated the relationship between autistic traits (Autism-Spectrum Quotient; AQ) and the influence of visual speech on the recognition of Rubin’s vase-type speech stimuli with degraded facial speech information. Participants were 31 university students (13 males and 18 females; mean age: 19.2, SD: 1.13 years) who reported normal (or corrected-to-normal) hearing and vision. All participants completed three speech recognition tasks (visual, auditory, and audiovisual stimuli) and the AQ–Japanese version. The results showed that accuracies of speech recognition for visual (i.e., lip-reading) and auditory stimuli were not significantly related to participants’ AQ. In contrast, audiovisual speech perception was less susceptible to facial speech perception among individuals with high rather than low autistic traits. The weaker influence of visual information on audiovisual speech perception in autism spectrum disorder (ASD) was robust regardless of the clarity of the visual information, suggesting a difficulty in the process of audiovisual integration rather than in the visual processing of facial speech.
APA, Harvard, Vancouver, ISO, and other styles
16

Kislyuk, Daniel S., Riikka Möttönen, and Mikko Sams. "Visual Processing Affects the Neural Basis of Auditory Discrimination." Journal of Cognitive Neuroscience 20, no. 12 (December 2008): 2175–84. http://dx.doi.org/10.1162/jocn.2008.20152.

Full text
Abstract:
The interaction between auditory and visual speech streams is a seamless and surprisingly effective process. An intriguing example is the “McGurk effect”: The acoustic syllable /ba/ presented simultaneously with a mouth articulating /ga/ is typically heard as /da/ [McGurk, H., & MacDonald, J. Hearing lips and seeing voices. Nature, 264, 746–748, 1976]. Previous studies have demonstrated the interaction of auditory and visual streams at the auditory cortex level, but the importance of these interactions for the qualitative perception change remained unclear because the change could result from interactions at higher processing levels as well. In our electroencephalogram experiment, we combined the McGurk effect with mismatch negativity (MMN), a response that is elicited in the auditory cortex at a latency of 100–250 msec by any above-threshold change in a sequence of repetitive sounds. An “odd-ball” sequence of acoustic stimuli consisting of frequent /va/ syllables (standards) and infrequent /ba/ syllables (deviants) was presented to 11 participants. Deviant stimuli in the unisensory acoustic stimulus sequence elicited a typical MMN, reflecting discrimination of acoustic features in the auditory cortex. When the acoustic stimuli were dubbed onto a video of a mouth constantly articulating /va/, the deviant acoustic /ba/ was heard as /va/ due to the McGurk effect and was indistinguishable from the standards. Importantly, such deviants did not elicit MMN, indicating that the auditory cortex failed to discriminate between the acoustic stimuli. Our findings show that visual stream can qualitatively change the auditory percept at the auditory cortex level, profoundly influencing the auditory cortex mechanisms underlying early sound discrimination.
APA, Harvard, Vancouver, ISO, and other styles
17

Narinesingh, Cindy, Michael Wan, Herbert C. Goltz, Manokaraananthan Chandrakumar, and Agnes M. F. Wong. "Audiovisual Perception in Adults With Amblyopia: A Study Using the McGurk Effect." Investigative Opthalmology & Visual Science 55, no. 5 (May 19, 2014): 3158. http://dx.doi.org/10.1167/iovs.14-14140.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Jones, Jeffery A., and Daniel E. Callan. "Brain activity during audiovisual speech perception: An fMRI study of the McGurk effect." NeuroReport 14, no. 8 (June 2003): 1129–33. http://dx.doi.org/10.1097/00001756-200306110-00006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Tiippana, Kaisa, Martti Vainio, and Mikko Tiainen. "Audiovisual Speech Perception: Acoustic and Visual Phonetic Features Contributing to the McGurk Effect." i-Perception 2, no. 8 (October 2011): 768. http://dx.doi.org/10.1068/ic768.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Burnham, Denis, and Barbara Dodd. "Language–General Auditory–Visual Speech Perception: Thai–English and Japanese–English McGurk Effects." Multisensory Research 31, no. 1-2 (2018): 79–110. http://dx.doi.org/10.1163/22134808-00002590.

Full text
Abstract:
Cross-language McGurk Effects are used to investigate the locus of auditory–visual speech integration. Experiment 1 uses the fact that [], as in ‘sing’, is phonotactically legal in word-final position in English and Thai, but in word-initial position only in Thai. English and Thai language participants were tested for ‘n’ perception from auditory [m]/visual [] (A[m]V[]) in word-initial and -final positions. Despite English speakers’ native language bias to label word-initial [] as ‘n’, the incidence of ‘n’ percepts to A[m]V[] was equivalent for English and Thai speakers in final and initial positions. Experiment 2 used the facts that (i) [ð] as in ‘that’ is not present in Japanese, and (ii) English speakers respond more often with ‘tha’ than ‘da’ to A[ba]V[ga], but more often with ‘di’ than ‘thi’ to A[bi]V[gi]. English and three groups of Japanese language participants (Beginner, Intermediate, Advanced English knowledge) were presented with A[ba]V[ga] and A[bi]V[gi] by an English (Experiment 2a) or a Japanese (Experiment 2b) speaker. Despite Japanese participants’ native language bias to perceive ‘d’ more often than ‘th’, the four groups showed a similar phonetic level effect of [a]/[i] vowel context × ‘th’ vs. ‘d’ responses to A[b]V[g] presentations. In Experiment 2b this phonetic level interaction held, but was more one-sided as very few ‘th’ responses were evident, even in Australian English participants. Results are discussed in terms of a phonetic plus postcategorical model, in which incoming auditory and visual information is integrated at a phonetic level, after which there are post-categorical phonemic influences.
APA, Harvard, Vancouver, ISO, and other styles
21

Irwin, Julia, Trey Avery, Lawrence Brancazio, Jacqueline Turcios, Kayleigh Ryherd, and Nicole Landi. "Electrophysiological Indices of Audiovisual Speech Perception: Beyond the McGurk Effect and Speech in Noise." Multisensory Research 31, no. 1-2 (2018): 39–56. http://dx.doi.org/10.1163/22134808-00002580.

Full text
Abstract:
Visual information on a talker’s face can influence what a listener hears. Commonly used approaches to study this include mismatched audiovisual stimuli (e.g., McGurk type stimuli) or visual speech in auditory noise. In this paper we discuss potential limitations of these approaches and introduce a novel visual phonemic restoration method. This method always presents the same visual stimulus (e.g., /ba/) dubbed with a matched auditory stimulus (/ba/) or one that has weakened consonantal information and sounds more /a/-like). When this reduced auditory stimulus (or /a/) is dubbed with the visual /ba/, a visual influence will result in effectively ‘restoring’ the weakened auditory cues so that the stimulus is perceived as a /ba/. An oddball design in which participants are asked to detect the /a/ among a stream of more frequently occurring /ba/s while either a speaking face or face with no visual speech was used. In addition, the same paradigm was presented for a second contrast in which participants detected /pa/ among /ba/s, a contrast which should be unaltered by the presence of visual speech. Behavioral and some ERP findings reflect the expected phonemic restoration for the /ba/ vs. /a/ contrast; specifically, we observed reduced accuracy and P300 response in the presence of visual speech. Further, we report an unexpected finding of reduced accuracy and P300 response for both speech contrasts in the presence of visual speech, suggesting overall modulation of the auditory signal in the presence of visual speech. Consistent with this, we observed a mismatch negativity (MMN) effect for the /ba/ vs. /pa/ contrast only that was larger in absence of visual speech. We discuss the potential utility for this paradigm for listeners who cannot respond actively, such as infants and individuals with developmental disabilities.
APA, Harvard, Vancouver, ISO, and other styles
22

Bargary, Gary, Kylie J. Barnett, Kevin J. Mitchell, and Fiona N. Newell. "Colored-Speech Synaesthesia Is Triggered by Multisensory, Not Unisensory, Perception." Psychological Science 20, no. 5 (May 2009): 529–33. http://dx.doi.org/10.1111/j.1467-9280.2009.02338.x.

Full text
Abstract:
Although it is estimated that as many as 4% of people experience some form of enhanced cross talk between (or within) the senses, known as synaesthesia, very little is understood about the level of information processing required to induce a synaesthetic experience. In work presented here, we used a well-known multisensory illusion called the McGurk effect to show that synaesthesia is driven by late, perceptual processing, rather than early, unisensory processing. Specifically, we tested 9 linguistic-color synaesthetes and found that the colors induced by spoken words are related to what is perceived (i.e., the illusory combination of audio and visual inputs) and not to the auditory component alone. Our findings indicate that color-speech synaesthesia is triggered only when a significant amount of information processing has occurred and that early sensory activation is not directly linked to the synaesthetic experience.
APA, Harvard, Vancouver, ISO, and other styles
23

Bosker, Hans Rutger, and David Peeters. "Beat gestures influence which speech sounds you hear." Proceedings of the Royal Society B: Biological Sciences 288, no. 1943 (January 27, 2021): 20202419. http://dx.doi.org/10.1098/rspb.2020.2419.

Full text
Abstract:
Beat gestures—spontaneously produced biphasic movements of the hand—are among the most frequently encountered co-speech gestures in human communication. They are closely temporally aligned to the prosodic characteristics of the speech signal, typically occurring on lexically stressed syllables. Despite their prevalence across speakers of the world's languages, how beat gestures impact spoken word recognition is unclear. Can these simple ‘flicks of the hand' influence speech perception? Across a range of experiments, we demonstrate that beat gestures influence the explicit and implicit perception of lexical stress (e.g. distinguishing OBject from obJECT ), and in turn can influence what vowels listeners hear. Thus, we provide converging evidence for a manual McGurk effect: relatively simple and widely occurring hand movements influence which speech sounds we hear.
APA, Harvard, Vancouver, ISO, and other styles
24

Paré, Martin, Rebecca C. Richler, Martin ten Hove, and K. G. Munhall. "Gaze behavior in audiovisual speech perception: The influence of ocular fixations on the McGurk effect." Perception & Psychophysics 65, no. 4 (May 2003): 553–67. http://dx.doi.org/10.3758/bf03194582.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Magnotti, John F., and Michael S. Beauchamp. "A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech." PLOS Computational Biology 13, no. 2 (February 16, 2017): e1005229. http://dx.doi.org/10.1371/journal.pcbi.1005229.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Hardison, Debra M. "Bimodal Speech Perception by Native and Nonnative Speakers of English: Factors Influencing the McGurk Effect." Language Learning 46, no. 1 (March 1996): 3–73. http://dx.doi.org/10.1111/j.1467-1770.1996.tb00640.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Hardison, Debra M. "Bimodal Speech Perception by Native and Nonnative Speakers of English: Factors Influencing the McGurk Effect." Language Learning 49 (1999): 213–83. http://dx.doi.org/10.1111/0023-8333.49.s1.7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Dupont, Sophie, Jérôme Aubin, and Lucie Ménard. "study of the McGurk effect in 4 and 5-year-old French Canadian children." ZAS Papers in Linguistics 40 (January 1, 2005): 1–17. http://dx.doi.org/10.21248/zaspil.40.2005.254.

Full text
Abstract:
It has been shown that visual cues play a crucial role in the perception of vowels and consonants. Conflicting consonantal stimuli presented in the visual and auditory modalities can even result in the emergence of a third perceptual unit (McGurk effect). From a developmental point of view, several studies report that newborns can associate the image of a face uttering a given vowel to the auditory signal corresponding to this vowel; visual cues are thus used by the newborns. Despite the large number of studies carried out with adult speakers and newborns, very little work has been conducted with preschool-aged children. This contribution is aimed at describing the use of auditory and visual cues by 4 and 5-year-old French Canadian speakers, compared to adult speakers, in the identification of voiced consonants. Audiovisual recordings of a French Canadian speaker uttering the sequences [aba], [ada], [aga], [ava], [ibi], [idi], [igi], [ivi] have been carried out. The acoustic and visual signals have been extracted and analysed so that conflicting and non-conflicting stimuli, between the two modalities, were obtained. The resulting stimuli were presented as a perceptual test to eight 4 and 5-year-old French Canadian speakers and ten adults in three conditions: visual-only, auditory-only, and audiovisual. Results show that, even though the visual cues have a significant effect on the identification of the stimuli for adults and children, children are less sensitive to visual cues in the audiovisual condition. Such results shed light on the role of multimodal perception in the emergence and the refinement of the phonological system in children.
APA, Harvard, Vancouver, ISO, and other styles
29

Burnham, Denis, and Barbara Dodd. "Auditory-visual speech integration by prelinguistic infants: Perception of an emergent consonant in the McGurk effect." Developmental Psychobiology 45, no. 4 (2004): 204–20. http://dx.doi.org/10.1002/dev.20032.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Tona, Risa, Yasushi Naito, Saburo Moroto, Rinko Yamamoto, Keizo Fujiwara, Hiroshi Yamazaki, Shogo Shinohara, and Masahiro Kikuchi. "Audio–visual integration during speech perception in prelingually deafened Japanese children revealed by the McGurk effect." International Journal of Pediatric Otorhinolaryngology 79, no. 12 (December 2015): 2072–78. http://dx.doi.org/10.1016/j.ijporl.2015.09.016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Sekiyama, Kaoru. "Differences in auditory-visual speech perception between Japanese and Americans: McGurk effect as a function of incompatibility." Journal of the Acoustical Society of Japan (E) 15, no. 3 (1994): 143–58. http://dx.doi.org/10.1250/ast.15.143.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Yordamlı, Arzu, and Doğu Erdener. "Auditory–Visual Speech Integration in Bipolar Disorder: A Preliminary Study." Languages 3, no. 4 (October 17, 2018): 38. http://dx.doi.org/10.3390/languages3040038.

Full text
Abstract:
This study aimed to investigate how individuals with bipolar disorder integrate auditory and visual speech information compared to healthy individuals. Furthermore, we wanted to see whether there were any differences between manic and depressive episode bipolar disorder patients with respect to auditory and visual speech integration. It was hypothesized that the bipolar group’s auditory–visual speech integration would be weaker than that of the control group. Further, it was predicted that those in the manic phase of bipolar disorder would integrate visual speech information more robustly than their depressive phase counterparts. To examine these predictions, a McGurk effect paradigm with an identification task was used with typical auditory–visual (AV) speech stimuli. Additionally, auditory-only (AO) and visual-only (VO, lip-reading) speech perceptions were also tested. The dependent variable for the AV stimuli was the amount of visual speech influence. The dependent variables for AO and VO stimuli were accurate modality-based responses. Results showed that the disordered and control groups did not differ in AV speech integration and AO speech perception. However, there was a striking difference in favour of the healthy group with respect to the VO stimuli. The results suggest the need for further research whereby both behavioural and physiological data are collected simultaneously. This will help us understand the full dynamics of how auditory and visual speech information are integrated in people with bipolar disorder.
APA, Harvard, Vancouver, ISO, and other styles
33

Brancazio, Lawrence, and Joanne L. Miller. "Use of visual information in speech perception: Evidence for a visual rate effect both with and without a McGurk effect." Perception & Psychophysics 67, no. 5 (July 2005): 759–69. http://dx.doi.org/10.3758/bf03193531.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Kumar, G. Vinodh, Neeraj Kumar, Dipanjan Roy, and Arpan Banerjee. "Segregation and Integration of Cortical Information Processing Underlying Cross-Modal Perception." Multisensory Research 31, no. 5 (2018): 481–500. http://dx.doi.org/10.1163/22134808-00002574.

Full text
Abstract:
Visual cues from the speaker’s face influence the perception of speech. An example of this influence is demonstrated by the McGurk-effect where illusory (cross-modal) sounds are perceived following presentation of incongruent audio–visual (AV) stimuli. Previous studies report the engagement of specific cortical modules that are spatially distributed during cross-modal perception. However, the limits of the underlying representational space and the cortical network mechanisms remain unclear. In this combined psychophysical and electroencephalography (EEG) study, the participants reported their perception while listening to a set of synchronous and asynchronous incongruent AV stimuli. We identified the neural representation of subjective cross-modal perception at different organizational levels — at specific locations in sensor space and at the level of the large-scale brain network estimated from between-sensor interactions. We identified an enhanced positivity in the event-related potential peak around 300 ms following stimulus onset associated with cross-modal perception. At the spectral level, cross-modal perception involved an overall decrease in power at the frontal and temporal regions at multiple frequency bands and at all AV lags, along with an increased power at the occipital scalp region for synchronous AV stimuli. At the level of large-scale neuronal networks, enhanced functional connectivity at the gamma band involving frontal regions serves as a marker of AV integration. Thus, we report in one single study that segregation of information processing at individual brain locations and integration of information over candidate brain networks underlie multisensory speech perception.
APA, Harvard, Vancouver, ISO, and other styles
35

Fuchs, Susanne, Pascal Perrier, and Bernd Pompino-Marschall. "Speech production and perception: experimental analyses and models." ZAS Papers in Linguistics 40 (January 1, 2005): 225. http://dx.doi.org/10.21248/zaspil.40.2005.253.

Full text
Abstract:
This special issue of the ZAS Papers in Linguistics contains a collection of papers of the French-German Thematic Summerschool on "Cognitive and physical models of speech production, and speech perception and of their interaction". Organized by Susanne Fuchs (ZAS Berlin), Jonathan Harrington (IPdS Kiel), Pascal Perrier (ICP Grenoble) and Bernd Pompino-Marschall (HUB and ZAS Berlin) and funded by the German-French University in Saarbrücken this summerschool was held from September 19th till 24th 2004 at the coast of the Baltic Sea at the Heimvolkshochschule Lubmin (Germany) with 45 participants from Germany, France, Great Britain, Italy and Canada. The scientific program of this summerschool that is reprinted at the end of this volume included 11 key-note presentations by invited speakers, 21 oral presentations and a poster session (8 presentations). The names and addresses of all participants are also given in the back matter of this volume. All participants was offered the opportunity to publish an extended version of their presentation in the ZAS Papers in Linguistics. All submitted papers underwent a review and an editing procedure by external experts and the organizers of the summerschool. As it is the case in a summerschool, papers present either works in progress, or works at a more advanced stage, or tutorials. They are ordered alphabetically by their first author's name, fortunately resulting in the fact that this special issue starts out with the paper that won the award as best pre-doctoral presentation, i.e. Sophie Dupont, Jérôme Aubin and Lucie Ménard with "A study of the McGurk effect in 4 and 5-year-old French Canadian children".
APA, Harvard, Vancouver, ISO, and other styles
36

Diesch, Eugen. "Left and Right Hemifield Advantages of Fusions and Combinations in Audiovisual Speech Perception." Quarterly Journal of Experimental Psychology Section A 48, no. 2 (May 1995): 320–33. http://dx.doi.org/10.1080/14640749508401393.

Full text
Abstract:
If a place-of-articulation contrast is created between the auditory and the visual component syllables of videotaped speech, frequently the syllable that listeners report they have heard differs phonetically from the auditory component. These “McGurk effects”, as they have come to be called, show that speech perception may involve some kind of intermodal process. There are two classes of these phenomena: fusions and combinations. Perception of the syllable /da/ when auditory /ba/ and visual /ga/ are presented provides a clear example of the former, and perception of the string /bga/ after presentation of auditory /ga/ and visual /ba/ an unambiguous instance of the latter. Besides perceptual fusions and combinations, hearing visually presented component syllables also shows an influence of vision on audition. It is argued that these “visual” responses arise from basically the same underlying processes that yield fusions and combinations, respectively. In the present study, the visual component of audiovisually incongruous CV-syllables was presented in the left and the right visual hemifield, respectively. Audiovisual fusion responses showed a left hemifield advantage, and audiovisual combination responses a right hemifield advantage. This finding suggests that the process of audiovisual integration differs between audiovisual fusions and combinations and, furthermore, that the two cerebral hemispheres contribute differentially to the two classes of response.
APA, Harvard, Vancouver, ISO, and other styles
37

Randazzo, Melissa, Ryan Priefer, Paul J. Smith, Amanda Nagler, Trey Avery, and Karen Froud. "Neural Correlates of Modality-Sensitive Deviance Detection in the Audiovisual Oddball Paradigm." Brain Sciences 10, no. 6 (May 28, 2020): 328. http://dx.doi.org/10.3390/brainsci10060328.

Full text
Abstract:
The McGurk effect, an incongruent pairing of visual /ga/–acoustic /ba/, creates a fusion illusion /da/ and is the cornerstone of research in audiovisual speech perception. Combination illusions occur given reversal of the input modalities—auditory /ga/-visual /ba/, and percept /bga/. A robust literature shows that fusion illusions in an oddball paradigm evoke a mismatch negativity (MMN) in the auditory cortex, in absence of changes to acoustic stimuli. We compared fusion and combination illusions in a passive oddball paradigm to further examine the influence of visual and auditory aspects of incongruent speech stimuli on the audiovisual MMN. Participants viewed videos under two audiovisual illusion conditions: fusion with visual aspect of the stimulus changing, and combination with auditory aspect of the stimulus changing, as well as two unimodal auditory- and visual-only conditions. Fusion and combination deviants exerted similar influence in generating congruency predictions with significant differences between standards and deviants in the N100 time window. Presence of the MMN in early and late time windows differentiated fusion from combination deviants. When the visual signal changes, a new percept is created, but when the visual is held constant and the auditory changes, the response is suppressed, evoking a later MMN. In alignment with models of predictive processing in audiovisual speech perception, we interpreted our results to indicate that visual information can both predict and suppress auditory speech perception.
APA, Harvard, Vancouver, ISO, and other styles
38

Hertrich, Ingo, Susanne Dietrich, and Hermann Ackermann. "Cross-modal Interactions during Perception of Audiovisual Speech and Nonspeech Signals: An fMRI Study." Journal of Cognitive Neuroscience 23, no. 1 (January 2011): 221–37. http://dx.doi.org/10.1162/jocn.2010.21421.

Full text
Abstract:
During speech communication, visual information may interact with the auditory system at various processing stages. Most noteworthy, recent magnetoencephalography (MEG) data provided first evidence for early and preattentive phonetic/phonological encoding of the visual data stream—prior to its fusion with auditory phonological features [Hertrich, I., Mathiak, K., Lutzenberger, W., & Ackermann, H. Time course of early audiovisual interactions during speech and non-speech central-auditory processing: An MEG study. Journal of Cognitive Neuroscience, 21, 259–274, 2009]. Using functional magnetic resonance imaging, the present follow-up study aims to further elucidate the topographic distribution of visual–phonological operations and audiovisual (AV) interactions during speech perception. Ambiguous acoustic syllables—disambiguated to /pa/ or /ta/ by the visual channel (speaking face)—served as test materials, concomitant with various control conditions (nonspeech AV signals, visual-only and acoustic-only speech, and nonspeech stimuli). (i) Visual speech yielded an AV-subadditive activation of primary auditory cortex and the anterior superior temporal gyrus (STG), whereas the posterior STG responded both to speech and nonspeech motion. (ii) The inferior frontal and the fusiform gyrus of the right hemisphere showed a strong phonetic/phonological impact (differential effects of visual /pa/ vs. /ta/) upon hemodynamic activation during presentation of speaking faces. Taken together with the previous MEG data, these results point at a dual-pathway model of visual speech information processing: On the one hand, access to the auditory system via the anterior supratemporal “what” path may give rise to direct activation of “auditory objects.” On the other hand, visual speech information seems to be represented in a right-hemisphere visual working memory, providing a potential basis for later interactions with auditory information such as the McGurk effect.
APA, Harvard, Vancouver, ISO, and other styles
39

Kkese, Elena. "McGurk effect and audiovisual speech perception in students with learning disabilities exposed to online teaching during the COVID-19 pandemic." Medical Hypotheses 144 (November 2020): 110233. http://dx.doi.org/10.1016/j.mehy.2020.110233.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Brendel, O. "SOME ASPECTS OF VISUAL-AUDITORY PERCEPTION OF ORAL SPEECH WHILE VIDEO AND SOUND RECORDINGS EXAMINATION." Theory and Practice of Forensic Science and Criminalistics 21, no. 1 (December 15, 2020): 349–58. http://dx.doi.org/10.32353/khrife.1.2020_24.

Full text
Abstract:
A problematic issue that frequently arises in the examination of video and audio recordings, namely the question of visual and auditory perception of oral speech – the establishment of the content of a conversation based on its image (lip reading) – is considered. The article purpose is to analyze the possibility and feasibility of examining the visual-auditory perception of oral speech in the framework of the examination of video and sound recordings, considering the peculiarities of such research; the ability to use visual information either as an independent object of examination (lip reading), or as a supplementary, additional to auditory analysis of a particular message. The main components of the process of lip reading, the possibility of visual examination of visual and auditory information in order to establish the content of a conversation are considered. Attention is paid to the features of visual and auditory perception of oral speech, and the factors that contribute enormously to the informative nature of the overall picture of oral speech perception by an image are analyzed. The influence of the visual image on the speech perception by an image is considered, such as active articulation, facial expressions, head movement, position of teeth, gestures, etc. In addition to the quality of the image, the duration of the speech fragment also affects the perception of oral speech by the image: a fully uttered expression is usually read better than its individual parts. The article also draws attention to the ambiguity of articulatory images of sounds. The features of the McGurk effect – a perception phenomenon that demonstrates the interaction between hearing and vision while the perception of speech – are considered. The analysis of the possibility and feasibility of examining visual and auditory perception of oral speech within the framework of the examination of video and sound recordings is carried out, and the peculiarities of such research are highlighted.
APA, Harvard, Vancouver, ISO, and other styles
41

Bristow, Davina, Ghislaine Dehaene-Lambertz, Jeremie Mattout, Catherine Soares, Teodora Gliga, Sylvain Baillet, and Jean-François Mangin. "Hearing Faces: How the Infant Brain Matches the Face It Sees with the Speech It Hears." Journal of Cognitive Neuroscience 21, no. 5 (May 2009): 905–21. http://dx.doi.org/10.1162/jocn.2009.21076.

Full text
Abstract:
Speech is not a purely auditory signal. From around 2 months of age, infants are able to correctly match the vowel they hear with the appropriate articulating face. However, there is no behavioral evidence of integrated audiovisual perception until 4 months of age, at the earliest, when an illusory percept can be created by the fusion of the auditory stimulus and of the facial cues (McGurk effect). To understand how infants initially match the articulatory movements they see with the sounds they hear, we recorded high-density ERPs in response to auditory vowels that followed a congruent or incongruent silently articulating face in 10-week-old infants. In a first experiment, we determined that auditory–visual integration occurs during the early stages of perception as in adults. The mismatch response was similar in timing and in topography whether the preceding vowels were presented visually or aurally. In the second experiment, we studied audiovisual integration in the linguistic (vowel perception) and nonlinguistic (gender perception) domain. We observed a mismatch response for both types of change at similar latencies. Their topographies were significantly different demonstrating that cross-modal integration of these features is computed in parallel by two different networks. Indeed, brain source modeling revealed that phoneme and gender computations were lateralized toward the left and toward the right hemisphere, respectively, suggesting that each hemisphere possesses an early processing bias. We also observed repetition suppression in temporal regions and repetition enhancement in frontal regions. These results underscore how complex and structured is the human cortical organization which sustains communication from the first weeks of life on.
APA, Harvard, Vancouver, ISO, and other styles
42

Tse, Chun-Yu, Gabriele Gratton, Susan M. Garnsey, Michael A. Novak, and Monica Fabiani. "Read My Lips: Brain Dynamics Associated with Audiovisual Integration and Deviance Detection." Journal of Cognitive Neuroscience 27, no. 9 (September 2015): 1723–37. http://dx.doi.org/10.1162/jocn_a_00812.

Full text
Abstract:
Information from different modalities is initially processed in different brain areas, yet real-world perception often requires the integration of multisensory signals into a single percept. An example is the McGurk effect, in which people viewing a speaker whose lip movements do not match the utterance perceive the spoken sounds incorrectly, hearing them as more similar to those signaled by the visual rather than the auditory input. This indicates that audiovisual integration is important for generating the phoneme percept. Here we asked when and where the audiovisual integration process occurs, providing spatial and temporal boundaries for the processes generating phoneme perception. Specifically, we wanted to separate audiovisual integration from other processes, such as simple deviance detection. Building on previous work employing ERPs, we used an oddball paradigm in which task-irrelevant audiovisually deviant stimuli were embedded in strings of non-deviant stimuli. We also recorded the event-related optical signal, an imaging method combining spatial and temporal resolution, to investigate the time course and neuroanatomical substrate of audiovisual integration. We found that audiovisual deviants elicit a short duration response in the middle/superior temporal gyrus, whereas audiovisual integration elicits a more extended response involving also inferior frontal and occipital regions. Interactions between audiovisual integration and deviance detection processes were observed in the posterior/superior temporal gyrus. These data suggest that dynamic interactions between inferior frontal cortex and sensory regions play a significant role in multimodal integration.
APA, Harvard, Vancouver, ISO, and other styles
43

Luo, Xiaoxiao, Guanlan Kang, Yu Guo, Xingcheng Yu, and Xiaolin Zhou. "A value-driven McGurk effect: Value-associated faces enhance the influence of visual information on audiovisual speech perception and its eye movement pattern." Attention, Perception, & Psychophysics 82, no. 4 (January 2, 2020): 1928–41. http://dx.doi.org/10.3758/s13414-019-01918-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Magnotti, John, and Michael Beauchamp. "A causal inference model of multisensory speech perception provides an explanation for why some audiovisual syllables but not others produce the McGurk Effect." Journal of Vision 16, no. 12 (September 1, 2016): 580. http://dx.doi.org/10.1167/16.12.580.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Assunção, Gustavo, Nuno Gonçalves, and Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection." Applied Sciences 11, no. 8 (April 10, 2021): 3397. http://dx.doi.org/10.3390/app11083397.

Full text
Abstract:
Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound source location, as in distinguishing whose voice is being heard in a group conversation. Furthermore, neuroscience has successfully identified the superior colliculus region in the brain as the one responsible for this modality fusion, with a handful of biological models having been proposed to approach its underlying neurophysiological process. Deriving inspiration from one of these models, this paper presents a methodology for effectively fusing correlated auditory and visual information for active speaker detection. Such an ability can have a wide range of applications, from teleconferencing systems to social robotics. The detection approach initially routes auditory and visual information through two specialized neural network structures. The resulting embeddings are fused via a novel layer based on the superior colliculus, whose topological structure emulates spatial neuron cross-mapping of unimodal perceptual fields. The validation process employed two publicly available datasets, with achieved results confirming and greatly surpassing initial expectations.
APA, Harvard, Vancouver, ISO, and other styles
46

Campbell, Ruth. "The processing of audio-visual speech: empirical and neural bases." Philosophical Transactions of the Royal Society B: Biological Sciences 363, no. 1493 (September 7, 2007): 1001–10. http://dx.doi.org/10.1098/rstb.2007.2155.

Full text
Abstract:
In this selective review, I outline a number of ways in which seeing the talker affects auditory perception of speech, including, but not confined to, the McGurk effect. To date, studies suggest that all linguistic levels are susceptible to visual influence, and that two main modes of processing can be described: a complementary mode, whereby vision provides information more efficiently than hearing for some under-specified parts of the speech stream, and a correlated mode, whereby vision partially duplicates information about dynamic articulatory patterning. Cortical correlates of seen speech suggest that at the neurological as well as the perceptual level, auditory processing of speech is affected by vision, so that ‘auditory speech regions’ are activated by seen speech. The processing of natural speech, whether it is heard, seen or heard and seen, activates the perisylvian language regions (left>right). It is highly probable that activation occurs in a specific order. First, superior temporal, then inferior parietal and finally inferior frontal regions (left>right) are activated. There is some differentiation of the visual input stream to the core perisylvian language system, suggesting that complementary seen speech information makes special use of the visual ventral processing stream, while for correlated visual speech, the dorsal processing stream, which is sensitive to visual movement, may be relatively more involved.
APA, Harvard, Vancouver, ISO, and other styles
47

Opoku-Baah, Collins, Adriana M. Schoenhaut, Sarah G. Vassall, David A. Tovar, Ramnarayan Ramachandran, and Mark T. Wallace. "Visual Influences on Auditory Behavioral, Neural, and Perceptual Processes: A Review." Journal of the Association for Research in Otolaryngology 22, no. 4 (May 20, 2021): 365–86. http://dx.doi.org/10.1007/s10162-021-00789-0.

Full text
Abstract:
AbstractIn a naturalistic environment, auditory cues are often accompanied by information from other senses, which can be redundant with or complementary to the auditory information. Although the multisensory interactions derived from this combination of information and that shape auditory function are seen across all sensory modalities, our greatest body of knowledge to date centers on how vision influences audition. In this review, we attempt to capture the state of our understanding at this point in time regarding this topic. Following a general introduction, the review is divided into 5 sections. In the first section, we review the psychophysical evidence in humans regarding vision’s influence in audition, making the distinction between vision’s ability to enhance versus alter auditory performance and perception. Three examples are then described that serve to highlight vision’s ability to modulate auditory processes: spatial ventriloquism, cross-modal dynamic capture, and the McGurk effect. The final part of this section discusses models that have been built based on available psychophysical data and that seek to provide greater mechanistic insights into how vision can impact audition. The second section reviews the extant neuroimaging and far-field imaging work on this topic, with a strong emphasis on the roles of feedforward and feedback processes, on imaging insights into the causal nature of audiovisual interactions, and on the limitations of current imaging-based approaches. These limitations point to a greater need for machine-learning-based decoding approaches toward understanding how auditory representations are shaped by vision. The third section reviews the wealth of neuroanatomical and neurophysiological data from animal models that highlights audiovisual interactions at the neuronal and circuit level in both subcortical and cortical structures. It also speaks to the functional significance of audiovisual interactions for two critically important facets of auditory perception—scene analysis and communication. The fourth section presents current evidence for alterations in audiovisual processes in three clinical conditions: autism, schizophrenia, and sensorineural hearing loss. These changes in audiovisual interactions are postulated to have cascading effects on higher-order domains of dysfunction in these conditions. The final section highlights ongoing work seeking to leverage our knowledge of audiovisual interactions to develop better remediation approaches to these sensory-based disorders, founded in concepts of perceptual plasticity in which vision has been shown to have the capacity to facilitate auditory learning.
APA, Harvard, Vancouver, ISO, and other styles
48

Ujiie, Yuta, and Kohske Takahashi. "Own-race faces promote integrated audiovisual speech information." Quarterly Journal of Experimental Psychology, September 8, 2021, 174702182110444. http://dx.doi.org/10.1177/17470218211044480.

Full text
Abstract:
The other-race effect indicates a perceptual advantage when processing own-race faces. This effect has been demonstrated in individuals’ recognition of facial identity and emotional expressions. However, it remains unclear whether the other-race effect also exists in multisensory domains. We conducted two experiments to provide evidence for the other-race effect in facial speech recognition, using the McGurk effect. Experiment 1 tested this issue among East Asian adults, examining the magnitude of the McGurk effect during stimuli using speakers from two different races (own-race vs. other-race). We found that own-race faces induced a stronger McGurk effect than other-race faces. Experiment 2 indicated that the other-race effect was not simply due to different levels of attention being paid to the mouths of own- and other-race speakers. Our findings demonstrated that own-race faces enhance the weight of visual input during audiovisual speech perception, and they provide evidence of the own-race effect in the audiovisual interaction for speech perception in adults.
APA, Harvard, Vancouver, ISO, and other styles
49

Nathan, Weisz. "Prestimulus Oscillatory Brain Activity Influences The Perception Of The McGurk-Effect." Frontiers in Neuroscience 4 (2010). http://dx.doi.org/10.3389/conf.fnins.2010.06.00171.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Yasufuku, Kanako, and Gabriel Doyle. "Echoes of L1 Syllable Structure in L2 Phoneme Recognition." Frontiers in Psychology 12 (July 20, 2021). http://dx.doi.org/10.3389/fpsyg.2021.515237.

Full text
Abstract:
Learning to move from auditory signals to phonemic categories is a crucial component of first, second, and multilingual language acquisition. In L1 and simultaneous multilingual acquisition, learners build up phonological knowledge to structure their perception within a language. For sequential multilinguals, this knowledge may support or interfere with acquiring language-specific representations for a new phonemic categorization system. Syllable structure is a part of this phonological knowledge, and language-specific syllabification preferences influence language acquisition, including early word segmentation. As a result, we expect to see language-specific syllable structure influencing speech perception as well. Initial evidence of an effect appears in Ali et al. (2011), who argued that cross-linguistic differences in McGurk fusion within a syllable reflected listeners’ language-specific syllabification preferences. Building on a framework from Cho and McQueen (2006), we argue that this could reflect the Phonological-Superiority Hypothesis (differences in L1 syllabification preferences make some syllabic positions harder to classify than others) or the Phonetic-Superiority Hypothesis (the acoustic qualities of speech sounds in some positions make it difficult to perceive unfamiliar sounds). However, their design does not distinguish between these two hypotheses. The current research study extends the work of Ali et al. (2011) by testing Japanese, and adding audio-only and congruent audio-visual stimuli to test the effects of syllabification preferences beyond just McGurk fusion. Eighteen native English speakers and 18 native Japanese speakers were asked to transcribe nonsense words in an artificial language. English allows stop consonants in syllable codas while Japanese heavily restricts them, but both groups showed similar patterns of McGurk fusion in stop codas. This is inconsistent with the Phonological-Superiority Hypothesis. However, when visual information was added, the phonetic influences on transcription accuracy largely disappeared. This is inconsistent with the Phonetic-Superiority Hypothesis. We argue from these results that neither acoustic informativity nor interference of a listener’s phonological knowledge is superior, and sketch a cognitively inspired rational cue integration framework as a third hypothesis to explain how L1 phonological knowledge affects L2 perception.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography