Conecte-se

Bibliografias temáticas / Auditory-visual speech perception / Teses / dissertações

Siga este link para ver outros tipos de publicações sobre o tema: Auditory-visual speech perception.

Teses / dissertações sobre o tema "Auditory-visual speech perception"

Autor: Grafiati

Publicado: 4 de junho de 2021

Última modificação: 20 de fevereiro de 2023

Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos

Selecione um tipo de fonte:

Veja os 24 melhores trabalhos (teses / dissertações) para estudos sobre o assunto "Auditory-visual speech perception".

Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.

Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.

Veja as teses / dissertações das mais diversas áreas científicas e compile uma bibliografia correta.

1

Howard, John Graham. "Temporal aspects of auditory-visual speech and non-speech perception". Thesis, University of Reading, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.553127.

Texto completo da fonte

Resumo:

This thesis concentrates on the temporal aspects of the auditory-visual integratory perceptual experience described above. It is organized in two parts, a literature review, followed by an experimentation section. After a brief introduction (Chapter One), Chapter Two begins by considering the evolution of the earliest biological structures to exploit information in the acoustic and optic environments. The second part of the chapter proposes that the auditory-visual integratory experience might be a by-product of the earliest emergence of spoken language. Chapter Three focuses on human auditory and visual neural structures. It traces the auditory and visual systems of the modem human brain through the complex neuroanatomical forms that construct their pathways, through to where they finally integrate into the high-level multi-sensory association areas. Chapter Four identifies two distinct investigative schools that have each reported on the auditory-visual integratory experience. We consider their different experimental methodologies and a number of architectural and information processing models that have sought to emulate human sensory, cognitive and perceptual processing, and ask how far they can accommodate a bi-sensory integratory processing. Chapter Five draws upon empirical data to support the importance of the temporal dimension of sensory forms in information processing, especially bimodal processing. It considers the implications of different modalities processing differently discontinuous afferent information within different time-frames. It concludes with a discussion of a number of models of biological clocks that have been proposed as essential temporal regulators of human sensory experience. In Part Two, the experiments are presented. Chapter Six provides the general methodology, and in the following Chapters a series of four experiments is reported upon. The experiments follow a logical sequence, each being built upon information either revealed or confirmed in results previously reported. Experiments One, Three, and Four required a radical reinterpretation of the 'fast-detection' paradigm developed for use in signal detection theory. This enables the work of two discrete investigative schools in auditory-visual processing to be brought together. The use of this modified paradigm within an appropriately designed methodology produces experimental results that speak directly to both the 'speech versus non-speech' debate and also to gender studies.

Estilos ABNT, Harvard, Vancouver, APA, etc.

2

Ver, Hulst Pamela. "Visual and auditory factors facilitating multimodal speech perception". Connect to resource, 2006. http://hdl.handle.net/1811/6629.

Texto completo da fonte

Resumo:

Thesis (Honors)--Ohio State University, 2006.
Title from first page of PDF file. Document formatted into pages: contains 35 p.; also includes graphics. Includes bibliographical references (p. 24-26). Available online via Ohio State University's Knowledge Bank.

Estilos ABNT, Harvard, Vancouver, APA, etc.

3

Anderson, Corinne D. "Auditory and visual characteristics of individual talkers in multimodal speech perception". Connect to resource, 2007. http://hdl.handle.net/1811/28373.

Texto completo da fonte

Resumo:

Thesis (Honors)--Ohio State University, 2007.
Title from first page of PDF file. Document formatted into pages: contains 43 p.; also includes graphics. Includes bibliographical references (p. 29-30). Available online via Ohio State University's Knowledge Bank.

Estilos ABNT, Harvard, Vancouver, APA, etc.

4

Leech, Stuart Matthew. "The effect on audiovisual speech perception of auditory and visual source separation". Thesis, University of Sussex, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.271770.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

5

Wroblewski, Marcin. "Developmental predictors of auditory-visual integration of speech in reverberation and noise". Diss., University of Iowa, 2017. https://ir.uiowa.edu/etd/6017.

Texto completo da fonte

Resumo:

Objectives: Elementary school classrooms that meet the acoustic requirements for near-optimum speech recognition are extremely scarce. Poor classroom acoustics may become a barrier to speech understanding as children enter school. The purpose of this study was threefold: 1) to quantify the extent to which reverberation, lexical difficulty, and presentation mode affect speech recognition in noise, 2) to examine to what extent auditory-visual (AV) integration assists with the recognition of speech in noisy and reverberant environments typical of elementary school classrooms, 3) to understand the relationship between developing mechanisms of multisensory integration and the concurrently developing linguistic and cognitive abilities. Design: Twenty-seven typically developing children and 9 young adults participated. Participants repeated short sentences reproduced by 10 speakers on a 30” HDTV and/or over loudspeakers located around the listener in a simulated classroom environment. Signal-to-noise ratio (SNR) for 70 (SNR70) and 30 (SNR30) percent correct performance were measured using an adaptive tracking procedure. Auditory-visual integration was assessed via the SNR difference between AV and auditory-only (AO) conditions, labeled speech-reading benefit (SRB). Linguistic and cognitive aptitude was assessed using the NIH-Toolbox: Cognition Battery (NIH-TB: CB). Results: Children required more favorable SNRs for equivalent performance when compared to adults. Participants benefited from the reduction in lexical difficulty, and in most cases the reduction in reverberation time. Reverberation affected children’s speech recognition in AO condition and adults in AV condition. At SNR30, SRB was greater than that at SNR70. Adults showed marginally significant increase in AV integration relative to children. Adults also showed increase in SRB for lexically hard versus easy words, at high level of reverberation. Development of linguistic and cognitive aptitude accounts for approximately 35% of the variance in AV integration, with crystalized and fluid cognition composite scores identified as strongest predictors. Conclusions: The results of this study add to the body of evidence in support of children requiring more favorable SNRs to perform the same speech recognition tasks as adults in simulated listening environments akin to school classrooms. Our findings shed light on the development of AV integration for speech recognition in noise and reverberation during the school years, and provide insight into the balance of cognitive and linguistic underpinnings necessary for AV integration of degraded speech.

Estilos ABNT, Harvard, Vancouver, APA, etc.

6

Watson, D. R. "Cognitive effects of impaired auditory abilities and use of visual speech to supplement perception". Thesis, Queen's University Belfast, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.396891.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

7

Lees, Nicole C. "Vocalisations with a better view : hyperarticulation augments the auditory-visual advantage for the detection of speech in noise". Thesis, View thesis, 2007. http://handle.uws.edu.au:8081/1959.7/19576.

Texto completo da fonte

Resumo:

Recent studies have shown that there is a visual influence early in speech processing - visual speech enhances the ability to detect auditory speech in noise. However, identifying exactly how visual speech interacts with auditory processing at such an early stage has been challenging, because this so-called AV speech detection advantage is both highly related to a specific lower-order, signal-based, optic-acoustic relationship between the second formant amplitude and the area of the mouth (F2/Mouth-area), and mediated by higher-order, information-based factors. Previous investigations either have maximised or minimised information-based factors, or have minimised signal-based factors, in order to try to tease out the relative importance of these sources of the advantage, but they have not yet been successful in this endeavour. Maximising signal-based factors has not previously been explored. This avenue was explored in this thesis by manipulating speaking style, hyperarticulated speech was used to maximise signal-based factors, and hypoarticulated speech to minimise signal-based factors - to examine whether the AV speech detection advantage is modified by these means, and to provide a clearer idea of the primary source of visual influence in the AV detection advantage. Two sets of six studies were conducted. In the first set, three recorded speech styles, hyperarticulated, normal, and hypoarticulated, were extensively analysed in physical (optic and acoustic) and perceptual (visual and auditory) dimensions ahead of stimulus selection for the second set of studies. The analyses indicated that the three styles comprise distinctive categories on the Hyper-Hypo continuum of articulatory effort (Lindblom, 1990). Most relevantly, both optically and visually hyperarticulated speech was more informative, and hypoarticulated less informative, than normal speech with regard to signal-based movement factors. However, the F2/Mouth-area correlation was similarly strong for all speaking styles, thus allowing examination of signal-based, visual informativeness on AV speech detection with optic-acoustic association controlled. In the second set of studies, six Detection Experiments incorporating the three speaking styles were designed to examine whether, and if so why, more visually-informative (hyperarticulated) speech augmented, and less visually informative (hypoarticulated) speech attenuated, the AV detection advantage relative to normal speech, and to examine visual influence when auditory speech was absent. Detection Experiment 1 used a two-interval, two-alternative (first or second interval, 2I2AFC) detection task, and indicated that hyperarticulation provided an AV detection advantage greater than for normal and hypoarticulated speech, with less of an advantage for hypoarticulated than for normal speech. Detection Experiment 2 used a single-interval, yes-no detection task to assess responses in signal-absent independent of signal-present conditions as a means of addressing participants’ reports that speech was heard when it was not presented in the 2I2AFC task. Hyperarticulation resulted in an AV detection advantage, and for all speaking styles there was a consistent response bias to indicate speech was present in signal-absent conditions. To examine whether the AV detection advantage for hyperarticulation was due to visual, auditory or auditory-visual factors, Detection Experiments 3 and 4 used mismatching AV speaking style combinations (AnormVhyper, AnormVhypo, AhyperVnorm, AhypoVnorm) that were onset-matched or time-aligned, respectively. The results indicated that higher rates of mouth movement can be sufficient for the detection advantage with weak optic-acoustic associations, but, in circumstances where these associations are low, even high rates of movement have little impact on augmenting detection in noise. Furthermore, in Detection Experiment 5, in which visual stimuli consisted only of the mouth movements extracted from the three styles, there was no AV detection advantage, and it seems that this is so because extra-oral information is required, perhaps to provide a frame of reference that improves the availability of mouth movement to the perceiver. Detection Experiment 6 used a new 2I-4AFC task and the measures of false detections and response bias to identify whether visual influence in signal absent conditions is due to response bias or an illusion of hearing speech in noise (termed here the Speech in Noise, SiN, Illusion). In the event, the SiN illusion occurred for both the hyperarticulated and the normal styles – styles with reasonable amounts of movement change. For normal speech, the responses in signal-absent conditions were due only to the illusion of hearing speech in noise, whereas for hypoarticulated speech such responses were due only to response bias. For hyperarticulated speech there is evidence for the presence of both types of visual influence in signal-absent conditions. It seems to be the case that there is more doubt with regard to the presence of auditory speech for non-normal speech styles. An explanation of past and present results is offered within a new framework -the Dynamic Bimodal Accumulation Theory (DBAT). This is developed in this thesis to address the limitations of, and conflicts between, previous theoretical positions. DBAT suggests a bottom-up influence of visual speech on the processing of auditory speech; specifically, it is proposed that the rate of change of visual movements guides auditory attention rhythms ‘on-line’ at corresponding rates, which allows selected samples of the auditory stream to be given prominence. Any patterns contained within these samples then emerge from the course of auditory integration processes. By this account, there are three important elements of visual speech necessary for enhanced detection of speech in noise. First and foremost, when speech is present, visual movement information must be available (as opposed to hypoarticulated and synthetic speech) Then the rate of change, and opticacoustic relatedness also have an impact (as in Detection Experiments 3 and 4). When speech is absent, visual information has an influence; and the SiN illusion (Detection Experiment 6) can be explained as a perceptual modulation of a noise stimulus by visually-driven rhythmic attention. In sum, hyperarticulation augments the AV speech detection advantage, and, whenever speech is perceived in noisy conditions, there is either response bias to perceive speech or a SiN illusion, or both. DBAT provides a detailed description of these results, with wider-ranging explanatory power than previous theoretical accounts. Predictions are put forward for examination of the predictive power of DBAT in future studies.

Estilos ABNT, Harvard, Vancouver, APA, etc.

8

Lees, Nicole C. "Vocalisations with a better view hyperarticulation augments the auditory-visual advantage for the detection of speech in noise /". View thesis, 2007. http://handle.uws.edu.au:8081/1959.7/19576.

Texto completo da fonte

Resumo:

Thesis (Ph.D.)--University of Western Sydney, 2007.
A thesis submitted to the University of Western Sydney, College of Arts, in fulfilment of the requirements for the degree of Doctor of Philosophy. Includes bibliography.

Estilos ABNT, Harvard, Vancouver, APA, etc.

9

Schnobrich, Kathleen Marie. "The Relationship between Literacy Readiness and Auditory and Visual Perception in Kindergarteners". Miami University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=miami1241010453.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

10

Erdener, Vahit Doğu. "The effect of auditory, visual and orthographic information on second language acquisition /". View thesis View thesis, 2002. http://library.uws.edu.au/adt-NUWS/public/adt-NUWS20030408.114825/index.html.

Texto completo da fonte

Resumo:

Thesis (MA (Hons)) -- University of Western Sydney, 2002.
"A thesis submitted in partial fulfillment of the requirements for the degree of Masters of Arts (Honours), MARCS Auditory Laboratories & School of Psychology, University of Western Sydney, May 2002" Bibliography : leaves 83-93.

Estilos ABNT, Harvard, Vancouver, APA, etc.

11

McNichol, Melissa Anne. "The effect of audiovisual cues on speech intelligibility in adverse listening conditions /". Full-text of dissertation on the Internet (523.43 KB), 2010. http://www.lib.jmu.edu/general/etd/2010/doctorate/mcnichma/mcnichma_doctorate_04-23-2010.pdf.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

12

Stevanovic, Bettina. "The effect of learning on pitch and speech perception : influencing perception of Shepard tones and McGurk syllables using classical and operant conditioning principles". Thesis, View thesis, 2007. http://handle.uws.edu.au:8081/1959.7/33694.

Texto completo da fonte

Resumo:

This thesis is concerned with describing and experimentally investigating the nature of perceptual learning. Ecological psychology defines perceptual learning as a process of educating attention to structural properties of stimuli (i.e., invariants) that specify meaning (i.e., affordances) to the perceiver. Although such definition comprehensively describes the questions of what humans learn to perceive, it does not address the question of how learning occurs. It is proposed in this thesis that the principles of classical and operant conditioning can be used to strengthen and expand the ecological account of perceptual learning. The perceptual learning of affordances is described in terms of learning that a stimulus is associated with another stimulus (classical conditioning), and in terms of learning that interacting with a stimulus is associated with certain consequences (operant conditioning). Empirical work in this thesis investigated the effect of conditioning on pitch and speech perception. Experiments 1, 2, and 3 were designed to modify pitch perception in Shepard tones via tone-colour associative training. During training, Shepard tones were paired with coloured circles in a way that the colour of the circles could be predicted by either the F0 (pitch) or by an F0-irrelevant auditory invariant. Participants were required to identify the colour of the circles that was associated with the tones and they received corrective feedback. Hypotheses were based on the assumption that F0-relevant/F0- irrelevant conditioning would increase/decrease the accuracy of pitch perception in Shepard tones. Experiment 1 investigated the difference between F0-relevant and F0- irrelevant conditioning in a between-subjects design, and found that pitch perception in the two conditions did not differ. Experiments 2 and 3 investigated the effect of F0- relevant and F0-irrelevant conditioning (respectively) on pitch perception using a within subjects (pre-test vs. post-test) design. It was found that the accuracy of pitch perception increased after F0-relevant conditioning, and was unaffected by F0-irrelevant conditioning. The differential trends observed in Experiments 2 and 3 suggest that conditioning played some role in influencing pitch perception. However, the question whether the observed trends were due to the facilitatory effect of F0-relevant conditioning or the inhibitory effect of F0-irrelevant conditioning warrants future investigation. Experiments 4, 5, and 6 were designed to modify the perception of McGurk syllables (i.e., auditory /b/ paired with visual /g/) via consonant-pitch associative training. During training, participants were repeatedly presented with /b/, /d/, and /g/ consonants in falling, flat, and rising pitch contours, respectively. Pitch contour was paired with either the auditory signal (Experiments 4 and 5) or the visual signal (Experiment 6) of the consonant. Participants were required to identify the stop consonants and they received corrective feedback. The perception of McGurk stimuli was tested before and after training by asking participants to identify the stop consonant in each stimulus as /b/ or /d/ or /g/. It was hypothesized that conditioning would increase (1) /b/ responses more in the falling than in the flat/ rising contour conditions, (2) /d/ responses more in the flat than in the falling/ rising contour conditions, and (3) /g/ responses more in the rising than in the falling/flat contour conditions. Support for the hypotheses was obtained in Experiments 5 and 6, but only in one response category (i.e., /b/ and /g/ response categories, respectively). It is suggested that the subtlety of the observed conditioning effect could be enhanced by increasing the salience of pitch contour and by reducing the clarity of auditory/visual invariants that specify consonants.

Estilos ABNT, Harvard, Vancouver, APA, etc.

13

Stevanovic, Bettina. "The effect of learning on pitch and speech perception influencing perception of Shepard tones and McGurk syllables using classical and operant conditioning principles /". View thesis, 2007. http://handle.uws.edu.au:8081/1959.7/33694.

Texto completo da fonte

Resumo:

Thesis (Ph.D.)--University of Western Sydney, 2007.
A thesis submitted to the University of Western Sydney, College of Arts, School of Psychology in fulfilment of the requirements for the degree of Doctor of Philosophy. Includes bibliography.

Estilos ABNT, Harvard, Vancouver, APA, etc.

14

Pacheco, Vera. "O efeito dos estimulos auditivo e visual na percepção dos marcadores prosodicos lexicais e graficos usados na escrita do portugues brasileiro". [s.n.], 2006. http://repositorio.unicamp.br/jspui/handle/REPOSIP/269120.

Texto completo da fonte

Resumo:

Orientador: Luiz Carlos Cagliari
Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Estudos da Linguagem
Made available in DSpace on 2018-08-08T03:36:40Z (GMT). No. of bitstreams: 1 Pacheco_Vera_D.pdf: 2213079 bytes, checksum: 573cea7b28e511aa5f13edb8554af285 (MD5) Previous issue date: 2006
Resumo: Para algumas teorias da percepção da fala, esse processo pode ocorrer pela ação da audição (Teoria Quântica), da visão (Teoria Motora) ou, ainda, pela ação conjugada da audição e da visão (efeito McGurck). Para que o processo de percepção seja completo, é requerida a decodificação da mensagem contida no percepto, que pode ocorrer a partir de um acesso top-down (descendente) ou bottom-up (ascendente) às informações dos diferentes níveis. Considerando que a percepção da fala conta com informações visuais, a presente tese busca investigar a ação dos estímulos auditivo e visual na percepção de recursos gráficos, ou marcadores prosódicos, usados na escrita do Português Brasileiro com a função de representar graficamente variações prosódicas. Dentre os diferentes tipos de marcadores descritos na literatura, foram objetos de investigação desta pesquisa aqueles que são palavras escritas e cuja carga semântica indica variações prosódicas, sendo, portanto, Marcadores Prosódicos Lexicais (MPL), e aqueles que são marcas gráficas, Marcadores Prosódicos Gráficos (MPG), em particular, os sinais de pontuação, cujo sentido convencionado tem o mesmo efeito da carga semântica dos MPLs. Os MPLs são recursos gráficos usados para indicar, na escrita, atitudes do falante, enquanto os MPGs tendem a indicar variações prosódicas mais diretamente relacionadas ao processo dialógico. Considerando que esses recursos gráficos possuem uma realidade visual e uma realidade auditiva, objetivou investigar a ação dos estímulos auditivo e visual na percepção desses marcadores. Para tanto, foi delineado um design experimental, no qual foram controladas as frases-alvo sob escopo dos marcadores: MPLs: gritar, dizere baixo, berrar, sussurar, dizer rápido e dizer devagar; MPGs: : ! ? . ¿ ,; bem como foram controlados os contextos em que essas frases apareceriam. Foram gravadas as leituras em voz alta dos textos com as frases-alvo. Para controle da ação dos estímulos auditivo e visual, foram consideradas 6 condições experimentais: 2 condições mono modais auditiva e visual; 4 bimodais: 1 com coincidência entre as informações auditivas e visuais; 1 com estímulo auditivo sem variação melódica e 2 em condição de mismatch, desencontro das informações dos estímulos auditivo e visual). Nas condições bimodais, os estímulos auditivo e visual foram apresentados simultaneamente de forma sincronizada. O teste de percepção foi aplicado com cada informante, a quem foi solicitado a informar em voz alta, através de um número específico, o marcador prosódico que tinha observado. Foi medido o tempo de reposta de execução dessas tarefas. A partir das repostas dadas pelos informantes, foram obtidas as variáveis porcentagem de escolha do marcador presente no estímulo auditivo e de escolha do marcador presente no estímulo visual e porcentagem de escolha do marcador diferente daqueles presentes nos dois estímulos. A variável velocidade de leitura foi obtida para cada informante.Os dados foram submetidos a testes estatísticos de normalidade, comparação de médias e reamostragem. Os resultados obtidos evidenciam participação diferenciada dos estímulos auditivo e visual na percepção dos MPLs e dos MPGs. Partindo desses dados, os processos de percepção e de reconhecimento dos marcadores prosódicos são discutidos à luz das teorias de percepção e reconhecimento da fala
Abstract: The perception of speech can be understood as product of the hearing (Quantum Theory), of the vision (Motor Theory) or of the conjugated action of the hearing and the vision (McGurck effect). This process can be bottom-up or top-down. The access of the information can occur in the different levels: from phonological level to context level or from context level to phonological level. Considering that the speech perception process accounts with auditory and visual information, this thesis proposed to investigate the action of the auditory and visual input in the perception of graphic marks of the Brazilian Portuguese writing which have the function to represent graphically prosodic variations. We investigate two types of the markers: a) written words whose semantics load indicates prosodic variations ¿ Lexical Prosodic Markers (LPM) ¿ and b) graphic markers that indicate prosodic variations ¿ Graphical Prosodic Markers (GPM) ¿ in particular, the punctuation signals.The LPMs are graphic marks of the writing used to indicate attitudes of the speakers, while the GPMs indicate prosodic variations more related to the dialogic process. Considering that these graphic markers have a visual reality and an auditory reality, we investigated the action of the auditory and visual input in the perception of these markers. We delineated six experimental conditions (two mono modal conditions and four bi modal conditions). In the experimental conditions we controlled the contexts of the occurrence of the LPMs (to cry out, to say low, to bawl, to whisper, to speak quickly and speak slowly)and of the MPGs (: ! ? . ¿ ,). So, we obtained several texts that composed the corpus. We recorded the lecture aloud this texts done by an announcer and we presented this record associates with a writing form those texts to eleven people, in the individual sections for that they could to observe the prosodic markers. We also presented them some tasks and in these tasks, the subjects would answer aloud a correspondent number of the observed prosodic marker. We measure the time spent in the execution theses tasks. Moreover, we measure reading tax to each subject. So we obtained three variables: a) prosodic marker choice b) answer time; c) reading tax. We submitted the data to the following statistic tests: a) normality; b) comparison of averages and c) bootstrapping.In the thesis, the process of perception and recognition of the prosodic markers are argued to the light of the perception theories and recognition of speech
Doutorado
Doutor em Linguística

Estilos ABNT, Harvard, Vancouver, APA, etc.

15

Erdener, Vahit Dogu, University of Western Sydney, College of Arts e School of Psychology. "Development of auditory-visual speech perception in young children". 2007. http://handle.uws.edu.au:8081/1959.7/13783.

Texto completo da fonte

Resumo:

Unlike auditory-only speech perception, little is known about the development of auditory-visual speech perception. Recent studies show that pre-linguistic infants perceive auditory-visual speech phonetically in the absence of any phonological experience. In addition, while an increase in visual speech influence over age is observed in English speakers, particularly between six and eight years, this is not the case in Japanese speakers. This thesis aims to investigate the factors that lead to an increase in visual speech influence in English speaking children aged between 3 and 8 years. The general hypothesis of this thesis is that age-related, language-specific factors will be related to auditory-visual speech perception. Three experiments were conducted here. Results show that in linguistically challenging periods, such as school onset and reading acquisition, there is a strong link between auditory visual and language specific speech perception, and that this link appears to help cope with new linguistic challenges. However this link does not seem to be present in adults or preschool children, for whom auditory visual speech perception is predictable from auditory speech perception ability alone. Implications of these results in relation to existing models of auditory-visual speech perception and directions for future studies are discussed.
Doctor of Philosophy (PhD)

Estilos ABNT, Harvard, Vancouver, APA, etc.

16

Erdener, Dogu. "Development of auditory-visual speech perception in young children". Thesis, 2007. http://handle.uws.edu.au:8081/1959.7/13783.

Texto completo da fonte

Resumo:

Unlike auditory-only speech perception, little is known about the development of auditory-visual speech perception. Recent studies show that pre-linguistic infants perceive auditory-visual speech phonetically in the absence of any phonological experience. In addition, while an increase in visual speech influence over age is observed in English speakers, particularly between six and eight years, this is not the case in Japanese speakers. This thesis aims to investigate the factors that lead to an increase in visual speech influence in English speaking children aged between 3 and 8 years. The general hypothesis of this thesis is that age-related, language-specific factors will be related to auditory-visual speech perception. Three experiments were conducted here. Results show that in linguistically challenging periods, such as school onset and reading acquisition, there is a strong link between auditory visual and language specific speech perception, and that this link appears to help cope with new linguistic challenges. However this link does not seem to be present in adults or preschool children, for whom auditory visual speech perception is predictable from auditory speech perception ability alone. Implications of these results in relation to existing models of auditory-visual speech perception and directions for future studies are discussed.

Estilos ABNT, Harvard, Vancouver, APA, etc.

17

Gilbert, Jaimie. "Neural correlates of auditory-visual speech perception in noise /". 2009. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3362794.

Texto completo da fonte

Resumo:

Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2009.
Source: Dissertation Abstracts International, Volume: 70-06, Section: B, page: 3400. Adviser: Charissa Lansing. Includes bibliographical references (leaves 157-170) Available on microfilm from Pro Quest Information and Learning.

Estilos ABNT, Harvard, Vancouver, APA, etc.

18

Fitzpatrick, Michael F. "Auditory and auditory-visual speech perception and production in noise in younger and older adults". Thesis, 2014. http://handle.uws.edu.au:8081/1959.7/uws:31936.

Texto completo da fonte

Resumo:

The overall aim of the thesis was to investigate spoken communication in adverse conditions using methods that take into account that spoken communication is a highly dynamic and adaptive process, underpinned by interaction and feedback between speech partners. To this end, first I assessed the speech production adaptations of talkers in quiet and in noise, and in different communicative settings, i.e., where the talker and interlocutor were face to face (FTF) or could not see each other (Non-visual) (Chapter 2). Results showed that talkers adapted their speech production to suit the specific communicative environment. Talkers exaggerated their speech productions in noise (Lombard speech) compared to in quiet conditions. Further, in noise, in the FTF condition, talkers exaggerated mouth opening and reduced auditory intensity compared to the non-visual condition. To determine whether these speech production changes affected speech perception, materials drawn from the production study were tested in a speech perception in noise experiment (Chapter 3). The results showed that speech produced in noise provided an additional visual and auditory-visual intelligibility benefit for the perceiver. Following up this finding, I tested older and younger adults to see whether older adults would also show a Lombard speech benefit (Chapter 4 & 5). It was found that older adults were able to benefit from the auditory and auditory-visual speech production changes talkers made in noise. However, the amount of benefit they received depended on the type of noise (i.e., the degree of energetic-masking or informational-masking present in the noise masker), the signal type (i.e., whether the signal is auditory, visual, or auditory-visual) and the type of speech material considered (i.e., vowels or consonants). The results also showed that older adults were significantly poorer at lipreading than younger adults. To investigate a possible cause of the older adults’ lipreading problems, I presented time-compressed and time-expanded visual speech stimuli to determine how durational changes affected the lipreading accuracy of older compared to the younger adults (Chapter 6). The results showed that older adults were not disproportionately affected by changes in the durational properties of visual speech, suggesting that factors other than the speed of the visual speech signal determine older adults’ reduced lipreading capacity. The final experiment followed up several methodological issues concerning testing speech perception in noise. I examined whether the noise type (i.e., SSN or babble), the degree of lead-in noise, as well as the temporal predictability of the speech signal influenced on speech perception in noise performance (Chapter 7). I found that the degree of energetic- and informational-masking of speech in noise was affected by the amount of lead-in noise before the onset of the speech signal, but not by the predictability of the target speech signal. Taken together, the research presented in this thesis provides an insight into some of the factors that affect how well younger and older adults communicate in adverse conditions.

Estilos ABNT, Harvard, Vancouver, APA, etc.

19

Lees, Nicole C., University of Western Sydney e College of Arts. "Vocalisations with a better view : hyperarticulation augments the auditory-visual advantage for the detection of speech in noise". 2007. http://handle.uws.edu.au:8081/1959.7/19576.

Texto completo da fonte

Resumo:

Recent studies have shown that there is a visual influence early in speech processing - visual speech enhances the ability to detect auditory speech in noise. However, identifying exactly how visual speech interacts with auditory processing at such an early stage has been challenging, because this so-called AV speech detection advantage is both highly related to a specific lower-order, signal-based, optic-acoustic relationship between the second formant amplitude and the area of the mouth (F2/Mouth-area), and mediated by higher-order, information-based factors. Previous investigations either have maximised or minimised information-based factors, or have minimised signal-based factors, in order to try to tease out the relative importance of these sources of the advantage, but they have not yet been successful in this endeavour. Maximising signal-based factors has not previously been explored. This avenue was explored in this thesis by manipulating speaking style, hyperarticulated speech was used to maximise signal-based factors, and hypoarticulated speech to minimise signal-based factors - to examine whether the AV speech detection advantage is modified by these means, and to provide a clearer idea of the primary source of visual influence in the AV detection advantage. Two sets of six studies were conducted. In the first set, three recorded speech styles, hyperarticulated, normal, and hypoarticulated, were extensively analysed in physical (optic and acoustic) and perceptual (visual and auditory) dimensions ahead of stimulus selection for the second set of studies. The analyses indicated that the three styles comprise distinctive categories on the Hyper-Hypo continuum of articulatory effort (Lindblom, 1990). Most relevantly, both optically and visually hyperarticulated speech was more informative, and hypoarticulated less informative, than normal speech with regard to signal-based movement factors. However, the F2/Mouth-area correlation was similarly strong for all speaking styles, thus allowing examination of signal-based, visual informativeness on AV speech detection with optic-acoustic association controlled. In the second set of studies, six Detection Experiments incorporating the three speaking styles were designed to examine whether, and if so why, more visually-informative (hyperarticulated) speech augmented, and less visually informative (hypoarticulated) speech attenuated, the AV detection advantage relative to normal speech, and to examine visual influence when auditory speech was absent. Detection Experiment 1 used a two-interval, two-alternative (first or second interval, 2I2AFC) detection task, and indicated that hyperarticulation provided an AV detection advantage greater than for normal and hypoarticulated speech, with less of an advantage for hypoarticulated than for normal speech. Detection Experiment 2 used a single-interval, yes-no detection task to assess responses in signal-absent independent of signal-present conditions as a means of addressing participants’ reports that speech was heard when it was not presented in the 2I2AFC task. Hyperarticulation resulted in an AV detection advantage, and for all speaking styles there was a consistent response bias to indicate speech was present in signal-absent conditions. To examine whether the AV detection advantage for hyperarticulation was due to visual, auditory or auditory-visual factors, Detection Experiments 3 and 4 used mismatching AV speaking style combinations (AnormVhyper, AnormVhypo, AhyperVnorm, AhypoVnorm) that were onset-matched or time-aligned, respectively. The results indicated that higher rates of mouth movement can be sufficient for the detection advantage with weak optic-acoustic associations, but, in circumstances where these associations are low, even high rates of movement have little impact on augmenting detection in noise. Furthermore, in Detection Experiment 5, in which visual stimuli consisted only of the mouth movements extracted from the three styles, there was no AV detection advantage, and it seems that this is so because extra-oral information is required, perhaps to provide a frame of reference that improves the availability of mouth movement to the perceiver. Detection Experiment 6 used a new 2I-4AFC task and the measures of false detections and response bias to identify whether visual influence in signal absent conditions is due to response bias or an illusion of hearing speech in noise (termed here the Speech in Noise, SiN, Illusion). In the event, the SiN illusion occurred for both the hyperarticulated and the normal styles – styles with reasonable amounts of movement change. For normal speech, the responses in signal-absent conditions were due only to the illusion of hearing speech in noise, whereas for hypoarticulated speech such responses were due only to response bias. For hyperarticulated speech there is evidence for the presence of both types of visual influence in signal-absent conditions. It seems to be the case that there is more doubt with regard to the presence of auditory speech for non-normal speech styles. An explanation of past and present results is offered within a new framework -the Dynamic Bimodal Accumulation Theory (DBAT). This is developed in this thesis to address the limitations of, and conflicts between, previous theoretical positions. DBAT suggests a bottom-up influence of visual speech on the processing of auditory speech; specifically, it is proposed that the rate of change of visual movements guides auditory attention rhythms ‘on-line’ at corresponding rates, which allows selected samples of the auditory stream to be given prominence. Any patterns contained within these samples then emerge from the course of auditory integration processes. By this account, there are three important elements of visual speech necessary for enhanced detection of speech in noise. First and foremost, when speech is present, visual movement information must be available (as opposed to hypoarticulated and synthetic speech) Then the rate of change, and opticacoustic relatedness also have an impact (as in Detection Experiments 3 and 4). When speech is absent, visual information has an influence; and the SiN illusion (Detection Experiment 6) can be explained as a perceptual modulation of a noise stimulus by visually-driven rhythmic attention. In sum, hyperarticulation augments the AV speech detection advantage, and, whenever speech is perceived in noisy conditions, there is either response bias to perceive speech or a SiN illusion, or both. DBAT provides a detailed description of these results, with wider-ranging explanatory power than previous theoretical accounts. Predictions are put forward for examination of the predictive power of DBAT in future studies.
Doctor of Philosophy (PhD)

Estilos ABNT, Harvard, Vancouver, APA, etc.

20

Hochstrasser, Daniel. "Investigating the effect of visual phonetic cues on the auditory N1 & P2". Thesis, 2017. http://hdl.handle.net/1959.7/uws:44884.

Texto completo da fonte

Resumo:

Studies have shown that the N1 and P2 auditory event-related potentials (ERPs) that occur to a speech sound when the talker can be seen (i.e., Auditory-Visual speech), occur earlier and are reduced in amplitude compared to when the talker cannot be seen (auditory-only speech). An explanation for why seeing the talker changes the brain’s response to sound is that visual speech provides information about the upcoming auditory speech event. This information reduces uncertainty about when the sound will occur and about what the event will be (resulting in a smaller N1 and P2, which are markers associated with auditory processing). It has yet to be determined whether form information alone can influence the amplitude or timing of either the N1 or P2. We tested this by conducting two separate EEG experiments. In Experiment 1, we compared the N1 and P2 peaks of the ERPs to auditory speech when preceded by a visual speech cue (Audio-visual Speech) or by a static neutral face. In Experiment 2, we compared contrasting N1/P2 peaks of the ERPs to auditory speech preceded by print cues presenting reliable information about their content (written “ba” or “da” shown before these spoken syllables), or to control cues (meaningless printed symbols). The results of Experiment 1 confirmed that the presentation of visual speech produced the expected effect of amplitude suppression of the N1 but the opposite effect occurred for latency facilitation (Auditory-only speech faster than Audio-visual speech). For Experiment 2, no difference in the amplitude or timing of the N1 or P2 ERPs to the reliable print versus the control cues was found. The unexpected slower latency response of the N1 to AV speech stimuli found in Experiment 1, may be accounted for by attentional differences induced by the experimental design. The null effect of print cues in Experiment 2 indicate the importance of the temporal relationship between visual and auditory events.

Estilos ABNT, Harvard, Vancouver, APA, etc.

21

Stevanovic, Bettina, University of Western Sydney, College of Arts e School of Psychology. "The effect of learning on pitch and speech perception : influencing perception of Shepard tones and McGurk syllables using classical and operant conditioning principles". 2007. http://handle.uws.edu.au:8081/1959.7/33694.

Texto completo da fonte

Resumo:

This thesis is concerned with describing and experimentally investigating the nature of perceptual learning. Ecological psychology defines perceptual learning as a process of educating attention to structural properties of stimuli (i.e., invariants) that specify meaning (i.e., affordances) to the perceiver. Although such definition comprehensively describes the questions of what humans learn to perceive, it does not address the question of how learning occurs. It is proposed in this thesis that the principles of classical and operant conditioning can be used to strengthen and expand the ecological account of perceptual learning. The perceptual learning of affordances is described in terms of learning that a stimulus is associated with another stimulus (classical conditioning), and in terms of learning that interacting with a stimulus is associated with certain consequences (operant conditioning). Empirical work in this thesis investigated the effect of conditioning on pitch and speech perception. Experiments 1, 2, and 3 were designed to modify pitch perception in Shepard tones via tone-colour associative training. During training, Shepard tones were paired with coloured circles in a way that the colour of the circles could be predicted by either the F0 (pitch) or by an F0-irrelevant auditory invariant. Participants were required to identify the colour of the circles that was associated with the tones and they received corrective feedback. Hypotheses were based on the assumption that F0-relevant/F0- irrelevant conditioning would increase/decrease the accuracy of pitch perception in Shepard tones. Experiment 1 investigated the difference between F0-relevant and F0- irrelevant conditioning in a between-subjects design, and found that pitch perception in the two conditions did not differ. Experiments 2 and 3 investigated the effect of F0- relevant and F0-irrelevant conditioning (respectively) on pitch perception using a within subjects (pre-test vs. post-test) design. It was found that the accuracy of pitch perception increased after F0-relevant conditioning, and was unaffected by F0-irrelevant conditioning. The differential trends observed in Experiments 2 and 3 suggest that conditioning played some role in influencing pitch perception. However, the question whether the observed trends were due to the facilitatory effect of F0-relevant conditioning or the inhibitory effect of F0-irrelevant conditioning warrants future investigation. Experiments 4, 5, and 6 were designed to modify the perception of McGurk syllables (i.e., auditory /b/ paired with visual /g/) via consonant-pitch associative training. During training, participants were repeatedly presented with /b/, /d/, and /g/ consonants in falling, flat, and rising pitch contours, respectively. Pitch contour was paired with either the auditory signal (Experiments 4 and 5) or the visual signal (Experiment 6) of the consonant. Participants were required to identify the stop consonants and they received corrective feedback. The perception of McGurk stimuli was tested before and after training by asking participants to identify the stop consonant in each stimulus as /b/ or /d/ or /g/. It was hypothesized that conditioning would increase (1) /b/ responses more in the falling than in the flat/ rising contour conditions, (2) /d/ responses more in the flat than in the falling/ rising contour conditions, and (3) /g/ responses more in the rising than in the falling/flat contour conditions. Support for the hypotheses was obtained in Experiments 5 and 6, but only in one response category (i.e., /b/ and /g/ response categories, respectively). It is suggested that the subtlety of the observed conditioning effect could be enhanced by increasing the salience of pitch contour and by reducing the clarity of auditory/visual invariants that specify consonants.
Doctor of Philosophy (PhD)

Estilos ABNT, Harvard, Vancouver, APA, etc.

22

Lapchak, Marion Cone. "Exploring the effects of age, early-onset otitis media, and articulation errors on the integration of auditory and visual information in speech perception /". Diss., 2005. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3188497.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

23

Cvejic, Erin. "It's not just what you say, but also how you say it : exploring the auditory and visual properties of speech prosody". Thesis, 2011. http://handle.uws.edu.au:8081/1959.7/511329.

Texto completo da fonte

Resumo:

This thesis investigated the production and perception of prosodic cues for focus and phrasing contrasts from auditory and visual speech (i.e., visible face and head movements). This was done by examining the form, perceptibility, and potential functions of the visual correlates of spoken prosody using auditory and motion analysis and perception-based measures. The first part of the investigation (Chapters 2 to 3) consisted of a series of perception experiments conducted to determine the degree to which perceivers were sensitive to the visual realisation of prosody across face areas. Here, participants were presented with a visual cue (either from the upper or lower half of the face) to match (based on prosody) with another visual or auditory cue. Performance was much better than chance even when the task involved matching cues produced by different talkers. The results indicate that perceivers were sensitive to visual prosodic cues, that considerable variability in the form of these could be tolerated, and that different cues conveying information about the same prosodic type could be matched. The second part of the thesis (Chapters 4 to 8) reported on the construction of a multi-talker speech prosody corpus and the analysis and perceptibility of this production data. The corpus consisted of auditory and visual speech recording of six talkers producing 30 sentences across three prosodic conditions in two interactive settings (face-to-face and auditory-only), with face movements captured using a 3D motion tracking system and characterised using a guided principal components analysis. The analysis consisted of quantifying auditory and visual characteristics of prosodic contrasts separately as well as the relationship between these. Acoustically, the properties of the contrasts corresponded to those typically described in the literature (however, some properties varied systematically as a function of the interactive setting), and were also perceived as conveying the intended contrasts in subsequent perceptual tasks (reported in Chapter 6). Overall, the types of movements used to contrast narrow from broad focused utterances, and echoic questions from statements, involved the use of both articulatory (e.g., jaw and lip movement) and non-articulatory (e.g., eyebrow and rigid head movement) cues. Both the visual and the acoustic properties varied across talkers and interactive settings. The spatial and temporal relationship between auditory and visual signal modalities was highly variable, differing substantially across utterances. The final part of the thesis (Chapters 9 to 10) reported the results of a series of perception experiments using perceptual rating and cross-modal matching tasks on stimuli resynthesised from the motion capture data. These stimuli showed various combinations of visual cues, and when presented in isolation or combined with the auditory signal, these were perceived as conveying the intended prosodic contrast. However, no auditory-visual (AV) benefit was observed in the perceptual ratings, with the presentation of more cues failing to result in better cross-modal matching performance (suggesting there may be limitations in perceivers‟ ability to process multiple cues). In sum, the thesis showed that perceivers were sensitive to visual prosodic cues despite variability in production, and were able to match different types of cue. The construction of an AV prosody corpus permitted the characteristics of the auditory and visual prosodic correlates (and their relationship) to be quantified, and allowed for the synthesis of visual cues that perceivers subsequently used to successfully extract prosodic information. In all, the experiments reported in this thesis provide a strong case for the development of well-controlled and measured manipulations of prosody and warrants further examination of the visual cues to prosody.

Estilos ABNT, Harvard, Vancouver, APA, etc.

24

Paris, Tim. "Audiovisual prediction using brain and behaviour measures". Thesis, 2014. http://handle.uws.edu.au:8081/1959.7/uws:32646.

Texto completo da fonte

Resumo:

The brain’s ability to generate predictions provides a foundation for efficient processing. When one event reliably follows another, the presence of the first event provides information about aspects of the second. Knowledge of this association can be advantageous; for instance, being able to anticipate what someone will say based on their previous lip movements; however, what is not clear is how the brain uses knowledge about future events to benefit processing. In order to provide an insight into this broad question, this thesis focuses on the specific example of auditory events that have been preceded by visual ones. While this topic may seem limited at first, it contains the essential elements that underpin a more complex study (Figure 1.1). This well-defined relationship describes a simple and controllable experimental setup that enables exploration of some key issues in sensory, perceptual and cognitive processing.

Estilos ABNT, Harvard, Vancouver, APA, etc.

Oferecemos descontos em todos os planos premium para autores cujas obras estão incluídas em seleções literárias temáticas. Contate-nos para obter um código promocional único!