Dissertations / Theses: 'Speech control'

1

Wilson, W. R. "Speech motor control." Thesis, University of Essex, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.376738.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Campbell, Wilhelm. "Multi-level speech timing control." Thesis, University of Sussex, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.283832.

Full text

Abstract:

This thesis describes a model of speech timing, predicting at the syllable level, with sensitivity to rhythmic factors at the foot level, that predicts segmental durations by a process of accommodation into the higher-level timing framework. The model is based on analyses of two large databases of British English speech; one illustrating the range of prosodic variation in the language, the other illustrating segmental duration characteristics in various phonetic environments. Designed for a speech synthesis application, the model also has relevance to linguistic and phonetic theory, and shows that phonological specification of prosodic variation is independent of the phonetic realisation of segmental duration. It also shows, using normalisation of phone-specific timing characteristics, that lengthening of segments within the syllable is of three kinds: prominence-related, applying more to onset segments; boundary-related, applying more to coda segments; and rhythm/rate-related, being more uniform across all component segments. In this model, durations are first predicted at the level of the syllable from consideration of the number of component segments, the nature of the rhyme, and the three types of lengthening. The segmental durations are then constrained to sum to this value by determining an appropriate uniform quantile of their individual distributions. Segmental distributions define the range of likely durations each might show under a given set of conditions; their parameters are predicted from broad-class features of place and manner of articulation, factored for position in the syllable, clustering, stress, and finality. Two parameters determine the segmental duration . pdfs, assuming a Gamma distribution, and one parameter determines the quantile within that pdf to predict the duration of any segment in a given prosodic context. In experimental tests, each level produced durations that closely fitted the data of four speakers of British English, and showed performance rates higher than a comparable model predicting exclusively at the level of the segment.

APA, Harvard, Vancouver, ISO, and other styles

3

Vousden, Janet. "Serial control of phonology in speech production." Thesis, University of Warwick, 1996. http://wrap.warwick.ac.uk/3026/.

Full text

Abstract:

The aim of this thesis is to further our understanding of the processes which control the sequencing of phonemes as we speak: this is an example of what is commonly known as the serial order problem. Such a process is apparent in normal speech and also from the existence of a class of speech errors known as sound movement errors, where sounds are anticipated (spoken too soon), perseverated (repeated again later), or exchanged (the sounds are transposed). I argue that this process is temporally governed, that is, the serial ordering mechanism is restricted to processing sounds that are close together in time. This is in conflict with frame-based accounts (e.g. Dell, 1986; Lapointe & Dell, 1979), serial buffer accounts (Shattuck-Hufnagel, 1979) and associative chaining theories (Wickelgren, 1969). An analysis of sound movement errors from Harley and MacAndrew's (1995) corpus shows how temporal processing bears on the production of speech sounds by the temporal constraint observed in the pattern of errors, and I suggest an appropriate computational model of this process. Specifically, I show how parallel temporal processing in an oscillator-based model can account for the movement of sounds in speech. Similar predictions were made by the model to the pattern of movement errors actually observed in speech error corpora. This has been demonstrated without recourse to an assumption of frame and slot structures. The OSCillator-based Associative REcall (OSCAR) model, on the other hand, is able to account for these effects and other positional effects, providing support for a temporal based theory of serial control.

APA, Harvard, Vancouver, ISO, and other styles

4

Palivela, Yaswanth. "Speech Assisted Interface for Quadcopter Flight Control." University of Toledo / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1526247041269609.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Bond, Rachel Jacqueline Psychology Faculty of Science UNSW. "Cognates, competition and control in bilingual speech production." Awarded by:University of New South Wales. Psychology, 2005. http://handle.unsw.edu.au/1959.4/22397.

Full text

Abstract:

If an individual speaks more than one language, there are always at least two ways of verbalising any thought to be expressed. The bilingual speaker must then have a means of ensuring that their utterances are produced in the desired language. However, prominent models of speech production are based almost exclusively on monolingual considerations and require substantial modification to account for bilingual production. A particularly important feature to be explained is the way bilinguals control the language of speech production: for instance, preventing interference from the unintended language, and switching from one language to another. One recent model draws a parallel between bilinguals??? control of their linguistic system and the control of cognitive tasks more generally. The first two experiments reported in this thesis explore the validity of this model by comparing bilingual language switching with a monolingual switching task, as well as to the broader task-switching literature. Switch costs did not conform to the predictions of the task-set inhibition hypothesis in either experiment, as the ???paradoxical??? asymmetry of switch costs was not replicated and some conditions showed benefits, rather than costs, for switching between languages or tasks. Further experiments combined picture naming with negative priming and semantic competitor priming paradigms to examine the role of inhibitory and competitive processes in bilingual lexical selection. Each experiment was also conducted in a parallel monolingual version. Very little negative priming was evident when speaking the second language, but the effects of interlingual cognate status were pronounced. There were some indications of cross-language competition at the level of lexical selection: participants appeared unable to suppress the irrelevant language, even when doing so would make the task easier. Across all the experiments, there was no evidence for global inhibition of the language-not-in-use during speech production. Overall results were characterised by a remarkable flexibility in the mechanisms of bilingual control. A striking dissociation emerged between the patterns of results for cognate and non-cognate items, which was reflected throughout the series of experiments and implicates qualitative differences in the way these lexical items are represented and interconnected.

APA, Harvard, Vancouver, ISO, and other styles

6

Tran, Thao, and Nathalie Tkauc. "Face recognition and speech recognition for access control." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-39776.

Full text

Abstract:

This project is a collaboration with the company JayWay in Halmstad. In order to enter theoffice today, a tag-key is needed for the employees and a doorbell for the guests. If someonerings the doorbell, someone on the inside has to open the door manually which is consideredas a disturbance during work time. The purpose with the project is to minimize thedisturbances in the office. The goal with the project is to develop a system that uses facerecognition and speech-to-text to control the lock system for the entrance door. The components used for the project are two Raspberry Pi’s, a 7 inch LCD-touch display, aRaspberry Pi Camera Module V2, a external sound card, a microphone and speaker. Thewhole project was written in Python and the platform used was Amazon Web Services (AWS)for storage and the face recognition while speech-to-text was provided by Google.The system is divided in three functions for employees, guests and deliveries. The employeefunction has two authentication steps, the face recognition and a random generated code that needs to be confirmed to avoid biometric spoofing. The guest function includes the speech-to-text service to state an employee's name that the guest wants to meet and the employee is then notified. The delivery function informs the specific persons in the office that are responsiblefor the deliveries by sending a notification.The test proves that the system will always match with the right person when using the facerecognition. It also shows what the threshold for the face recognition can be set to, to makesure that only authorized people enters the office.Using the two steps authentication, the face recognition and the code makes the system secureand protects the system against spoofing. One downside is that it is an extra step that takestime. The speech-to-text is set to swedish and works quite well for swedish-speaking persons.However, for a multicultural company it can be hard to use the speech-to-text service. It canalso be hard for the service to listen and translate if there is a lot of background noise or ifseveral people speak at the same time.

APA, Harvard, Vancouver, ISO, and other styles

7

Stenbäck, Victoria. "Speech masking speech in everyday communication : The role of inhibitory control and working memory capacity." Doctoral thesis, Linköpings universitet, Avdelningen för neuro- och inflammationsvetenskap, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-133194.

Full text

Abstract:

Age affects hearing and cognitive abilities. Older people, with and without hearing impairment (HI), exhibit difficulties in hearing speech in noise. Elderly individuals show greater difficulty in segregating target speech from distracting background noise, especially if the noise is competing speech with meaningful contents, so called informational maskers. Working memory capacity (WMC) has proven to be a crucial factor in comprehending speech in noise, especially for people with hearing loss. In auditory scenes where speech is disrupted by competing speech, high WMC has proven to facilitate the ability to segregate target speech and inhibit responses to irrelevant information. People with low WMC are more prone to be disrupted by competing speech and exhibit more difficulties in hearing target speech in complex listening environments. Furthermore, elderly individuals with a HI experience more difficulties in switching attention between wanted and irrelevant stimuli, and they employ more resources and time to attend to the stimuli than do normally - hearing (NH) younger adults. This thesis investigated the importance of inhibitory control and WMC for speech recognition in noise, and perceived listening effort. Four studies were conducted. In the first study, the aim was to develop a test of inhibitory control for verbal content, and to investigate the relation between inhibitory control and WMC, and how these two abilities related to speech recognition in noise, in young normally – hearing (YNH) individuals. In the second study we aimed to investigate the same relationship as in the first study to further strengthen the validity of the inhibitory test developed, as well as the importance of lexical access. It was also an aim to investigate the influence of age and hearing status on lexical access and WMC, and their respective roles for speech recognition in noise in both YNH and elderly HI (EHI) individuals. Study one and two showed that, for YNH, inhibitory control was related to speech recognition in noise, indicating that inhibitory control can help to predict speech recognition in noise performance. The relationship between WMC and speech recognition in noise in YNH shifted in the studies, suggesting that this relationship is multifaceted and varying. Lexical access was of little importance for YNH, although for EHI individuals, both WMC and lexical access was of importance for speech recognition in noise, suggesting that different cognitive abilities were of importance for the YNH and EHI individuals Study three investigated the relationship between inhibitory control, WMC, speech recognition in noise, and perceived listening effort, in YNH and elderly, for their age, NH, individuals (ENH). In study four the same relationships as in study three were investigated, albeit in EHI individuals. Two speech materials with different characteristics, masked with four background noises were used. The results in study three showed that less favourable SNRs were needed for informational maskers than for maskers without semantic content. ENH individuals were more susceptible to informational maskers than YNH individuals. In contrast, in study four, more favourable SNRs were needed for informational maskers. In both studies, results showed that speech recognition in noise performance differed depending on the characteristics of the speech material. The studies showed that high WMC, compared to low WMC, was beneficial for speech recognition in noise, especially for informational maskers, and resulted in lower ratings of perceived effort. Varying results were found in study three and four regarding perceived effort and inhibitory control. In study three good inhibitory control was associated with lower effort rating, while in study four, individuals with a HI and good inhibitory control rated effort as higher. The results suggest that hearing status, age, and cognitive abilities, contribute to the differences in performance between YNH, ENH, and EHI individuals in speech – recognition – in – noise - and cognitive tasks. This thesis has, for the first time, demonstrated that a measure of inhibitory control of verbal content, is related to speech recognition in noise performance in YNH, ENH and EHI individuals. Results presented in this thesis also show that both WMC and inhibitory control are related to an individuals’ perception of how effortful a listening task is. It also adds to the literature that WMC is related to speech recognition in noise performance for ENH and EHI individuals, but that this relationship is not as robust in YNH individuals.
Ålder påverkar hörseln och de kognitiva förmågorna. Äldre personer, med och utan hörselnedsättning, uppvisar ofta svårigheter att höra tal i miljöer med bakgrundsljud. De uppvisar större svårigheter att urskilja en måltalare, speciellt om det omgivande ljudet består av annat tal med meningsfullt innehåll, så kallad informationsmaskering. Arbetsminne har visat sig vara en viktig faktor för att förstå tal – i – brus, framför allt för personer med hörselnedsättning. I ljudmiljöer där tal störs av andra talkällor är hög arbetsminneskapacitet av vikt för att understödja förmågan att urskilja måltalaren från de störande talkällorna genom att underlätta inhiberingen av irrelevant information. Individer med lägre arbetsminneskapacitet är mer benägna att störas av andra talkällor, och har svårare att uppfatta måltalaren i komplexa lyssningssituationer. Vidare upplever äldre personer med hörselnedsättning att det är svårare att skifta uppmärksamheten mellan relevant och irrelevant stimuli, och de använder mer resurser och tid till omgivande stimuli än, ex. yngre individer med normal hörsel. I den här avhandlingen undersöktes vikten av inhibitionskontroll och arbetsminne vid taluppfattning i brus och upplevelsen av lyssningsansträngning. Fyra studier genomfördes. Syftet med första studien var att utveckla ett test för verbal inhibitionskontroll, och att undersöka relationen mellan inhibitionskontroll, arbetsminneskapacitet, och deras koppling till taluppfattning i brus hos yngre normalhörande personer. I studie två undersöktes ovanstående relationer för att vidare styrka validiteten för testet av inhibitionskontroll, samt vikten av lexikal åtkomst. Vidare syfte var att undersöka ålderns och hörselns inverkan på lexikal åtkomst och arbetsminneskapacitet, och deras respektive roller för taluppfattning i brus hos både yngre normalhörande och äldre hörselnedsatta personer. Studie ett och två visade att inhibitionskontroll var relaterad till taluppfattning i brus för yngre normalhörande personer, vilket indikerar att inhibitionskontroll kan användas för att underlätta att förutsäga förmågan att uppfatta tal – i – brus. Relationen mellan arbetsminneskapacitet och taluppfattning i brus hos yngre normalhörande var inte solid, vilket tyder på att relationen är mångsidig och skiftande. Lexikal åtkomst var av mindre betydelse för yngre normalhörande personer, ehuru hos äldre hörselskadade personer var både arbetsminneskapacitet och lexikal åtkomst viktigt för taluppfattning i brus. Detta tyder på att olika kognitiva förmågor var betydelsefulla för taluppfattningen i brus för yngre normalhörande och äldre hörselskadade personer. Studie tre undersökte relationen mellan inhibitionskontroll, arbetsminneskapacitet, taluppfattning i brus, och upplevd lyssningsansträngning hos yngre och äldre, för sin ålder, normalhörande personer. Två talmaterial med olika karakteristika användes och maskerades med fyra olika bakgrundsbrus. Resultatet visade att mindre gynnsamma signal – brus - förhållanden uppnåddes när informationsmaskering användes jämfört med brus utan semantiskt innehåll. Äldre normalhörande personer var mer mottagliga för informationsmaskering än yngre normalhörande personer. Hög arbetsminneskapacitet och god inhibitionskontroll var förmånliga för taluppfattning i brus, och resulterade i mindre upplevd lyssningsansträngning, jämfört med personer med lägre arbetsminneskapacitet och sämre inhibitionskontroll. Resultaten talar för att åldersrelaterade tillbakagångar i hörförmåga och, vissa, kognitiva förmågor, bidrar till skillnaderna i prestation mellan yngre och äldre normalhörande personer när det gäller förmågan att uppfatta tal - i - brus. Studie fyra undersökte samma relationer som i studie tre, ehuru hos äldre personer med mild – måttlig sensorineural hörselnedsättning. Resultaten visade att förmågan att uppfatta tal - i - brus varierade beroende på talmaterialets karakteristika, samt vilket bakgrundsbrus som användes. Hög arbetsminneskapacitet och god inhibitionskontroll var fördelaktiga för taluppfattningen, i synnerhet när informationsmaskering användes. Personer med högt arbetsminne upplevde mindre lyssningsansträngning, medan god inhibitionskontroll associerades med högre upplevd lyssningsansträngning. I föreliggande avhandling har det, för första gången, påvisats att verbal inhibitionskontroll relaterar till förmågan att uppfatta tal – i – brus hos yngre och äldre normalhörande, och äldre personer med hörselnedsättning. Resultaten som presenterats i avhandlingen visar att både arbetsminneskapacitet och inhibitionskontroll är associerade med en individs upplevelse av hur ansträngande en lyssningssituation är. Avhandlingen stödjer även tidigare forskning som visar på att arbetsminneskapaciteten är relaterad till förmågan att uppfatta tal – i – brus hos äldre normalhörande, och äldre hörselskadade personer, men att denna relation inte är lika solid för yngre normalhörande personer.

APA, Harvard, Vancouver, ISO, and other styles

8

Ward, David. "Intrinsic timing, extrinsic timing and stuttered speech." Thesis, University of Reading, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.309521.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Mitchell, Douglas Alan. "Control of high speech cavity flow using plasma actuators." Connect to resource, 2006. http://hdl.handle.net/1811/6439.

Full text

Abstract:

Thesis (Honors)--Ohio State University, 2006.
Title from first page of PDF file. Document formatted into pages: contains x, 63 p.; also includes graphics. Includes bibliographical references (p. 50-51). Available online via Ohio State University's Knowledge Bank.

APA, Harvard, Vancouver, ISO, and other styles

10

Chiam, Ruth. "Speech Motor Control in English-Mandarin Bilinguals who stutter." Thesis, University of Canterbury. Department of Communication Disorders, 2013. http://hdl.handle.net/10092/7793.

Full text

Abstract:

Research examining bilinguals who stutter (BWS) is limited; in particular there are few studies that have considered examining features of speech motor control in BWS. The present study was designed to examine features of speech motor control in bilingual speakers of Mandarin and English. Speech motor control was examined through the acoustic analysis of speaking rate, voice onset time (VOT) and stuttering adaptation. Participants ranged from age between 9 and 27 years. Upon completion of a language dominance questionnaire, two BWS participants were found to be English dominant and three were Mandarin dominant. Each BWS participant was matched to age/sex matched control participants (BWNS). Results for the BWS participants found more stuttering in the less dominant language based on a measure of percentage of syllables stuttered. All of the BWS participants demonstrated stuttering adaptation and there was no significant difference in the amount of adaptation for Mandarin and English. There was no difference found between BWS and BWNS for speaking rate and VOT. In spite of the similarity between BWS and BWNS, speaking rate in Mandarin appeared to be faster compared to English. These findings suggest that speech motor control in BWS and BWNS are similar and current application of these findings to the clinical setting is discussed.

APA, Harvard, Vancouver, ISO, and other styles

11

Ward, Karen. "A speech act model of air traffic control dialogue /." Full text open access at:, 1992. http://content.ohsu.edu/u?/etd,228.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Miura, Takayuki. "Executive control in speech comprehension : bilingual dichotic listening studies." Thesis, University of Edinburgh, 2014. http://hdl.handle.net/1842/9740.

Full text

Abstract:

In this dissertation, the traditional dichotic listening paradigm was integrated with the notion of working memory capacity (WMC) to explore the cognitive mechanism of bilingual speech comprehension at the passage level. A bilingual dichotic listening (BDL) task was developed and administered to investigate characteristics of bilingual listening comprehension, which include semantic relatedness, unattended language, ear preference, auditory attentional control, executive control, voluntary note-taking, and language switching. The central concept of the BDL paradigm is that the auditory stimuli are presented in the bilinguals’ two languages and their attention is directed to one of their ears while they have to overcome cognitive and linguistic conflicts caused by information in the other ear. Different experimental manipulations were employed in the BDL task to examine the characteristics of bilingual listening comprehension. The bilingual population examined was Japanese- English bilinguals with relatively high second language (L2) proficiency and WMC. Seven experiments and seven cross-experimental comparisons are reported. Experiment 1 employed the BDL task with pairs of passages that had different semantic relationships (i.e., related or unrelated) and were heard in different languages (i.e., L1 or L2). The semantically related passages were found to interfere with comprehension of the attended passage more than the semantically unrelated passages, whether the attended and unattended languages were the same or different. Contrary to the theories of bilingual language control, unattended L1 was found to enhance comprehension of the attended passage, regardless of semantic relationships and language it was heard in. L2 proficiency and WMC served as good predictors of resolution of the cognitive and linguistic conflicts. The BDL task is suggested to serve as an experimental paradigm to explore executive control and language control in bilingual speech comprehension. Experiment 2 was conducted to investigate language lateralisation (i.e., ear preference) on bilingual speech comprehension, hence, the participants in Experiment 1 used their preferred ear, whereas participants in Experiment 2 used their non-preferred ear, whether it was left or right, in the BDL task. Comprehension was better through the preferred ear, indicating that there is a favourable ear-to-hemisphere route for understanding bilinguals’ two languages. Most of the participants were found to be left-lateralised (i.e., right-eared) and some to be right-lateralised (i.e., left-eared) presumably depending on their L2 proficiency and WMC. Experiment 3 was concerned with auditory attentional control, and explored whether there would be a right-ear advantage (REA). The participants indicated an REA whether the attended and unattended languages were L1 or L2. When they listened to Japanese in the left ear, they found it more difficult to suppress Japanese in the right ear than English. WMC was not required as much as expected for auditory attentional control probably because the passages in Experiment 3 did not yield as much semantic competition as those in Experiment 1. L2 proficiency was crucial for resolving within- and between-language competition in each ear. Experiments 4, 5, and 6 were replications of Experiments 1, 2 and 3, but these latter experiments considered the effect of note-taking that is commonly performed in everyday listening situations. Note-taking contributed to better performance and clearer understanding of the role of WMC in bilingual speech comprehension. A cross-experimental analysis between Experiments 1, 2, 4, and 5 revealed not only a facilitatory role of note-taking in bilingual listening comprehension in general, but also a hampering role when listening through the preferred ear. Experiment 7 addressed the effect of predictability of language switching by presenting L1 and L2 in a systematic order while switching attention between ears and comparing the result with that of Experiment 6 where language switching was unpredictable. The effect of predictability of language switching was different between ears. When language switches were predictable, higher comprehension was observed in the left ear than the right ear, and when language switches were unpredictable, higher comprehension was observed in the right ear than the left ear, thereby suggesting a mechanism of asymmetrical language control. WMC was more related to processing of predictable language switches than that of unpredictable language switches. The dissertation ends with discussions of the implications from the seven BDL experiments and possible applications, along with experimental techniques from other relevant disciplines that might be used in future research to yield additional insight into how bilingual listeners sustain their listening performance in their two languages in the real-life situations.

APA, Harvard, Vancouver, ISO, and other styles

13

Soni, Maya. "Semantics in speech production." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/semantics-in-speech-production(c446ac01-7c32-468a-816b-04993347e135).html.

Full text

Abstract:

The semantic system contributes to the process of speech production in two major ways. The basic information is contained within semantic representations, and the semantic control system manipulates that knowledge as required by task and context. This thesis explored the evidence for interactivity between semantic and phonological stages of speech production, and examined the role of semantic control within speech production. The data chapters focussed on patients with semantic aphasia or SA, who all have frontal and/or temporoparietal lesions and are thought to have a specific impairment of semantic control. In a novel development, grammatical class and cueing effects in this patient group were compared with healthy participants under tempo naming conditions, a paradigm which is thought to impair normal semantic control by imposing dual task conditions. A basic picture naming paradigm was used throughout, with the addition of different grammatical classes, correct and misleading phonemic cues, and repetition and semantic priming: all these manipulations could be expected to place differing loads on a semantic control system with either permanent or experimentally induced impairment. It was found that stimuli requiring less controlled processing such as high imageability objects, pictures with simultaneous correct cues or repetition primed pictures were named significantly more accurately than items which needed more controlled processing, such as low imageability actions, pictures with misleading phonemic cues and unprimed pictures. The cueing evidence offered support to interactive models of speech production where phonological activation is able to influence semantic selection. The impairment in tasks such as the inhibition of task-irrelevant material seen in SA patients and tempo participants, and the overlap between cortical areas cited in studies looking at both semantic and wider executive control mechanisms suggest that semantic control may be part of a more generalised executive system.

APA, Harvard, Vancouver, ISO, and other styles

14

Rohani, Mehdiabadi Behrooz. "Power control for mobile radio systems using perceptual speech quality metrics." University of Western Australia. School of Electrical, Electronic and Computer Engineering, 2007. http://theses.library.uwa.edu.au/adt-WU2007.0174.

Full text

Abstract:

As the characteristics of mobile radio channels vary over time, transmit power must be controlled accordingly to ensure that the received signal level is within the receiver's sensitivity. As a consequence, modern mobile radio systems employ power control to regulate the received signal level such that it is neither less nor excessively larger than receiver sensitivity in order to maintain adequate service quality. In this context, speech quality measurement is an important aspect in the delivery of speech services as it will impact satisfaction of customers as well as the usage of precious system resources. A variety of techniques for speech quality measurement has been produced over the last few years as result of tireless research in the area of perceptual speech quality estimation. These are mainly based on psychoacoustic models of the human auditory systems. However, these techniques cannot be directly applied for real-time communication purposes as they typically require a copy of the transmitted and received speech signals for their operation. This thesis presents a novel technique of incorporating perceptual speech quality metrics with power control for mobile radio systems. The technique allows for standardized perceptual speech quality measurement algorithms to be used for in-service measurement of speech quality. The accuracy of the proposed Real-Time Perceptual Speech Quality Measurement (RTPSQM) technique with respect to measuring speech quality is first validated by extensive simulations. On this basis, RTPSQM is applied to power control in the Global System for Mobile (GSM) communication and the Universal Mobile Telecommunication System (UMTS). It is shown by simulations that the use of perceptual-based power control in GSM and UMTS outperforms conventional power control in terms of reducing the transmitter signal power required for providing adequate speech quality. This in turn facilitates the observed increase in system capacity and thus offers better utilization of available system resources. To enable an analytical performance assessment of perceptual speech quality metrics in power control, the mathematical frameworks for conventional and perceptual-based power control are derived. The derivations are performed for Code Division Multiple Access (CDMA) systems and kept as generic as possible. Numerical results are presented which could be used in a system design to readily find the Erlang capacity per cell for either of the considered power control algorithms.

APA, Harvard, Vancouver, ISO, and other styles

15

Shiller, Douglas M. "Understanding speech motor control in the context of orofacial biomechanics." Thesis, McGill University, 2002. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=84435.

Full text

Abstract:

A series of experiments are described which explore the relationship between biomechanical properties and the control of jaw movement in speech. This relationship is documented using kinematic analyses in conjunction with a mathematical model of jaw motion and direct measures of jaw stiffness.
In the first experiment, empirical and modeling studies were carried out to examine whether the nervous system compensates for naturally occurring forces acting on the jaw during speech. As subjects walk or run, loads to the jaw vary with the direction and magnitude of head acceleration. While these loads are large enough to produce a measurable effect on jaw kinematics, variation in jaw position during locomotion is shown to be substantially reduced when locomotion is combined with speech. This reduction in jaw motion is consistent with the idea that in speech, the control of jaw movement is adjusted to offset the effects of head acceleration. Results of simulation studies using a physiologically realistic model of the jaw provide further evidence that subjects compensate for the effects of self-generated loads by adjusting neural control signals.
A second experiment explores the idea that a principle mechanical property of the jaw---its spring-like behavior, or stiffness---might influence patterns of kinematic variation in speech movements. A robotic device was used to deliver mechanical perturbations to the jaw in order to quantify stiffness in the mid-sagittal plane. The observed stiffness patterns were non-uniform, with higher stiffness in the protrusion-retraction direction. Consistent with the idea that kinematic patterns reflect directional asymmetries in stiffness, a detailed relationship between jaw kinematic variability and stiffness was observed---kinematic variability was consistently higher under conditions in which jaw stiffness was low. Modeling studies suggested that the pattern of jaw stiffness is significantly determined by jaw geometrical properties and muscle force generating abilities.
A third experiment examines the extent to which subjects are able to alter the three-dimensional pattern of jaw stiffness in a task-dependent manner. Destabilizing loads were applied to the jaw in order to disrupt the ability of subjects to maintain a static jaw posture. Subjects adapted by increasing jaw stiffness in a manner that depended on the magnitude and, to a more limited extent, direction of the destabilizing load. The results support the idea that stiffness properties can be controlled in the jaw, and thus may play a role in regulating mechanical interactions in the orofacial system.

APA, Harvard, Vancouver, ISO, and other styles

16

Schäfer, Dirk. "Context-sensitive speech recognition in the air traffic control simulation." [S.l. : s.n.], 2000. http://deposit.ddb.de/cgi-bin/dokserv?idn=961514280.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Kuffel, Robert F. "Speech recognition software : an alternative to reduce ship control manning /." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2004. http://library.nps.navy.mil/uhtbin/hyperion/04Mar%5FKuffel.pdf.

Full text

Abstract:

Thesis (M.S. in Information Systems and Operations)--Naval Postgraduate School, March 2004.
Thesis advisor(s): Russell Gottfried, Monique P. Fargues. Includes bibliographical references (p. 43-45). Also available online.

APA, Harvard, Vancouver, ISO, and other styles

18

Romsdorfer, Harald. "Polyglot text to speech synthesis text analysis & prosody control." Aachen Shaker, 2009. http://d-nb.info/993448836/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Ong, Leh Kui. "Source reliant error control for low bit rate speech communications." Thesis, University of Surrey, 1994. http://epubs.surrey.ac.uk/843456/.

Full text

Abstract:

Contemporary and future speech telecommunication systems now utilise low bit rate (LBR) speech coding techniques in efforts to eliminate bandwidth expansion as a disadvantage of digital coding and transmission. These speech coders employ model-based approaches in compressing human speech into a number of parameters, using a well-known process known as linear predictive coding (LPC). However, a major side-effect observed in these coders is that errors in the model parameters have noticeable and undesirable consequences on the synthesised speech quality, and unless they are protected from such corruptions, the level of service quality will deteriorate rapidly. Traditionally, forward error correction (FEC) coding is used to remove these errors, but these require substantial redundancy. Therefore, a different perspective of the error control problems and solutions is necessary. In this thesis, emphasis is constantly placed on exploiting the constraints and residual redundancies present in the model parameters. It is also shown that with such source criteria in the LBR speech coders, varying degrees of error protection from channel corruptions are feasible. From these observations, error control requirements and methodologies, using both block- and parameter-orientated aspects, are analysed, devised and implemented. It is evident, that under the unusual circumstances which LBR speech coders have to operate in, the importance and significance of source reliant error control will continue to attract research and commercial interests. The work detailed in this thesis is focused on two LPC-based speech coders. One of the ideas developed for these two coders is an advanced zero redundancy scheme for the LPC parameters which is designed to operate at high channel error rates. Another concept proposed here is the use of source criteria to enhance the decoding capabilities of FEC codes to exceed that of maximum likelihood decoding performance. Lastly, for practical operation of LBR speech coders, lost frame recovery strategies are viewed to be an indispensable part of error control. This topic is scrutinised in this thesis by investigating the behaviour of a specific speech coder under irrecoverable error conditions. In all of the ideas pursued above, the effectiveness of the algorithms formulated here are quantified using both objective and subjective tests. Consequently, the capabilities of the techniques devised in this thesis can be demonstrated, examples of which are: (1) higher speech quality produced under noisy channels, using an improved zero-redundancy algorithm for the LPC filter coefficients; (2) as much as 50% improvement in the residual BER and decoding failures of FEC schemes, through the utilisation of source criteria in LBR speech coders; and (3) acceptable speech quality produced under high frame loss rates (14%), after formulating effective strategies for recovery of speech coder parameters. It is hoped that the material described here provide concepts which can help achieve the ideals of maximum efficiency and quality in LBR speech telecommunications.

APA, Harvard, Vancouver, ISO, and other styles

20

Wang, Yonglian. "Speech Recognition under Stress." Available to subscribers only, 2009. http://proquest.umi.com/pqdweb?did=1968468151&sid=9&Fmt=2&clientId=1509&RQT=309&VName=PQD.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Theron, Karin. "Temporal aspects of speech production in bilingual speakers with neurogenic speech disorders." Diss., Pretoria : [s.n.], 2003. http://upetd.up.ac.za/thesis/available/etd-08072003-152242.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Odlozinski, Lisa M. "An acoustic analysis of speech rate control procedures in Parkinson's disease." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/tape17/PQDD_0004/MQ30738.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Leung, Man-tak, and 梁文德. "The role of proprioceptive and auditory feedback on speech motor control." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31241967.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Leung, Man-tak. "The role of proprioceptive and auditory feedback on speech motor control." Hong Kong : University of Hong Kong, 2001. http://sunzi.lib.hku.hk/hkuto/record.jsp?B22805503.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Thornton, David Gilbert Juan E. "Talking games an empirical study of speech-based cursor control mechanisms /." Auburn, Ala, 2008. http://repo.lib.auburn.edu/EtdRoot/2008/FALL/Computer_Science_and_Software_Engineering/Dissertation/Thornton_David_31.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Forster, David C. (David Clarke) Carleton University Dissertation Psychology. "Speech-motor control and interhemispheric relations in recovered and persistent stuttering." Ottawa, 1996.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

27

Liu, Xunying. "Discriminative complexity control and linear projections for large vocabulary speech recognition." Thesis, University of Cambridge, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.613815.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Baber, Christopher. "The human factors of automatic speech recognition in control room systems." Thesis, Aston University, 1990. http://publications.aston.ac.uk/10839/.

Full text

Abstract:

This thesis addresses the viability of automatic speech recognition for control room systems; with careful system design, automatic speech recognition (ASR) devices can be useful means for human computer interaction in specific types of task. These tasks can be defined as complex verbal activities, such as command and control, and can be paired with spatial tasks, such as monitoring, without detriment. It is suggested that ASR use be confined to routine plant operation, as opposed the critical incidents, due to possible problems of stress on the operators' speech. Some solutions to the problems of stress are given. From a series of studies, it is concluded that the interaction be designed to capitalise upon the tendency of operators to use short, succinct, and task specific styles of speech. From studies comparing different types of feedback, it is concluded that operators be given screen based feedback rather than auditory feedback. Feedback will take two forms: the use of the ASR device will require recognition feedback, which will be best supplied using text; the performance of a process control task will require feedback integrated into the mimic display. This latter feedback can be either textual or symbolic, but it is suggested that symbolic feedback will be more beneficial. Related to both interaction style and feedback is the issue of handling recognition errors. These should be corrected by simple command repetition practices, rather than use error handling dialogues. This thesis also addresses some of the problems of user error in ASR use, and provides a number of recommendations for its reduction. Before using the ASR device, new operators will require some form of training. It is shown that a demonstration by an experienced user of the device can lead to superior performance than instructions. Thus a relatively cheap and very efficient form of operator training can be supplied by demonstration by experienced ASR operators

APA, Harvard, Vancouver, ISO, and other styles

29

Spencer, Caroline. "Neural Mechanisms of Intervention in Residual Speech Sound Disorder." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617106948666706.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Lahey, Michael. "Soft control| Television's relationship to digital micromedia." Thesis, Indiana University, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3607011.

Full text

Abstract:

This dissertation explores the role soft control plays in the relationship between the television industry and short forms of digital media. Following James Beniger and Tizianna Terranova, I define soft control as the purposive movement by the television industry towards shaping audience attention toward predetermined goals through a range of interactions where development happens somewhat autonomously, while being interjected with commands over time. I define such things as media environment design, branding, and data collection as soft control practices. I focus on television as a way to understand how an industry historically patterned around more rigid forms of audience control deals with a digital media environment often cited for its lack of control features. And while there is already a robust discussion on the shifting strategies for the online distribution of shows, there is less of a focus on the increasing importance of shorter forms of digital media to the everyday operation of the television industry. Shorter forms of media include digitally circulated short videos, songs, casual digital games, and even social media, which is itself a platform for the distribution of shorter forms of media. I refer to all these forms of short media as "micromedia" and focus my interest on how various television companies are dealing with media environments saturated with it.

To do this I look at, for instance, how television companies use the data available on Twitter and appropriate the user-generated content of audiences, as well as how standard digital communication interfaces are utilized to more easily retrofit previous audience retention practices into new digital environments. Through the investigation of how television creates and appropriates micromedia as a way to reconfigure practices into the everyday lives of participatory audiences, I argue that we can see soft control elements at work in structuring the industry-audience relationship. These soft control features call into question the emancipatory role attributed to participatory audiences and digital technologies alike. If we think about media forms in their specific contexts, making sure to focus on their intermedial connections and their materiality, we can complicate ideas about what the categories of audience or industrial control mean.

APA, Harvard, Vancouver, ISO, and other styles

31

Clapp, Amanda Louise. "Investigating cognitive control in language switching." Thesis, University of Exeter, 2013. http://hdl.handle.net/10871/14106.

Full text

Abstract:

How do bi/multilinguals switch between languages so effectively that there is no obvious intrusion from the alternatives? One can examine this by comparing language selection with task selection, or language switching with task switching. This is the approach adopted in the first of two strands of research presented in this thesis. In task switching, providing advance warning of the task typically leads to a reduction in the performance ‘switch cost’, suggesting top-down biasing of task selection. It is not clear whether the language switch cost also reduces with preparation, partly because there have been very few attempts to examine preparation for a language switch, and partly because these attempts suffered from non-trivial methodological drawbacks. In Experiments 1-3 I used an optimised picture naming paradigm in which language changed unpredictably and was specified by a language cue presented at different intervals before the picture. Experiment 1, conducted on ‘unbalanced’ bilinguals, revealed some evidence of reduction in the language switch cost for naming times with preparation, but only when cue duration was short. In an attempt to further optimise the paradigm, in Experiment 2 the cue-stimulus interval (which was varied from trial to trial in Experiment 1), was varied over blocks instead. Visual cues were replaced with auditory cues – the latter also enabled a comparison between semantically transparent word cues (the spoken names of the languages) and less transparent cues (fragments of national anthems). Experiment 2 revealed a reduction in switch cost with preparation for naming latencies, but only in the second language; the first language showed the reverse. To examine whether the increase in switch cost with preparation in the first language could be due to unbalanced bilinguals biasing processing towards L2, balanced bilinguals were tested in Experiment 3. This revealed a robust reduction in switch cost in naming latencies for both languages, which was driven primarily by the trials with the anthem cues. However, in the error rates the switch cost increased with preparation interval, thus complicating the interpretation of the reduction observed for response times. Experiment 4 investigated whether preparation for a language switch elicits the electrophysiological patterns commonly found during preparation for a task switch – a switch-induced positive polarity Event-Related Potential (ERP) with a posterior scalp distribution. Contrary to a recent report of the absence of the posterior positivity in language switching, it was clearly present in the present EEG data. As in task switching, the amplitude of the posterior positivity predicted performance. The electrophysiological data suggest that preparation for a language switch and preparation for a task switch rely on highly overlapping control mechanisms. The behavioural data suggest that advance control can be effective in language switching, but perhaps not as effective as in task switching. Experiments 1-3 also examined the effect of stimulus associative history – whether the language used on the previous encounter with a given stimulus influenced performance on the current trial). Having previously named a given picture in the same language benefited overall performance, but did not do so more for switches than repeats. Thus, stimulus associative history does not seem to contribute to the language switch cost. The second strand of my research asked whether bilinguals can set themselves independently for speech vs. comprehension. Previous research has examined the cost of switching the language in output tasks and in input tasks. But, it is not clear whether one can apply separate control settings for input and output selection. To investigate this, I used a paradigm that combined switching languages for speech production and comprehension. My reasoning was that, if there is cross-talk between the control settings for input vs. output, performance in one pathway should benefit if the language selected for the other pathway is the same relative to when it is different: a ‘language match effect’. Conversely, if there is no cross-talk, there should not be a language match effect. In Experiment 5 bilinguals alternated predictably between naming numbers in their first and second language (in runs of 3 trials), whilst also having to semantically categorise spoken words which occasionally (and unpredictably) replaced the numbers. The language of the categorisation ‘probes’ varied over blocks of ~17 naming runs, but was constant within a block. The results showed a clear match effect in the input task (categorisation), but not the output task (naming). To examine the potential role of proficiency, Experiment 6 used the same paradigm to test unbalanced and balanced bilinguals. The pattern of results was qualitatively similar in both groups to that observed in Experiment 5: a language match effect confined to the input task. These results suggest ‘leakage’ from the output control settings into the input control settings.

APA, Harvard, Vancouver, ISO, and other styles

32

Green, Jordan R. "Physiologic development of speech motor control : articulatory coordination of lips and jaw /." Thesis, Connect to this title online; UW restricted, 1998. http://hdl.handle.net/1773/8254.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Spaulding, Tammie J. "Attentional Control in Preschool Children with Specific Language Impairment." Diss., The University of Arizona, 2008. http://hdl.handle.net/10150/194819.

Full text

Abstract:

This research was guided by a theoretical framework positing that children with typical language apply general cognitive resources, such as attention, to facilitate language acquisition, and limitations in these processes may contribute to poor language skills. From this perspective, studying the attentional functioning of children who exhibit difficulty with language would have value for both informing this theory and understanding the nature of the disorder. However, research on the attention of children with specific language impairment (SLI) is limited, as only a few subdomains have been addressed to date. In addition, although school-age children with SLI have been studied, the assessment of attentional functioning in preschool children with this disorder has been minimal. This is likely the result the limitations inherent to the methods used for evaluating attentional skills at younger ages. The purpose of this research was to extend a method previously used successfully with preschool children to study selected aspects of attentional control including susceptibility to distraction, inhibitory control, and updating skills. The research questions were: (a) Do children with SLI exhibit increased susceptibility to distraction relative to their typically-developing peers, and if so, does it vary according to the type of distracter (visual, nonverbal-auditory, linguistic) presented? (b) Do children with SLI exhibit poor inhibitory control relative to their typically-developing peers? (c) Do children with SLI and their typically-developing peers display evidence of updating? Thirty-one preschool children with SLI and 31 controls participated in two computer tasks designed to assess these mechanisms of attentional control. The susceptibility to distraction task involved resisting distracters presented in different stimulus modalities (visual and auditory-linguistic/nonlinguistic). Inhibition and updating skills were assessed using a stop signal paradigm. In comparison to typically-developing children, the children with SLI exhibited increased susceptibility to distraction and poor inhibitory control. Unlike the controls, they exhibited no evidence of updating. The results of this investigation will contribute to a long-term goal of addressing how attention may affect language acquisition in children with SLI. In addition, the successful methodology employed in this study may offer an improved procedure for diagnosing attentional difficulties at an early age, regardless of language status.

APA, Harvard, Vancouver, ISO, and other styles

34

Scott, Sarah Jane. "Comparing Speech Movements in Different Types of Noise." BYU ScholarsArchive, 2014. https://scholarsarchive.byu.edu/etd/4226.

Full text

Abstract:

This study examined the impact of several noise conditions on speech articulator movements during a sentence repetition task. Sixty participants in three age groups ranging from 20 to 70 repeated a sentence under five noise conditions. Lower lip movements during production of a target sentence were used to compute the spatiotemporal index (STI). It was hypothesized that STI would be lower (indicating greater stability) in the silent baseline condition. There were changes in speech production under several of the noise conditions. The duration for the 1-talker condition was significantly shorter when compared to the silent condition, which could be due to the impact of the 1-talker noise on the attention of the speaker. The peak velocity of a selected closing gesture increased in all of the noise conditions compared to silence. It could be speculated that the repetitive and predictable nature of the speaking task allowed participants to easily filter out the noise while automatically increasing the velocity of lip movements, and consequently, the rate of speech. The STI in the pink noise and 6-talker conditions was lower than in the silent condition, which may be interpreted to reflect a steadier manner of speech production. This could be due to the fact that in the 6-speaker noise condition, the overall effect was more similar to continuous noise, and thus potentially less distracting than hearing a single speaker talking. The count of velocity peaks was unexpectedly lower in the noise conditions compared to speech in silence, suggesting a smoother pattern of articulator movement. The repetitiveness of the task may not require a high level of self-monitoring, resulting in speech output that was more automatic in the noise conditions. With the presentation of noise during a speaking task, the intensity increased due to the Lombard effect in all of the noise conditions. People communicate in noisy environments every day, and an increased understanding of the effects of noise on speech would have value from both theoretical and clinical perspectives.

APA, Harvard, Vancouver, ISO, and other styles

35

Wang, Yao Electrical Engineering &amp Telecommunications Faculty of Engineering UNSW. "Single channel speech enhancement based on perceptual temporal masking model." Awarded by:University of New South Wales. Electrical Engineering & Telecommunications, 2007. http://handle.unsw.edu.au/1959.4/40454.

Full text

Abstract:

In most speech communication systems, the presence of background noise causes the quality and intelligibility of speech to degrade, especially when the Signal-to-Noise Ratio (SNR) is low. Numerous speech enhancement techniques have been employed successfully in many applications. However, at low signal-to-noise ratios most of these speech enhancement techniques tend to introduce a perceptually annoying residual noise known as "musical noise". The research presented in this thesis aims to minimize this musical noise and maximize the noise reduction ability of speech enhancement algorithms to improve speech quality in low SNR environments. This thesis proposes two novel speech enhancement algorithms based on Weiner and Kalman filters, and exploit the masking properties of the human auditory system to reduce background noise. The perceptual Wiener filter method uses either temporal or simultaneous masking to adjust the Wiener gain in order to suppress noise below the masking thresholds. The second algorithm involves reshaping the corrupted signal according to the masking threshold in each critical band, followed by Kalman filtering. A comparison of the results from these proposed techniques with those obtained from traditional methods suggests that the proposed algorithms address the problem of noise reduction effectively while decreasing the level of the musical noise. In this thesis, many other existing competitive noise suppression methods have also been discussed and their performance evaluated under different types of noise environments. The performances were evaluated and compared to each other using both objective PESQ measures (ITU-T P.862) and subjective listening tests (ITU-T P.835). The proposed speech enhancement schemes based on the auditory masking model outperformed the other methods that were tested.

APA, Harvard, Vancouver, ISO, and other styles

36

Romsdorfer, Harald [Verfasser]. "Polyglot Text-to-Speech Synthesis : Text Analysis & Prosody Control / Harald Romsdorfer." Aachen : Shaker, 2009. http://d-nb.info/1156517354/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Steeve, Roger William. "Mandibular motor control during the early development of speech and nonspeech behaviors /." Thesis, Connect to this title online; UW restricted, 2004. http://hdl.handle.net/1773/8220.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Niziolek, Caroline A. "The role of linguistic contrasts in the auditory feedback control of Speech." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/62521.

Full text

Abstract:

Thesis (Ph. D. in Speech and Hearing Bioscience and Technology)--Harvard-MIT Division of Health Sciences and Technology, 2010.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 165-180).
Speakers use auditory feedback to monitor their own speech, ensuring that the intended output matches the observed output. By altering the acoustic feedback signal before it reaches the speaker's ear, we can induce auditory errors: differences between what is expected and what is heard. This dissertation investigates the neural mechanisms responsible for the detection and consequent correction of these auditory errors. Linguistic influences on feedback control were assessed in two experiments employing auditory perturbation. In a behavioral experiment, subjects spoke four-word sentences while the fundamental frequency (FO) of the stressed word was perturbed either upwards or downwards, causing the word to sound more or less stressed. Subjects adapted by altering both the FO and the intensity contrast between stressed and unstressed words, even though intensity remained unperturbed. An integrated model of prosodic control is proposed in which FO and intensity are modulated together to achieve a stress target. In a second experiment, functional magnetic resonance imaging was used to measure neural responses to speech with and without auditory perturbation. Subjects were found to compensate more for formant shifts that resulted in a phonetic category change than for formant shifts that did not, despite the identical magnitudes of the shifts. Furthermore, the extent of neural activation in superior temporal and inferior frontal regions was greater for cross-category than for within-category shifts, evidence that a stronger cortical error signal accompanies a linguistically-relevant acoustic change. Taken together, these results demonstrate that auditory feedback control is sensitive to linguistic contrasts learned through auditory experience.
by Caroline A. Niziolek.
Ph.D.in Speech and Hearing Bioscience and Technology

APA, Harvard, Vancouver, ISO, and other styles

39

Karlsson, Joakim. "The integration of automatic speech recognition into the air traffic control system." Thesis, Massachusetts Institute of Technology, 1990. http://hdl.handle.net/1721.1/42184.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Obelnicki, Mary Carolyn 1976. "Speech and gesture integration for a game-based command and control environment." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86520.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.
Includes bibliographical references (p. 63-66).
by Mary Carolyn Obelnicki.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

41

Nygren, Mårten. "Improved speech communication in a car." Thesis, Linköping University, Department of Electrical Engineering, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-1720.

Full text

Abstract:

In modern cars a lot of effort is put on reducing the background noise level. Despite these efforts it is often difficult for persons in the rear seat(s) to hear the persons in the front seat. This is partly due to the background noise, but also geometry and acoustics properties of the passenger compartment.

The aim of this thesis was to implement a speech enhancement system to increase the audibility between the driver and the rear passenger(s). The speech enhancement system should not affect the directivity of the speech or increase the background noise level.

A speech enhancement system has been implemented on a DSP in a test car. A microphone was placed in front of the driver to collect his/her speech. The microphone signal was bandpass filtered to remove the main part of the background noise and to avoid aliasing. The signal was delayed before it was sent out in the rear loudspeaker. The delay made the speech from the driver reaching the rear passenger before the sound the rear loudspeakers. This delay was enough to get the right directivity of the sound, i.e. making speech sounding as if it came from the driver instead of the rear loudspeakers.

In the thesis other methods to reduce background noise and get directivity of the sound were evaluated, but not implemented in the test car. The evaluations of the system showed that the audibility was increased. At the same time the background noise level was not noticeable increased. The work has been performed at A2 Acoustics AB in Linköping, during spring 2003.

APA, Harvard, Vancouver, ISO, and other styles

42

Etter, Nicole M. "The Relationship of Somatosensory Perception and Fine-Force Control in the Adult Human Orofacial System." UKnowledge, 2014. http://uknowledge.uky.edu/rehabsci_etds/19.

Full text

Abstract:

The orofacial area stands apart from other body systems in that it possesses a unique performance anatomy whereby oral musculature inserts directly into the underlying cutaneous skin, allowing for the generation of complex three-dimensional deformations of the orofacial system. This anatomical substrate provides for the tight temporal synchrony between self-generated cutaneous somatosensation and oromotor control during functional behaviors in this region and provides the necessary feedback needed to learn and maintain skilled orofacial behaviors. The Directions into Velocity of Articulators (DIVA) model highlights the importance of the bidirectional relationship between sensation and production in the orofacial region in children learning speech. This relationship has not been as well-established in the adult orofacial system. The purpose of this observational study was to begin assessing the perception-action relationship in healthy adults and to describe how this relationship may be altered as a function of healthy aging. This study was designed to determine the correspondence between orofacial cutaneous perception using vibrotactile detection thresholds (VDT) and low-level static and dynamic force control tasks in three representative age cohorts. Correlational relationships among measures of somatosensory capacity and low-level skilled orofacial force control were determined for 60 adults (19-84 years). Significant correlational relationships were identified using non-parametric Spearman’s correlations with an alpha at 0.1 between the 5 Hz test probe and several 0.5 N low-level force control assessments in the static and slow ramp-and-hold condition. These findings indicate that as vibrotactile detection thresholds increase (labial sensation decreases), ability to maintain a low-level force endpoint decreases. Group data was analyzed using non-parametric Kruskal-Wallis tests and identified significant differences between the 5 Hz test frequency probe and various 0.5 N skilled force assessments for group variables such as age, pure tone hearing assessments, sex, speech usage and smoking history. Future studies will begin the processing of modeling this complex multivariate relationship in healthy individuals before moving to a disordered population.

APA, Harvard, Vancouver, ISO, and other styles

43

Xu, Jue. "Adaptations in Speech Processing." Doctoral thesis, Humboldt-Universität zu Berlin, 2021. http://dx.doi.org/10.18452/23030.

Full text

Abstract:

Wie sich die Sprachwahrnehmung an ständig eingehende Informationen anpasst, ist eine Schlüsselfrage in der Gedanken- und Gehirnforschung. Die vorliegende Dissertation zielt darauf ab, zum Verständnis von Anpassungen an die Sprecheridentität und Sprachfehler während der Sprachverarbeitung beizutragen und unser Wissen über die Rolle der kognitiven Kontrolle bei der Sprachverarbeitung zu erweitern. Zu diesem Zweck wurden ereigniskorrelierte Potentiale (EKPs, englisch: event-related potentials, ERPs) N400 und P600 in der Elektroenzephalographie (EEG) analysiert. Die vorliegende Arbeit befasste sich insbesondere mit der Frage nach der Anpassung an die Sprecheridentität bei der Verarbeitung von zwei Arten von Sprachfehlern (Xu, Abdel Rahman, & Sommer, 2019), und untersuchte die proaktive Anpassungen, die durch die Erkennung von Sprachfehlern (Xu, Abdel Rahman, & Sommer, 2021) und durch die Sprecher(dis)kontinuität über aufeinanderfolgende Sätze in Situationen mit mehreren Sprechern ausgelöst wurden (Xu, Abdel Rahman, & Sommer, 2021, in press). Die Ergebnisse zeigten, dass unterschiedliche Sprachverarbeitungsstrategien entsprechend der Sprecheridentität von Muttersprachlern oder Nicht-Muttersprachlern und zwei verschiedenen Arten von Sprachfehlern angepasst wurden, was sich in unterschiedlichen N400- und P600-Effekten widerspiegelte. Darüber hinaus kann die Erkennung von Konflikten (Sprachfehler) und Sprecher(dis)kontinuität über aufeinanderfolgende Sätze hinweg eine proaktive kognitive Kontrolle erfordern, die die Verarbeitungsstrategien für den folgenden Satz schnell anpasst, was sich in bisher nicht gemeldeten sequentiellen Anpassungseffekten in der P600-Amplitude manifestierte. Basierend auf dem DMC Modell (Braver, 2012; Braver, Gray, & Burgess, 2007) und dem Überwachungsmodell der Sprachverarbeitung (van de Meerendonk, Indefrey, Chwilla, & Kolk, 2011) schlage ich vor, dass die P600-Amplitude nicht nur reaktive Anpassungen manifestiert, die durch Konflikterkennung ausgelöst werden, nämlich die klassischen P600-Effekte, die eine erneute Analyse der Sprachverarbeitung widerspiegeln, sondern auch proaktive Anpassungen in der Überwachung der Sprachverarbeitung, die Mechanismen der kognitiven Kontrolle von Aufmerksamkeit und Gedächtnis beinhalten.
How language perception adapts to constantly incoming information is a key question in mind and brain research. This doctoral thesis aims to contribute to the understanding of adaptation to speaker identity and speech error during speech processing, and to enhance our knowledge about the role of cognitive control in speech processing. For this purpose, event-related brain potentials (ERPs) N400 and P600 in the electroencephalography (EEG) were analyzed. Specifically, the present work addressed the question about adaptation to the speaker’s identity in processing two types of speech errors (Xu, Abdel Rahman, & Sommer, 2019), and explored proactive adaptation initiated by the detection of speech errors (Xu, Abdel Rahman, & Sommer, 2021) and by speaker (dis-)continuity across consecutive sentences in multi-speaker situations (Xu, Abdel Rahman, & Sommer, 2021, in press). Results showed that different speech processing strategies were adapted according to native or non-native speaker identity and two different types of speech errors, reflected in different N400 and P600 effects. In addition, detection of conflict (speech error) and speaker (dis-)continuity across consecutive sentences engage cognitive control to rapidly adapt processing strategies for the following sentence, manifested in hitherto unreported sequential adaptation effects in the P600 amplitude. Based on the DMC model (Braver, 2012; Braver, Gray, & Burgess, 2007) and the monitoring theory of language perception (van de Meerendonk, Indefrey, Chwilla, & Kolk, 2011), I propose that the P600 amplitude manifests not only reactive adaptations triggered by conflict detection, i.e., the classic P600 effect, reflecting reanalysis of speech processing, but also proactive adaptations in monitoring the speech processing, engaging cognitive control mechanisms of attention and memory.

APA, Harvard, Vancouver, ISO, and other styles

44

Hu, Rong. "Enhancement of adaptive de-correlation filtering separation model for robust speech recognition." Diss., Columbia, Mo. : University of Missouri-Columbia, 2007. http://hdl.handle.net/10355/4682.

Full text

Abstract:

Thesis (Ph. D.)--University of Missouri-Columbia, 2007.
The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on September 25, 2007) Vita. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

45

Koliousis, Dimitrios S. "Real-time speech recognition system for robotic control applications using an ear-microphone." Thesis, Monterey, Calif. : Naval Postgraduate School, 2007. http://bosun.nps.edu/uhtbin/hyperion-image.exe/07Jun%5FKoliousis.pdf.

Full text

Abstract:

Thesis (M.S. in Electrical Engineering)--Naval Postgraduate School, June 2007.
Thesis Advisor(s): Monique P. Fargues, Ravi Vaidyanathan, Peter R. Ateshian. "June 2007." Includes bibliographical references (p. 133-136). Also available in print.

APA, Harvard, Vancouver, ISO, and other styles

46

Mills, Timothy Ian Pandachuck. "Speech motor control variables in the production of voicing contrasts and emphatic accent." Thesis, University of Edinburgh, 2009. http://hdl.handle.net/1842/5815.

Full text

Abstract:

This dissertation looks at motor control in speech production. Two specific questions emerging from the speech motor control literature are studied: the question of articulatory versus acoustic motor control targets, and the question of whether prosodic linguistic variables are controlled in the same way as segmental linguistic variables. In the first study, I test the utility of whispered speech as a tool for addressing the question of articulatory or acoustic motor control targets. Research has been done probing both sides of this question. The case for articulatory specifications is developed in depth in the Articulatory Phonology framework of Haskins researchers (eg. Browman & Goldstein 2000), based on the task-dynamic model of control presented by Saltzman & Kelso (1987). The case for acoustic specifications is developed in the work of Perkell and others (eg Perkell, Matthies, Svirsky & Jordan 1993, Guenther, Espy-Wilson, Boyce, Matthies, Zandipour & Perkell 1999, Perkell, Guenther, Lane, Matthies, Perrier, Vick,Wilhelms-Tricarico & Zandipour 2000). It has also been suggested that some productions are governed by articulatory targets while others are governed by acoustic targets (Ladefoged 2005). This study involves two experiments. In the first, I make endoscopic video recordings of the larynx during the production of phonological voicing contrasts in normal and whispered speech. I discovered that the glottal aperture differences between voiced obstruents (ie, /d) and voiceless obstruents (ie, /t) in normal speech was preserved in whispered speech. Of particular interest was the observation that phonologically voiced obstruents tended to exhibit a narrower glottal aperture in whisper than vowels, which are also phonologically voiced. This suggests that the motor control targets of voicing is different for vowels than for voiced obstruents. A perceptual experiment on the speech material elicited in the endoscopic recordings elicited judgements to see whether listeners could discriminate phonological voicing in whisper, in the absence of non-laryngeal cues such as duration. I found that perceptual discrimination in whisper, while lower than that for normal speech, was significantly above chance. Together, the perceptual and the production data suggest that whispered speech removes neither the acoustic nor the articulatory distinction between phonologically voiced and voiceless segments. Whisper is therefore not a useful tool for probing the question of articulatory versus acoustic motor control targets. In the second study, I look at the multiple parameters contributing to relative prominence, to see whether they are controlled in a qualitatively similar way to the parameters observed in bite block studies to contribute to labial closure or vowel height. I vary prominence by eliciting nuclear accents with a contrastive and a non-contrastive reading. Prominence in this manipulation is found to be signalled by f0 peak, accented syllable duration, and peak amplitude, but not by vowel de-centralization or spectral tilt. I manipulate the contribution of f0 in two ways. The first is by eliciting the contrastive and non-contrastive readings in questions rather than statements. This reduces the f0 difference between the two readings. The second is by eliciting the contrastive and non-contrastive readings in whispered speech, thus removing the acoustic f0 information entirely. In the first manipulation, I find that the contributions of both duration and amplitude to signalling contrast are reduced in parallel with the f0 contribution. This is a qualitatively different behaviour from all other motor control studies; generally, when one variable is manipulated, others either act to compensate or do not react at all. It would seem, then, that this prosodic variable is controlled in a different manner from other speech motor targets that have been examined. In the whisper manipulation, I find no response in duration or amplitude to the manipulation of f0. This result suggests that, like in the endoscopy study, perhaps whisper is not an effective means of perturbing laryngeal articulations.

APA, Harvard, Vancouver, ISO, and other styles

47

Juster, Joshua. "Speech and gesture understanding in a homeostatic control framework for a robotic chandelier." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/33133.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.
Includes bibliographical references (leaf 40).
We describe a home lighting robot that uses directional spotlights to create complex lighting scenes. The robot senses its visual environment using a panoramic camera and attempts to maintain its target goal state by adjusting the positions and intensities of its lights. Users can communicate desired changes in the lighting environment through speech and gesture (e.g., "Make it brighter over there"). Information obtained from these two modalities are combined to form a goal, a desired change in the lighting of the scene. This goal is then incorporated into the system's target goal state. When the target goal state and the world are out of alignment, the system formulates a sensorimotor plan that acts on the world to return the system to homeostasis.
by Joshua Juster.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

48

Meekings, S. A. L. "The role of the superior temporal gyrus in auditory feedback control of speech." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1566746/.

Full text

Abstract:

Modern, biologically plausible models of speech production suggest that the superior temporal gyrus (STG) acts as a feedback monitor during speech production. This thesis investigates the role of the STG during speech production in three groups that have been hypothesized to use auditory feedback in differing ways: typical speakers, people who stammer, and a stroke patient. Because accurate speech production in most conversational settings can be accomplished without recourse to checking auditory feedback, it is necessary to introduce an external ‘error’, or feedback perturbation, to ensure that feedback control is being used. Here, masking noise was used as an ecologically valid perturbation that reliably prompts vocal adaptation. An activation likelihood estimation meta-analysis showed that feedback perturbation is generally associated with bilateral STG activation. This was supported by a lesion study of a patient with left-sided stroke that suggested a link between temporal cortex infarct and an abnormal response to feedback perturbation. However, a functional magnetic resonance imaging (fMRI) study of typical speakers’ behavioural and neural responses to different types of masking noise found that activation in the STG was driven not by the availability of auditory feedback, but by the informational content of the masker. Finally, an fMRI study of people who stutter— whose disfluency is hypothesized to arise from an overreliance on auditory feedback— found that STG activation was greatest in fluency-enhancing conditions, rather than during stuttering. In sum, while there is some evidence that the STG acts as a feedback monitor, this is limited to a subset of situations that involve auditory feedback. It is likely that feedback monitoring is not as central to speech communication as the previous literature might indicate. It is suggested that the concept of auditory ‘error’ should be reformulated to acknowledge different types of speech goals—acoustic, semantic, or phonemic.

APA, Harvard, Vancouver, ISO, and other styles

49

Fuhrman, Robert. "Vocal effort and within-speaker coordination in speech production : effects on postural control." Thesis, University of British Columbia, 2014. http://hdl.handle.net/2429/51656.

Full text

Abstract:

This thesis probes the joint role of respiration in speech motor control and postural control by examining the effect of increasing the loudness of speech production, or vocal effort, on within-speaker coordination. Specifically, this work tested the dual hypothesis that the functional demands of speech production at increasingly higher levels of vocal effort would result in increasingly rigid coordination across multiple bodily subsystems, and that this entrainment would ultimately affect postural control, resulting in a loss of balance An interactive spontaneous speech task was used to elicit speech at multiple levels of vocal effort by increasing the intended communication distance. Data from acoustic and kinematic measurement domains, including speech, rigid body motion of the head, 2d motion of the body extracted from video, and postural forces and torques measured at the feet, were collected simultaneously. These data were analyzed using a unique collection of techniques for the analysis of non-stationary time-series, which included methods for assessing cross-domain correspondence, system dimensionality, and fluctuation characteristics. The results of these analyses show convergent evidence for both hypotheses. Coordination among kinematic and acoustic measurement domains both strengthens and simplifies at high levels of vocal effort, and evidence of postural instability was found at the highest levels of vocal effort. Subsystem fluctuation characteristics show a direct relationship to the observed effects on coordination, both in terms of their intrinsic properties and in terms of changes due to increased vocal effort. Although this study did not include a direct measure of respiration, these results highlight the necessity of expanding our understanding of respiration’s role in speech motor control, especially insofar as the inevitable crossover between speech and other task domains, such as postural control, is concerned. The methodology used in this study can be straightforwardly expanded towards these ends, and provides a potentially useful in-roads to research in this direction. Even in the absence of a respiratory measure, these results will be of potential interest to clinicians working on the treatment of patients with speech disorders associated with neurological dysfunction, as occur, for example, in Parkinson’s disease.
Arts, Faculty of
Linguistics, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

50

Jenkins, Reni L. "The use of the auditory lexical decision task as a method for assessing the relative quality of synthetic speech." Thesis, This resource online, 1992. http://scholar.lib.vt.edu/theses/available/etd-05042010-020234/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Speech control'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles