Dissertations / Theses on the topic 'Speech processing systems. Pattern recognition systems'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Speech processing systems. Pattern recognition systems.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Alphonso, Issac John. "Network training for continuous speech recognition." Master's thesis, Mississippi State : Mississippi State University, 2003. http://library.msstate.edu/etd/show.asp?etd=etd-10252003-105104.
Full textCombrinck, Hendrik Petrus. "A cost, complexity and performance comparison of two automatic language identification architectures." Pretoria : [s.n.], 2006. http://upetd.up.ac.za/thesis/available/etd-12212006-141335/.
Full textSundaram, Anand R. K. "Vowel recognition using Kohonen's self-organizing feature maps /." Online version of thesis, 1991. http://hdl.handle.net/1850/10710.
Full textSukittanon, Somsak. "Modulation scale analysis : theory and application for nonstationary signal classification /." Thesis, Connect to this title online; UW restricted, 2004. http://hdl.handle.net/1773/5875.
Full textChen, Xin. "Ensemble methods in large vocabulary continuous speech recognition." Diss., Columbia, Mo. : University of Missouri-Columbia, 2008. http://hdl.handle.net/10355/5797.
Full textThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on August 28, 2008) Vita. Includes bibliographical references.
Jantan, Adznan Bin. "A comparative study of various analysis techniques for use in speech recognition systems." Thesis, Swansea University, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.292473.
Full textXue, Jian. "Improvement of decoding engine & phonetic decision tree in acoustic modeling for online large vocabulary conversational speech recognition." Diss., Columbia, Mo. : University of Missouri-Columbia, 2007. http://hdl.handle.net/10355/4821.
Full textThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on March 4, 2008) Vita. Includes bibliographical references.
Chiou, Greg I. "Active contour models for distinct feature tracking and lipreading /." Thesis, Connect to this title online; UW restricted, 1995. http://hdl.handle.net/1773/6023.
Full textRavindran, Sourabh. "Physiologically Motivated Methods For Audio Pattern Classification." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14066.
Full textDu, Toit A. (Andre). "Automatic classification of spoken South African English variants using a transcription-less speech recognition approach." Thesis, Stellenbosch : Stellenbosch University, 2004. http://hdl.handle.net/10019.1/49866.
Full textENGLISH ABSTRACT: We present the development of a pattern recognition system which is capable of classifying different Spoken Variants (SVs) of South African English (SAE) using a transcriptionless speech recognition approach. Spoken Variants (SVs) allow us to unify the linguistic concepts of accent and dialect from a pattern recognition viewpoint. The need for the SAE SV classification system arose from the multi-linguality requirement for South African speech recognition applications and the costs involved in developing such applications.
AFRIKAANSE OPSOMMING: Ons beskryf die ontwikkeling van 'n patroon herkenning stelsel wat in staat is om verskillende Gesproke Variante (GVe) van Suid Afrikaanse Engels (SAE) te klassifiseer met behulp van 'n transkripsielose spraak herkenning metode. Gesproke Variante (GVe) stel ons in staat om die taalkundige begrippe van aksent en dialek te verenig vanuit 'n patroon her kenning oogpunt. Die behoefte aan 'n SAE GV klassifikasie stelsel het ontstaan uit die meertaligheid vereiste vir Suid Afrikaanse spraak herkenning stelsels en die koste verbonde aan die ontwikkeling van sodanige stelsels.
Little, M. A. "Biomechanically informed nonlinear speech signal processing." Thesis, University of Oxford, 2007. http://ora.ox.ac.uk/objects/uuid:6f5b84fb-ab0b-42e1-9ac2-5f6acc9c5b80.
Full textYaman, Sibel. "A multi-objective programming perspective to statistical learning problems." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/26470.
Full textCommittee Chair: Chin-Hui Lee; Committee Member: Anthony Yezzi; Committee Member: Evans Harrell; Committee Member: Fred Juang; Committee Member: James H. McClellan. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Hawkins, Mikhel E. "High speed target tracking using Kalman filter and partial window imaging." Thesis, Georgia Institute of Technology, 2002. http://hdl.handle.net/1853/16709.
Full textWong, Ing Hoo. "Design of a realtime high speed recognizer for unconstrained handprinted alphanumeric characters." Thesis, University of British Columbia, 1985. http://hdl.handle.net/2429/25135.
Full textApplied Science, Faculty of
Electrical and Computer Engineering, Department of
Graduate
Theunissen, M. W. (Marthinus Wilhelmus). "Phonene-based topic spotting on the switchboard corpus." Thesis, Stellenbosch : Stellenbosch University, 2002. http://hdl.handle.net/10019.1/52998.
Full textENGLISH ABSTRACT: The field of topic spotting in conversational speech deals with the problem of identifying "interesting" conversations or speech extracts contained within large volumes of speech data. Typical applications where the technology can be found include the surveillance and screening of messages before referring to human operators. Closely related methods can also be used for data-mining of multimedia databases, literature searches, language identification, call routing and message prioritisation. The first topic spotting systems used words as the most basic units. However, because of the poor performance of speech recognisers, a large amount of topic-specific hand-transcribed training data is needed. It is for this reason that researchers started concentrating on methods using phonemes instead, because the errors then occur on smaller, and therefore less important, units. Phoneme-based methods consequently make it feasible to use computer generated transcriptions as training data. Building on word-based methods, a number of phoneme-based systems have emerged. The two most promising ones are the Euclidean Nearest Wrong Neighbours (ENWN) algorithm and the newly developed Stochastic Method for the Automatic Recognition of Topics (SMART). Previous experiments on the Oregon Graduate Institute of Science and Technology's Multi-Language Telephone Speech Corpus suggested that SMART yields a large improvement over ENWN which outperformed competing phoneme-based systems in evaluations. However, the small amount of data available for these experiments meant that more rigorous testing was required. In this research, the algorithms were therefore re-implemented to run on the much larger Switchboard Corpus. Subsequently, a substantial improvement of SMART over ENWN was observed, confirming the result that was previously obtained. In addition to this, an investigation was conducted into the improvement of SMART. This resulted in a new counting strategy with a corresponding improvement in performance.
AFRIKAANSE OPSOMMING: Die veld van onderwerp-herkenning in spraak het te doen met die probleem om "interessante" gesprekke of spraaksegmente te identifiseer tussen groot hoeveelhede spraakdata. Die tegnologie word tipies gebruik om gesprekke te verwerk voor dit verwys word na menslike operateurs. Verwante metodes kan ook gebruik word vir die ontginning van data in multimedia databasisse, literatuur-soektogte, taal-herkenning, oproep-kanalisering en boodskap-prioritisering. Die eerste onderwerp-herkenners was woordgebaseerd, maar as gevolg van die swak resultate wat behaal word met spraak-herkenners, is groot hoeveelhede hand-getranskribeerde data nodig om sulke stelsels af te rig. Dit is om hierdie rede dat navorsers tans foneemgebaseerde benaderings verkies, aangesien die foute op kleiner, en dus minder belangrike, eenhede voorkom. Foneemgebaseerde metodes maak dit dus moontlik om rekenaargegenereerde transkripsies as afrigdata te gebruik. Verskeie foneemgebaseerde stelsels het verskyn deur voort te bou op woordgebaseerde metodes. Die twee belowendste stelsels is die "Euclidean Nearest Wrong Neighbours" (ENWN) algoritme en die nuwe "Stochastic Method for the Automatic Recognition of Topics" (SMART). Vorige eksperimente op die "Oregon Graduate Institute of Science and Technology's Multi-Language Telephone Speech Corpus" het daarop gedui dat die SMART algoritme beter vaar as die ENWN-stelsel wat ander foneemgebaseerde algoritmes geklop het. Die feit dat daar te min data beskikbaar was tydens die eksperimente het daarop gedui dat strenger toetse nodig was. Gedurende hierdie navorsing is die algoritmes dus herimplementeer sodat eksperimente op die "Switchboard Corpus" uitgevoer kon word. Daar is vervolgens waargeneem dat SMART aansienlik beter resultate lewer as ENWN en dit het dus die geldigheid van die vorige resultate bevestig. Ter aanvulling hiervan, is 'n ondersoek geloods om SMART te probeer verbeter. Dit het tot 'n nuwe telling-strategie gelei met 'n meegaande verbetering in resultate.
Morris, Robert W. "Enhancement and recognition of whispered speech." Diss., Available online, Georgia Institute of Technology, 2004:, 2003. http://etd.gatech.edu/theses/available/etd-04082004-180338/unrestricted/morris%5frobert%5fw%5f200312%5fphd.pdf.
Full textYOUSSIF, ROSHDY S. "HYBRID INTELLIGENT SYSTEMS FOR PATTERN RECOGNITION AND SIGNAL PROCESSING." University of Cincinnati / OhioLINK, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1085714219.
Full textSilvestre, Cerdà Joan Albert. "Different Contributions to Cost-Effective Transcription and Translation of Video Lectures." Doctoral thesis, Universitat Politècnica de València, 2016. http://hdl.handle.net/10251/62194.
Full text[ES] Durante estos últimos años, los repositorios multimedia on-line han experimentado un gran crecimiento que les ha hecho establecerse como fuentes fundamentales de conocimiento, especialmente en el área de la educación, donde se han creado grandes repositorios de vídeo charlas educativas para complementar e incluso reemplazar los métodos de enseñanza tradicionales. No obstante, la mayoría de estas charlas no están transcritas ni traducidas debido a la ausencia de soluciones de bajo coste que sean capaces de hacerlo garantizando una calidad mínima aceptable. Soluciones de este tipo son claramente necesarias para hacer que las vídeo charlas sean más accesibles para hablantes de otras lenguas o para personas con discapacidades auditivas. Además, dichas soluciones podrían facilitar la aplicación de funciones de búsqueda y de análisis tales como clasificación, recomendación o detección de plagios, así como el desarrollo de funcionalidades educativas avanzadas, como por ejemplo la generación de resúmenes automáticos de contenidos para ayudar al estudiante a tomar apuntes. Por este motivo, el principal objetivo de esta tesis es desarrollar una solución de bajo coste capaz de transcribir y traducir vídeo charlas con un nivel de calidad razonable. Más específicamente, abordamos la integración de técnicas estado del arte de Reconocimiento del Habla Automático y Traducción Automática en grandes repositorios de vídeo charlas educativas para la generación de subtítulos multilingües de alta calidad sin requerir intervención humana y con un reducido coste computacional. Además, también exploramos los beneficios potenciales que conllevaría la explotación de la información de la que disponemos a priori sobre estos repositorios, es decir, conocimientos específicos sobre las charlas tales como el locutor, la temática o las transparencias, para crear sistemas de transcripción y traducción especializados mediante técnicas de adaptación masiva. Las soluciones propuestas en esta tesis han sido testeadas en escenarios reales llevando a cabo nombrosas evaluaciones objetivas y subjetivas, obteniendo muy buenos resultados. El principal legado de esta tesis, The transLectures-UPV Platform, ha sido liberado públicamente como software de código abierto, y, en el momento de escribir estas líneas, está sirviendo transcripciones y traducciones automáticas para diversos miles de vídeo charlas educativas en nombrosas universidades e instituciones Españolas y Europeas.
[CAT] Durant aquests darrers anys, els repositoris multimèdia on-line han experimentat un gran creixement que els ha fet consolidar-se com a fonts fonamentals de coneixement, especialment a l'àrea de l'educació, on s'han creat grans repositoris de vídeo xarrades educatives per tal de complementar o inclús reemplaçar els mètodes d'ensenyament tradicionals. No obstant això, la majoria d'aquestes xarrades no estan transcrites ni traduïdes degut a l'absència de solucions de baix cost capaces de fer-ho garantint una qualitat mínima acceptable. Solucions d'aquest tipus són clarament necessàries per a fer que les vídeo xarres siguen més accessibles per a parlants d'altres llengües o per a persones amb discapacitats auditives. A més, aquestes solucions podrien facilitar l'aplicació de funcions de cerca i d'anàlisi tals com classificació, recomanació o detecció de plagis, així com el desenvolupament de funcionalitats educatives avançades, com per exemple la generació de resums automàtics de continguts per ajudar a l'estudiant a prendre anotacions. Per aquest motiu, el principal objectiu d'aquesta tesi és desenvolupar una solució de baix cost capaç de transcriure i traduir vídeo xarrades amb un nivell de qualitat raonable. Més específicament, abordem la integració de tècniques estat de l'art de Reconeixement de la Parla Automàtic i Traducció Automàtica en grans repositoris de vídeo xarrades educatives per a la generació de subtítols multilingües d'alta qualitat sense requerir intervenció humana i amb un reduït cost computacional. A més, també explorem els beneficis potencials que comportaria l'explotació de la informació de la que disposem a priori sobre aquests repositoris, és a dir, coneixements específics sobre les xarrades tals com el locutor, la temàtica o les transparències, per a crear sistemes de transcripció i traducció especialitzats mitjançant tècniques d'adaptació massiva. Les solucions proposades en aquesta tesi han estat testejades en escenaris reals duent a terme nombroses avaluacions objectives i subjectives, obtenint molt bons resultats. El principal llegat d'aquesta tesi, The transLectures-UPV Platform, ha sigut alliberat públicament com a programari de codi obert, i, en el moment d'escriure aquestes línies, està servint transcripcions i traduccions automàtiques per a diversos milers de vídeo xarrades educatives en nombroses universitats i institucions Espanyoles i Europees.
Silvestre Cerdà, JA. (2016). Different Contributions to Cost-Effective Transcription and Translation of Video Lectures [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/62194
TESIS
Kani, Bijan. "Enhanced logical adaptive systems for image processing and pattern recognition." Thesis, Brunel University, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.358406.
Full textWilson, Shawn C. "Voice recognition systems : assessment of implementation aboard U.S. naval ships." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2003. http://library.nps.navy.mil/uhtbin/hyperion-image/03Mar%5FWilson.pdf.
Full textThesis advisor(s): Michael T. McMaster, Kenneth J. Hagan. Includes bibliographical references (p. 47-49). Also available online.
Müller, J. J. "USB telephony interface device for speech recognition applications /." Link to the online version, 2005. http://hdl.handle.net/10019/1127.
Full textJeon, Woojay. "Speech Analysis and Cognition Using Category-Dependent Features in a Model of the Central Auditory System." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14061.
Full textDobie, Mark Ralph. "Motion analysis in multimedia systems." Thesis, University of Southampton, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.359240.
Full textStyne, Bruce Alan. "Management systems for computer graphics." Thesis, University of Cambridge, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.303247.
Full textNeville, Katrina Lee, and katrina neville@rmit edu au. "Channel Compensation for Speaker Recognition Systems." RMIT University. Electrical and Computer Engineering, 2007. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080514.093453.
Full textLai, Yiu Pong. "Maximum likelihood normalization for robust speech recognition /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202003%20LAI.
Full textIncludes bibliographical references (leaves 98-103). Also available in electronic version. Access restricted to campus users.
Li, Chak Fai. "Improved polynomial segment model for speech recognition /." View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202004%20LI.
Full textIncludes bibliographical references (leaves 80-84). Also available in electronic version. Access restricted to campus users.
Wanderley, Juliana Fernandes Camapum. "Colour-based recognition for remote sensing in environmental systems." Thesis, Coventry University, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.266844.
Full textGiles, Paul A. "Iterated function systems and shape representation." Thesis, Durham University, 1990. http://etheses.dur.ac.uk/6188/.
Full textReynolds, Graham J. "Configurable graphics systems : modelling and specification." Thesis, University of East Anglia, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.293731.
Full textZahedi, Fariborz. "A systems approach to image segmentation." Thesis, University of Brighton, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.260978.
Full textAdami, André Gustavo. "Modeling prosodic differences for speaker and language recognition /." Full text open access at:, 2004. http://content.ohsu.edu/u?/etd,19.
Full textLee, Spencer Jaehoon Gilbert Juan E. "Post-speech-recognition processiing in domain-specific text-corpus-based distributed listening system analysis, interpretation and selection of speech recognition results /." Auburn, Ala., 2006. http://repo.lib.auburn.edu/2006%20Summer/Theses/LEE_SPENCER_7.pdf.
Full textRideout, Robert Martin. "Coded imaging systems for X-ray astronomy." Thesis, University of Birmingham, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.364854.
Full textTai, Anthony. "Perceptual grouping and knowledge-based vision systems." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/844407/.
Full textAu, Wing Hei. "Improved acoustic model training for speech recognition and verification /." View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202004%20AU.
Full textIncludes bibliographical references (leaves 81-86). Also available in electronic version. Access restricted to campus users.
Cummings, Kathleen E. "Analysis, synthesis, and recognition of stressed speech." Diss., Georgia Institute of Technology, 1992. http://hdl.handle.net/1853/15673.
Full textNg, Kwong Tim. "Exploring Chinese linguistic characteristics for speech recognition /." View abstract or full-text, 2005. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202005%20NGK.
Full textLiu, Yi. "Pronunciation modeling for spontaneous mandarin speech recognition /." View Abstract or Full-Text, 2002. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202002%20LIU.
Full textIncludes bibliographical references (leaves 169-177). Also available in electronic version. Access restricted to campus users.
Al-Darkazali, Mohammed. "Image processing methods to segment speech spectrograms for word level recognition." Thesis, University of Sussex, 2017. http://sro.sussex.ac.uk/id/eprint/71675/.
Full textVan, der Walt Craig. "An investigation into the practical implementation of speech recognition for data capturing." Thesis, Cape Technikon, 1993. http://hdl.handle.net/20.500.11838/1156.
Full textA study into the practical implementation of Speech Recognition for the purposes of Data Capturing within Telkom SA. is described. As datacapturing is increasing in demand a more efficient method of capturing is sought. The technology relating to Speech recognition is herein examined and practical gnidelines for selecting a Speech recognition system are described. These guidelines are used to show how commercially available systems can be evaluated. Specific tests on a selected speech recognition system are described, relating to the accuracy and adaptability of the system. The results obtained illustrate why at present speech recognition systems are not advisable for the purpose of Data capturing. The results also demonstrate how the selection of keywords words can affect system performance. Areas of further research are highlighted relating to recognition performance and vocabulary selection.
Nanka-Bruce, Oona. "Some computer aided design methods for nonlinear control systems." Thesis, University of Sussex, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.252934.
Full textChu, Kam Keung. "Feature extraction based on perceptual non-uniform spectral compression for noisy speech recognition /." access full-text access abstract and table of contents, 2005. http://libweb.cityu.edu.hk/cgi-bin/ezdb/thesis.pl?mphil-ee-b19887516a.pdf.
Full text"Submitted to Department of Electronic Engineering in partial fulfillment of the requirements for the degree of Master of Philosophy" Includes bibliographical references (leaves 143-147)
Hu, Rusheng. "Statistical optimization of acoustic models for large vocabulary speech recognition." Diss., Columbia, Mo. : University of Missouri-Columbia, 2006. http://hdl.handle.net/10355/4329.
Full textThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on August 2, 2007) Includes bibliographical references.
Johnson, Joanna. "The effectiveness of voice recognition technology as used by persons with disabilities." Online version, 1998. http://www.uwstout.edu/lib/thesis/1998/1998johnsonj.pdf.
Full textMa, Chengyuan. "A detection-based pattern recognition framework and its applications." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33889.
Full textHobbs, Mike. "Genetic algorithms for spatial data analysis in geographical information systems." Thesis, University of Kent, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.262636.
Full textWasmeier, Hans. "Development of tests and preprocessing algorithms for evaluation and improvement of speech recognition units." Thesis, University of British Columbia, 1986. http://hdl.handle.net/2429/26750.
Full textApplied Science, Faculty of
Electrical and Computer Engineering, Department of
Graduate
Ross, Philip. "Network accessible parallel computing systems, based upon transputers, for image processing strategies." Thesis, University of Aberdeen, 1993. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU059635.
Full textRao, Ram Raghavendra. "Audio-visual interaction in multimedia." Diss., Georgia Institute of Technology, 1998. http://hdl.handle.net/1853/13349.
Full text