To see the other types of publications on this topic, follow the link: Cepstrum.

Dissertations / Theses on the topic 'Cepstrum'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Cepstrum.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Bryan, Robert A. "Thin-bed resolution from cepstrum analysis." Thesis, Virginia Polytechnic Institute and State University, 1985. http://hdl.handle.net/10919/74514.

Full text
Abstract:
A method of cepstrum analysis is developed for the purpose of resolving thin-beds. The method relies on the detection of periodic pulses of the cepstra of reflectivity functions, which are isolated by computing a sub-cepstrum and a sum-cepstrum, and highlighted with a discriminator, where the sub-cepstrum of the functions f₁(t) and f₂(t) is the difference between the cepstra of the two functions, the sum-cepstrum of f₁(t) is the sum of the sub-cepstra of f₁(t) and f<sub>k</sub>(t), k=2,3,4,... , and the discriminator is the product of the sum-cepstrum and the autocovariance of the sum-cepstrum. The technique requires at least two reflected wavelets generated by the same source. The method was applied to synthetic thin lens models. The method is shown to be sensitive to the ratio of the reflection coefficients at the top and bottom of the thin-bed. Specifically, the resolution depends on the ratio of the reflection coefficients. Optimum resolution is achieved when the reflection coefficients at the top and bottom of the thin-bed are equal in absolute magnitude. In addition, in the noise-free case, the absolute magnitude of the cepstral pulses can be used to determine the absolute magnitude of the ratio of the reflection coefficients. The technique is also sensitive to the sample interval used. The finest sample interval provides the best resolution because it produces the sharpest cepstral pulses and resolves the thinnest beds. The resolution of the method is drastically reduced by random noise, although thin-bed thicknesses are still detectable when the S/N of the synthetic seismic section is 15/1 and the upper frequency of the bandwidth of the noise is 1.1 octaves above the upper frequency of the bandwidth of the source wavelet.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
2

Gudnason, Jon. "Voice source cepstrum processing for speaker identification." Thesis, Imperial College London, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.439448.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Lelakis, Ioannis. "Speaker identification using the two-dimensional cepstrum transform." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1995. http://handle.dtic.mil/100.2/ADA294204.

Full text
Abstract:
Thesis (M.S. in Electrical Engineering) Naval Postgraduate School, March 1995.<br>"March 1995." Thesis advisor(s): Monique Fargue, Ralph Hippenstiel. Includes bibliographical references. Also available online.
APA, Harvard, Vancouver, ISO, and other styles
4

Komadel, Michal. "Mutlimediální diff - audio dokumenty." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-237074.

Full text
Abstract:
This work describes development of a diff tool working with audio files containing general sound such as music, speech and other sounds. There are presented facts from different domains of science related to sound, such as psychoacoustics, speech recognition and automatic music genre categorisation. This paper also contains description of some diff algorithms and external tools needed for development of the goal application. Moreover, there is introduced design and implementation of the application, settings used for sound features extraction and evaluation of attained results.
APA, Harvard, Vancouver, ISO, and other styles
5

Marvi, Hossein. "Efficient feature extraction based on two-dimensional cepstrum analysis for speech recognition." Thesis, University of Surrey, 2004. http://epubs.surrey.ac.uk/843940/.

Full text
Abstract:
Solving speech recognition problems requires an adequate feature extraction technique to transform the raw speech signal to a set of feature vectors to preserve most of information corresponding to the speech signal. The features should ideally be compact, distinct and well representative of the speech signal. If the feature vectors do not represent the important content of the speech, the performance of the system will perform poorly regardless of the pattern recognition techniques applied. Many different feature extraction representations of the speech signal have been suggested and tried for speech recognition. The most popular features which are used currently are Mel- frequency cesptral coefficients (MFCC) and perceptual linear prediction (PLP), which are based on one dimensional cepstrum analysis. The two dimensional cepstrum (TDC) is an alternative approach for time-frequency representation of any speech signal which can preserve both the instantaneous and transitional information of the speech signal. Here, in this thesis, the principle aim concerns the study of the two dimensional cepstrum analysis as a feature extraction technique for speech recognition. A novel feature extraction technique, two dimensional root cepstrum (TDRC) is also introduced. It has the advantage of an adjustable y parameter which can be used to optimise the feature extraction process, reducing the dimensions of the feature matrix and giving simple computation. In addition, the Mel TDRC has been proposed as a modified method of original TDRC to improve the accuracy. It is shown that both the TDC and the TDRC outperform the conventional cepstrum. To preserve both magnitude and phase details of the speech signal simultaneously in a feature matrix, the Hartley transform (HT) is suggested as a substitute for the Fourier transform (FT) in two-dimensional cepstrum analysis. Experimental results demonstrate the enhanced capability of the HT in the two dimensional root cepstral analysis to improve recognition accuracy. An experimental comparative study of 9 kinds of feature extraction methods based on cepstral analysis are also carried out.
APA, Harvard, Vancouver, ISO, and other styles
6

Hubbard, Stephen J. "A cepstrum-based acoustic echo cancellation technique for improving public address system performance." Diss., Georgia Institute of Technology, 1994. http://hdl.handle.net/1853/15617.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Mahajan, Mayur. "Development of a speech recognition system using the Mel Frequency Cepstrum Coefficient method." Thesis, California State University, Long Beach, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10141515.

Full text
Abstract:
<p> Voice recognition systems have found widespread use in applications such as tele-shopping, tele-banking, information services, home automation, voice message security, and voice call dialing, which allows a driver to make calls safely while driving. </p><p> This project presents the development of a high performance speech recognition system using human voice models. Recognizing the behavior of the human ear, the Mel Frequency Cepstral Coefficient (MFCC) method is used to develop the system capability for feature extraction. Vector quantization optimized by the Linde Buzo Gray (LGB) algorithm is used for feature matching. Experimental results show that the system has over 90% success rate in the noise-free case, but the system performance deteriorates in the presence of noise. The system, however, has better recognition ability when the noise signal consists of harmonic components, as compared to a non-stationary, non-harmonic signal.</p>
APA, Harvard, Vancouver, ISO, and other styles
8

Wei, Coach K. (Coach Kecheng) 1973. "Cepstrum-based deconvolution techniques for ultrasonic pulse-echo imaging of flaws in composite laminates." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/79992.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Kupka, Petr. "Detekce alkoholu v řečovém signálu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442414.

Full text
Abstract:
The diploma thesis Detection of Alcohol in Speech Signal first describes the effect of alcohol on the human body. The second part deals with ways to obtain parameters that describe the speech signal. The third part provides a brief overview of previous case studies and patents focused on the detection of alcohol in the speech signal. The fourth part presents the collected own database of voice recordings and developed software application for the analysis of intoxicated speech. The final part describes the measured changes in speech signal parameters that indicate alcohol intoxication.
APA, Harvard, Vancouver, ISO, and other styles
10

Larsson, Alm Kevin. "Automatic Speech Quality Assessment in Unified Communication : A Case Study." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-159794.

Full text
Abstract:
Speech as a medium for communication has always been important in its ability to convey our ideas, personality and emotions. It is therefore not strange that Quality of Experience (QoE) becomes central to any business relying on voice communication. Using Unified Communication (UC) systems, users can communicate with each other in several ways using many different devices, making QoE an important aspect for such systems. For this thesis, automatic methods for assessing speech quality of the voice calls in Briteback’s UC application is studied, including a comparison of the researched methods. Three methods all using a Gaussian Mixture Model (GMM) as a regressor, paired with extraction of Human Factor Cepstral Coefficients (HFCC), Gammatone Frequency Cepstral Coefficients (GFCC) and Modified Mel Frequency Cepstrum Coefficients (MMFCC) features respectively is studied. The method based on HFCC feature extraction shows better performance in general compared to the two other methods, but all methods show comparatively low performance compared to literature. This most likely stems from implementation errors, showing the difference between theory and practice in the literature, together with the lack of reference implementations. Further work with practical aspects in mind, such as reference implementations or verification tools can make the field more popular and increase its use in the real world.
APA, Harvard, Vancouver, ISO, and other styles
11

Šústek, Martin. "Ukázkový systém na rozpoznávání mluvčích." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217682.

Full text
Abstract:
My diploma theses deals with the problem of the speaker recognition. The basic theory of this problem is described in the text as well as model and implementation of the system for speaker recognition. The scope of the system is to recognize up to three speakers. The theory is based on calculation parameters for speaker recognition and processing of voice. Program is made in Matlab as a independent application and it has got Czech and English interface.
APA, Harvard, Vancouver, ISO, and other styles
12

Jongens, Adrian Wynand Doride. "Application of cepstrum techniques and a guard tube to the measurement of the normal incident sound power absorption coefficient of road surfaces in-situ." Master's thesis, University of Cape Town, 1993. http://hdl.handle.net/11427/9639.

Full text
Abstract:
Includes bibliographical references.<br>The work described in this thesis was directed towards studying signal processing techniques that could best be incorporated in an apparatus that was to measure the plane wave sound power absorption coefficient of road surfaces in-situ. Road traffic noise has been identified as the greatest noise pollutant in the industrialised world with the tyre/road interaction being the major source of noise for traffic speeds in excess of 50 km/hr. Open pore bitumen asphalt material has been found to present a sound absorbing surface that is able to contribute to the mitigation of road traffic noise. This has generated research into the production of sound absorbing road surface materials which, in turn, has generated a need for an apparatus that is able to measure the sound power absorption coefficient of such materials both in the laboratory and in the field. It was considered that the development of an easily transportable apparatus was needed which would enable a single, non-skilled operator to measure rapidly the normal incident sound power absorption coefficient, over a broad frequency band, of road surfaces, in-situ. This included, for example, the measurement of the absorption coefficient of open-pore asphalt materials developed in the laboratory, the measurement of newly laid surfaces, the comparison of different surfaces, as well as determining the effect of contamination, over time, of the pores of open-pore asphalt by ingress of dust.
APA, Harvard, Vancouver, ISO, and other styles
13

Hammoudeh, Ismail. "Qualitative nichtlineare Zeitreihenanalyse mit Anwendung auf das Problem der Polbewegung." Phd thesis, [S.l. : s.n.], 2002. http://pub.ub.uni-potsdam.de/2003/0003/hammoud.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Spalt, Taylor B. "Background Noise Reduction in Wind Tunnels using Adaptive Noise Cancellation and Cepstral Echo Removal Techniques for Microphone Array Applications." Thesis, Virginia Tech, 2010. http://hdl.handle.net/10919/34247.

Full text
Abstract:
Two experiments were conducted to investigate Adaptive Noise Cancelling and Cepstrum echo removal post-processing techniques on acoustic data from a linear microphone array in an anechoic chamber. A point source speaker driven with white noise was used as the primary signal. The first experiment included a background speaker to provide interference noise at three different Signal-to-Noise Ratios to simulate noise propagating down a wind tunnel circuit. The second experiment contained only the primary source and the wedges were removed from the floor to simulate reflections found in a wind tunnel environment. The techniques were applicable to both signal microphone and array analysis. The Adaptive Noise Cancellation proved successful in its task of removing the background noise from the microphone signals at SNRs as low as -20 dB. The recovered signals were then used for array processing. A simulation reflection case was analyzed with the Cepstral technique. Accurate removal of the reflection effects was achieved in recovering both magnitude and phase of the direct signal. Experimental data resulted in Cepstral features that caused errors in phase accuracy. A simple phase correction procedure was proposed for this data, but in general it appears that the Cepstral technique is and would be not well suited for all experimental data.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
15

Vodička, Radek. "Rozpoznávání izolovaných slov." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2014. http://www.nusl.cz/ntk/nusl-220658.

Full text
Abstract:
Main purpose of the thesis is to study the processes and methods of isolated words recognition. In the theoretical part a basic principals are explained. The practical part is about the program creating using these principles in practice. For isolated words recognition Hidden Markov Models (HMM) are used, for obtaining decision symptoms cepstral analysis is chosen.
APA, Harvard, Vancouver, ISO, and other styles
16

Sanchez, Fabrício Lopes. "Análise cepstral baseada em diferentes famílias transformada wavelet." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/82/82131/tde-01092010-113906/.

Full text
Abstract:
Este trabalho apresenta um estudo comparativo entre diferentes famílias de transformada Wavelet aplicadas à análise cepstral de sinais digitais de fala humana, com o objetivo específico de determinar o período de pitch dos mesmos e, ao final, propõe um algoritmo diferencial para realizar tal operação, levando-se em consideração aspectos importantes do ponto de vista computacional, tais como: desempenho, complexidade do algoritmo, plataforma utilizada, dentre outros. São apresentados também, os resultados obtidos através da implementação da nova técnica (baseada na transformada wavelet) em comparação com a abordagem tradicional (baseada na transformada de Fourier). A implementação da técnica foi testada em linguagem C++ padrão ANSI sob as plataformas Windows XP Professional SP3, Windows Vista Business SP1, Mac OSX Leopard e Linux Mandriva 10.<br>This work presents a comparative study between different family of wavelets applied on cepstral analysis of the digital speech human signal with specific objective for determining of pitch period of the same and in the end, proposes an differential algorithm to make such a difference operation take into consideration important aspects of computational point of view, such as: performance, algorithm complexity, used platform, among others. They are also present, the results obtained through of the technique implementation compared with the traditional approach. The technique implementation was tested in C++ language standard ANSI under the platform Windows XP Professional SP3 Edition, Windows Vista Business SP1, MacOSX Leopard and Linux Mandriva 10.
APA, Harvard, Vancouver, ISO, and other styles
17

Sannachi, Lakshmanan. "Investigation of anisotropic properties of musculoskeletal tissues by high frequency ultrasound." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2012. http://dx.doi.org/10.18452/16476.

Full text
Abstract:
Knochen und Muskel sind die wichtigsten Gewebe im muskuloskelettalen System welche dem Körper die Bewegungen möglich machen. Beide Gewebetypen sind hochgradig strukturierter Extrazellulärmatrix zugrundegelegt, welche die mechanischen und biologischen Funktionen bestimmen. In dieser Studie wurden die räumliche Verteilung der anisotropen elastischen Eigenschaften und der Gewebemineralisation im humanen kortikalen Femur untersucht mit akustischer Mikroskopie und Synchrotron-µCT. Die homogenisierten elastischen Eigenschaften wurden aus einer Kombination der Porosität und der Gewebeelastizitätsmatrix mit Hilfe eines asymptotischen Homogenisierungsmodells ermittelt. Der Einfluss der Gewebemineralisierung und der Strukturparameter auf die mikroskopischen und mesoskopischen elastischen Koeffizienten wurde unter Berücksichtigung der anatomischen Position des Femurschaftes untersucht. Es wurde ein Modell entwickelt, mit welchem der intramuskuläre Fettgehalt des porcinen musculus longissimus nichtinvasiv mittels quantitativem Ultraschall und dessen spektraler Analyze des Echosignals bestimmt werden kann. Muskelspezifische Parameter wie Dämpfung, spectral slope, midband fit, apparent integrated backscatter und cepstrale Paramter wurden aus den RF-Signalen extrahiert. Die Einflüsse der Muskelkomposition und Strukturparameter auf die spektralen Ultraschallparameter wurden untersucht. Die akustischer Parameter werden durch die Muskelfaserorientierung beeinflusst und weisen höhere Werte parallel zur Faserlängsrichtung als senkrecht zur Faserorientierung auf. Die in dieser Studie gewonnenen detaillierten und lokal bestimmten Knochendaten können möglicherweise als Eingabeparameter für numerische 3D FE-Simulationen. Darüber hinaus kann die Untersuchung von Veränderungen der lokalen Gewebeanisotropie neue Einsichten in Studien über Knochenumbildung geben. Diese auf Gewebeebene bestimmten Daten von Muskelgewebe können in numerischen Simulationen von akustischer Rückstreuung genutzt werden um diagnostische Methoden und Geräte zu verbessern.<br>Bone and muscle are the most important tissues in the musculoskeletal system that gives the ability to move the body. Both tissues have the highly oriented underlying extracellular matrix structure for performing mechanical and biological functions. In this study, the spatial distribution of anisotropic elastic properties and tissue mineralization within a human femoral cortical bone shaft were investigated using scanning acoustic microscopy and synchrotron radiation µCT. The homogenized meoscopic elastic properties were determined by a combination of porosity and tissue elastic matrix using a asymptotic homogenization model. The impact on tissue mineralization and structural parameters of the microscopic and mesocopic elastic coefficients was analyzed with respect to the anatomical location of the femoral shaft. A model was developed to estimate intramuscular fat of porcine musculus longissimus non-invasively using a quantitative ultrasonic device by spectral analysis of ultrasonic echo signals. Muscle specific acoustic parameters, i.e. attenuation, spectral slope, midband fit, apparent integrated backscatter, and cepstral parameters were extracted from the measured RF echoes. The impact of muscle composition and structural properties on ultrasonic spectral parameters was analyzed. The ultrasound propagating parameters were affected by the muscle fiber orientation. The most dominant direction dependency was found for the attenuation. The detailed locally assessed bone data in this study may serve as a real-life input for numerical 3D FE simulation models. Moreover, the assessment of changes of local tissue anisotropy may provide new insights into the bone remodelling studies. The data provided at tissue level and investigated ultrasound backscattering from muscle tissue, can be used in numerical simulation FE models for acoustical backscattering from muscle for the further improvement of diagnostic methods and equipment.
APA, Harvard, Vancouver, ISO, and other styles
18

Freedman, Joseph Saul. "Using helicopter noise to prevent brownout crashes: an acoustic altimeter." Thesis, Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/34833.

Full text
Abstract:
This thesis explores one possible method of preventing helicopter crashes caused by brownout using the noise generated by the helicopter rotor as an altimeter. The hypothesis under consideration is that the helicopter's height, velocity, and obstacle locations with respect to the helicopter, can be determined by comparing incident and reflected rotor noise signals, provided adequate bandwidth and signal to noise ratio. Heights can be determined by measuring the cepstrum of the reflected helicopter noise. The velocity can be determined by measuring small amounts of Doppler distortion using the Mellin-Scale Transform. Height and velocity detection algorithms are developed, optimized for this application, and tested using a microphone array. The algorithms and array are tested using a hemianechoic chamber and outside in Georgia Tech's Burger Bowl. Height and obstacle detection are determined to be feasible with the existing array. Velocity detection and surface mapping are not successfully accomplished.
APA, Harvard, Vancouver, ISO, and other styles
19

Danielson, Hugo, and Schmuck Benjamin von. "Robot Condition Monitoring : A first step in Condition Monitoring for robotic applications." Thesis, Luleå tekniska universitet, Institutionen för teknikvetenskap och matematik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-66011.

Full text
Abstract:
The industrial world is in constant demand for faster, cheaper and higher quality manufacturing. Robot utilisation and automation has evolved to become a necessary asset to master in order to stay competitive in the global market. With the growing dependency on robots, unexpected downtime and brakedowns can cause devastating loss of revenue. Consequently, this has lead to an increased importance for an accurate condition based way of performing robotic maintenance. As of writing, robots are predominantly maintained through time dependent maintenance. Part replacement is based on statistical models where maintenance is performed without taking the actual robot condition into consideration. As a result an overall level of uncertainty is ensued, where lacking the ability to properly diagnose the robot, also leads to superfluous repairs. Because of the costly impact this has on production, a condition based maintenance approach to robots would yield increased reliability at a lower cost of maintenance. This research focuses on trying to monitor vibrations in a robot, so as to infer about wear and to provide a first step in vibration based Robot Condition Monitoring. This research has been of multidisciplinary nature where robotics, tribology, mechanical component, signal analysis and diagnosis theory have overlapped in several areas throughout the project. The research has provided a vibration baseline and trends of the theoretical bearing defect frequencies for a hypocycloid gearbox installed on an ABB IRB6600 robot. The gearbox was not worn to a level that a severe gearbox degradation was irrefutably detectable and analysable. Accelerometers normally used on wind turbines were used for the project, and are believed to be sufficiently successful in capturing bearing related signals to accredit it for continued use at the preliminary stages of Robot Condition Monitoring development. A worn RV410F hypocycloid gearbox, was dismantled and analysed. Bearings found inside indicate high degrees of moisture corrosion and extensive surface wear. These findings had decisive roles in what future work recommendations where presented. Areas with great potential are condition monitoring through the use of Acoustic Emission and lubrication analysis. Further recommendations include investigating signal analysis techniques such as cepstrum pre-whitening and discrete wavelet transforms.
APA, Harvard, Vancouver, ISO, and other styles
20

Matuštík, Daniel. "Určování základního hlasového tónu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2013. http://www.nusl.cz/ntk/nusl-219980.

Full text
Abstract:
This diploma thesis with estimation of pitch period of the human voice. The paper listed some of the methods for Estimation of pitch period and method for preprocessing and final processing of the signal after application of functions to determine the frequency of the pitch period in graphical user interface.
APA, Harvard, Vancouver, ISO, and other styles
21

Adamec, Michal. "Moderní rozpoznávače řečové aktivity." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217322.

Full text
Abstract:
This masters theses deals with standard detection methods of speech/pause - voice activity detectors are based on the principles of short-time energy, real spectrum, short-time intensity and on a combinations of these three detectors. In the next parts, there are mentioned other voice activity detectors based on hidden Markovov‘s models and a detector described in the ITU-T G.729 standard. All the detectors, mentioned above, were implemented in research environment MATLAB. Further there was created an user interface for testing functions of the implemented detectors. Finally, there was done an evaluation by ROC characteristics according to the results of the testing.
APA, Harvard, Vancouver, ISO, and other styles
22

Smith, Paul Devon. "An Analog Architecture for Auditory Feature Extraction and Recognition." Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/4839.

Full text
Abstract:
Speech recognition systems have been implemented using a wide range of signal processing techniques including neuromorphic/biological inspired and Digital Signal Processing techniques. Neuromorphic/biologically inspired techniques, such as silicon cochlea models, are based on fairly simple yet highly parallel computation and/or computational units. While the area of digital signal processing (DSP) is based on block transforms and statistical or error minimization methods. Essential to each of these techniques is the first stage of extracting meaningful information from the speech signal, which is known as feature extraction. This can be done using biologically inspired techniques such as silicon cochlea models, or techniques beginning with a model of speech production and then trying to separate the the vocal tract response from an excitation signal. Even within each of these approaches, there are multiple techniques including cepstrum filtering, which sits under the class of Homomorphic signal processing, or techniques using FFT based predictive approaches. The underlying reality is there are multiple techniques that have attacked the problem in speech recognition but the problem is still far from being solved. The techniques that have shown to have the best recognition rates involve Cepstrum Coefficients for the feature extraction and Hidden-Markov Models to perform the pattern recognition. The presented research develops an analog system based on programmable analog array technology that can perform the initial stages of auditory feature extraction and recognition before passing information to a digital signal processor. The goal being a low power system that can be fully contained on one or more integrated circuit chips. Results show that it is possible to realize advanced filtering techniques such as Cepstrum Filtering and Vector Quantization in analog circuitry. Prior to this work, previous applications of analog signal processing have focused on vision, cochlea models, anti-aliasing filters and other single component uses. Furthermore, classic designs have looked heavily at utilizing op-amps as a basic core building block for these designs. This research also shows a novel design for a Hidden Markov Model (HMM) decoder utilizing circuits that take advantage of the inherent properties of subthreshold transistors and floating-gate technology to create low-power computational blocks.
APA, Harvard, Vancouver, ISO, and other styles
23

Pfeifer, Leon. "Automatické rozpoznávání emočních stavů člověka na základě analýzy řečového projevu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217520.

Full text
Abstract:
The diploma thesis deals with the analysis of human emotional states. The thesis consists of three parts. The first part is charcterize, the process of speech generating, from phonetic and psychological poin of view. In the second part there are proccesed metods and contextual things.(preprocessing of signal, voice activity detector). For calculation fundamental Frequency it was used metod of central clipping, another used metod is formant frequency analyse and the last is metod of determinatin of nuber of thorns and planes. In the thirt part there are proccesesed results of measurements performed by particural metods. It was scorred five different emotional states: neutral, anger, happiness, sadness and surprise. At the end of this part there are discussed results for each metod.
APA, Harvard, Vancouver, ISO, and other styles
24

Cole, David Ross. "Intelligibility enhancement of severely reverberant speech." Thesis, Queensland University of Technology, 1997.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
25

Hassanain, Elham. "Novel cepstral techniques applied to speech synthesis." Thesis, University of Surrey, 2006. http://epubs.surrey.ac.uk/842745/.

Full text
Abstract:
The aim of this research was to develop an improved analysis and synthesis model for utilization in speech synthesis. Conventionally, linear prediction has been used in speech synthesis but is restricted by the requirement of an all-pole, minimum phase model. Here, cepstral homomorphic deconvolution techniques were used to approach the problem, since there are fewer constraints on the model and some evidence in the literature that shows that cepstral homomorphic deconvolution can give improved performance. Specifically the spectral root cepstrum was developed in an attempt to separate the magnitude and phase spectra. Analysis and synthesis filters were developed on these two data streams independently in an attempt to improve the process. It is shown that independent analysis of the magnitude and phase spectra is preferable to a combined analysis, and so the concept of a phase cepstrum is introduced, and a number of different phase cepstra are defined. Although extremely difficult for many types of signals, phase analysis via a root cepstrum and the Hartley phase cepstrum give encouraging results for a wide range of both minimum and maximum phase signals. Overall, this research has shown that improved synthesis can be achieved with these techniques.
APA, Harvard, Vancouver, ISO, and other styles
26

Green, Richard C. "Walsh based cepstra for speech coding." Thesis, King's College London (University of London), 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.392848.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Riedißer, Wolfgang. "Phasenorientierte Signalanalyse unter besonderer Berücksichtigung des Cepstrums sowie minimalphasiger Systeme und Signale /." Düsseldorf : VDI-Verl, 1999. http://www.gbv.de/dms/bs/toc/270562567.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Cruz, Luís Manuel Monteiro de Sousa. "Aplicação de técnicas de classificação em séries temporais." Master's thesis, Universidade de Aveiro, 2015. http://hdl.handle.net/10773/15900.

Full text
Abstract:
Mestrado em Engenharia Física<br>Hoje, há um interesse crescente pela aprendizagem de uma segunda língua, quer seja por razões profissionais ou pessoais. Esta é uma tendência que se vai afirmando num mundo cada vez mais interconectado. Por outro lado, a democratização das tecnologias computacionais torna possível pensar em desenvolver novas técnicas de ensino de línguas mais automatizadas e personalizadas. Esta dissertação teve como objetivo estudar e implementar um conjunto de técnicas de processamento de sinal e de classificação de séries temporais úteis para o desenvolvimento de metodologias do ensino oral com feedback automático. São apresentados resultados preliminares sobre a prestação destas técnicas, e avaliada a viabilidade deste tipo de abordagem.<br>Today, there is a growing interest in learning a second language , either for professional or personal reasons. This is a trend that tends to hold in a increasingly interconnected world. On the other hand , the democratization of computer technologys makes it possible to think about developing new more automated and personalized language teaching techniques. This work aimed to study and implement a set of useful signal processing and time series classification techniques to develop methodologies of oral teaching with automatic feedback. Preliminary results on the provision of these techniques are shown, and assessed the feasibility of this approach.
APA, Harvard, Vancouver, ISO, and other styles
29

Sarpal, Sanjeev. "Efficient system identification based on root cepstral deconvolution." Thesis, University of Surrey, 2003. http://epubs.surrey.ac.uk/843337/.

Full text
Abstract:
This thesis summarizes approximately three years of research on signal modelling for the purposes of system identification. Improvements in signal modelling techniques have been encouraged over the years by society's demand for more efficient ways of accessing information. As a consequence, several modelling/compression techniques in both the time domain and the frequency domain have been developed as possible solutions to these problems. Cepstral deconvolution is a frequency domain modelling technique that has been successfully applied to many diverse fields, such as speech and seismic analysis. Thus far, all cepstral modelling performance has been empirical, relying on the judgement of the designer. Therefore a novel method for measuring root cepstral pole-zero modelling performance is proposed, by introducing a cost function applied directly to the root cepstral domain. It is, therefore, possible to demonstrate the optimized modelling of a pole-zero model and show that its performance is superior to that of a FIR Wiener filter and LPC. The optimized modelling of speech data is considered by a special form of the developed cost function. It is demonstrated that the modelling performance of the root cepstral method is superior to that of the real (magnitude) cepstrum and LPC. A novel method of model order identification for use with time domain modelling methods based around z-plane root cepstral plots is also developed and discussed. It is demonstrated that the positions of a model or plant's poles and zeros may be determined by visual inspection of the resulting z-plane plot. However, performance in noise was poor to that of LPC, leading to difficulties when trying to determine the model's order. Finally, an investigation into the poor phase modelling performance of the algorithm when modelling signals comprised of multiple excitations is presented. It is demonstrated that all DFT/FFT based analysis techniques are fundamentally flawed due to discontinuities. As a consequence, a simple pre-filtering algorithm is presented as a possible solution.
APA, Harvard, Vancouver, ISO, and other styles
30

Busset, Julie. "Inversion acoustique articulatoire à partir de coefficients cepstraux." Phd thesis, Université de Lorraine, 2013. http://tel.archives-ouvertes.fr/tel-00838913.

Full text
Abstract:
L'inversion acoustique-articulatoire de la parole consiste à récupérer la forme du conduit vocal à partir d'un signal de parole. Ce problème est abordé à l'aide d'une méthode d'analyse par synthèse reposant sur un modèle physique de production de la parole contrôlé par un petit nombre de paramètres décrivant la forme du conduit vocal : l'ouverture de la mâchoire, la forme et la position de la langue et la position des lèvres et du larynx. Afin de s'approcher de la géométrie de notre locuteur, le modèle articulatoire est construit à l'aide de contours articulatoires issus d'images cinéradiographiques présentant une vue sagittale du conduit vocal. Ce synthétiseur articulatoire nous permet de créer une table formée de couples associant un vecteur articulatoire au vecteur acoustique correspondant. Nous n'utiliserons pas les formants (fréquences de résonance du conduit vocal) comme vecteur acoustique car leur extraction n'est pas toujours fiable provoquant des erreurs lors de l'inversion. Les coefficients cepstraux sont utilisés comme vecteur acoustique. De plus, l'effet de la source et les disparités entre le conduit vocal du locuteur et le modèle articulatoire sont pris en compte explicitement en comparant les spectres naturels à ceux produits par le synthétiseur car nous disposons des deux signaux.
APA, Harvard, Vancouver, ISO, and other styles
31

Bees, Duncan Charles. "Enhancement of acoustically reverberant speech using cepstral methods." Thesis, McGill University, 1990. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=59819.

Full text
Abstract:
Acoustical reverberation has been shown to degrade the intelligibility and naturalness of speech. In this thesis, we discuss the application of cepstral methods to the enhancement of acoustically reverberant speech.<br>We first study previously described cepstral techniques for removal of simple echoes from signals. Our results show that these techniques are not directly applicable to the enhancement of speech of indefinite extent. We next recast these techniques specifically for speech. We propose new segmentation and windowing strategies, in combination with cepstral averaging, to accurately identify the acoustical impulse response. We then consider inverse filtering based on an estimated acoustical impulse response, and find that finite impulse response filters designed according to the least mean squared error criterion provide satisfactory performance. Finally, we synthesize and test an algorithm for enhancement of reverberant speech. Although significant difficulties remain, we feel that our methods offer a substantial contribution to the solution of the reverberant speech enhancement problem.
APA, Harvard, Vancouver, ISO, and other styles
32

Chuang, Ping Derg. "Range estimation by cepstral techniques in image processing." Thesis, Imperial College London, 1986. http://hdl.handle.net/10044/1/37973.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Bekli, Zeid, and William Ouda. "A performance measurement of a Speaker Verification system based on a variance in data collection for Gaussian Mixture Model and Universal Background Model." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20122.

Full text
Abstract:
Voice recognition has become a more focused and researched field in the last century,and new techniques to identify speech has been introduced. A part of voice recognition isspeaker verification which is divided into Front-end and Back-end. The first componentis the front-end or feature extraction where techniques such as Mel-Frequency CepstrumCoefficients (MFCC) is used to extract the speaker specific features of a speech signal,MFCC is mostly used because it is based on the known variations of the humans ear’scritical frequency bandwidth. The second component is the back-end and handles thespeaker modeling. The back-end is based on the Gaussian Mixture Model (GMM) andGaussian Mixture Model-Universal Background Model (GMM-UBM) methods forenrollment and verification of the specific speaker. In addition, normalization techniquessuch as Cepstral Means Subtraction (CMS) and feature warping is also used forrobustness against noise and distortion. In this paper, we are going to build a speakerverification system and experiment with a variance in the amount of training data for thetrue speaker model, and to evaluate the system performance. And further investigate thearea of security in a speaker verification system then two methods are compared (GMMand GMM-UBM) to experiment on which is more secure depending on the amount oftraining data available.This research will therefore give a contribution to how much data is really necessary fora secure system where the False Positive is as close to zero as possible, how will theamount of training data affect the False Negative (FN), and how does this differ betweenGMM and GMM-UBM.The result shows that an increase in speaker specific training data will increase theperformance of the system. However, too much training data has been proven to beunnecessary because the performance of the system will eventually reach its highest point and in this case it was around 48 min of data, and the results also show that the GMMUBM model containing 48- to 60 minutes outperformed the GMM models.
APA, Harvard, Vancouver, ISO, and other styles
34

Miyajima, C., Y. Nishiwaki, K. Ozawa, T. Wakita, K. Itou, and K. Takeda. "Cepstral Analysis of Driving Behavioral Signals for Driver Identification." IEEE, 2006. http://hdl.handle.net/2237/9596.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Barreira, Ramiro Roque Antunes. "Modelo mel-cepstral generalizado para envoltória espectral de fala." [s.n.], 2010. http://repositorio.unicamp.br/jspui/handle/REPOSIP/259047.

Full text
Abstract:
Orientadores: Fábio Violaro, Edmilson da Silva Morais<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação<br>Made available in DSpace on 2018-08-17T02:12:55Z (GMT). No. of bitstreams: 1 Barreira_RamiroRoqueAntunes_M.pdf: 2303475 bytes, checksum: 72e03fe8e41e9e440f2d4a266666763d (MD5) Previous issue date: 2010<br>Resumo: A análise Mel-Cepstral Generalizada (MGC) corresponde a uma abordagem para estimação de envoltória espectral de fala que unifica as análises LPC, Mel-LPC, Cepstral e Mel-Cepstral. A forma funcional do modelo MGC varia continuamente com dois parâmetros reais ? e ?, possibilitando que o modelo assuma diferentes características. A flexibilidade oferecida pelo modelo MGC aliada à sua estabilidade e bom desempenho sob manipulação de parâmetros tem feito com que os parâmetros MGC sejam empregados com sucesso em codificação de fala e síntese de fala via HMM (Hidden Markov Models). O presente trabalho foca os aspectos matemáticos da análise MGC, abordando e demonstrando, em extensão, a formulação em seus vieses analítico e computacional para a solução do modelo. As propriedades e formulações básicas da análise MGC são tratadas na perspectiva do espectro mel-logarítmico generalizado. Propõe-se um método para a computação dos coeficientes MGC e Mel-Cepstrais que não envolve o uso de fórmulas recursivas de transformação em freqüência. As análises e experimentos relacionados ao método encontram-se em estágio inicial e devem ser completados no sentido de se identificar a relação ganho computacional × qualidade da representação.<br>Abstract: Mel-Generalized Cepstral analysis (MGC) is an approach for speech spectral envelope estimation that unifies LPC, Mel-LPC, Cepstral and Mel-Cepstral Analysis. The functional form of the MGC model varies continuously with the real parameters ? e ?, enabling the model to acquire different characteristics. The flexibility of MGC model associated with its stability and good performance under parameter manipulation have made MGC parameters to be successfully employed in speech codification and HMM speech synthesis. The present study focuses on mathematical aspects of MGC analysis, treating and proving, in a fairly extended way, analytical and computational formulation for model solution. MGC analysis properties and basic formulation are treated in melgeneralized logarithmic spectrum perspective. A method for the computation of MGC and Mel-Cepstral coefficients that do not require frequency transformation recursion formulas is proposed. Experiments and analysis concerning the method are in their initial stage and needs to be completed in the sense to identify computational × representation performances.<br>Mestrado<br>Telecomunicações e Telemática<br>Mestre em Engenharia Elétrica
APA, Harvard, Vancouver, ISO, and other styles
36

Alpan, Ali. "Objective assessment of disordered connected speech." Doctoral thesis, Universite Libre de Bruxelles, 2012. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209758.

Full text
Abstract:
Within the context of the assessment of laryngeal function, acoustic analysis has an important place because the speech signal may be recorded non-invasively and it forms the base on which the perceptual assessment of voice is founded. Given the limitations of perceptual ratings, one has investigated vocal cues of disordered voices that are clinically relevant, summarize properties of speech signals and report on a speaker's phonation in general and voice in particular. Ideally, the acoustic descriptors should also be correlates of auditory-perceptual ratings of voice. Generally speaking, the goal of acoustic analysis is to document quantitatively the degree of severity of a voice disorder and monitor the evolution of the voice of dysphonic speakers.<p><p><p>The first part of this thesis is devoted to the analysis of disordered connected speech. The aim is to investigate vocal cues that are clinically relevant and correlated with auditory-perceptual ratings. Two approaches are investigated. The variogram-based method in the temporal domain is addressed first. The second approach is in the cepstral domain. In particular, the first rahmonic amplitude is used as an acoustic cue to describe voice quality. A multi-dimensional approach combining temporal and spectral aspects is also investigated. The goal is to check whether acoustic cues in both domains report complementary information when predicting perceptual scores.<p><p><p>Both methods are tested first on a corpus of synthetic sound stimuli that has been obtained by means of a synthesizer of disordered voices. The purpose is to learn about the link between the signal properties (fixed by the synthesis parameters) and acoustic cues.<p>In this study, we had the opportunity to use two large natural speech corpora. One of them has been perceptually rated. <p><p><p>The final part of the text is devoted to the automatic classification of voice with regard to perceived voice quality. Many studies have proposed a binary (normal/pathological) classification of voice samples. An automatic categorization according to perceived degrees of hoarseness appears, however, to be more attractive to both clinicians and technologists and more likely to be clinically relevant. Indeed, one way to reduce inter-rater variability of an auditory-perceptual evaluation is to ask several experts to participate and then to average the perceptual scores. However, auditory-perceptual evaluation of a corpus by several judges is a very laborious, time-consuming and costly task. Making this perceptual evaluation task automatic is therefore desirable. <p>The aim of this study is to exploit the support vector machine classifier that has become, over the last years, a popular tool for classification, to carry out categorization of voices according to perceived degrees of hoarseness.<br>Doctorat en Sciences de l'ingénieur<br>info:eu-repo/semantics/nonPublished
APA, Harvard, Vancouver, ISO, and other styles
37

Cédric, Peeters. "Advanced signal processing for the identification and diagnosis of the condition of rotating machinery." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSEI107.

Full text
Abstract:
Cette thèse porte sur des méthodes innovantes de contrôle de l'état de santé des machines tournantes par l’analyse des signaux vibratoires. En effet, la surveillance de l’état de santé des machines contribue à des améliorations substantielles des points de vue économique et de sureté. Afin d’y aboutir, l’une des manières les plus populaires est de recueillir les vibrations de la machine. La plupart de ces vibrations sont directement liées au comportement périodique des sous-systèmes de la machine tels que les arbres de rotation, engrenages, champs électriques rotationnels, etc. Cette connaissance peut être exploitée afin de concevoir une méthodologie adaptée à chaque type de défaut. Cette thèse s’intéresse aux étapes de la mise en œuvre de cette méthodologie. En règle générale, la première condition préalable à l’analyse avancée de l’information récoltée est la disponibilité de la vitesse instantanée de rotation. Cette vitesse doit être connue car la plupart des techniques du traitement du signal sont adaptées aux conditions de fonctionnement stationnaires. Ainsi, la connaissance de la vitesse permettra de compenser les fluctuations de vitesse, par exemple par le ré-échantillonnage angulaire du signal de vibration. Malgré l’existence d’outils de mesure permettant l’estimation de la vitesse tels que les codeurs et les tachymètres, cette thèse étudie le potentiel d’estimer la vitesse instantanée de rotation à partir des signaux vibratoires. Après l'estimation de la vitesse et le ré-échantillonnage angulaire, une étape suivante courante consiste à séparer le signal en composantes déterministes et stochastiques. Dans ce sens, l’efficacité et l’applicabilité de la procédure d'édition du cepstre sont analysées. Ensuite, différentes méthodes de filtrage sont appliquées au signal résiduel afin d’améliorer le rapport signal sur bruit. Pour cette fin, les méthodes existantes utilisant des critères conventionnels sont étudiées en parallèles avec une nouvelle méthodologie aveugle de filtrage. La dernière étape du processus de traitement consiste à diagnostiquer le défaut potentiel. Ainsi, des indicateurs statistiques sont calculés sur le signal obtenu après traitement et suivis dans le temps pour vérifier leurs variations. Dans de nombreux cas, la signature du défaut présente un comportement cyclostationaire. Par conséquent, cette thèse examine également différentes techniques d'analyse de la cyclostationarité. Enfin, les performances des différentes méthodes de traitement sont validées sur deux ensembles de données expérimentales de vibrations issues de boîtes de vitesses d’éoliennes<br>This Ph.D. dissertation targets innovative methods for vibration-based condition monitoring of rotating machinery. Substantial benefits can be achieved from an economical and a safety point of view using condition monitoring. One of the most popular methods to gather information about the state of machine parts is through the analysis of machine vibrations. Most of these vibrations are directly linked to periodical behavior of subsystems within the machine like e.g. rotating shafts, gears, rotating electrical fields, etc. This knowledge can be exploited to enable faultdependent processing schemes. This dissertation investigates how to implement and utilize these processing schemes and details the steps in such a procedure. Typically, the first prerequisite for advanced analysis is the availability of the instantaneous rotation speed. This speed needs to be known since most frequency-based analysis techniques assume stationary behavior. Knowledge of the speed thus allows for compensating speed fluctuations, for example through angular resampling of the vibration signal. While there are hardware-based solutions for speed estimation using angle encoders or tachometers, this thesis investigates the potential in vibration signals for speed estimation. After speed estimation and angular resampling, a common next step is to separate the signal into deterministic and stochastic components. The cepstrum editing procedure is examined for its efficacy and applicability. Afterwards, different filtering methods are inspected as to improve the signal-to-noise ratio of the signal content of interest. Existing methods using conventional criteria are investigated together with a novel blind filtering methodology. The final step in the multi-step processing scheme is to search for the potential fault. Statistical indicators can be calculated on the processed time domain signal and tracked over time to check for increases. In many cases, the fault signature exhibits cyclostationary behavior. Therefore this dissertation also examines different cyclostationary analysis techniques. Lastly, the performance of the different processing methods is validated on two experimental vibration data sets of wind turbine gearboxes
APA, Harvard, Vancouver, ISO, and other styles
38

Lareau, Jonathan. "Application of shifted delta cepstral features for GMM language identification /." Electronic version of thesis, 2006. https://ritdml.rit.edu/dspace/handle/1850/2686.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Darch, Jonathan J. A. "Robust acoustic speech feature prediction from Mel frequency cepstral coefficients." Thesis, University of East Anglia, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445206.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Nam, Jean Ok. "Signal recovery using cepstral smoothing in the shallow water environment." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/34093.

Full text
Abstract:
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.<br>Includes bibliographical references (leaves 47-48).<br>by Jean Ok Nam.<br>M.Eng.
APA, Harvard, Vancouver, ISO, and other styles
41

Gedeon, Ibrahim Joseph Carleton University Dissertation Engineering Electrical. "Cepstral analysis : a speech processing strategy for the cochlear implant." Ottawa, 1990.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
42

Aslan, Gokhan. "Cepstral Deconvolution Method For Measurement Of Absorption And Scattering Coefficients Of Materials." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/3/12608021/index.pdf.

Full text
Abstract:
Several methods are developed to measure absorption and scattering coefficients of materials. In this study, a new method based on cepstral deconvolution technique is proposed. A reverberation room method standardized recently by ISO (ISO 17497-1) is taken as the reference for measurements. Several measurements were conducted in a physically scaled reverberation room and results are evaluated according to these two methods, namely, the method given in the standard and cepstral deconvolution method. Two methods differ from each other in the estimation of specular parts of room impulse responses essential for determination of scattering coefficients. In the standard method, specular part is found by synchronous averaging of impulse responses. However, cepstral deconvolution method utilizes cepstral analysis to obtain the specular part instead of averaging. Results obtained by both of these two approaches are compared for five different test materials. Both of the methods gave almost same values for absorption coefficients. On the other hand, lower scattering coefficient values have been obtained for cepstral deconvolution with respect to the ISO method.
APA, Harvard, Vancouver, ISO, and other styles
43

Chen, Jen Kwang, and 陳貞光. "A PRELIMINARY STUDY ON USING CEPSTRUM/MEL-CEPSTRUM FOR SPEECH RECOGNITION THROUGH TELEPHONE SYSTEM." Thesis, 1993. http://ndltd.ncl.edu.tw/handle/09335135352612888419.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Liou, Sin-Jyun, and 劉信君. "A VLSI Design for LPC-Cepstrum." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/43582368844753170445.

Full text
Abstract:
碩士<br>國立暨南國際大學<br>電機工程學系<br>93<br>In digital speech signal processing, the feature coefficients extracted from speech signals plays an important role in either speech recognition or speech compression. Under a well-designed extraction algorithm for the feature coefficients, the speech recognition rate can be highly increased and the performance of the speech compression can also be improved. In this thesis, the implementation of the LPC-Cepstrum algorithm on one single VLSI chip is examined, which can be widely applied in the portable speech recognition systems. The whole LPC-Cepstrum circuit is divided into three modules, the autocorrelation module, the Linear Predictive Coding (LPC) module, and the Cepstrum module. Each module has its own corresponding control circuit which handles the transmission of data in the circuit. In order to save the chip area, the resource sharing method is adopted in the design of the circuit. The whole circuit has been designed and verified well-functioning. The fix point numerical system is used, and the sampling rate of the speech signal is 8 KHz. Each sample of the data is coded with 16 bits. The internal operation frequency of the chip is 100MHz. The simulation result shows that difference between the error rates from the ASIC and from the C code is less than 2%.
APA, Harvard, Vancouver, ISO, and other styles
45

ZHUANG, DING-FENG, and 莊鼎豐. "Word recognition based on mel generalized cepstrum." Thesis, 1991. http://ndltd.ncl.edu.tw/handle/61545450640861380805.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Huang, Chao-Cyun, and 黃超群. "A hybrid audio watermarking technique in cepstrum domain." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/04970192319920609086.

Full text
Abstract:
碩士<br>國立東華大學<br>資訊工程學系<br>97<br>Along with the progress relating to computer hardware and software, the Internet has become the most popular medium for transmitting various forms of digital multimedia. Since the environment of the Internet is open and free browsed, the protection of digital information transmitted on the network has become an important research topic in recent years. Watermarking techniques, therefore, have received much attention. Embedding digital watermark in digital media is an important method for the protection of intellectual property right. In this thesis, we propose a hybrid watermarking technique that makes use of combining two techniques to embed a binary image into an audio data in the cepstrum domain. Moreover, we apply the MPEG-1 psycho-acoustic model for providing the capability of acoustic transparency to audio watermarking and also exploit error correcting coding to facilitate the accuracy of the watermark extraction. We embed the digital watermark in various genres of song in order to test the robustness of our technique. Simulation results demonstrate the robustness of the proposed hybrid audio watermarking technique, especially the resistance to the most popular MP3 attacks. The extracted watermark images are shown to prove the robustness of the proposed technique as well.
APA, Harvard, Vancouver, ISO, and other styles
47

Sujatno, Santoso, and 葉丁榮. "Research of MCE-Based Two-Dimension Cepstrum Speech recognition." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/45112937731633895054.

Full text
Abstract:
碩士<br>國立暨南國際大學<br>電機工程學系<br>97<br>The thesis is investigated into training models of Minimum Classification Error (MCE) to compare with other ways, and used different methods of enhancement to improve the performance in the speech recognition system. In the study, we used Modified Two Dimension Cepstrum (MTDC) and Genetic Algorithm to convert the speech data as the features of speech recognition. There is a mismatch between the acoustic conditions of training and applications environment for a speech recognition system, so the performance of the system is seriously degraded. So in this thesis will employ Minimum Classification Error (MCE) based Two Dimension Cepstrum (TDC) to enhance speaker features, then using Gaussian Mixture Model (GMM) to set up speech models. Next, we used the system to identify the speech. We adopted numbers in Chinese (0-9) from 10 speakers (5 males and 5 females), then everyone chanted 10 times for each number (total files: 11000). We selected 1040 files of each one as the training file, the remainder as the testing files. Finally, we compared and discussed the results which are tested in several variable background noises form different conditions.
APA, Harvard, Vancouver, ISO, and other styles
48

Jiang, Min-siou, and 江敏秀. "Research of LDA-Based Two-Dimension Cepstrum Speech Recognition." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/20191114493261457066.

Full text
Abstract:
碩士<br>國立暨南國際大學<br>電機工程學系<br>97<br>This thesis proposed a new robust speech recognition technique in noisy environment. The feature extraction bases on Modified Two-Dimension Cepstrum (MTDC), and template matching employs Gaussian Mixture Models (GMM). However, the noisy background in our life may interfere with the performance. Hence, we adopt genetic algorithms (GA)、principal component analysis (PCA) and linear discriminant analysis (LDA) to enhance speech features. Next, we used the system to identify the speech. We adopted numbers in Chinese (0-9) from 10 speakers (5 males and 5 females), then everyone chanted 10 times for each number (total files: 10400). We selected 980 files of each one as the training file, the remainder as the testing files. Finally, we compared and discussed the results which are tested in several variable background noises form different conditions.
APA, Harvard, Vancouver, ISO, and other styles
49

Hwu, Jiing-Yuan, and 胡景淵. "GA-based Noisy Speech Recognition using Two Dimensional Cepstrum." Thesis, 1997. http://ndltd.ncl.edu.tw/handle/02198349081521111748.

Full text
Abstract:
碩士<br>國立交通大學<br>控制工程系<br>85<br>There are many kinds of parameters that are used for speech feature extraction. Two dimensional cepstrum (TDC) is one of them. It can simultaneously represent several kinds of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structures. From analysis, the coeffi-cients located at lower indexes portion of the TDC matrix seem to be more significant than others. Hence, to represent an utterance only some TDC coefficients will be selected to form a feature vector instead of the sequences of feature vectors. It has the advantages of simple computation and less storage space. However, our experiments show that it is quite sensitive to background noise. In order to solve this problem, we propose the GA-based M_TDC method in this dissertation to improve the performance of TDC under noisy condition. In the GA-based M_TDC method, we use the temporal filter to remove the components of noise in the feature extraction phase andwe apply the genetic algorithms (GAs) to find the robust speech parameters in the M_TDC matrix. From the experiments with five noise types, we found that the GA-based M_TDC have better recognition results than the TDC under the noisy environments. In Appendix A of this thesis, the combination of GA-based M_TDC with neural network was proposed to improve the recognition rate under the noisy environment furthermore. From the experiments with five noise types, we found that the combi-nation of GA-based M_TDC with neural network method have better recognition re-sults than the GA- based M_TDC under the noisy environments. The neural network used in our system is the Self-cOnstructing Neural Fuzzy Inference Network (SONFIN).
APA, Harvard, Vancouver, ISO, and other styles
50

Liu, Shi-Cheng, and 劉適程. "A BCH Code-Based Robust Audio Watermarking in Cepstrum Domain." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/02341740301266842003.

Full text
Abstract:
碩士<br>國立東華大學<br>資訊工程學系<br>92<br>A BCH Code-Based Robust Audio Watermarking in Cepstrum Domain ABSTRACT Within this article, we propose the cepstrum domain analysis with the application onto audio watermark. The audio analogy signal will be experienced with a series of conversions for both complex number and logarithm with binary image embedded to reach the purpose of copyright statement. The typical audio recognizing system has been experienced with a long term research conducted by the professionals with the result to locate the cepstrum domain eigenparameter from audio frame. We further to adopt the Divergence Method to compare the divergence of eigenvalue to reach the purposeful recognizing. Hereby, we have proposed the difference for cepstrum domain analysis with the application onto audio watermark is exactly that the coefficient after the logarithm conversion must be reversely processed with index. It suggests that after the audio signal is embedded with watermarks, under the typical asynchronous attack processing, it is still available to keep attack invariant feature to enhance the robustness of watermark. In addition, in order to fight against the attack with an attempt to adopt the non-deformation wave for removing the watermarks, we have added a new communication theory--BCH code.It is empirically proven that after adding BCH code, the security of watermarks has been well improved. Keyword:digital audio watermarking , cepstrum domain, permutation, BCH code, error correcting code, asynchronous attack, robustness
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography