Dissertations / Theses: 'Speech enhancement algorithm'

1

Gagnon, Luc. "A speech enhancement algorithm based upon resonator filterbanks." Thesis, University of Ottawa (Canada), 1991. http://hdl.handle.net/10393/7767.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Roy, Sujan K. "Kalman Filtering with Machine Learning Methods for Speech Enhancement." Thesis, Griffith University, 2021. http://hdl.handle.net/10072/404456.

Full text

Abstract:

Speech corrupted by background noise (or noisy speech) can reduce the efficiency of communication between man-man and man-machine. A speech enhancement algorithm (SEA) can be used to suppress the embedded background noise and increase the quality and intelligibility of noisy speech. Many applications, such as speech communication systems, hearing aid devices, and speech recognition systems, typically rely upon speech enhancement algorithms for robustness. This dissertation focuses on single-channel speech enhancement using Kalman filtering with machine learning methods. In Kalman filter (KF)-based speech enhancement, each clean speech frame is represented by an auto-regressive (AR) process, whose parameters comprise the linear prediction coefficients (LPCs) and prediction error variance. The LPC parameters and the additive noise variance are used to form the recursive equations of the KF. In augmented KF (AKF), both the clean speech and additive noise LPC parameters are incorporated into an augmented matrix to construct the recursive equations of AKF. Given a frame of noisy speech samples, the KF and AKF give a linear MMSE estimate of the clean speech samples using the recursive equations. Usually, the inaccurate estimates of the parameters introduce bias in the KF and AKF gain, leading to a degradation in speech enhancement performance. The research contributions in this dissertation can be grouped into three focus areas. In the first work, we propose an iterative KF (IT-KF) to offset the bias in KF gain for speech enhancement through utilizing the parameters in real-life noise conditions. In the second work, we jointly incorporate the robustness and sensitivity metrics to offset the bias in the KF and AKF gain - which address speech enhancement in real-life noise conditions. The third focus area consists of the deep neural network (DNN) and whitening filter assisted KF and AKF for speech enhancement. Specifically, DNN and whitening filter-based approaches utilize the parameter estimates for the KF and AKF for speech enhancement. However, the whitening filter still produces biased speech LPC estimates for the KF and AKF, results in degraded speech. To address this, we propose a DeepLPC framework constructed with the state-of-the-art residual network and temporal convolutional network (ResNet-TCN) to jointly estimate the speech and noise LPC parameters from the noisy speech for the AKF. Recently, the multi-head self-attention network (MHANet) has demonstrated the ability to more efficiently model the long-term dependencies of noisy speech than ResNet-TCN. Therefore, we employ the MHANet within DeepLPC, termed as DeepLPC-MHANet, to further improve the speech and noise LPC parameter estimates for the AKF. Finally, we perform a comprehensive study on four different training targets for LPC estimation using ResNet-TCN and MHANet. This study aims to determine which training target as well as DNN method produces accurate speech and noise LPC parameter with an application of AKF-based speech enhancement in practice. Objective and subjective scores demonstrate that the proposed methods in this dissertation produce enhanced speech with higher quality and intelligibility than the competing methods in various noise conditions for a wide range of signal-to-noise ratio (SNR) levels.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Eng & Built Env
Science, Environment, Engineering and Technology
Full Text

APA, Harvard, Vancouver, ISO, and other styles

3

Shannon, Benjamin J. "Speech Recognition and Enhancement using Autocorrelation Domain Processing." Thesis, Griffith University, 2007. http://hdl.handle.net/10072/365193.

Full text

Abstract:

From a young age, humans learn language skills and develop them to the point that they become reflex like. As a communication modality, speech is efficient, natural and intrinsically understood. By developing spoken language interfaces for machines, the same kinds of benefits can be realised for the human-machine interaction. Development of machine based speech recognition has been in progress for the past 50 years. In this time significant advances have been made, but the performance of current solutions in the presence of ambient acoustic noise is one factor holding the technology back. Contributing to the overall deficiency of the system is the performance of current feature extraction methods. These techniques cannot be described as robust when deployed in the dynamic acoustic environments typically encountered in everyday life. Ambient background noise also affects speech communication between humans. Restoration of a degraded speech signal by a speech enhancement algorithm can help to reduce this effect. Techniques developed for improving the noise robustness of feature extraction algorithms can also find application in speech enhancement algorithms. Contributions made in this thesis are aimed at improving the performance of automatic speech recognition in the presence of ambient acoustic noise and the quality of speech perceived by human listeners in the same conditions. The proposed techniques are based on processing the degraded speech signal in the ii autocorrelation domain. Based on the differences in the production mechanisms of speech and noise signals, transforming them into the autocorrelation domain provides a favourable representation for noise robust processing. We found that by utilising the higher-lag coefficients of the autocorrelation sequence and discarding the lower-lag coefficients, more noise robust spectral estimates could be made. This approach was found to be adept at suppressing particular classes of non-stationary noise that conventional methods fail to handle. We also explored a topic in speech enhancement of phase spectrum estimation and showed positive results. The proposed feature extraction and speech enhancement techniques, while performing very well for some non-stationary noises, were less effective against the stationary cases. This work highlights the autocorrelation domain as a domain for noise robust speech processing in the presence of dynamic ambient noises. With improvements in short-time autocorrelation estimation, it is expected that the performance of the techniques for stationary noises can also be improved.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
Griffith School of Engineering
Full Text

APA, Harvard, Vancouver, ISO, and other styles

4

Freesen, Jessica Stacy. "Evaluation of the telephone speech enhancement algorithm in older adults using individual audiograms." Connect to resource, 2009. http://hdl.handle.net/1811/37214.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Andrianakis, Ioannis. "Bayesian algorithms for speech enhancement." Thesis, University of Southampton, 2007. https://eprints.soton.ac.uk/66244/.

Full text

Abstract:

The portability of modern voice processing devices allows them to be used in environments where background noise conditions can be adverse. Background noise can deteriorate the quality of speech transmitted through such devices, but speech enhancement algorithms can ameliorate this degradation to some extent. The development of speech enhancement algorithms that improve the quality of noisy speech is the aim of this thesis, which consists of three main parts. In the first part, we propose a framework of algorithms that estimate the clean speech Short Time Fourier Transform (STFT) coefficients. The algorithms are derived from the Bayesian theory of estimation and can be grouped according to i) the STFT representation they estimate ii) the estimator they apply and iii) the speech prior density they assume. Apart from the introduction of algorithms that surpass the performance of similar algorithms that exist in the literature, the compilation of the above framework offers insight on the effect and relative importance of the different components of the algorithms (e.g. prior, estimator) to the quality of the enhanced speech. In the second part of this thesis, we develop methods for the estimation of the power of time varying noise. The main outcome is a method that exploits some similarities between the distribution of the noisy speech spectral amplitude coefficients within a single frequency bin, and the corresponding distribution of the corrupting noise. The above similarities allow the extraction of samples that are more likely to correspond to noise, from a window of past spectral amplitude observations. The extracted samples are then used to produce an estimate of the noise power. In the final part of this thesis, we are concerned with the incorporation of the time and frequency dependencies of speech signals in our estimation model. The theoretical framework on which the modelling is based is provided by Markov Random Fields (MRF’s). Initially, we develop a MAP estimator of speech based on the Gaussian MRF prior. In the following, we introduce the Chi MRF, which is employed in the development of an improved speech estimator. Finally, the performance of fixed and adaptive schemes for the estimation of the MRF parameters is investigated.

APA, Harvard, Vancouver, ISO, and other styles

6

O'Rourke, William Thomas. "Real-world evaluation of mobile phone speech enhancement algorithms." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000585.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Ma, Ning. "Speech enhancement algorithms using Kalman filtering and masking properties of human auditory systems." Thesis, University of Ottawa (Canada), 2005. http://hdl.handle.net/10393/29229.

Full text

Abstract:

Speech enhancement algorithms have been employed successfully in many areas such as VoIP, automatic speech recognition and speaker verification. Many approaches are presented in the literature. This thesis focuses on enhancing single channel speech degraded by white noise or colored noise. A Kalman filter algorithm combined with the masking properties of human auditory systems is proposed. The threshold computed from the masking properties is used as a constraint in the Kalman filter to theoretically derive a modified Kalman filter. The derivation gives a theoretical foundation for the feasibility of combining masking properties with a Kalman filter. Some heuristic methods are also proposed for an easier implementation. One algorithm proposes to use the frequency domain masking level as a hard threshold to reshape the Kalman filtered signal. Another algorithm is to use a post-filter concatenated with the Kalman filter, using a threshold where both time-domain and frequency domain masking properties are taken into account. The goal of the masking is to make the energy of the estimate state error smaller than the threshold. To further decrease the computational cost, a wavelet Kalman filter combined with masking thresholds is also introduced. In the above algorithms, the speech model is assumed to be linear. Nonlinear speech models are also considered in the thesis. To address the nonlinear model problem, dual Extended Kalman Filter (EKF) and dual Unscented Kalman Filter (UKF) algorithms are studied. In these cases, both time-domain and frequency domain masking properties are taken into account. The simulation results show that all the proposed methods combining Kalman filter and masking properties can produce promising results from the point of view of PESQ scores. The average PESQ score gains obtained by these proposed methods are from about 0.35 to 0.45. Some informal subjective tests also show that the performance of the proposed methods is promising. No voice activity detection is required in the proposed methods.

APA, Harvard, Vancouver, ISO, and other styles

8

Sabuwala, Adnan H. "Towards a real-time implementation of loudness enhancement algorithms on a Motorola DSP 56600." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000602.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Arioz, Umut. "Developing Subject-specific Frequency Lowering Algorithms With Simulated Hearing Loss For The Enhancement Of Sensorineural Hearing Loss." Phd thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614929/index.pdf.

Full text

Abstract:

The hearing and understanding problems of the people with high frequency hearing loss are covered within the scope of this thesis. For overcoming these problems, two main studies, developing hearing loss simulation (HLS) and applying new frequency lowering methods (FLMs) were carried out. HLS was developed with the suprathreshold effects and new FLMs were applied with different combinations of the FLMs. For evaluating the studies, modified rhyme test (MRT) and speech intelligibility index (SII) were used as subjective and objective measures, respectively. Before both of the studies, offline studies were carried out for specifying the significant parameters and values for using in MRT. For the HLS study, twelve hearing impaired subjects listened to unprocessed sounds and thirty six normal hearing subjects listened to simulated sounds. In the evaluation of the HLS, both measures gave similar and consistent results for both unprocessed and simulated sounds. In FLMs study, hearing impaired subjects were simulated and normal hearing subjects listened to frequency lowered sounds with the specified methods, parameters and values. All FLMs were compared with the standard method of hearing aids (amplification) for five different noisy environments. FLMs satisfied 83% success of higher speech intelligibility improvement than amplification in all cases. As a conclusion, the necessity of using subject-specific FLMs was shown to achieve higher intelligibility than with amplification only. Accordingly, a methodology for selection of the values of parameters for different noisy environments and for different audiograms was developed.

APA, Harvard, Vancouver, ISO, and other styles

10

Al-Ali, Ahmed Kamil Hasan. "Forensic speaker recognition under adverse conditions." Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/130783/1/Ahmed%20Kamil%20Hasan_Al-Ali_Thesis.pdf.

Full text

Abstract:

The performance of forensic speaker recognition systems degrades significantly in the presence of environmental noise and reverberant conditions. This research developed new techniques to improve forensic speaker recognition performance under these conditions using fusion feature extraction techniques and speech enhancement based on the independent component analysis algorithm. A range of forensic speaker recognition applications will benefit from the research outcomes including criminal investigations and law enforcement agencies.

APA, Harvard, Vancouver, ISO, and other styles

11

Wu, Jin-Fu, and 吳金富. "Speech Enhancement Using Subspace Noise Tracking Algorithm." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/p86vku.

Full text

Abstract:

碩士
國立臺北科技大學
電機工程系研究所
99
Speech signals are tend to decrease the overall quality and recognition rates when corrupted by background noises. Speech enhancement is a technique usually used in speech transmission and speech recognition that recovers the clean speech from noisy speech by using a noise estimator, i.e., a noise tracking algorithm. More accurate the noise estimator, more efficiency the enhancement technique is. There exist many kinds of noises with different characteristics in our environment. That’s why the design of an accurate noise estimator is not an easy task since it could not know the noise type it will deal with in advance. In this thesis we propose an effective noise tracking algorithm based on frequency-domain subspace decomposition method. We analyze the energy of noise contained in the speech signal then filter out the noise by speech enhancement technology. Four speech enhancement techniques, including the spectrum subtraction method (SS), the time-domain Wiener filter (TDWF), the frequency-domain Wiener filter (FDWF), and the subspace method (SM) incorporated with the proposed tracking algorithm are investigated. Both well-known minimum statistics (MS) and minima controlled recursive averaging (MCRA) noise tracking algorithms are also included for comparison in the experiments. The experimental results show that in average the proposed noise tracking algorithm can achieve higher segmental signal to noise improvement (SSNRI) when compared with both minimum statistics (MS) and minima controlled recursive averaging (MCRA) methods. Using the spectrum subtraction (SS) enhancement method as an example, when the signal to noise ratio (SNR) of test speech is at 10 dB, the SSNRI of the proposed tracking algorithm is up to 2.7146 dB. It performs better than the SSNRI of 1.4837 dB for MCRA and 0.3418 dB for MS, respectively. As a result, it could provide a superior communication quality in noisy environment.

APA, Harvard, Vancouver, ISO, and other styles

12

Yu-HsuanHuang and 黃裕軒. "Deep Learning Applied to Speech Enhancement Algorithm." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/q2wt6u.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Chung, Cheng-Wei, and 鍾丞韋. "Development of an automatic singer identification system using speech enhancement algorithm." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/15863802129236619843.

Full text

Abstract:

碩士
國立彰化師範大學
車輛科技研究所
99
This thesis presents a study of an automatic singer identification system using speech enhancement algorithm and artificial neural network. The proposed system can be divided into two parts. In the first stage, the representative characteristics were extracted by voice activity detection (VAD) and Mel-frequency cepstral coefficients (MFCC). It can reduce the computation dimensions and enhance the performance of classification. In the second stage, the amplitude of energy distribution using VAD and MFCC, which is take as database input to artificial neural network. The artificial neural network is used to train the speech signal features. In addition, the experimental using different speech enhancement method compare recognition rate of enhanced signals. For recognizing the singers effectively, this study uses the generalized regression neural network training and testing. The experimental results show that the proposed system has good performance for automatic singer identification system.

APA, Harvard, Vancouver, ISO, and other styles

14

Chang, Kai-Hsing, and 張凱行. "Algorithm/Hardware Design of a Subspace Tracking Based Speech Enhancement System." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/65074690868306148826.

Full text

Abstract:

碩士
國立成功大學
電機工程學系碩博士班
92
In this thesis, we describe a design of signal subspace speech enhancement based on subspace tracking algorithm. The proposed algorithm incorporates a perceptual filterbank which is derived from psycho-acoustic model with signal subspace processing. The experimental results which were obtained by testing TAICAR database show that our approach is better than conventional subspace methods. The low frequency noises (below 1KHz) in car noisy environments are suppressed efficiently after applying the perceptual filterbank. For real time applications, we derive a pipelined VLSI architecture of the subspace tracking algorithm. The data hazard of subspace tracking algorithm is solved by using Look-Ahead method without delayed updating. The convergence rate of our architecture is faster than those of delayed PASTd architectures. To save the chip area, a shared technique for the arithmetic of multiplication units is adopted. It makes the number of multipliers be independent with the filter length. This architecture has been realized in ARM-based ALTERA EPXA10 Development Board with frequency at 9.7MHz. Simulation results are presented to validate our algorithm and hardware architectures.

APA, Harvard, Vancouver, ISO, and other styles

15

YangJui-Cheng and 楊瑞政. "Speech Enhancement Based on Hybrid Wavelet Thresholding Algorithm for Reducing Colored Noise." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/01329104222916857001.

Full text

Abstract:

碩士
崑山科技大學
電機工程研究所
94
The famous wavelet denoising method is wavelet shrinkage algorithm proposed by Donoho and Johnstone. It transforms the degraded signal by wavelet to produce the wavelet coefficients, which is utilized to evaluate a threshold value to determine the (hard thresholding or soft thresholding) wavelet shrinkage function. These methods were only experimented on white noise suppression. Thus, we proposed an effective method to reducing colored noise. First, we developed two different thresholds from wavelet-packet coefficients produced by discrete wavelet-packet transform. Furthermore, we applied these thresholds to the new hybrid wavelet shrinkage function to suppress the colored noise. Finally, we applied this new wavelet denoising algorithm to enhance speech corrupted by colored noise such as car noise and fan noise. In these applications, signal-to-noise ratio (SNR) has been used to evaluate the performances, which show that this new wavelet denoising algorithm can suppress colored noise effectively to improve the speech quality.

APA, Harvard, Vancouver, ISO, and other styles

16

Sinha, Pavel. "Algorithm and architecture for simultaneous diagonalization of matrices applied to subspace-based speech enhancement." Thesis, 2008. http://spectrum.library.concordia.ca/975645/1/MR40897.pdf.

Full text

Abstract:

This thesis presents algorithm and architecture for simultaneous diagonalization of matrices. As an example, a subspace-based speech enhancement problem is considered, where in the covariance matrices of the speech and noise are diagonalized simultaneously. In order to compare the system performance of the proposed algorithm, objective measurements of speech enhancement is shown in terms of the signal to noise ratio and mean bark spectral distortion at various noise levels. In addition, an innovative subband analysis technique for subspace-based time-domain constrained speech enhancement technique is proposed. The proposed technique analyses the signal in its subbands to build accurate estimates of the covariance matrices of speech and noise, exploiting the inherent low varying characteristics of speech and noise signals in narrow bands. The subband approach also decreases the computation time by reducing the order of the matrices to be simultaneously diagonalized. Simulation results indicate that the proposed technique performs well under extreme low signal-to-noise-ratio conditions. Further, an architecture is proposed to implement the simultaneous diagonalization scheme. The architecture is implemented on an FPGA primarily to compare the performance measures on hardware and the feasibility of the speech enhancement algorithm in terms of resource utilization, throughput, etc. A Xilinx FPGA is targeted for implementation. FPGA resource utilization re-enforces on the practicability of the design. Also a projection of the design feasibility for an ASIC implementation in terms of transistor count only is included

APA, Harvard, Vancouver, ISO, and other styles

17

Sheng-WenHuang and 黃聖彣. "Speech Enhancement Algorithm Based on Power Spectral Density Ratio Applied to Smart Handheld Devices." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/647me2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Yin-HuanHuang and 黃尹鐶. "Speech Enhancement Algorithm Based on Modified Generalized Sidelobe Canceller For Far-field Microphone Array Application." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/nca2ta.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

You, Ming-jhan, and 游明展. "An Improved Spectral Subtractive-Type Algorithm with a Spectral Weighted Filter for Single-Channel Speech Enhancement." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/82603516081518657797.

Full text

Abstract:

碩士
國立成功大學
電機工程學系碩博士班
96
This thesis presents a two-step spectral subtraction speech enhancement algorithm to reduce background noise. First, spectral subtractive-type algorithms are utilized to obtain the subtracted signal by subtracting the noise power spectrum from the noisy power spectrum of noisy speech signals. Due to the inherent deficiency of conventional subtractive-type algorithms, there still remains residual noise in the subtracted signal. To solve this problem, we integrate a spectral weighted filter with subtractive-type algorithm to further eliminate the residual noise. The adjustment of the spectral weighted filter is based on the current spectrum of the subtracted signal. If the current spectrum of subtracted signal is the residual noise, we set the value of the spectral weighted filter close to zero to suppress the residual noise. On the other hand, the value of spectral weighted filter can be set one to keep the information of speech signal by considering the masking effect. The effectiveness of the proposed subtractive-type algorithms coupled with additive spectral weighted filters has been validated by SNR improvement tests and improvement rate tests and compared with conventional subtractive-type algorithms. According to our simulation results, the proposed algorithms outperform the conventional methods in both tests.

APA, Harvard, Vancouver, ISO, and other styles

20

SHEN, YU-CHUNG, and 沈于中. "Estimation of Noise Magnitude Using Minima-Controlled-Recursive-Averaging Algorithm Adapted by Vowel Harmonic for Speech Enhancement." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/08729934395341655099.

Full text

Abstract:

碩士
亞洲大學
資訊傳播學系
102
Accurately estimating noise magnitude can improve the performance of a speech enhancement system. However, most of noise estimators suffer from either overestimation or underestimation on the noise level. An overestimate on noise will cause serious speech distortion. On the contrary, a great quantity of residual noise will be introduced when noise magnitude is underestimated. Accordingly, how to accurately estimate noise magnitude is important for speech enhancement. In this study, we employ a minima-controlled-recursive -averaging (MCRA) algorithm adapted by vowel harmonics to estimate noise level. A speech-presence probability is adapted by the number of robust harmonics, enabling a vowel spectrum to obtain the value of speech-presence probability approaching unity. The vowel spectra can be well preserved. Consequently, the enhanced speech quality is improved while background noise is efficiently reduced. Experimental results show that the proposed method can accurately estimate noise magnitude and can improve the performance of the MCRA algorithm.

APA, Harvard, Vancouver, ISO, and other styles

21

Bing-HongTu and 杜秉鴻. "Speech Enhancement Algorithm Based on Modified Power Spectral Density Difference Applied to Dual Microphone Smart Handheld Devices." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/eq5ww3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Huang, Chung-Han, and 黃重翰. "Noise Reduction Algorithms for Speech Enhancement." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/70999296683649720524.

Full text

Abstract:

碩士
國立臺灣大學
電信工程學研究所
99
When clean speech affected by various types of noise, there are many methods to reduce noise. Companying more noise we reduce, more speech distortions the enhanced signals produce. Although wiener filter makes mean square error minimum(MMSE), it also has highly speech distortions. The thesis improves wiener filter to become a tradeoff filter and implements adaptive tradeoff parameter to let it have different degree of suppression in different SNR. We also improve the wiener filter with different gain function in order to make a balance between noise reduction and speech distortion, and it saves some information of clean speech. Besides, a new noise estimation method originated from minimum statistical estimation is proposed. All of the improvement methods will reduce distortion of speech and maintain a certain degree of noise reduction. At last we compare result of the improved algorithms with conventional wiener filter in white noise, babble noise and Vuvuzela noise based on speech distortion measure and noise reduction measure.

APA, Harvard, Vancouver, ISO, and other styles

23

Chen, Chun-Hung, and 陳俊宏. "Single-channel noise reduction algorithms for speech enhancement." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/98591437384251482710.

Full text

Abstract:

碩士
國立交通大學
機械工程學系
98
This paper will propose an optimized speech enhancement algorithm aimed at single-channel noise reduction (NR) ,and apply the NR algorithm in the speech recognition. The optimization process is based on an objective function obtained in a regression model and the simulated annealing (SA) algorithm that is well suited for problems with many local optima. The NR algorithm, minimum mean-square error noise reduction (MMSE-NR) algorithm, employs a time-recursive averaging (TRA) method for noise estimation. Objective tests were undertaken to compare the optimized MMSE-TRA-NR and MMSE-VAD-TRA-NR algorithm with several conventional NR algorithms. White noise and car noise at signal-to-noise ratio (SNR) 5 dB are used in these tests. As compared to conventional algorithms, the optimized MMSE-TRA-NR and MMSE-VAD-TRA-NR algorithm proved effective in enhancing noise-corrupted speech signals, without compromising the timbral quality. The optimized MMSE-TRA-NR algorithm also can be used in automatic speech recognition (ASR), the recognition rate will be enhance by the optimal parameters of the MMSE-TRA-NR algorithms.

APA, Harvard, Vancouver, ISO, and other styles

24

Παπανικολάου, Παναγιώτης. "Ενίσχυση σημάτων μουσικής υπό το περιβάλλον θορύβου." Thesis, 2010. http://nemertes.lis.upatras.gr/jspui/handle/10889/3833.

Full text

Abstract:

Στην παρούσα εργασία επιχειρείται η εφαρμογή αλγορίθμων αποθορυβοποίησης σε σήματα μουσικής και η εξαγωγή συμπερασμάτων σχετικά με την απόδοση αυτών ανά μουσικό είδος. Η κύρια επιδίωξη είναι να αποσαφηνιστούν τα βασικά προβλήματα της ενίσχυσης ήχων και να παρουσιαστούν οι διάφοροι αλγόριθμοι που έχουν αναπτυχθεί για την επίλυση των προβλημάτων αυτών. Αρχικά γίνεται μία σύντομη εισαγωγή στις βασικές έννοιες πάνω στις οποίες δομείται η τεχνολογία ενίσχυσης ομιλίας. Στην συνέχεια εξετάζονται και αναλύονται αντιπροσωπευτικοί αλγόριθμοι από κάθε κατηγορία τεχνικών αποθορυβοποίησης, την κατηγορία φασματικής αφαίρεσης, την κατηγορία στατιστικών μοντέλων και αυτήν του υποχώρου. Για να μπορέσουμε να αξιολογήσουμε την απόδοση των παραπάνω αλγορίθμων χρησιμοποιούμε αντικειμενικές μετρήσεις ποιότητας, τα αποτελέσματα των οποίων μας δίνουν την δυνατότητα να συγκρίνουμε την απόδοση του κάθε αλγορίθμου. Με την χρήση τεσσάρων διαφορετικών μεθόδων αντικειμενικών μετρήσεων διεξάγουμε τα πειράματα εξάγοντας μια σειρά ενδεικτικών τιμών που μας δίνουν την ευχέρεια να συγκρίνουμε είτε τυχόν διαφοροποιήσεις στην απόδοση των αλγορίθμων της ίδιας κατηγορίας είτε διαφοροποιήσεις στο σύνολο των αλγορίθμων. Από την σύγκριση αυτή γίνεται εξαγωγή χρήσιμων συμπερασμάτων σχετικά με τον προσδιορισμό των παραμέτρων κάθε αλγορίθμου αλλά και με την καταλληλότητα του κάθε αλγορίθμου για συγκεκριμένες συνθήκες θορύβου και για συγκεκριμένο μουσικό είδος.
This thesis attempts to apply Noise Reduction algorithms to signals of music and draw conclusions concerning the performance of each algorithm for every musical genre. The main aims are to clarify the basic problems of sound enhancement and present the various algorithms developed for solving these problems. After a brief introduction to basic concepts on sound enhancement we examine and analyze various algorithms that have been proposed at times in the literature for speech enhancement. These algorithms can be divided into three main classes: spectral subtractive algorithms, statistical-model-based algorithms and subspace algorithms. In order to evaluate the performance of the above algorithms we use objective measures of quality, the results of which give us the opportunity to compare the performance of each algorithm. By using four different methods of objective measures to conduct the experiments we draw a set of values that facilitate us to make within-class algorithm comparisons and across-class algorithm comparisons. From these comparisons we can draw conclusions on the determination of parameters for each algorithm and the appropriateness of algorithms for specific noise conditions and music genre.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Speech enhancement algorithm'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles