Dissertations / Theses on the topic 'Image and speech coding'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Image and speech coding.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Yan, Ming. "VLSI architectures for speech and image coding applications." Thesis, Queen's University Belfast, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356855.
Full textSo, Stephen. "Efficient Block Quantisation for Image and Speech Coding." Thesis, Griffith University, 2005. http://hdl.handle.net/10072/366625.
Full textThesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Microelectronic Engineering
Full Text
Savvides, Vasos E. "Perceptual models in speech quality assessment and coding." Thesis, Loughborough University, 1988. https://dspace.lboro.ac.uk/2134/36273.
Full textLo, Ka-Yiu. "Pitch synchronous speech coding at very low bit rates." Thesis, University of Liverpool, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.321128.
Full textFarsi, Hassan. "Advanced pre-and-post processing techniques for speech coding." Thesis, University of Surrey, 2003. http://epubs.surrey.ac.uk/844491/.
Full textPeng, Yong Kian. "Speech coding based on a pitch synchronous pattern recognition approach." Thesis, University of Ulster, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.245804.
Full textMeh, Chu Chu. "Exploiting spatial and temporal redundancies for vector quantization of speech and images." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54442.
Full textAbboud, Karim. "Wideband CELP speech coding." Thesis, McGill University, 1992. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=56805.
Full textthe first approach considers the quantization of Liner Predictive Coding (LPC) parameters and uses a three way split vector quantization. Both scalar and vector quantization are initially studied; results show that, with adequate codebook training, the second method generates better results while using a fewer number of bits. Nevertheless, the use of vector quantizers remain highly complex in terms of memory and number of computations. A new quantization scheme, split vector quantization (split VQ), is investigated to overcome this complexity problem. Using a new weighted distance measure as a selection criterion for split VQ, the average spectral distortion is significantly reduced to match the results obtained with scalar quantizers.
The second approach introduces a new pitch predictor with an increased temporal resolution for periodicity. This new technique has the advantage of maintaining the same quality obtained with conventional multiple coefficient predictors at a reduced bit rate. Furthermore, the conventional CELP noise weighting filter is modified to allow more freedom and better accuracy in the modeling of both tilt and formant structures. Throughout this process, different noise weighting schemes are evaluated and the results show that the new filter greatly contributes in solving the problem of high frequency distortion.
The final wideband CELP coder is operational at 11.7 kbits/s and generates a high perceptual quality of the reconstructed speech using the fractional pitch predictor and the new perceptual noise weighting filter.
Streit, Juergen Stefan. "Digital image coding." Thesis, University of Southampton, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.361092.
Full textSturt, Christian. "Pitch synchronous speech coding techniques." Thesis, University of Surrey, 2003. http://epubs.surrey.ac.uk/843327/.
Full textKaouri, Hussein Ali. "Speech coding using vector quantisation." Thesis, Queen's University Belfast, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356934.
Full textKritzinger, Carl. "Low bit rate speech coding." Thesis, Stellenbosch : University of Stellenbosch, 2006. http://hdl.handle.net/10019.1/2078.
Full textDespite enormous advances in digital communication, the voice is still the primary tool with which people exchange ideas. However, uncompressed digital speech tends to require prohibitively high data rates (upward of 64kbps), making it impractical for many applications. Speech coding is the process of reducing the data rate of digital voice to manageable levels. Parametric speech coders or vocoders utilise a-priori information about the mechanism by which speech is produced in order to achieve extremely efficient compression of speech signals (as low as 1 kbps). The greater part of this thesis comprises an investigation into parametric speech coding. This consisted of a review of the mathematical and heuristic tools used in parametric speech coding, as well as the implementation of an accepted standard algorithm for parametric voice coding. In order to examine avenues of improvement for the existing vocoders, we examined some of the mathematical structure underlying parametric speech coding. Following on from this, we developed a novel approach to parametric speech coding which obtained promising results under both objective and subjective evaluation. An additional contribution by this thesis was the comparative subjective evaluation of the effect of parametric speech coding on English and Xhosa speech. We investigated the performance of two different encoding algorithms on the two languages.
Burnett, I. S. "Hybrid techniques for speech coding." Thesis, University of Bath, 1992. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.317353.
Full textChowdhury, Md Mahbubul Islam. "Image segmentation for coding." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0017/MQ55494.pdf.
Full textVASCONCELLOS, EDMAR DA COSTA. "SUB-BAND IMAGE CODING." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 1994. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=8635@1.
Full textEste trabalho aborda o problema da compressão de imagens explorando a técnica de codificação por sub-bandas(SBB). Como estrutura básica, usada na primeira parte do trabalho, tem-se a divisão da imagem em 16 sub-bandas buscando replicar os resultados de woods [1]. As componentes das 16 SBB são quantizadas e codificadas, e bits são alocados às SBB usando como critério a minimização do erro médio quadrático. Os quantizadores são projetados segundo uma distribuição Gaussiana Generalizada. Neste processo de codificação, a sub-banda de mais baixa freqüência é codificada com DPCM, enquanto as demais SBB são codificadas por PCM. Como inovação, é proposto o uso do algoritmo de Lempel-Ziv na codificação sem perdas (compactação) das sub-bandas quantizadas. Na compactação são empregados os algoritmos de Huffman e LZW (modificação do LZA). Os resultados das simulações são apresentados em termos da taxa (bits/pixel) versus relação sinal ruído de pico e em termos de analise subjetiva das imagens reconstruídas. Os resultados obtidos indicam um desempenho de compressão superior quanto o algoritmo de Huffman é usado, comparado com o algoritmo LZW. A melhoria de desempenho, na técnica de decomposição em sub-bandas, observada com o algoritmo de Huffman foi superior (2dB acima). Todavia, tendo em vista as vantagens da universalidade do algoritmo de Lempel-Ziv, deve-se continuar a investigar o seu desempenho implementado de forma diferente do explorado neste trabalho.
This work focus on the problem of image compression, with exploring the techniques of subband coding. The basic structure, used in the sirst part of this tesis, encompass the uniform decomposition of the image into 16 subbands. This procedure aims at reproducing the reults of Woods [1]. The component of the 16 subbands are quatized and coded and bits are optimally allocated among the subbands to minimize the mean-squared error. The quantizers desingned match the Generelized Gaussian Distribuition, which model the subband components. In the coding process, the lowest subband is DPCM coded while the higher subbands are coded with PCM. As an innovation, it is proposed the use of the algorithm LZW for coding without error (compaction) the quantized subbands. In the compactation process, the Huffamn and LZW algorithms are used. The simulation results are presented in terms of rate (bits/pel) versus peak signal-to-noise and subjective quality. The performance of the subband decomposition tecnique obtained with the Huffamn´s algorithm is about 2dB better than that obtained with the LZW. The universality of the Lempel-Ziv algorithm is, however, an advantage that leads us to think that further investigation should still be pursued.
Al-Naimi, Khaldoon Taha. "Advanced speech processing and coding techniques." Thesis, University of Surrey, 2002. http://epubs.surrey.ac.uk/843488/.
Full textZhao, David Yuheng. "Model Based Speech Enhancement and Coding." Doctoral thesis, Stockholm : Kungliga Tekniska högskolan, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4412.
Full textKatugampala, Nilantha N. "Multimode speech coding below 6 kbps." Thesis, University of Surrey, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.365141.
Full textGreen, Richard C. "Walsh based cepstra for speech coding." Thesis, King's College London (University of London), 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.392848.
Full textOoi, James M. 1970. "Application of wavelets to speech coding." Thesis, Massachusetts Institute of Technology, 1993. http://hdl.handle.net/1721.1/12340.
Full textZolfaghari, Parham Seyed. "Sinusoidal model based segmental speech coding." Thesis, University of Cambridge, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.621177.
Full textAndersson, Tomas. "On error-robust source coding with image coding applications." Licentiate thesis, Stockholm : Department of Signals, Sensors and Systems, Royal Institute of Technology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4046.
Full textBergström, Peter. "Eye-movement controlled image coding /." Linköping : Univ, 2003. http://www.bibl.liu.se/liupubl/disp/disp2003/tek831s.pdf.
Full textSilva, Eduardo Antonio Barros da. "Wavelet transforms for image coding." Thesis, University of Essex, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.282495.
Full textKubrick, Aharon H. "Image coding employing vector quantisation." Thesis, City University London, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.357009.
Full textMorgan, Pamela Sheila. "Medical image coding and segmentation :." Thesis, University of Bristol, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.442206.
Full textDesai, Ujjaval Yogesh. "Coding of segmented image sequences." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/11984.
Full textIncludes bibliographical references (leaves 72-74).
by Ujjaval Yogesh.
M.Eng.
Frajka, Tamás. "Image coding subject to constraints /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2003. http://wwwlib.umi.com/cr/ucsd/fullcit?p3090437.
Full textBatri, Nadim. "Robust spectral parameter coding in speech processing." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0005/MQ43996.pdf.
Full textAsenstorfer, John A. "Source-channel coding for CELP speech coders /." Title page, contents and abstract only, 1994. http://web4.library.adelaide.edu.au/theses/09PH/09pha816.pdf.
Full textSoong, Michael. "Predictive split vector quantization for speech coding." Thesis, McGill University, 1994. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=68054.
Full textSummation Product Codes (SPCs) are a family of structured vector quantizers that circumvent the complexity obstacle. The performance of SPC vector quantizers can be traded off against their storage and encoding complexity. Besides the complexity factors, the design algorithm can also affect the performance of the quantizer. The conventional generalized Lloyd's algorithm (GLA) generates sub-optimal codebooks. For particular SPC such as multistage VQ, the GLA is applied to design the stage codebooks stage-by-stage. Joint design algorithms on the other hand update all the stage codebooks simultaneously.
In this thesis, a general formulation and an algorithm solution to the joint codebook design problem is provided for the SPCs. The key to this algorithm is that every PC has a reference product codebook which minimizes the overall distortion. This joint design algorithm is tested with a novel SPC, namely "Predictive Split VQ (PSVQ)".
VQ of speech Line Spectral Frequencies (LSF's) using PSVQ is also presented. A result in this work is that PSVQ, designed using the joint codebook design algorithm requires only 20 bits/frame(20 ms) for transparent coding of a 10$ sp{ rm th}$ order LSF's parameters.
Grass, John. "Quantization of predictor coefficients in speech coding." Thesis, McGill University, 1990. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=60067.
Full textScalar quantization is the first approach evaluated. Results show that Line Spectral Frequencies require significantly fewer bits than reflection coefficients for comparable performance. The second approach investigated is the use of vector-scalar quantization. In the first stage, vector quantization is performed. The second stage consists of a bank of scalar quantizers which code the vector errors between the original LPC coefficients and the components of the vector of the quantized coefficients.
The approach is to couple the vector and scalar quantization stages. Every codebook vector is compared to the original LPC coefficient vector to produce error vectors. The second innovation into vector-scalar quantization is the incorporation of a small adaptive codebook to the large fixed codebook. Frame-to-frame correlation of the LPC coefficients is exploited at no extra cost in bits.
The performance of the vector-scalar quantization using the two new techniques is better than that of the scalar coding techniques currently used in conventional LPC coders.
Maroun, Nabih. "Toll-quality speech coding at 8 kbs." Thesis, McGill University, 1993. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=56802.
Full textSuddle, Muhammad Riaz. "Speech coding in private and broadcast networks." Thesis, University of Surrey, 1996. http://epubs.surrey.ac.uk/1019/.
Full textOberhofer, Robert. "Pitch adaptive variable bitrate CELP speech coding." Thesis, University of Ulster, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.264811.
Full textThorpe, T. F. "Performance bounds for digital coding of speech." Thesis, University of Cambridge, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.234070.
Full textGant, Nicolas Roland Noel. "The linear predictive coding of mask speech." Thesis, University of Southampton, 1986. https://eprints.soton.ac.uk/52261/.
Full textDeloche, François. "Short time-scale efficient coding of speech." Thesis, Paris, EHESS, 2019. http://www.theses.fr/2019EHES0142.
Full textCochlear frequency selectivity is known to reflect the overall statistical structure of speech, in line with the hypothesis that low-level sensory processing provides efficient codes for information contained in natural stimuli. Speech signals, however, possess a complex structure, even on short-time scales, as a result of the diversity of acoustic factors involved in the generation of speech. This rich structure means that advanced coding schemes based on a nonlinear representation of speech sounds could provide more efficient codes. The first step in finding efficient strategies is to describe the statistical structure of speech at a fine level — at the level of phonemes or even finer at the level of acoustic events. In this thesis, I use a parametric approach to explore the fine-grained statistical structure of speech. The goal of this method is to find the sparsest representation of speech sounds among a family of dictionaries of Gabor filters whose frequency selectivity follows different power laws in the high frequency range 1-8kHz. I motivate the use of Gabor filters for the search of sparse time-frequency representations of speech signals, and I show that the dictionary method has a formal link with previous work based on Independent Component Analysis (ICA). The acoustic factors that affect the power law associated with the sparsest decomposition can be inferred from the analyses of synthetic and real data. The results suggest that an efficient speech coding strategy is to reduce frequency selectivity with sound intensity level, reflecting the nonlinear behavior of the cochlea
Hoyle, Robert D. (Robert Douglas) Carleton University Dissertation Engineering Electrical. "Digital speech coding for land mobile radio." Ottawa, 1986.
Find full textMason, Michael. "Hybrid coding of speech and audio signals." Thesis, Queensland University of Technology, 2001.
Find full textChaiyaboonthanit, Thanit. "Image coding using wavelet transform and adaptive block truncation coding /." Online version of thesis, 1991. http://hdl.handle.net/1850/10913.
Full textLeong, Michael. "Representing voiced speech using prototype waveform interpolation for low-rate speech coding." Thesis, McGill University, 1992. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=56796.
Full textIn examining the PWI method, it was found that although the method generally works very well there are occasional sections of the reconstructed voiced speech where audible distortion can be heard, even when the prototypes are not quantized. The research undertaken in this thesis focuses on the fundamental principles behind modelling voiced speech using PWI instead of focusing on bit allocation for encoding the prototypes. Problems in the PWI method are found that may be have been overlooked as encoding error if full encoding were implemented.
Kleijn uses PWI to represent voiced sections of the excitation signal which is the residual obtained after the removal of short-term redundancies by a linear predictive filter. The problem with this method is that when the PWI reconstructed excitation is passed through the inverse filter to synthesize the speech undesired effects occur due to the time-varying nature of the filter. The reconstructed speech may have undesired envelope variations which result in audible warble.
This thesis proposes an energy fixup to smoothen the synthesized speech envelope when the interpolation procedure fails to provide the smooth linear result that is desired. Further investigation, however, leads to the final proposal in this thesis that PWI should he performed on the clean speech signal instead of the excitation to achieve consistently reliable results for all voiced frames.
Varga, A. P. "Multipulse excited linear predictive analysis in speech coding and constructive speech synthesis." Thesis, University of Cambridge, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.372909.
Full textAccardi, Anthony J. (Anthony Joseph) 1976. "A modular approach to speech enhancement with an application to speech coding." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9976.
Full textIncludes bibliographical references (p. 98-101).
by Anthony J. Accardi.
B.S.
M.Eng.
Greenwood, Andrew Richard. "Articulatory speech synthesis." Thesis, University of Liverpool, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.386773.
Full textIslam, Tamanna. "Interpolation of linear prediction coefficients for speech coding." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0034/MQ64229.pdf.
Full textTrinkaus, Trevor R. "Perceptual coding of audio and diverse speech signals." Diss., Georgia Institute of Technology, 1999. http://hdl.handle.net/1853/13883.
Full textLoo, James H. Y. (James Hung Yan). "Intraframe and interframe coding of speech spectral parameters." Thesis, McGill University, 1996. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=24065.
Full textBecause speech is quasi-stationary, interframe coding methods such as predictive SVQ (PSVQ) can exploit the correlation between adjacent LSF vectors. Nonlinear PSVQ (NPSVQ) is introduced in which a nonparametric and nonlinear predictor replaces the linear predictor used in PSVQ. Regardless of predictor type, PSVQ garners a performance gain of 5-7 bits/frame over SVQ. By interleaving intraframe SVQ with PSVQ, error propagation is limited to at most one adjacent frame. At an overall bit rate of about 21 bits/frame, NPSVQ can provide similar coding quality as intraframe SVQ at 24 bits/frame (an average gain of 3 bits/frame). The particular form of nonlinear prediction we use incurs virtually no additional encoding computational complexity. Voicing classification is used in classified NPSVQ (CNPSVQ) to obtain an additional average gain of 1 bit/frame for unvoiced frames. Furthermore, switched-adaptive predictive SVQ (SA-PSVQ) provides an improvement of 1 bit/frame over PSVQ, or 6-8 bits/frame over SVQ, but error propagation increases to 3-7 frames. We have verified our comparative performance results using subjective listening tests.
Ramachandran, Ravi P. "Pitch filtering in adaptive predictive coding of speech." Thesis, McGill University, 1986. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=65345.
Full textRoy, Guylain. "Low-rate analysis-by-synthesis wideband speech coding." Thesis, McGill University, 1990. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=59643.
Full textThe study consists of three stages. First, aspects of wideband spectral envelope modeling using Line Spectral Frequencies (LSF's) are studied. Then, the underlying coder structure is derived from a basic Residual Excited Linear Predictive coder (RELP). This structure is enhanced by the addition of a pitch prediction stage, and by the development of full-band and split-band pitch parameter optimization procedures. These procedures are then applied to an Code Excited Linear Prediction (CELP) model. Finally, the performance of full-band and split-band CELP structures are compared.