Log in

Relevant bibliographies by topics / Image and speech coding

Contents

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'Image and speech coding'

Author: Grafiati

Published: 19 February 2023

Last updated: 20 February 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Image and speech coding.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Image and speech coding"

1

Guglielmo, Mario, Giulio Modena, and Roberto Montagna. "Speech and image coding for digital communications." European Transactions on Telecommunications 2, no. 1 (January 1991): 21–44. http://dx.doi.org/10.1002/ett.4460020106.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Zivin, Gail. "Image or neural coding of inner speech and agency?" Behavioral and Brain Sciences 9, no. 3 (September 1986): 534–35. http://dx.doi.org/10.1017/s0140525x00047002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

ALSAIDI, RAMADHAN ABDO MUSLEH, HONG LI, YANTAO WEI, ROKAN KHAJI, and YUAN YAN TANG. "HIERARCHICAL SPARSE METHOD WITH APPLICATIONS IN VISION AND SPEECH RECOGNITION." International Journal of Wavelets, Multiresolution and Information Processing 11, no. 02 (March 2013): 1350016. http://dx.doi.org/10.1142/s0219691313500161.

Full text

Abstract:

A new approach for feature extraction using neural response has been developed in this paper through combining the hierarchical architectures with the sparse coding technique. As far as proposed layered model, at each layer of hierarchy, it concerned two components that were used are sparse coding and pooling operation. While the sparse coding was used to solve increasingly complex sparse feature representations, the pooling operation by comparing sparse outputs was used to measure the match between a stored prototype and the input sub-image. It is recommended that value of the best matching should be kept and discarding the others. The proposed model is implemented and tested taking into account two ranges of recognition tasks i.e. image recognition and speech recognition (on isolated word vocabulary). Experimental results with various parameters demonstrate that proposed scheme leads to extract more efficient features than other methods.

APA, Harvard, Vancouver, ISO, and other styles

4

Kim, Seonjae, Dongsan Jun, Byung-Gyu Kim, Seungkwon Beack, Misuk Lee, and Taejin Lee. "Two-Dimensional Audio Compression Method Using Video Coding Schemes." Electronics 10, no. 9 (May 6, 2021): 1094. http://dx.doi.org/10.3390/electronics10091094.

Full text

Abstract:

As video compression is one of the core technologies that enables seamless media streaming within the available network bandwidth, it is crucial to employ media codecs to support powerful coding performance and higher visual quality. Versatile Video Coding (VVC) is the latest video coding standard developed by the Joint Video Experts Team (JVET) that can compress original data hundreds of times in the image or video; the latest audio coding standard, Unified Speech and Audio Coding (USAC), achieves a compression rate of about 20 times for audio or speech data. In this paper, we propose a pre-processing method to generate a two-dimensional (2D) audio signal as an input of a VVC encoder, and investigate the applicability to 2D audio compression using the video coding scheme. To evaluate the coding performance, we measure both signal-to-noise ratio (SNR) and bits per sample (bps). The experimental result shows the possibility of researching 2D audio encoding using video coding schemes.

APA, Harvard, Vancouver, ISO, and other styles

5

Exarchakis, Georgios, and Jörg Lücke. "Discrete Sparse Coding." Neural Computation 29, no. 11 (November 2017): 2979–3013. http://dx.doi.org/10.1162/neco_a_01015.

Full text

Abstract:

Sparse coding algorithms with continuous latent variables have been the subject of a large number of studies. However, discrete latent spaces for sparse coding have been largely ignored. In this work, we study sparse coding with latents described by discrete instead of continuous prior distributions. We consider the general case in which the latents (while being sparse) can take on any value of a finite set of possible values and in which we learn the prior probability of any value from data. This approach can be applied to any data generated by discrete causes, and it can be applied as an approximation of continuous causes. As the prior probabilities are learned, the approach then allows for estimating the prior shape without assuming specific functional forms. To efficiently train the parameters of our probabilistic generative model, we apply a truncated expectation-maximization approach (expectation truncation) that we modify to work with a general discrete prior. We evaluate the performance of the algorithm by applying it to a variety of tasks: (1) we use artificial data to verify that the algorithm can recover the generating parameters from a random initialization, (2) use image patches of natural images and discuss the role of the prior for the extraction of image components, (3) use extracellular recordings of neurons to present a novel method of analysis for spiking neurons that includes an intuitive discretization strategy, and (4) apply the algorithm on the task of encoding audio waveforms of human speech. The diverse set of numerical experiments presented in this letter suggests that discrete sparse coding algorithms can scale efficiently to work with realistic data sets and provide novel statistical quantities to describe the structure of the data.

APA, Harvard, Vancouver, ISO, and other styles

6

DAS, B. K., R. N. MAHAPATRA, and B. N. CHATTERJI. "PERFORMANCE MODELING OF DISCRETE COSINE TRANSFORM FOR STAR GRAPH CONNECTED MULTIPROCESSORS." Journal of Circuits, Systems and Computers 06, no. 06 (December 1996): 635–48. http://dx.doi.org/10.1142/s0218126696000443.

Full text

Abstract:

Discrete Cosine Transform algorithm has emphasized the research attention for its ability to analyze application-based problems in signal and image processing like speech coding, image coding, filtering, cepstral analysis, topographic classification, progressive image transmission, data compression etc. This has major applications in pattern recognition and image processing. In this paper, a Cooley-Tukey approach has been proposed for computation of Discrete Cosine Transform and the necessary mathematical formulations have been developed for Star Graph connected multiprocessors. The signal flow graph of the algorithm has been designed for mapping onto the Star Graph. The modeling results are derived in terms of computation time, speedup and efficiency.

APA, Harvard, Vancouver, ISO, and other styles

7

Hussain, Y., and N. Farvardin. "Variable-rate finite-state vector quantization and applications to speech and image coding." IEEE Transactions on Speech and Audio Processing 1, no. 1 (1993): 25–38. http://dx.doi.org/10.1109/89.221365.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Ahlam H. Shnain. "KEY Genaration Forimage Scrambling Using Voiceprint." Diyala Journal of Engineering Sciences 6, no. 3 (September 1, 2013): 1–16. http://dx.doi.org/10.24237/djes.2013.06301.

Full text

Abstract:

This paper presents a new algorithm to scramble color image using voiceprint and linear predicative coding (LPC). The speech signal pass through pre-processing stage which includes sampling and segmentation into many frames. All frames are windowed using rectangular window and fed to linear predicative predicator, the linear predicator is used to obtain the coefficient of the pth order all-pole vocal tract and it predicts the current sample of the speech signal from linear combination of past samples. Levison Durbin (L-D) procedure is used for each speech frame to find Lp coefficients, reflection coefficients and predictor error. For scrambling color image, key will be generated manually; by using the LPC coefficient, by ascending all the LPC coefficients and compare each coefficient with all pixels of the color image. When LPC coefficient is similar to the pixel, the pixel will be replaced by that coefficient. So that pixel will be send in random sequence and the color image will be scrambled by using voiceprint (LPC) coefficients. Descrambling will be done in reverse procedure. Scrambling process is simulated using MATLAB version 7.06.324(R2008a). Many tests are done with different speech signals and color image, SNR, correlation will founded good results.

APA, Harvard, Vancouver, ISO, and other styles

9

McGuire, David, Thomas N. Garavan, James Cunningham, and Greg Duffy. "The use of imagery in the campaign speeches of Barack Hussein Obama and John McCain during the 2008 US Presidential Election." Leadership & Organization Development Journal 37, no. 4 (June 6, 2016): 430–49. http://dx.doi.org/10.1108/lodj-07-2014-0136.

Full text

Abstract:

Purpose – The use of imagery in leadership speeches is becoming increasingly important in shaping the beliefs and actions of followers. The purpose of this paper is to investigate the use of speech imagery and linguistic features employed during the 2008 US Presidential Election campaign. Design/methodology/approach – The authors analysed a total of 264 speeches (160 speeches from Obama and 104 speeches from McCain) delivered throughout the 2008 US Presidential Election and identified 15 speech images used by the two candidates. Both descriptive coding and axial coding approaches were applied to the data and speech images common to both candidates were further subjected to Pennebaker et al. (2003) linguistic inquiry methodology. Findings – The analysis revealed a number of important differences with Obama using inclusive language and nurturing communitarian values, whereas McCain focusing on personal actions and strict, conservative individualistic values. The use of more inclusive language by Obama was found to be significant in three of the five speech images common to both candidates. Research limitations/implications – The research acknowledges the difficulty of measuring the effectiveness of speech images without taking into account wider factors such as tone of voice, facial expression and level of conviction. It also recognises the heavy use of speechwriters by presidential candidates whilst on the campaign trail, but argues that candidates still exert a strong influence through instructions to speechwriters and that speeches should reflect the candidate’s values and beliefs. Originality/value – The research findings contribute to the emerging stream of leadership research that addresses language content issues surrounding and embedded in the leadership process. The research argues that leaders’ speeches provide a fertile ground for conducting research and for examining the evolving relationship between leaders and followers.

APA, Harvard, Vancouver, ISO, and other styles

10

Vinar, Olga. "Means of speech characteristics of the stage image in the context of decoding the signal space contemporary performances." National Academy of Managerial Staff of Culture and Arts Herald, no. 2 (September 17, 2021): 311–16. http://dx.doi.org/10.32461/2226-3209.2.2021.240110.

Full text

Abstract:

The purpose of the article is to identify the features of speech characteristics in the context of the disclosure of the plot-content aspect of the performance of postmodern aesthetics. Methodology. A typological and systematic method is used to study the creative mechanisms of the actor's creation of the speech characteristic of the image in the context of the peculiarities of the aesthetics of postmodernism; cognitive method, thanks to which theoretical positions in the field of language psychology and speech therapy are extrapolated to the field of theatrical art; method oftheoretical generalization, etc. Scientific novelty. The influence of postmodernist theater tendencies on the process of the actor's voice work on the creation of the image – the development and implementation of speech features of the character; the peculiarities of the process of decoding the sign system of a modern production on the basis of the interpretation of the speech characteristic of the images created by the actors are analyzed. Conclusions. The verbal characteristic of the image contributes to the actor's representation of certain internal characteristics of the character, his emotional state, deep reaction to events and/or actions of other characters, changes in lifestyle and worldview, etc., and in the context of postmodern aesthetics, when stage texts double coding" – a complex phenomenon of postmodernism, artistic and aesthetic means of which in theatrical art is a synthesis of different languages and codes of literature and philosophy in a holistic hypertext of performance, contribute to the understanding and comprehension of semantic aspects of theatrical production. The actor's use of elements of speech characteristic contributes to the expansion of his professional speech competence, the diversity of speech sound of the stage word. The components of speech characteristics can act as expressive means that reveal important aspects of the character; the necessary form of revealing the internal content of the stage image; an important element of the psychophysical structure of the role; an artistic technique that enhances the expressive possibilities of the stage word, and, accordingly, helps to form verbal symbolic means of expressing the artistic meanings of the performance. In our opinion, for the organic process of forming the stage image in general and the speech characteristics of the character in particular, the actor must not only expand his attention, observation, and ability to understand and comprehend the psychological and social causes of human behavior, but also deepen knowledge of psycholinguistics and speech therapy. Because only the organic combination of these factors contributes to the design of optimal stage speech, which corresponds to the concept of each specific performance.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Image and speech coding"

1

Yan, Ming. "VLSI architectures for speech and image coding applications." Thesis, Queen's University Belfast, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356855.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

So, Stephen. "Efficient Block Quantisation for Image and Speech Coding." Thesis, Griffith University, 2005. http://hdl.handle.net/10072/366625.

Full text

Abstract:

Signal coding or compression has played a significant role in the success of digital communications and multimedia. The use of signal coding pervades many aspects of our digital lifestyle-a lifestyle that has seen widespread demand for applications like third generation mobile telephony, portable music players, Internet-based video conferencing, digital television, etc. The issues that arise, when dealing with the transmission and storage of digital media, are the limited bandwidth of communication channels, the limited capacity of storage devices, and the limited processing ability of the encoding and decoding devices. The aim of signal coding is therefore to represent digital media, such as speech, music, images, and video, as efficiently as possible. Coding efficiency encompasses rate-distortion (for lossy coding), computational complexity, and static memory requirements. The fundamental operation in lossy signal coding is quantisation. Its rate-distortion efficiency is influenced by the properties of the signal source, such as statistical dependencies and its probability density function. Vector quantisers are known to theoretically achieve the lowest distortion, at a given rate and dimension, of any quantisation scheme, though their computational complexity and memory requirements grow exponentially with rate and dimension. Structurally constrained vector quantisers, such as product code vector quantisers, alleviate these complexity issues, though this is achieved at the cost of degraded rate-distortion performance. Block quantisers or transform coders, which are a special case of product code vector quantisation, possess both low computational and memory requirements, as well as the ability to scale to any bitrate, which is termed as bitrate scalability. However, the prerequisite for optimal block quantisation, namely a purely Gaussian data source with uniform correlation, is rarely ever met with real-world signals. The Gaussian mixture model-based block quantiser, which was originally developed for line spectral frequency (LSF) quantisation for speech coding, overcomes these problems of source mismatch and non-stationarity by estimating the source using a GMM. The split vector quantiser, which was also successfully applied to LSF quantisation in the speech coding literature, is a product code vector quantiser that overcomes the complexity problem of unconstrained vector quantisers, by partitioning vectors into sub-vectors and quantising each one independently. The complexity can be significant reduced via more vector splitting, though this inevitably leads to an accompanying degradation in the rate-distortion efficiency. This is because the structural constraint of vector splitting causes losses in several properties of vector quantisers, which are termed as 'advantages'. This dissertation makes several contributions to the area of block and vector quantisation, more specifically to the GMM-based block quantiser and split vector quantiser, which aim to improve their rate-distortion and computational efficiency. These new quantisation schemes are evaluated and compared with existing and popular schemes in the areas of lossy image coding, LSF quantisation in narrowband speech coding, LSF and immittance spectral pair (ISP) quantisation in wideband speech coding, and Mel frequency-warped cepstral coefficient (MFCC) quantisation in distributed speech recognition. These contributions are summarised below. A novel technique for encoding fractional bits in a fixed-rate 0MM-based block quantiser scheme is presented. In the 0MM-based block quantiser, fractional bitrates are often assigned to each of the cluster block quantisers. This new encoding technique leads to better utilisation of the bit budget by allowing the use of, and providing for the encoding of, quantiser levels in a fixed-rate framework. The algorithm is based on a generalised positional number system and has a low complexity. A lower complexity 0MM-based block quantiser, that replaces the KLT with the discrete cosine transform (DOT), is proposed for image coding. Due to its source independent nature and amenability to efficient implementation, the DOT allows a fast 0MM-based block quantiser to be realised that achieves comparable rate-distortion performance as the KLT-based scheme in the block quantisation of images. Transform image coding often suffers from block artifacts at relatively low bitrates. We propose a scheme that minimises the block artifacts of block quantisation by pre-processing the image using the discrete wavelet transform, extracting vectors via a tree structure that exploits spatial self-similarity, and quantising these vectors using the 0MM-based block quantiser. Visual examination shows that block artifacts are considerably reduced by the wavelet pre-processing step. The multi-frame 0MM-based block quantiser is a modified scheme that exploits memory across successive frames or vectors. Its main advantages over the memoryless scheme in the application of LSF and ISP quantisation, are better rate-distortion and computational efficiency, through the exploitation of correlation across multiple frames and mean squared error selection criterion, respectively. The multi-frame 0MM-based block quantiser is also evaluated for the quantisation of MFCC feature vectors for distributed speech recognition and is shown to be superior to all quantisation schemes considered. A new product code vector quantiser, called the switched split vector quantiser (SSVQ), is proposed for speech LSF and ISP quantisation. SSVQ is a hybrid scheme, combining a switch vector quantiser with several split vector quantisers. It aims to overcome the losses of rate-distortion efficiency in split vector quantisers, by exploiting full vector dependencies before the vector splitting. It is shown that the SSVQ alleviates the losses in two of the three vector quantiser 'advantages'. The SSVQ also has a remarkably low computational complexity, though this is achieved at the cost of an increase in memory requirements.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Microelectronic Engineering
Full Text

APA, Harvard, Vancouver, ISO, and other styles

3

Savvides, Vasos E. "Perceptual models in speech quality assessment and coding." Thesis, Loughborough University, 1988. https://dspace.lboro.ac.uk/2134/36273.

Full text

Abstract:

The ever-increasing demand for good communications/toll quality speech has created a renewed interest into the perceptual impact of rate compression. Two general areas are investigated in this work, namely speech quality assessment and speech coding. In the field of speech quality assessment, a model is developed which simulates the processing stages of the peripheral auditory system. At the output of the model a "running" auditory spectrum is obtained. This represents the auditory (spectral) equivalent of any acoustic sound such as speech. Auditory spectra from coded speech segments serve as inputs to a second model. This model simulates the information centre in the brain which performs the speech quality assessment.

APA, Harvard, Vancouver, ISO, and other styles

4

Lo, Ka-Yiu. "Pitch synchronous speech coding at very low bit rates." Thesis, University of Liverpool, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.321128.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Farsi, Hassan. "Advanced pre-and-post processing techniques for speech coding." Thesis, University of Surrey, 2003. http://epubs.surrey.ac.uk/844491/.

Full text

Abstract:

Advances in digital technology in the last decade have motivated the development of very efficient and high quality speech compression algorithms. While in the early low bit rate coding systems, the main target was the production of intelligible speech at low bit rates, expansion of new applications such as mobile satellite systems increased the demand for reducing the transmission bandwidth and achieving higher speech quality. This resulted in the development of efficient parametric models for speech production system. These models were the basis of powerful speech compression algorithms such as CELP, MBE, MELP and WI. The performance of a speech coder not only depends on the speech production model employed but also on the accurate estimation of speech parameters. Periodicity, also known as pitch, is one of the speech parameters that greatly affect the synthesised speech quality. Thus, the subject of pitch determination has attracted much research in the area of low bit rate coding. In these studies it is assumed that for a short segment of speech, called frame, the pitch is fixed or smoothly evolving. The pitch estimation algorithms generally fail to determine irregular variations, which can occur at onset and offset speech segments. In order to overcome this problem, a novel preprocessing method, which detects irregular pitch variations and modifies the speech signal such as to improve the accuracy of the pitch estimation, is proposed. This method results in more regular speech while maintaining perceptual speech quality. The perceptual quality of the synthesised speech may also be improved using postfiltering techniques. Conventional postfiltering methods generally consider the enhancement of the whole speech spectrum. This may result in the broadening of the first formant, which leads to the increase of quantisation noise for this formant. A new postfiltering technique, which is based on factorising the linear prediction synthesis filter, is proposed. This provides more control over the formant bandwidth and attenuation of spectral speech valleys. Key words: Pitch smoothing, speech pre-processor, postfiltering.

APA, Harvard, Vancouver, ISO, and other styles

6

Peng, Yong Kian. "Speech coding based on a pitch synchronous pattern recognition approach." Thesis, University of Ulster, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.245804.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Meh, Chu Chu. "Exploiting spatial and temporal redundancies for vector quantization of speech and images." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54442.

Full text

Abstract:

The objective of the proposed research is to compress data such as speech, audio, and images using a new re-ordering vector quantization approach that exploits the transition probability between consecutive code vectors in a signal. Vector quantization is the process of encoding blocks of samples from a data sequence by replacing every input vector from a dictionary of reproduction vectors. Shannon’s rate-distortion theory states that signals encoded as blocks of samples have a better rate-distortion performance relative to when encoded on a sample-to-sample basis. As such, vector quantization achieves a lower coding rate for a given distortion relative to scalar quantization for any given signal. Vector quantization does not take advantage of the inter-vector correlation between successive input vectors in data sequences. It has been demonstrated that real signals have significant inter-vector correlation. This correlation has led to vector quantization approaches that encode input vectors based on previously encoded vectors. Some methods have been proposed in literature to exploit the dependence between successive code vectors. Predictive vector quantization, dynamic codebook re-ordering, and finite-state vector quantization are examples of vector quantization schemes that use intervector correlation. Predictive vector quantization and finite-state vector quantization predict the reproduction vector for a given input vector by using past input vectors. Dynamic codebook re-ordering vector quantization has the same reproduction vectors as standard vector quantization. The dynamic codebook re-ordering algorithm is based on the concept of re-ordering indices whereby existing reproduction vectors are assigned new channel indices according a structure that orders the reproduction vectors in an order of increasing dissimilarity. Hence, an input vector encoded in the standard vector quantization method is transmitted through a channel with new indices such that 0 is assigned to the closest reproduction vector to the past reproduction vector. Larger index values are assigned to reproduction vectors that have larger distances from the previous reproduction vector. Dynamic codebook re-ordering assumes that the reproduction vectors of two successive vectors of real signals are typically close to each other according to a distance metric. Sometimes, two successively encoded vectors may have relatively larger distances from each other. Our likelihood codebook re-ordering vector quantization algorithm exploits the structure within a signal by exploiting the non-uniformity in the reproduction vector transition probability in a data sequence. Input vectors that have higher probability of transition from prior reproduction vectors are assigned indices of smaller values. The code vectors that are more likely to follow a given vector are assigned indices closer to 0 while the less likely are given assigned indices of higher value. This re-ordering provides the reproduction dictionary a structure suitable for entropy coding such as Huffman and arithmetic coding. Since such transitions are common in real signals, it is expected that our proposed algorithm when combined with entropy coding algorithms such binary arithmetic and Huffman coding, will result in lower bit rates for the same distortion as a standard vector quantization algorithm. The re-ordering vector quantization approach on quantized indices can be useful in speech, images, audio transmission. By applying our re-ordering approach to these data types, we expect to achieve lower coding rates for a given distortion or perceptual quality. This reduced coding rate makes our proposed algorithm useful for transmission and storage of larger image, speech streams for their respective communication channels. The use of truncation on the likelihood codebook re-ordering scheme results in much lower compression rates without significantly distorting the perceptual quality of the signals. Today, texts and other multimedia signals may be benefit from this additional layer of likelihood re-ordering compression.

APA, Harvard, Vancouver, ISO, and other styles

8

Abboud, Karim. "Wideband CELP speech coding." Thesis, McGill University, 1992. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=56805.

Full text

Abstract:

The purpose of this thesis is to study the coding of wideband speech and to improve on previous Code-Excited Linear Prediction (CELP) coders in terms of speech quality and bit rate. To accomplish this task, improved coding techniques are introduced and the operating bit rate is reduced while maintaining and even enhancing the speech quality.
the first approach considers the quantization of Liner Predictive Coding (LPC) parameters and uses a three way split vector quantization. Both scalar and vector quantization are initially studied; results show that, with adequate codebook training, the second method generates better results while using a fewer number of bits. Nevertheless, the use of vector quantizers remain highly complex in terms of memory and number of computations. A new quantization scheme, split vector quantization (split VQ), is investigated to overcome this complexity problem. Using a new weighted distance measure as a selection criterion for split VQ, the average spectral distortion is significantly reduced to match the results obtained with scalar quantizers.
The second approach introduces a new pitch predictor with an increased temporal resolution for periodicity. This new technique has the advantage of maintaining the same quality obtained with conventional multiple coefficient predictors at a reduced bit rate. Furthermore, the conventional CELP noise weighting filter is modified to allow more freedom and better accuracy in the modeling of both tilt and formant structures. Throughout this process, different noise weighting schemes are evaluated and the results show that the new filter greatly contributes in solving the problem of high frequency distortion.
The final wideband CELP coder is operational at 11.7 kbits/s and generates a high perceptual quality of the reconstructed speech using the fractional pitch predictor and the new perceptual noise weighting filter.

APA, Harvard, Vancouver, ISO, and other styles

9

Streit, Juergen Stefan. "Digital image coding." Thesis, University of Southampton, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.361092.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Sturt, Christian. "Pitch synchronous speech coding techniques." Thesis, University of Surrey, 2003. http://epubs.surrey.ac.uk/843327/.

Full text

Abstract:

Efficient source coding techniques are necessary to make optimal use of the limited bandwidth available in mobile phone networks. Most current mobile telephone communication systems compress the speech waveform by using speech coders based on the Code Excited Linear Prediction (CELP) model. Such coders give high quality speech at bit rates of 8 kbps and above. Below 8 kbps, the quality of the coded speech degrades rapidly. At rates of 6 kbps and below, parametric speech coders offer better speech quality. These coders reduce the required bit rate by transmitting certain characteristics of the speech waveform to the decoder, rather than attempting to code the waveform itself. The disadvantage of parametric coders is that the maximum achievable quality is limited by assumptions made during the coding of the speech signal. The aim of the research presented is to investigate and eliminate the factors that limit the speech quality of parametric coders. A new pitch synchronous coding model is proposed that operates on individual pitch cycle waveforms of speech rather than longer, fixed length frames as used in classic techniques. In order to implement a pitch synchronous coder, new pitch cycle detection algorithms have been proposed. Pitch synchronous parameter analysis was investigated and several new techniques have been developed. A novel pitch synchronous split-band voicing estimator has been proposed that utilises only the phase of the speech harmonics rather than the periodicity used in traditional techniques. Fixed rate quantisation of pitch synchronous speech parameters has been investigated and a joint quantisation/interpolation scheme has been proposed. This scheme has been applied to the quantisation of the pitch synchronous parameters and has been shown to outperform traditional quantisation techniques. A comparison of a reference parametric coder with its pitch synchronous counterpart has shown that the pitch synchronous paradigm eliminates some of the main factors that limit the speech quality in parametric coders. It is expected that this will lead to the development of speech coders that can produce speech of higher quality than current parametric coders operating at the same bit rate. Key words: Speech Coding, Pitch Synchronous, Sinusoidal Coding, Split-Band LPC Coding.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Books on the topic "Image and speech coding"

1

Metkar, Shilpa. Motion Estimation Techniques for Digital Video Coding. India: Springer India, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

2

Stellenbosch), IEEE South African Symposium on Communications and Signal Processing (1994 University of. Proceedings of the 1994 IEEE South African Symposium on Communications and Signal Processing, COMSIG-94, Tuesday, October 4, 1994, University of Stellenbosch, Stellenbosch. [South Africa]: IEEE South Africa Section, 1994.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

3

Bäckström, Tom. Speech Coding. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-50204-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Woods, John W., ed. Subband Image Coding. Boston, MA: Springer US, 1991. http://dx.doi.org/10.1007/978-1-4757-2119-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Ogunfunmi, Tokunbo. Principles of speech coding. Boca Raton: Taylor & Francis, 2010.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

6

Ayuso, Antonio J. Rubio, and Juan M. López Soler, eds. Speech Recognition and Coding. Berlin, Heidelberg: Springer Berlin Heidelberg, 1995. http://dx.doi.org/10.1007/978-3-642-57745-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Atal, Bishnu S., Vladimir Cuperman, and Allen Gersho, eds. Advances in Speech Coding. Boston, MA: Springer US, 1991. http://dx.doi.org/10.1007/978-1-4615-3266-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Atal, Bishnu S., Vladimir Cuperman, and Allen Gersho. Advances in speech coding. Edited by IEEE Workshop on Speech Coding for Telecommunications (1989 : Vancouver, B.C.). New York: Springer, 1991.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

9

S, Atal Bishnu, Cuperman Vladimir, Gersho Allen, and IEEE Workshop on Speech Coding for Telecommunications (1989 : Vancouver, B.C.), eds. Advances in speech coding. Boston: Kluwer Academic Publishers, 1991.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

10

Li, Feng. Interference Cancellation Using Space-Time Processing and Precoding Design. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Image and speech coding"

1

Pearlman, William A. "Set Partition Embedded Block (SPECK) Coding." In Wavelet Image Compression, 23–45. Cham: Springer International Publishing, 2013. http://dx.doi.org/10.1007/978-3-031-02248-7_5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Weidong, Liu, Feng Guiliang, and Li Zhonghua. "Optimization on Wavelet SPECK Image Coding Algorithm." In Information and Business Intelligence, 693–99. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-29084-8_107.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Gersho, Allen. "Speech Coding." In The Kluwer International Series in Engineering and Computer Science, 73–100. Boston, MA: Springer US, 1992. http://dx.doi.org/10.1007/978-1-4757-2148-5_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Owens, F. J. "Speech Coding." In Signal Processing of Speech, 122–37. London: Macmillan Education UK, 1993. http://dx.doi.org/10.1007/978-1-349-22599-6_6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Moreau, Nicolas. "Speech Coding." In Tools for Signal Compression, 101–22. Hoboken, NJ USA: John Wiley & Sons, Inc., 2013. http://dx.doi.org/10.1002/9781118616611.ch6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Alencar, Marcelo S., and Valdemar C. da Rocha. "Speech Coding." In Communication Systems, 97–134. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-12067-1_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Macario, R. C. V. "Speech Coding." In Cellular Radio, 158–71. London: Macmillan Education UK, 1997. http://dx.doi.org/10.1007/978-1-349-14433-4_7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Alencar, Marcelo S., and Valdemar C. da Rocha. "Speech Coding." In Communication Systems, 89–128. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-25462-9_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Ma, Yide, Kun Zhan, and Zhaobin Wang. "Image Coding." In Applications of Pulse-Coupled Neural Networks, 43–59. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-13745-7_4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Xu, Xiang, Xingkun Wu, and Feng Lin. "Image Coding." In Cellular Image Classification, 89–103. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-47629-2_5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Image and speech coding"

1

Wong, Chian-Hong, Heng-Siong Lim, and Alan Wee-Chiat Tan. "Warped linear predictive speech coding." In 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA). IEEE, 2011. http://dx.doi.org/10.1109/icsipa.2011.6144160.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Zeng, B., Y. Neuvo, and A. N. Venetsanopoulos. "Interpolative BTC image coding." In [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, 1992. http://dx.doi.org/10.1109/icassp.1992.226168.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Maalouf, Aldo, and Mohamed-Chaker Larabi. "Bandelet-based stereo image coding." In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2010. http://dx.doi.org/10.1109/icassp.2010.5495084.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Anand, K., V. Srivastava, and S. Bhattacharya. "A novel method of speech coding using Transform Coding And Polynomial approximation." In National Conference on Signal and Image Processing Applications. IET, 2009. http://dx.doi.org/10.1049/ic.2009.0178.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Silsbee, P. L., and A. C. Bovik. "Adaptive visual pattern image coding." In [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing. IEEE, 1991. http://dx.doi.org/10.1109/icassp.1991.150975.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Soleymani, M. R., S. D. Morgera, and R. Quesnel. "Image coding for noisy channels." In [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing. IEEE, 1991. http://dx.doi.org/10.1109/icassp.1991.150980.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Joseph, S. M., and A. P. Babu. "Continuous speech coding using coiflets wavelet." In 2013 International Conference on Signal Processing, Image Processing, and Pattern Recognition (ICSIPR). IEEE, 2013. http://dx.doi.org/10.1109/icsipr.2013.6497933.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Zhou, Z., and A. N. Venetsanopoulos. "Morphological methods in image coding." In [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, 1992. http://dx.doi.org/10.1109/icassp.1992.226171.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Hu, Yang, and William A. Pearlman. "Differential-SPIHT for image sequence coding." In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2010. http://dx.doi.org/10.1109/icassp.2010.5495263.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Chen, Homer H., Wu Chou, Barry G. Haskell, and Tsuhan Chen. "Speech recognition for acoustic-assisted video coding and animation." In Visual Communications and Image Processing '95, edited by Lance T. Wu. SPIE, 1995. http://dx.doi.org/10.1117/12.206731.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Image and speech coding"

1

Hogden, J. An articulatorily constrained, maximum entropy approach to speech recognition and speech coding. Office of Scientific and Technical Information (OSTI), December 1996. http://dx.doi.org/10.2172/432946.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Phoha, Shashi, and Mendel Schmiedekamp. Semantic Source Coding for Flexible Lossy Image Compression. Fort Belvoir, VA: Defense Technical Information Center, March 2007. http://dx.doi.org/10.21236/ada464658.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

St. George, Brett A. Speech Coding and Phoneme Classification Using a Back-Propagation Neural Network. Fort Belvoir, VA: Defense Technical Information Center, May 1997. http://dx.doi.org/10.21236/ada418472.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Hogden, J. Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding. Office of Scientific and Technical Information (OSTI), November 1996. http://dx.doi.org/10.2172/431136.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Klein, Stanley, and Amnon Silverstein. Spatio-Temporal Masking in Human Vision and Its Application to Image Coding. Fort Belvoir, VA: Defense Technical Information Center, October 1995. http://dx.doi.org/10.21236/ada300556.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Yesha, Yaacov. Channel coding for code excited linear prediction (CELP) encoded speech in mobile radio applications. Gaithersburg, MD: National Institute of Standards and Technology, 1994. http://dx.doi.org/10.6028/nist.ir.5503.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Fu, Chi Yung ,. Petrich, L. I. ,. Lee, M. Image and video compression/decompression based on human visual perception system and transform coding. Office of Scientific and Technical Information (OSTI), February 1997. http://dx.doi.org/10.2172/489146.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Бережна, Маргарита Василівна. Maleficent: from the Matriarch to the Scorned Woman (Psycholinguistic Image). Baltija Publishing, 2021. http://dx.doi.org/10.31812/123456789/5766.

Full text

Abstract:

The aim of the research is to identify the elements of the psycholinguistic image of the leading character in the dark fantasy adventure film Maleficent directed by Robert Stromberg (2014). The task consists of two stages, at the first of which I identify the psychological characteristics of the character to determine to which of the archetypes Maleficent belongs. As the basis, I take the classification of film archetypes by V. Schmidt. At the second stage, I distinguish the speech peculiarities of the character that reflex her psychological image. This paper explores 98 Maleficent’s turns of dialogues in the film. According to V. Schmidt’s classification, Maleficent belongs first to the Matriarch archetype and later in the plot to the Scorned Woman archetype. These archetypes are representations of the powerful goddess of marriage and fertility Hera, being respectively her heroic and villainous embodiments. There are several crucial characteristics revealed by speech elements.

APA, Harvard, Vancouver, ISO, and other styles

9

Бережна, Маргарита Василівна. Psycholinguistic Image of Joy (in the Computer-Animated Film Inside Out). Psycholinguistics in a Modern World, 2021. http://dx.doi.org/10.31812/123456789/5827.

Full text

Abstract:

The paper is focused on the correlation between the psychological archetype of a film character and the linguistic elements composing their speech. The Nurturer archetype is represented in the film Inside Out by the personalized emotion Joy. Joy is depicted as an antropomorphous female character, whose purpose is to keep her host, a young girl Riley, happy. As the Nurturer, Joy is completely focused on Riley’s happiness, which is expressed by lexico-semantic group ‘happy’, positive evaluative tokens, exclamatory sentences, promissive speech acts, and repetitions. She needs the feeling of connectedness with other members of her family, which is revealed by lexico-semantic groups ‘support’ and ‘help’. She is ready to sacrifice everything to save the girl in her care, which is demonstrated by modal verbs, frequent word-combination ‘for Riley’, and directives.

APA, Harvard, Vancouver, ISO, and other styles

10

Washington Nichols, Bruno, and Pedro Chapaval Pimentel. Impeachment e imagem pública: uma análise do discurso vazado de Michel Temer / Impeachment and public image: an analysis of Michel Temer’s leaked speech. Revista Internacional de Relaciones Públicas, June 2017. http://dx.doi.org/10.5783/rirp-13-2017-04-41-60.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!