To see the other types of publications on this topic, follow the link: Image and speech coding.

Journal articles on the topic 'Image and speech coding'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Image and speech coding.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Guglielmo, Mario, Giulio Modena, and Roberto Montagna. "Speech and image coding for digital communications." European Transactions on Telecommunications 2, no. 1 (January 1991): 21–44. http://dx.doi.org/10.1002/ett.4460020106.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Zivin, Gail. "Image or neural coding of inner speech and agency?" Behavioral and Brain Sciences 9, no. 3 (September 1986): 534–35. http://dx.doi.org/10.1017/s0140525x00047002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

ALSAIDI, RAMADHAN ABDO MUSLEH, HONG LI, YANTAO WEI, ROKAN KHAJI, and YUAN YAN TANG. "HIERARCHICAL SPARSE METHOD WITH APPLICATIONS IN VISION AND SPEECH RECOGNITION." International Journal of Wavelets, Multiresolution and Information Processing 11, no. 02 (March 2013): 1350016. http://dx.doi.org/10.1142/s0219691313500161.

Full text
Abstract:
A new approach for feature extraction using neural response has been developed in this paper through combining the hierarchical architectures with the sparse coding technique. As far as proposed layered model, at each layer of hierarchy, it concerned two components that were used are sparse coding and pooling operation. While the sparse coding was used to solve increasingly complex sparse feature representations, the pooling operation by comparing sparse outputs was used to measure the match between a stored prototype and the input sub-image. It is recommended that value of the best matching should be kept and discarding the others. The proposed model is implemented and tested taking into account two ranges of recognition tasks i.e. image recognition and speech recognition (on isolated word vocabulary). Experimental results with various parameters demonstrate that proposed scheme leads to extract more efficient features than other methods.
APA, Harvard, Vancouver, ISO, and other styles
4

Kim, Seonjae, Dongsan Jun, Byung-Gyu Kim, Seungkwon Beack, Misuk Lee, and Taejin Lee. "Two-Dimensional Audio Compression Method Using Video Coding Schemes." Electronics 10, no. 9 (May 6, 2021): 1094. http://dx.doi.org/10.3390/electronics10091094.

Full text
Abstract:
As video compression is one of the core technologies that enables seamless media streaming within the available network bandwidth, it is crucial to employ media codecs to support powerful coding performance and higher visual quality. Versatile Video Coding (VVC) is the latest video coding standard developed by the Joint Video Experts Team (JVET) that can compress original data hundreds of times in the image or video; the latest audio coding standard, Unified Speech and Audio Coding (USAC), achieves a compression rate of about 20 times for audio or speech data. In this paper, we propose a pre-processing method to generate a two-dimensional (2D) audio signal as an input of a VVC encoder, and investigate the applicability to 2D audio compression using the video coding scheme. To evaluate the coding performance, we measure both signal-to-noise ratio (SNR) and bits per sample (bps). The experimental result shows the possibility of researching 2D audio encoding using video coding schemes.
APA, Harvard, Vancouver, ISO, and other styles
5

Exarchakis, Georgios, and Jörg Lücke. "Discrete Sparse Coding." Neural Computation 29, no. 11 (November 2017): 2979–3013. http://dx.doi.org/10.1162/neco_a_01015.

Full text
Abstract:
Sparse coding algorithms with continuous latent variables have been the subject of a large number of studies. However, discrete latent spaces for sparse coding have been largely ignored. In this work, we study sparse coding with latents described by discrete instead of continuous prior distributions. We consider the general case in which the latents (while being sparse) can take on any value of a finite set of possible values and in which we learn the prior probability of any value from data. This approach can be applied to any data generated by discrete causes, and it can be applied as an approximation of continuous causes. As the prior probabilities are learned, the approach then allows for estimating the prior shape without assuming specific functional forms. To efficiently train the parameters of our probabilistic generative model, we apply a truncated expectation-maximization approach (expectation truncation) that we modify to work with a general discrete prior. We evaluate the performance of the algorithm by applying it to a variety of tasks: (1) we use artificial data to verify that the algorithm can recover the generating parameters from a random initialization, (2) use image patches of natural images and discuss the role of the prior for the extraction of image components, (3) use extracellular recordings of neurons to present a novel method of analysis for spiking neurons that includes an intuitive discretization strategy, and (4) apply the algorithm on the task of encoding audio waveforms of human speech. The diverse set of numerical experiments presented in this letter suggests that discrete sparse coding algorithms can scale efficiently to work with realistic data sets and provide novel statistical quantities to describe the structure of the data.
APA, Harvard, Vancouver, ISO, and other styles
6

DAS, B. K., R. N. MAHAPATRA, and B. N. CHATTERJI. "PERFORMANCE MODELING OF DISCRETE COSINE TRANSFORM FOR STAR GRAPH CONNECTED MULTIPROCESSORS." Journal of Circuits, Systems and Computers 06, no. 06 (December 1996): 635–48. http://dx.doi.org/10.1142/s0218126696000443.

Full text
Abstract:
Discrete Cosine Transform algorithm has emphasized the research attention for its ability to analyze application-based problems in signal and image processing like speech coding, image coding, filtering, cepstral analysis, topographic classification, progressive image transmission, data compression etc. This has major applications in pattern recognition and image processing. In this paper, a Cooley-Tukey approach has been proposed for computation of Discrete Cosine Transform and the necessary mathematical formulations have been developed for Star Graph connected multiprocessors. The signal flow graph of the algorithm has been designed for mapping onto the Star Graph. The modeling results are derived in terms of computation time, speedup and efficiency.
APA, Harvard, Vancouver, ISO, and other styles
7

Hussain, Y., and N. Farvardin. "Variable-rate finite-state vector quantization and applications to speech and image coding." IEEE Transactions on Speech and Audio Processing 1, no. 1 (1993): 25–38. http://dx.doi.org/10.1109/89.221365.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ahlam H. Shnain. "KEY Genaration Forimage Scrambling Using Voiceprint." Diyala Journal of Engineering Sciences 6, no. 3 (September 1, 2013): 1–16. http://dx.doi.org/10.24237/djes.2013.06301.

Full text
Abstract:
This paper presents a new algorithm to scramble color image using voiceprint and linear predicative coding (LPC). The speech signal pass through pre-processing stage which includes sampling and segmentation into many frames. All frames are windowed using rectangular window and fed to linear predicative predicator, the linear predicator is used to obtain the coefficient of the pth order all-pole vocal tract and it predicts the current sample of the speech signal from linear combination of past samples. Levison Durbin (L-D) procedure is used for each speech frame to find Lp coefficients, reflection coefficients and predictor error. For scrambling color image, key will be generated manually; by using the LPC coefficient, by ascending all the LPC coefficients and compare each coefficient with all pixels of the color image. When LPC coefficient is similar to the pixel, the pixel will be replaced by that coefficient. So that pixel will be send in random sequence and the color image will be scrambled by using voiceprint (LPC) coefficients. Descrambling will be done in reverse procedure. Scrambling process is simulated using MATLAB version 7.06.324(R2008a). Many tests are done with different speech signals and color image, SNR, correlation will founded good results.
APA, Harvard, Vancouver, ISO, and other styles
9

McGuire, David, Thomas N. Garavan, James Cunningham, and Greg Duffy. "The use of imagery in the campaign speeches of Barack Hussein Obama and John McCain during the 2008 US Presidential Election." Leadership & Organization Development Journal 37, no. 4 (June 6, 2016): 430–49. http://dx.doi.org/10.1108/lodj-07-2014-0136.

Full text
Abstract:
Purpose – The use of imagery in leadership speeches is becoming increasingly important in shaping the beliefs and actions of followers. The purpose of this paper is to investigate the use of speech imagery and linguistic features employed during the 2008 US Presidential Election campaign. Design/methodology/approach – The authors analysed a total of 264 speeches (160 speeches from Obama and 104 speeches from McCain) delivered throughout the 2008 US Presidential Election and identified 15 speech images used by the two candidates. Both descriptive coding and axial coding approaches were applied to the data and speech images common to both candidates were further subjected to Pennebaker et al. (2003) linguistic inquiry methodology. Findings – The analysis revealed a number of important differences with Obama using inclusive language and nurturing communitarian values, whereas McCain focusing on personal actions and strict, conservative individualistic values. The use of more inclusive language by Obama was found to be significant in three of the five speech images common to both candidates. Research limitations/implications – The research acknowledges the difficulty of measuring the effectiveness of speech images without taking into account wider factors such as tone of voice, facial expression and level of conviction. It also recognises the heavy use of speechwriters by presidential candidates whilst on the campaign trail, but argues that candidates still exert a strong influence through instructions to speechwriters and that speeches should reflect the candidate’s values and beliefs. Originality/value – The research findings contribute to the emerging stream of leadership research that addresses language content issues surrounding and embedded in the leadership process. The research argues that leaders’ speeches provide a fertile ground for conducting research and for examining the evolving relationship between leaders and followers.
APA, Harvard, Vancouver, ISO, and other styles
10

Vinar, Olga. "Means of speech characteristics of the stage image in the context of decoding the signal space contemporary performances." National Academy of Managerial Staff of Culture and Arts Herald, no. 2 (September 17, 2021): 311–16. http://dx.doi.org/10.32461/2226-3209.2.2021.240110.

Full text
Abstract:
The purpose of the article is to identify the features of speech characteristics in the context of the disclosure of the plot-content aspect of the performance of postmodern aesthetics. Methodology. A typological and systematic method is used to study the creative mechanisms of the actor's creation of the speech characteristic of the image in the context of the peculiarities of the aesthetics of postmodernism; cognitive method, thanks to which theoretical positions in the field of language psychology and speech therapy are extrapolated to the field of theatrical art; method oftheoretical generalization, etc. Scientific novelty. The influence of postmodernist theater tendencies on the process of the actor's voice work on the creation of the image – the development and implementation of speech features of the character; the peculiarities of the process of decoding the sign system of a modern production on the basis of the interpretation of the speech characteristic of the images created by the actors are analyzed. Conclusions. The verbal characteristic of the image contributes to the actor's representation of certain internal characteristics of the character, his emotional state, deep reaction to events and/or actions of other characters, changes in lifestyle and worldview, etc., and in the context of postmodern aesthetics, when stage texts double coding" – a complex phenomenon of postmodernism, artistic and aesthetic means of which in theatrical art is a synthesis of different languages and codes of literature and philosophy in a holistic hypertext of performance, contribute to the understanding and comprehension of semantic aspects of theatrical production. The actor's use of elements of speech characteristic contributes to the expansion of his professional speech competence, the diversity of speech sound of the stage word. The components of speech characteristics can act as expressive means that reveal important aspects of the character; the necessary form of revealing the internal content of the stage image; an important element of the psychophysical structure of the role; an artistic technique that enhances the expressive possibilities of the stage word, and, accordingly, helps to form verbal symbolic means of expressing the artistic meanings of the performance. In our opinion, for the organic process of forming the stage image in general and the speech characteristics of the character in particular, the actor must not only expand his attention, observation, and ability to understand and comprehend the psychological and social causes of human behavior, but also deepen knowledge of psycholinguistics and speech therapy. Because only the organic combination of these factors contributes to the design of optimal stage speech, which corresponds to the concept of each specific performance.
APA, Harvard, Vancouver, ISO, and other styles
11

Jamal, Marwa, and Tariq A. Hassan. "Speech Coding Using Discrete Cosine Transform and Chaotic Map." Ingénierie des systèmes d information 27, no. 4 (August 31, 2022): 673–77. http://dx.doi.org/10.18280/isi.270419.

Full text
Abstract:
Recently, data of multimedia performs an exponentially blowing tendency, saturating daily life of humans. Various modalities of data, includes images, texts and video, plays important role in different aspects and has wide. However, the key problem of utilizing data of large scale is cost of processing and massive storage. Therefore, for efficient communications and for economical storage requires effective techniques of data compression to reduce the volume of data. Speech coding is a main problem in the area of digital speech processing. The process of converting the voice signals into a more compressed form is speech coding. In this work, we demonstrate that a DCT with a chaotic system combined with run-length coding can be utilized to implement speech coding of very low bit-rate with high reconstruction quality. Experimental result show that compression ratio is about 13% when implemented on Librispeech dataset.
APA, Harvard, Vancouver, ISO, and other styles
12

KITTISUWAN, PICHID, THITIPORN CHANWIMALUANG, SANPARITH MARUKATAT, and WIDHYAKORN ASDORNWISED. "IMAGE AND AUDIO-SPEECH DENOISING BASED ON HIGHER-ORDER STATISTICAL MODELING OF WAVELET COEFFICIENTS AND LOCAL VARIANCE ESTIMATION." International Journal of Wavelets, Multiresolution and Information Processing 08, no. 06 (November 2010): 987–1017. http://dx.doi.org/10.1142/s0219691310003808.

Full text
Abstract:
At first, this paper is concerned with wavelet-based image denoising using Bayesian technique. In conventional denoising process, the parameters of probability density function (PDF) are usually calculated from the first few moments, mean and variance. In the first part of our work, a new image denoising algorithm based on Pearson Type VII random vectors is proposed. This PDF is used because it allows higher-order moments to be incorporated into the noiseless wavelet coefficients' probabilistic model. One of the cruxes of the Bayesian image denoising algorithms is to estimate the variance of the clean image. Here, maximum a posterior (MAP) approach is employed for not only noiseless wavelet-coefficient estimation but also local observed variance acquisition. For the local observed variance estimation, the selection of noisy wavelet-coefficient model, either a Laplacian or a Gaussian distribution, is based upon the corrupted noise power where Gamma distribution is used as a prior for the variance. Evidently, our selection of prior is motivated by analytical and computational tractability. In our experiments, our proposed method gives promising denoising results with moderate complexity. Eventually, our image denoising method can be simply extended to audio/speech processing by forming matrix representation whose rows are formed by time segments of digital speech waveforms. This way, the use of our image denoising methods can be exploited to improve the performance of various audio/speech tasks, e.g., denoised enhancement of voice activity detection to capture voiced speech, significantly needed for speech coding and voice conversion applications. Moreover, one of the voice abnormality detections, called oropharyngeal dysphagia classification, is also required denoising method to improve the signal quality in elderly patients. We provide simple speech examples to demonstrate the prospects of our techniques.
APA, Harvard, Vancouver, ISO, and other styles
13

Kubanek, Mariusz, Janusz Bobulski, and Joanna Kulawik. "A Method of Speech Coding for Speech Recognition Using a Convolutional Neural Network." Symmetry 11, no. 9 (September 19, 2019): 1185. http://dx.doi.org/10.3390/sym11091185.

Full text
Abstract:
This work presents a new approach to speech recognition, based on the specific coding of time and frequency characteristics of speech. The research proposed the use of convolutional neural networks because, as we know, they show high resistance to cross-spectral distortions and differences in the length of the vocal tract. Until now, two layers of time convolution and frequency convolution were used. A novel idea is to weave three separate convolution layers: traditional time convolution and the introduction of two different frequency convolutions (mel-frequency cepstral coefficients (MFCC) convolution and spectrum convolution). This application takes into account more details contained in the tested signal. Our idea assumes creating patterns for sounds in the form of RGB (Red, Green, Blue) images. The work carried out research for isolated words and continuous speech, for neural network structure. A method for dividing continuous speech into syllables has been proposed. This method can be used for symmetrical stereo sound.
APA, Harvard, Vancouver, ISO, and other styles
14

Agrawal, S. K., and O. P. Sahu. "Two-Channel Quadrature Mirror Filter Bank: An Overview." ISRN Signal Processing 2013 (September 3, 2013): 1–10. http://dx.doi.org/10.1155/2013/815619.

Full text
Abstract:
During the last two decades, there has been substantial progress in multirate digital filters and filter banks. This includes the design of quadrature mirror filters (QMF). A two-channel QMF bank is extensively used in many signal processing fields such as subband coding of speech signal, image processing, antenna systems, design of wavelet bases, and biomedical engineering and in digital audio industry. Therefore, new efficient design techniques are being proposed by several authors in this area. This paper presents an overview of analysis and design techniques of the two-channel QMF bank. Application in the area of subband coding and future research trends are also discussed.
APA, Harvard, Vancouver, ISO, and other styles
15

Hueber, Thomas, Eric Tatulli, Laurent Girin, and Jean-Luc Schwartz. "Evaluating the Potential Gain of Auditory and Audiovisual Speech-Predictive Coding Using Deep Learning." Neural Computation 32, no. 3 (March 2020): 596–625. http://dx.doi.org/10.1162/neco_a_01264.

Full text
Abstract:
Sensory processing is increasingly conceived in a predictive framework in which neurons would constantly process the error signal resulting from the comparison of expected and observed stimuli. Surprisingly, few data exist on the accuracy of predictions that can be computed in real sensory scenes. Here, we focus on the sensory processing of auditory and audiovisual speech. We propose a set of computational models based on artificial neural networks (mixing deep feedforward and convolutional networks), which are trained to predict future audio observations from present and past audio or audiovisual observations (i.e., including lip movements). Those predictions exploit purely local phonetic regularities with no explicit call to higher linguistic levels. Experiments are conducted on the multispeaker LibriSpeech audio speech database (around 100 hours) and on the NTCD-TIMIT audiovisual speech database (around 7 hours). They appear to be efficient in a short temporal range (25–50 ms), predicting 50% to 75% of the variance of the incoming stimulus, which could result in potentially saving up to three-quarters of the processing power. Then they quickly decrease and almost vanish after 250 ms. Adding information on the lips slightly improves predictions, with a 5% to 10% increase in explained variance. Interestingly the visual gain vanishes more slowly, and the gain is maximum for a delay of 75 ms between image and predicted sound.
APA, Harvard, Vancouver, ISO, and other styles
16

Stoll, Chloé, Helen Rodger, Junpeng Lao, Anne-Raphaëlle Richoz, Olivier Pascalis, Matthew Dye, and Roberto Caldara. "Quantifying Facial Expression Intensity and Signal Use in Deaf Signers." Journal of Deaf Studies and Deaf Education 24, no. 4 (July 4, 2019): 346–55. http://dx.doi.org/10.1093/deafed/enz023.

Full text
Abstract:
Abstract We live in a world of rich dynamic multisensory signals. Hearing individuals rapidly and effectively integrate multimodal signals to decode biologically relevant facial expressions of emotion. Yet, it remains unclear how facial expressions are decoded by deaf adults in the absence of an auditory sensory channel. We thus compared early and profoundly deaf signers (n = 46) with hearing nonsigners (n = 48) on a psychophysical task designed to quantify their recognition performance for the six basic facial expressions of emotion. Using neutral-to-expression image morphs and noise-to-full signal images, we quantified the intensity and signal levels required by observers to achieve expression recognition. Using Bayesian modeling, we found that deaf observers require more signal and intensity to recognize disgust, while reaching comparable performance for the remaining expressions. Our results provide a robust benchmark for the intensity and signal use in deafness and novel insights into the differential coding of facial expressions of emotion between hearing and deaf individuals.
APA, Harvard, Vancouver, ISO, and other styles
17

Lee, Sung-Tae, and Jong-Ho Bae. "Investigation of Deep Spiking Neural Networks Utilizing Gated Schottky Diode as Synaptic Devices." Micromachines 13, no. 11 (October 22, 2022): 1800. http://dx.doi.org/10.3390/mi13111800.

Full text
Abstract:
Deep learning produces a remarkable performance in various applications such as image classification and speech recognition. However, state-of-the-art deep neural networks require a large number of weights and enormous computation power, which results in a bottleneck of efficiency for edge-device applications. To resolve these problems, deep spiking neural networks (DSNNs) have been proposed, given the specialized synapse and neuron hardware. In this work, the hardware neuromorphic system of DSNNs with gated Schottky diodes was investigated. Gated Schottky diodes have a near-linear conductance response, which can easily implement quantized weights in synaptic devices. Based on modeling of synaptic devices, two-layer fully connected neural networks are trained by off-chip learning. The adaptation of a neuron’s threshold is proposed to reduce the accuracy degradation caused by the conversion from analog neural networks (ANNs) to event-driven DSNNs. Using left-justified rate coding as an input encoding method enables low-latency classification. The effect of device variation and noisy images to the classification accuracy is investigated. The time-to-first-spike (TTFS) scheme can significantly reduce power consumption by reducing the number of firing spikes compared to a max-firing scheme.
APA, Harvard, Vancouver, ISO, and other styles
18

Tabassum, Shafia, Muddasir Hussain, and Darakhshan M. Saleem. "4 Audible Tool for Color Identification Using Arduino UNO." Sir Syed Research Journal of Engineering & Technology 1, no. 1 (December 19, 2018): 4. http://dx.doi.org/10.33317/ssurj.v1i1.40.

Full text
Abstract:
This hardware tool is designed for assistance and training for visually impaired people to identify the surrounding colored objects. Several researches are conducted all over the world to design advanced, reliable tool and techniques to support sightless individuals. This paper describes designing of a handheld device to assist visually impaired people for color identification via generated speech. Vision is one of the five senses which is essential for a human being to survive independently. All individuals acquire the external information through the visual coding entering into the eye. The light rays penetrate and stimulate vision receptors to execute the processing of image formation. In response, the brain is able to identify the object. Unfortunately, any accidents or visual disorders with age can restrict the ability to sight the external environment. A visually impaired person can identify the texture, shape and size of an object by touching. But the sense of touch does not permit color identification. This proposed wearable tool, aid mainly focuses on individuals who are unable to observe colors due to visual disorder. A hardware device is presented which generates the speech for color detection. In this proposed work RGB sensor is used to capture the color information of the entity. An Arduino UNO card is programmed using Arduino software 1.6.3 to enhance the signal. It generates a clear audible speech for the specific color of the object. The benefit of this device is low power battery operated, portable, safe and easy to handle.
APA, Harvard, Vancouver, ISO, and other styles
19

Tabassum, Shafia, Muddasir Hussain, and Darakhshan M. Saleem. "Audible Tool for Color Identification Using Arduino UNO." Sir Syed University Research Journal of Engineering & Technology 7, no. 1 (December 19, 2018): 4. http://dx.doi.org/10.33317/ssurj.v7i1.40.

Full text
Abstract:
This hardware tool is designed for assistance and training for visually impaired people to identify the surrounding colored objects. Several researches are conducted all over the world to design advanced, reliable tool and techniques to support sightless individuals. This paper describes designing of a handheld device to assist visually impaired people for color identification via generated speech. Vision is one of the five senses which is essential for a human being to survive independently. All individuals acquire the external information through the visual coding entering into the eye. The light rays penetrate and stimulate vision receptors to execute the processing of image formation. In response, the brain is able to identify the object. Unfortunately, any accidents or visual disorders with age can restrict the ability to sight the external environment. A visually impaired person can identify the texture, shape and size of an object by touching. But the sense of touch does not permit color identification. This proposed wearable tool, aid mainly focuses on individuals who are unable to observe colors due to visual disorder. A hardware device is presented which generates the speech for color detection. In this proposed work RGB sensor is used to capture the color information of the entity. An Arduino UNO card is programmed using Arduino software 1.6.3 to enhance the signal. It generates a clear audible speech for the specific color of the object. The benefit of this device is low power battery operated, portable, safe and easy to handle.
APA, Harvard, Vancouver, ISO, and other styles
20

Mirhosseini, Seyyed-Abdolhamid, and Mahdieh Noori. "Discursive portrayal of Islam as “a part of America’s story” in Obama’s presidential speeches." Journal of Language and Politics 18, no. 6 (August 9, 2019): 915–37. http://dx.doi.org/10.1075/jlp.18023.mir.

Full text
Abstract:
Abstract This article investigates the image portrayed of Islam and Muslims in official speeches of the former US President, Barack Obama during his two terms in office. Applying qualitative data coding procedures and based on a Critical Discourse Studies (CDS) approach, we examine 377 speeches delivered in the period of 2009–2016 within the macro context of US involvements in contemporary international politics to uncover the discursive image of Islam and Islamic attributes projected and subtly reproduced over time by Obama during his presidency. The outcome comprises four major themes shaped around the notions of America’s fundamental values; Dialogue with Muslim communities; Defining good Islam; and Defining bad Muslims. Through a detailed discussion of the discursive construction of these themes and specifically referring to their lexical highlights, we illustrate aspects of Islam-related issues in the view of an American president.
APA, Harvard, Vancouver, ISO, and other styles
21

Dennison, Stephen R., Tanvi Thakkar, Alan Kan, Mahan Azadpour, Mario A. Svirsky, and Ruth Y. Litovsky. "ITD sensitivity provided to bilateral cochlear implant listeners by a mixed-rate, real-time capable processing strategy." Journal of the Acoustical Society of America 150, no. 4 (October 2021): A338—A339. http://dx.doi.org/10.1121/10.0008506.

Full text
Abstract:
Bilateral cochlear implant (CI) listeners have limited access to interaural time differences (ITDs) at low frequencies in part because clinical processors do not coordinate the timing of stimulation across the ears. Further, clinical strategies, which are optimized for good speech reception, operate at stimulation rates that are too high [∼1000 pulses per second (pps)] to enable ITD sensitivity from the timing of pulses. Our newly developed, real-time capable sound coding strategy, delivered using a bilaterally synchronized research processor, enables binaural coordination and pulse-timed ITD encoding. We employ a “mixed-rate” stimulation approach by maintaining the capacity to provide speech or envelope ITDs on high-rate channels while simultaneously providing ITDs in the pulse timing on low-rate (100 pps) channels. We hypothesized that a mixed-rate strategy encoding both high-rate envelope ITDs and pulse ITDs would yield better ITD sensitivity than a strategy encoding pulse ITDs or high-rate envelope ITDs only. This hypothesis was tested by measuring the perceived range of lateralized auditory images in bilateral CI listeners. Preliminary results indicate listeners may have ITD sensitivity with all conditions, though the mixed-rate strategy can provide better ITD sensitivity in some listeners. This demonstrates the potential of the mixed-rate strategy for improving binaural hearing outcomes.
APA, Harvard, Vancouver, ISO, and other styles
22

Middlebrooks, John C., and Julie Arenberg Bierer. "Auditory Cortical Images of Cochlear-Implant Stimuli: Coding of Stimulus Channel and Current Level." Journal of Neurophysiology 87, no. 1 (January 1, 2002): 493–507. http://dx.doi.org/10.1152/jn.00211.2001.

Full text
Abstract:
This study quantified the accuracy with which populations of neurons in the auditory cortex can represent aspects of electrical cochlear stimuli presented through a cochlear implant. We tested the accuracy of coding of the place of stimulation (i.e., identification of the active stimulation channel) and of the stimulus current level. Physiological data came from the companion study, which recorded spike activity of neurons simultaneously from 16 sites along the tonotopic axis of the guinea pig's auditory cortex. In that study, cochlear electrical stimuli were presented to acutely deafened animals through a 6-electrode animal version of the 22-electrode Nucleus banded electrode array (Cochlear). Cochlear electrode configurations consisted of monopolar (MP), bipolar (BP + N) with N inactive electrodes between the active and return electrodes (0 ≤ N ≤ 3), tripolar (TP) with one active electrode and two flanking return electrodes, and common ground (CG) with one active electrode and as many as five return electrodes. In the present analysis, an artificial neural network was trained to recognize spatiotemporal patterns of cortical activity in response to single presentations of particular stimuli and, thereby, to identify those stimuli. The accuracy of pair-wise discrimination of stimulation channels or of current levels was represented by the discrimination index, d′, where d′ = 1 was taken as threshold. In many cases, the threshold for discrimination of place of cochlear stimulation was <0.75 mm, and the threshold for discrimination of current levels was <1 dB. Cochlear electrode configurations varied in the accuracy with which they signaled to the auditory cortex the place of cochlear stimulation. The BP + N and TP configurations provided considerably greater sensitivity to place of stimulation than did the MP configuration. The TP configuration maintained accurate signaling of place of stimulation up to the highest current levels, whereas sensitivity was degraded at high current levels in BP + N configurations. Electrode configurations also varied in the dynamic range over which they signaled stimulus current level. Dynamic ranges were widest for the BP + 0 configuration and narrowest for the TP configuration. That is, the configuration that showed the most accurate signaling of cochlear place of stimulation (TP) showed the most restricted dynamic range for signaling of current level. These results suggest that the choice of the optimal electrode configuration for use by human cochlear-prosthesis users would depend on the particular demands of the speech-processing strategy that is to be employed.
APA, Harvard, Vancouver, ISO, and other styles
23

Rao, P. Srinivasa, Ch ChandraLekha, B. Srinivasa Reddy, G. Lakshmi Deepthi, and K. Bhargav. "Speech Coding." International Journal of Future Generation Communication and Networking 7, no. 5 (October 31, 2014): 229–38. http://dx.doi.org/10.14257/ijfgcn.2014.7.5.19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Nikoriak, Nataliia. "Text on the Culturological Border: the Cinemanovel “The Red. Without a Front Line” by A. Kokotiukha." Pitannâ lìteraturoznavstva, no. 102 (December 28, 2020): 164–94. http://dx.doi.org/10.31861/pytlit2020.102.164.

Full text
Abstract:
The article under studies reveals the terminological polymodality of the concept of “cinemanovel” as a screened novel; film genre; an original narrative work that tends to a screenplay; literary text written on the basis of the film and the screenplay to it (film “novelization”). An overview of modern theoretical and practical discourse of the cinemanovel genre is presented in the paper. It has been emphasized that some researchers try to find out the origins of this genre by analyzing the samples in a comparative and intermediate way, while others focus on clarifying the specifics of individual novels, concluding on the synthetic and hybrid nature of this genre. In particular, in this aspect, the cinemanovel-prequel by A. Kokotiukha “The Red. Without a Front Line” (2019) has been analyzed. This text, based on a film screenplay, appears to be a rather complex construct that acquires a double coding – cinematic and literary – hence the genre of the novel (as a product of the synthesis of two arts) contains the key features of both. On the one hand, we have to deal with the preservation of the cinematic codes that pass from the screenplay: fragmentation, word visualization, documentalism, eventfulness, editing, alternation of angles and plans, time reduction, dialogues, character formation in action, characterization through speech, conciseness of phrases in certain scenes to create the effect of maximum tension, image condensation, accumulation of internal tension in the episode. On the other hand (as a result of the so-called “novelization”), the text acquires genre features of the novel. These are: the scale of the narration (although fragmented and condensed), the description of characters’ lives is presented in line with historical events, with the disclosure of their psychology and inner world. Finally, the work is also marked with specifically architectonic, i.e. the author connects his cinemanovels together by means of a plot, the main character and a general artistic idea.
APA, Harvard, Vancouver, ISO, and other styles
25

Yatsuzuka, Yohtaro. "Speech coding system." Journal of the Acoustical Society of America 90, no. 1 (July 1991): 626. http://dx.doi.org/10.1121/1.402304.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Sasaki, Seishi. "Speech coding circuit." Journal of the Acoustical Society of America 97, no. 1 (January 1995): 736. http://dx.doi.org/10.1121/1.413056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Jagtap, S. K., M. S. Mulye, and M. D. Uplane. "Speech Coding Techniques." Procedia Computer Science 49 (2015): 253–63. http://dx.doi.org/10.1016/j.procs.2015.04.251.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Alsaka, Y. A. "Contractive speech coding." Electronics Letters 28, no. 14 (1992): 1358. http://dx.doi.org/10.1049/el:19920863.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Amano, Fumio. "Speech coding apparatus." Journal of the Acoustical Society of America 95, no. 2 (February 1994): 1183. http://dx.doi.org/10.1121/1.408451.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Goldberg, Randy G. "Perpetual speech coding." Journal of the Acoustical Society of America 96, no. 4 (October 1994): 2595. http://dx.doi.org/10.1121/1.410069.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Taniguchi, Tomohiko, and Mark A. Johnson. "Speech coding system." Journal of the Acoustical Society of America 96, no. 3 (September 1994): 1949. http://dx.doi.org/10.1121/1.410190.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Taniguchi, Tomohiko. "Speech coding apparatus using multimode coding." Journal of the Acoustical Society of America 95, no. 5 (May 1994): 2795. http://dx.doi.org/10.1121/1.409797.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

ros, Fi A., and Utpal Bhattacharjee. "Impact of Semantic Coding of Emotional Speech on Speech Coding Performance." International Journal of Computer Trends and Technology 57, no. 1 (March 25, 2018): 6–10. http://dx.doi.org/10.14445/22312803/ijctt-v57p102.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Herrera, Abel, and Arturo Gardida. "Speech coding for fast speech commands recognition." Journal of the Acoustical Society of America 112, no. 5 (November 2002): 2305. http://dx.doi.org/10.1121/1.4779282.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Avery, James M., and Elmer A. Hoyer. "Clipped speech‐linear predictive coding speech processor." Journal of the Acoustical Society of America 84, no. 5 (November 1988): 1966. http://dx.doi.org/10.1121/1.397069.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Guibé, G., H. T. How, and L. Hanzo. "Speech spectral quantizers for wideband speech coding." European Transactions on Telecommunications 12, no. 6 (November 2001): 535–45. http://dx.doi.org/10.1002/ett.4460120609.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Koh, Soo N., and Costas Xydeas. "Frequency domain speech coding." Journal of the Acoustical Society of America 93, no. 1 (January 1993): 592–93. http://dx.doi.org/10.1121/1.405579.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Amarendra Babu, Akella. "Robust Speech Coding Algorithm." International Journal on Cryptography and Information Security 2, no. 1 (March 31, 2012): 59–67. http://dx.doi.org/10.5121/ijcis.2012.2106.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Boyd, Ivan. "Method of speech coding." Journal of the Acoustical Society of America 90, no. 6 (December 1991): 3391. http://dx.doi.org/10.1121/1.401352.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Litwin, L. R. "Speech coding with wavelets." IEEE Potentials 17, no. 2 (April 1998): 38–41. http://dx.doi.org/10.1109/45.666646.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Nishiguchi, Masayuki. "Speech efficient coding method." Journal of the Acoustical Society of America 102, no. 6 (1997): 3251. http://dx.doi.org/10.1121/1.419570.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Boyd, I. "Speech coding for telecommunications." Electronics & Communications Engineering Journal 4, no. 5 (1992): 273. http://dx.doi.org/10.1049/ecej:19920048.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Serizawa, Masahiro. "Speech pitch coding system." Journal of the Acoustical Society of America 103, no. 3 (March 1998): 1248. http://dx.doi.org/10.1121/1.423206.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Cuperman, Vladimir, and Allen Gersho. "Low delay speech coding." Speech Communication 12, no. 2 (June 1993): 193–204. http://dx.doi.org/10.1016/s0167-6393(05)80011-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Litwin, L. R. "Speech coding with wavelets." Computer Standards & Interfaces 20, no. 6-7 (March 1999): 447. http://dx.doi.org/10.1016/s0920-5489(99)90931-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Jayant, N. S., J. D. Johnston, and Y. Shoham. "Coding of wideband speech." Speech Communication 11, no. 2-3 (June 1992): 127–38. http://dx.doi.org/10.1016/0167-6393(92)90007-t.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Vanyagina, M. "Mnemonics techniques for teaching English lexis." Pedagogy and Psychology of Education, no. 3, 2019 (2019): 71–85. http://dx.doi.org/10.31862/2500-297x-2019-3-71-85.

Full text
Abstract:
The article deals with the efficiency of mnemonics application for studying the English language. Mnemonics is one of active training technologies which rely on lexical and semantic communications and associative thinking. Since ancient times scientists studied properties of memory and offered the ways of simplification the process of storing information by means of various techniques. Modern psychologists and teachers agree that coding of information by means of images and associations accelerates the process of information storing. The article considers various methods of mnemonics helping to perceive and reproduce necessary educational information, including foreign words and phrases: associations, mnemo-rhymes, method of a chain, and mnemo-cards. Examples of mnemonics application in the author's practice of teaching a foreign language when training cadets and post-graduate military students are given. The pedagogical experiment on mnemo-cards application in teaching English the cadets of higher educational military organizations for the intensification of educational process is described. A mnemo-card represents the structured schematic visualization of basic elements of information on the paper or via electronic medium. The word, the name of a part of speech, a transcription, variants of translation, the picture which is associated with a word and a mnemo-rhyme are applied on it. Mnemo-cards are repeatedly shown and pronounced during lessons and self study. As a result of a pedagogical experiment the groups of cadets using mnemo-cards have higher percentage of storing of English-language lexis, that allows to draw a conclusion on expediency and efficiency of the offered technique.
APA, Harvard, Vancouver, ISO, and other styles
48

Ponghiran, Wachirawit, and Kaushik Roy. "Spiking Neural Networks with Improved Inherent Recurrence Dynamics for Sequential Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 8001–8. http://dx.doi.org/10.1609/aaai.v36i7.20771.

Full text
Abstract:
Spiking neural networks (SNNs) with leaky integrate and fire (LIF) neurons, can be operated in an event-driven manner and have internal states to retain information over time, providing opportunities for energy-efficient neuromorphic computing, especially on edge devices. Note, however, many representative works on SNNs do not fully demonstrate the usefulness of their inherent recurrence (membrane potential retaining information about the past) for sequential learning. Most of the works train SNNs to recognize static images by artificially expanded input representation in time through rate coding. We show that SNNs can be trained for practical sequential tasks by proposing modifications to a network of LIF neurons that enable internal states to learn long sequences and make their inherent recurrence resilient to the vanishing gradient problem. We then develop a training scheme to train the proposed SNNs with improved inherent recurrence dynamics. Our training scheme allows spiking neurons to produce multi-bit outputs (as opposed to binary spikes) which help mitigate the mismatch between a derivative of spiking neurons' activation function and a surrogate derivative used to overcome spiking neurons' non-differentiability. Our experimental results indicate that the proposed SNN architecture on TIMIT and LibriSpeech 100h speech recognition dataset yields accuracy comparable to that of LSTMs (within 1.10% and 0.36%, respectively), but with 2x fewer parameters than LSTMs. The sparse SNN outputs also lead to 10.13x and 11.14x savings in multiplication operations compared to GRUs, which are generally considered as a lightweight alternative to LSTMs, on TIMIT and LibriSpeech 100h datasets, respectively.
APA, Harvard, Vancouver, ISO, and other styles
49

Järvinen, Kari. "Digital speech processing: Speech coding, synthesis, and recognition." Signal Processing 30, no. 1 (January 1993): 133–34. http://dx.doi.org/10.1016/0165-1684(93)90056-g.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

McAulay, R. J., and T. F. Quatieri. "Speech coding based on a sinusoidal speech model." Journal of the Acoustical Society of America 95, no. 5 (May 1994): 2806–7. http://dx.doi.org/10.1121/1.409725.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography