To see the other types of publications on this topic, follow the link: Visemes.

Journal articles on the topic 'Visemes'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Visemes.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Chelali, Fatma Zohra, and Amar Djeradi. "Primary Research on Arabic Visemes, Analysis in Space and Frequency Domain." International Journal of Mobile Computing and Multimedia Communications 3, no. 4 (October 2011): 1–19. http://dx.doi.org/10.4018/jmcmc.2011100101.

Full text
Abstract:
Visemes are the unique facial positions required to produce phonemes, which are the smallest phonetic unit distinguished by the speakers of a particular language. Each language has multiple phonemes and visemes, and each viseme can have multiple phonemes. However, current literature on viseme research indicates that the mapping between phonemes and visemes is many-to-one: there are many phonemes which look alike visually, and hence they fall into the same visemic category. To evaluate the performance of the proposed method, the authors collected a large number of speech visual signal of five Algerian speakers male and female at different moments pronouncing 28 Arabic phonemes. For each frame the lip area is manually located with a rectangle of size proportional to 120*160 and centred on the mouth, and converted to gray scale. Finally, the mean and the standard deviation of the values of the pixels of the lip area are computed by using 20 images for each phoneme sequence to classify the visemes. The pitch analysis is investigated to show its variation for each viseme.
APA, Harvard, Vancouver, ISO, and other styles
2

Bear, Helen L., and Richard Harvey. "Alternative Visual Units for an Optimized Phoneme-Based Lipreading System." Applied Sciences 9, no. 18 (September 15, 2019): 3870. http://dx.doi.org/10.3390/app9183870.

Full text
Abstract:
Lipreading is understanding speech from observed lip movements. An observed series of lip motions is an ordered sequence of visual lip gestures. These gestures are commonly known, but as yet are not formally defined, as `visemes’. In this article, we describe a structured approach which allows us to create speaker-dependent visemes with a fixed number of visemes within each set. We create sets of visemes for sizes two to 45. Each set of visemes is based upon clustering phonemes, thus each set has a unique phoneme-to-viseme mapping. We first present an experiment using these maps and the Resource Management Audio-Visual (RMAV) dataset which shows the effect of changing the viseme map size in speaker-dependent machine lipreading and demonstrate that word recognition with phoneme classifiers is possible. Furthermore, we show that there are intermediate units between visemes and phonemes which are better still. Second, we present a novel two-pass training scheme for phoneme classifiers. This approach uses our new intermediary visual units from our first experiment in the first pass as classifiers; before using the phoneme-to-viseme maps, we retrain these into phoneme classifiers. This method significantly improves on previous lipreading results with RMAV speakers.
APA, Harvard, Vancouver, ISO, and other styles
3

Owens, Elmer, and Barbara Blazek. "Visemes Observed by Hearing-Impaired and Normal-Hearing Adult Viewers." Journal of Speech, Language, and Hearing Research 28, no. 3 (September 1985): 381–93. http://dx.doi.org/10.1044/jshr.2803.381.

Full text
Abstract:
A series of VCV nonsense syllables formed with 23 consonants and the vowels //, /i/, /u/, and // was presented on videotape without sound to 5 hearing-impaired adults and 5 adults with normal hearing. The two-fold purpose was (a) to determine whether the two groups would perform the same in their identification of visemes and (b) to observe whether the identification of visemes is influenced by vowel context. There were no differences between the two groups either with respect to the overall percentage of items correct or to the visemes identified. Noticeable differences occurred in viseme identification between the /u/ context and the other 3 vowel contexts; visemes with // differed slightly from those with // and /i/; and there were no differences in viseme identification for // and /i/ contexts. Findings were in general agreement with other studies with respect to the visemes identified, provided it is acknowledged that changes can occur depending on variables such as talkers, stimuli, recording and viewing conditions, training procedures, and statistical criteria. A composite grouping consists of /p,b,m/; /f,v/; /θ,ð/; /w,r/; /t∫,d,∫,/; and /t,d,s,k,n,g,l/.
APA, Harvard, Vancouver, ISO, and other styles
4

Fenghour, Souheil, Daqing Chen, Kun Guo, Bo Li, and Perry Xiao. "An Effective Conversion of Visemes to Words for High-Performance Automatic Lipreading." Sensors 21, no. 23 (November 26, 2021): 7890. http://dx.doi.org/10.3390/s21237890.

Full text
Abstract:
As an alternative approach, viseme-based lipreading systems have demonstrated promising performance results in decoding videos of people uttering entire sentences. However, the overall performance of such systems has been significantly affected by the efficiency of the conversion of visemes to words during the lipreading process. As shown in the literature, the issue has become a bottleneck of such systems where the system’s performance can decrease dramatically from a high classification accuracy of visemes (e.g., over 90%) to a comparatively very low classification accuracy of words (e.g., only just over 60%). The underlying cause of this phenomenon is that roughly half of the words in the English language are homophemes, i.e., a set of visemes can map to multiple words, e.g., “time” and “some”. In this paper, aiming to tackle this issue, a deep learning network model with an Attention based Gated Recurrent Unit is proposed for efficient viseme-to-word conversion and compared against three other approaches. The proposed approach features strong robustness, high efficiency, and short execution time. The approach has been verified with analysis and practical experiments of predicting sentences from benchmark LRS2 and LRS3 datasets. The main contributions of the paper are as follows: (1) A model is developed, which is effective in converting visemes to words, discriminating between homopheme words, and is robust to incorrectly classified visemes; (2) the model proposed uses a few parameters and, therefore, little overhead and time are required to train and execute; and (3) an improved performance in predicting spoken sentences from the LRS2 dataset with an attained word accuracy rate of 79.6%—an improvement of 15.0% compared with the state-of-the-art approaches.
APA, Harvard, Vancouver, ISO, and other styles
5

Preminger, Jill E., Hwei-Bing Lin, Michel Payen, and Harry Levitt. "Selective Visual Masking in Speechreading." Journal of Speech, Language, and Hearing Research 41, no. 3 (June 1998): 564–75. http://dx.doi.org/10.1044/jslhr.4103.564.

Full text
Abstract:
Using digital video technology, selective aspects of a face can be masked by identifying the pixels that represent it and then, by adjusting the gray levels, effectively eliminate that facial aspect. In groups of young adults with normal vision and hearing, consonant-viseme recognition was measured for closed sets of vowel-consonant-vowel disyllables. In the first experiment viseme recognition was measured while the tongue and teeth were masked and while the entire mouth was masked. The results showed that masking of the tongue and teeth had little effect on viseme recognition, and when the entire mouth was masked, participants continued to identify consonant visemes with 70% or greater accuracy in the /a/ and // vowel contexts. In the second experiment, viseme recognition was measured when the upper part of the face and the mouth were masked and when the lower part of the face and the mouth were masked. The results showed that when the mouth and the upper part of the face were masked, performance was poor, but information was available to identify the consonantviseme /f/. When the mouth and the lower part of the face were masked, viseme recognition was quite poor, but information was available to discriminate the consonant-viseme /p/ from other consonant visemes.
APA, Harvard, Vancouver, ISO, and other styles
6

De Martino, José Mario, Léo Pini Magalhães, and Fábio Violaro. "Facial animation based on context-dependent visemes." Computers & Graphics 30, no. 6 (December 2006): 971–80. http://dx.doi.org/10.1016/j.cag.2006.08.017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Lalonde, Kaylah, and Grace A. Dwyer. "Visual phonemic knowledge and audiovisual speech-in-noise perception in school-age children." Journal of the Acoustical Society of America 153, no. 3_supplement (March 1, 2023): A337. http://dx.doi.org/10.1121/10.0019067.

Full text
Abstract:
Our mental representations of speech sounds include information about the visible articulatory gestures that accompany different speech sounds. We call this visual phonemic knowledge. This study examined development of school-age children’s visual phonemic knowledge and their ability to use visual phonemic knowledge to supplement audiovisual speech processing. Sixty-two children (5–16 years) and 18 adults (19–35 years) completed auditory-only, visual-only, and audiovisual tests of consonant-vowel syllable repetition. Auditory-only and audiovisual conditions were presented in steady-state, speech-spectrum noise at individually set SNRs. Consonant confusions were analyzed to define visemes (clusters of phonemes that are visually confusable with one another but visually distinct from other phonemes) evident in adults’ responses to visual-only consonants and to compute the proportion of errors in each participant and modality that were within adults' visemes. Children were less accurate than adults at visual-only consonant identification. However, children as young as 5 years of age demonstrated some visual phonemic knowledge. Comparison of error patterns across conditions indicated that children used visual phonemic knowledge during audiovisual speech-in-noise recognition. Details regarding the order of acquisition of viseme will be discussed.
APA, Harvard, Vancouver, ISO, and other styles
8

Lazalde, Oscar Martinez, Steve Maddock, and Michael Meredith. "A Constraint-Based Approach to Visual Speech for a Mexican-Spanish Talking Head." International Journal of Computer Games Technology 2008 (2008): 1–7. http://dx.doi.org/10.1155/2008/412056.

Full text
Abstract:
A common approach to produce visual speech is to interpolate the parameters describing a sequence of mouth shapes, known as visemes, where a viseme corresponds to a phoneme in an utterance. The interpolation process must consider the issue of context-dependent shape, or coarticulation, in order to produce realistic-looking speech. We describe an approach to such pose-based interpolation that deals with coarticulation using a constraint-based technique. This is demonstrated using a Mexican-Spanish talking head, which can vary its speed of talking and produce coarticulation effects.
APA, Harvard, Vancouver, ISO, and other styles
9

Thangthai, Ausdang, Ben Milner, and Sarah Taylor. "Synthesising visual speech using dynamic visemes and deep learning architectures." Computer Speech & Language 55 (May 2019): 101–19. http://dx.doi.org/10.1016/j.csl.2018.11.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Henton, Caroline. "Beyond visemes: Using disemes in synthetic speech with facial animation." Journal of the Acoustical Society of America 95, no. 5 (May 1994): 3010. http://dx.doi.org/10.1121/1.408830.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

SHAIKH, AYAZ A., DINESH K. KUMAR, and JAYAVARDHANA GUBBI. "VISUAL SPEECH RECOGNITION USING OPTICAL FLOW AND SUPPORT VECTOR MACHINES." International Journal of Computational Intelligence and Applications 10, no. 02 (June 2011): 167–87. http://dx.doi.org/10.1142/s1469026811003045.

Full text
Abstract:
A lip-reading technique that identifies visemes from visual data only and without evaluating the corresponding acoustic signals is presented. The technique is based on vertical components of the optical flow (OF) analysis and these are classified using support vector machines (SVM). The OF is decomposed into multiple non-overlapping fixed scale blocks and statistical features of each block are computed for successive video frames of an utterance. This technique performs automatic temporal segmentation (i.e., determining the start and the end of an utterance) of the utterances, achieved by pair-wise pixel comparison method, which evaluates the differences in intensity of corresponding pixels in two successive frames. The experiments were conducted on a database of 14 visemes taken from seven subjects and the accuracy tested using five and ten fold cross validation for binary and multiclass SVM respectively to determine the impact of subject variations. Unlike other systems in the literature, the results indicate that the proposed method is more robust to inter-subject variations with high sensitivity and specificity for 12 out of 14 visemes. Potential applications of such a system include human computer interface (HCI) for mobility-impaired users, lip reading mobile phones, in-vehicle systems, and improvement of speech based computer control in noisy environment.
APA, Harvard, Vancouver, ISO, and other styles
12

Hirayama, Makoto J. "Computer graphics facial animation of Japanese speech using three‐dimensional dynamic visemes." Journal of the Acoustical Society of America 120, no. 5 (November 2006): 3353. http://dx.doi.org/10.1121/1.4781430.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Quigley, Rita, and Al Yonovitz. "Signal detection of lipreading visemes using two dimensional and three dimensional images." Journal of the Acoustical Society of America 134, no. 5 (November 2013): 4204. http://dx.doi.org/10.1121/1.4831430.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Jia, Xi Bin, and Mei Xia Zheng. "Video Based Visual Speech Feature Model Construction." Applied Mechanics and Materials 182-183 (June 2012): 1367–71. http://dx.doi.org/10.4028/www.scientific.net/amm.182-183.1367.

Full text
Abstract:
This paper aims to give a solutions for the construction of chinese visual speech feature model based on HMM. We propose and discuss three kind representation model of the visual speech which are lip geometrical features, lip motion features and lip texture features. The model combines the advantages of the local LBP and global DCT texture information together, which shows better performance than the single feature. Equally the model combines the advantages of the local LBP and geometrical information together is better than single feature. By computing the recognition rate of the visemes from the model, the paper shows the HMM which describing the dynamic of speech, coupled with the combined feature for describing the global and local texture is the best model.
APA, Harvard, Vancouver, ISO, and other styles
15

Pattamadilok, Chotiga, and Marc Sato. "How are visemes and graphemes integrated with speech sounds during spoken word recognition? ERP evidence for supra-additive responses during audiovisual compared to auditory speech processing." Brain and Language 225 (February 2022): 105058. http://dx.doi.org/10.1016/j.bandl.2021.105058.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Srivastava, Tanmay, Prerna Khanna, Shijia Pan, Phuc Nguyen, and Shubham Jain. "MuteIt." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, no. 3 (September 6, 2022): 1–26. http://dx.doi.org/10.1145/3550281.

Full text
Abstract:
In this paper, we present MuteIt, an ear-worn system for recognizing unvoiced human commands. MuteIt presents an intuitive alternative to voice-based interactions that can be unreliable in noisy environments, disruptive to those around us, and compromise our privacy. We propose a twin-IMU set up to track the user's jaw motion and cancel motion artifacts caused by head and body movements. MuteIt processes jaw motion during word articulation to break each word signal into its constituent syllables, and further each syllable into phonemes (vowels, visemes, and plosives). Recognizing unvoiced commands by only tracking jaw motion is challenging. As a secondary articulator, jaw motion is not distinctive enough for unvoiced speech recognition. MuteIt combines IMU data with the anatomy of jaw movement as well as principles from linguistics, to model the task of word recognition as an estimation problem. Rather than employing machine learning to train a word classifier, we reconstruct each word as a sequence of phonemes using a bi-directional particle filter, enabling the system to be easily scaled to a large set of words. We validate MuteIt for 20 subjects with diverse speech accents to recognize 100 common command words. MuteIt achieves a mean word recognition accuracy of 94.8% in noise-free conditions. When compared with common voice assistants, MuteIt outperforms them in noisy acoustic environments, achieving higher than 90% recognition accuracy. Even in the presence of motion artifacts, such as head movement, walking, and riding in a moving vehicle, MuteIt achieves mean word recognition accuracy of 91% over all scenarios.
APA, Harvard, Vancouver, ISO, and other styles
17

Plass, John, David Brang, Satoru Suzuki, and Marcia Grabowecky. "Vision perceptually restores auditory spectral dynamics in speech." Proceedings of the National Academy of Sciences 117, no. 29 (July 6, 2020): 16920–27. http://dx.doi.org/10.1073/pnas.2002887117.

Full text
Abstract:
Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require lossy remapping of speech signals onto abstracted representations. Because visible articulators shape the spectral content of speech, we hypothesized that the perceptual system might exploit natural correlations between midlevel visual (oral deformations) and auditory speech features (frequency modulations) to extract detailed spectrotemporal information from visual speech without employing high-level abstractions. Consistent with this hypothesis, we found that the time–frequency dynamics of oral resonances (formants) could be predicted with unexpectedly high precision from the changing shape of the mouth during speech. When isolated from other speech cues, speech-based shape deformations improved perceptual sensitivity for corresponding frequency modulations, suggesting that listeners could exploit this cross-modal correspondence to facilitate perception. To test whether this type of correspondence could improve speech comprehension, we selectively degraded the spectral or temporal dimensions of auditory sentence spectrograms to assess how well visual speech facilitated comprehension under each degradation condition. Visual speech produced drastically larger enhancements during spectral degradation, suggesting a condition-specific facilitation effect driven by cross-modal recovery of auditory speech spectra. The perceptual system may therefore use audiovisual correlations rooted in oral acoustics to extract detailed spectrotemporal information from visual speech.
APA, Harvard, Vancouver, ISO, and other styles
18

Meredith, R., S. D. G. Stephens, and G. E. Jones. "Investigations on viseme groups in Welsh." Clinical Linguistics & Phonetics 4, no. 3 (January 1990): 253–65. http://dx.doi.org/10.3109/02699209008985487.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Gorman, Benjamin M. "Reducing viseme confusion in speech-reading." ACM SIGACCESS Accessibility and Computing, no. 114 (March 16, 2016): 36–43. http://dx.doi.org/10.1145/2904092.2904100.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Varshney, Priyanka, Omar Farooq, and Prashant Upadhyaya. "Hindi viseme recognition using subspace DCT features." International Journal of Applied Pattern Recognition 1, no. 3 (2014): 257. http://dx.doi.org/10.1504/ijapr.2014.065768.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Mishra, A. N., Mahesh Chandra, Astik Biswas, and S. N. Sharan. "Hindi phoneme-viseme recognition from continuous speech." International Journal of Signal and Imaging Systems Engineering 6, no. 3 (2013): 164. http://dx.doi.org/10.1504/ijsise.2013.054793.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

van Son, Nic, Tirtsa M. I. Huiskamp, Arjan J. Bosman, and Guido F. Smoorenburg. "Viseme classifications of Dutch consonants and vowels." Journal of the Acoustical Society of America 96, no. 3 (September 1994): 1341–55. http://dx.doi.org/10.1121/1.411324.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Kalberer, Gregor A., Pascal Müller, and Luc Van Gool. "Visual speech, a trajectory in viseme space." International Journal of Imaging Systems and Technology 13, no. 1 (2003): 74–84. http://dx.doi.org/10.1002/ima.10044.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Rachman, Anung, Risanuri Hidayat, and Hanung Adi Nugroho. "Improving Phoneme to Viseme Mapping for Indonesian Language." IJITEE (International Journal of Information Technology and Electrical Engineering) 4, no. 1 (September 9, 2020): 1. http://dx.doi.org/10.22146/ijitee.47577.

Full text
Abstract:
The lip synchronization technology of animation can run automatically through the phoneme-to-viseme map. Since the complexity of facial muscles causes the shape of the mouth to vary greatly, phoneme-to-viseme mapping always has challenging problems. One of them is the allophone vowel problem. The resemblance makes many researchers clustering them into one class. This paper discusses the certainty of allophone vowels as a variable of the phoneme-to-viseme map. Vowel allophones pre-processing as a proposed method is carried out through formant frequency feature extraction methods and then compared by t-test to find out the significance of the difference. The results of pre-processing are then used to reference the initial data when building phoneme-to-viseme maps. This research was conducted on maps and allophones of the Indonesian language. Maps that have been built are then compared with other maps using the HMM method in the value of word correctness and accuracy. The results show that viseme mapping preceded by allophonic pre-processing makes map performance more accurate when compared to other maps.
APA, Harvard, Vancouver, ISO, and other styles
25

Dave, Namrata, and Narendra M. Patel. "Phoneme and Viseme based Approach for Lip Synchronization." International Journal of Signal Processing, Image Processing and Pattern Recognition 7, no. 3 (June 30, 2014): 385–94. http://dx.doi.org/10.14257/ijsip.2014.7.3.31.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Bibish Kumar, K. T., R. K. Sunil Kumar, E. P. A. Sandesh, S. Sourabh, and V. L. Lajish. "Viseme set identification from Malayalam phonemes and allophones." International Journal of Speech Technology 22, no. 4 (November 4, 2019): 1149–66. http://dx.doi.org/10.1007/s10772-019-09655-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Voldman, Daniele, and Marc Cluet. "L'architecture du IIIe Reich. Origines intellectuelles et visees ideologiques." Vingtième Siècle. Revue d'histoire, no. 19 (July 1988): 125. http://dx.doi.org/10.2307/3769787.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Mayer, Connor, Jennifer Abel, Adriano Barbosa, Alexis Black, and Eric Vatikiotis‐Bateson. "The labial viseme reconsidered: Evidence from production and perception." Journal of the Acoustical Society of America 129, no. 4 (April 2011): 2456. http://dx.doi.org/10.1121/1.3588075.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Xue, Jianxia, Abeer Alwan, Edward T. Auer, and Lynne E. Bernstein. "On audio‐visual synchronization for viseme‐based speech synthesis." Journal of the Acoustical Society of America 116, no. 4 (October 2004): 2480–81. http://dx.doi.org/10.1121/1.4784907.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Arifin, Arifin, Surya Sumpeno, Mochamad Hariadi, and Hanny Haryanto. "A Text-to-Audiovisual Synthesizer for Indonesian by Morphing Viseme." International Review on Computers and Software (IRECOS) 10, no. 11 (November 30, 2015): 1149. http://dx.doi.org/10.15866/irecos.v10i11.7833.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Bement, Linda, Josara Wallber, Carol DeFilippo, Joseph Bochner, and Wayne Garrison. "A New Protocol for Assessing Viseme Perception in Sentence Context." Ear and Hearing 9, no. 1 (February 1988): 33–40. http://dx.doi.org/10.1097/00003446-198802000-00014.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Richie, Carolyn. "The effects of speechreading training on viseme categories for vowels." Journal of the Acoustical Society of America 118, no. 3 (September 2005): 1963. http://dx.doi.org/10.1121/1.4809122.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Li, H., and C. J. Tang. "Dynamic Chinese viseme model based on phones and control function." Electronics Letters 47, no. 2 (2011): 144. http://dx.doi.org/10.1049/el.2010.2570.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Basu, Sankar. "Speech driven lip synthesis using viseme based hidden Markov models." Journal of the Acoustical Society of America 112, no. 6 (2002): 2520. http://dx.doi.org/10.1121/1.1536507.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Jachimski, Dawid, Andrzej Czyzewski, and Tomasz Ciszewski. "A comparative study of English viseme recognition methods and algorithms." Multimedia Tools and Applications 77, no. 13 (October 7, 2017): 16495–532. http://dx.doi.org/10.1007/s11042-017-5217-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Filler, Robert, and Gary L. Cantrell. "Polyfluoroaromatics. The interrelated chemistry of barrelenes, anthracenes and vis-à-visenes." Journal of Fluorine Chemistry 29, no. 1-2 (August 1985): 112. http://dx.doi.org/10.1016/s0022-1139(00)83348-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Choi, Jaehee, Keonseok Yoon, Hyesoo Ryu, and Hyunsook Jang. "Analysis of Korean Viseme System in Korean Standard Monosyllabic Word Lists." Communication Sciences & Disorders 22, no. 3 (September 30, 2017): 615–28. http://dx.doi.org/10.12963/csd.17390.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Roslan, Rosniza, Nursuriati Jamil, Noraini Seman, and Syafiqa Ain Alfida Abdul Rahim. "Mouth Segmentation of Viseme Using Biased Normalized Cuts and Mathematical Morphology." Advanced Science Letters 23, no. 5 (May 1, 2017): 4202–5. http://dx.doi.org/10.1166/asl.2017.8297.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Bear, Helen L., and Richard Harvey. "Phoneme-to-viseme mappings: the good, the bad, and the ugly." Speech Communication 95 (December 2017): 40–67. http://dx.doi.org/10.1016/j.specom.2017.07.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Aghaahmadi, Mohammad, Mohammad Mahdi Dehshibi, Azam Bastanfard, and Mahmood Fazlali. "Clustering Persian viseme using phoneme subspace for developing visual speech application." Multimedia Tools and Applications 65, no. 3 (June 21, 2012): 521–41. http://dx.doi.org/10.1007/s11042-012-1128-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Aripin, _., and Hanny Haryanto. "A Realistic Visual Speech Synthesis for Indonesian Using A Combination of Morphing Viseme and Syllable Concatenation Approach to Support Pronunciation Learning." International Journal of Emerging Technologies in Learning (iJET) 13, no. 08 (August 30, 2018): 19. http://dx.doi.org/10.3991/ijet.v13i08.8084.

Full text
Abstract:
This study aims to build a realistic visual speech synthesis for Indonesian so that it can be used to learn Indonesian pronunciation. In this study, We used the combination of morphing viseme and syllable concatenation method. The morphing viseme method is a process of deformation from one viseme to another so that the animation of the mouth shape looks smoother. This method is used to create the transition of animation between viseme. The Syllable Concatenation method is used to assemble viseme based on certain syllable patterns. We built a syllable-based voice database as a basis for synchronization between syllables, speech and viseme models. The method proposed in this study consists of several stages, namely the formation of Indonesian viseme models, designing facial animation character, development of speech database, a synchronization process and subjective testing of the resulting application. Subjective tests were conducted on 30 respondents who assessed the suitability and natural movement of the mouth when uttering the Indonesian texts. The MOS (Mean Opinion Score) method is used to calculate the average of respondents' scores. The MOS calculation results for the criteria of Synchronization and naturalness are 4,283 and 4,107 on the scale of 1 to 5. This result shows that the level of Synchronization and naturalness of the synthesis of visual speech is more realistic. Therefore, the system can display the visualization of phoneme pronunciation to support learning Indonesian pronunciation.
APA, Harvard, Vancouver, ISO, and other styles
42

Setyati, Endang, Mauridhi Hery Purnomo, Surya Sumpeno, and Joan Santoso. "HIDDEN MARKOV MODELS BASED INDONESIAN VISEME MODEL FOR NATURAL SPEECH WITH AFFECTION." Kursor 8, no. 3 (December 13, 2016): 102. http://dx.doi.org/10.28961/kursor.v8i3.61.

Full text
Abstract:
In a communication using texts input, viseme (visual phonemes) is derived from a group of phonemes having similar visual appearances. Hidden Markov model (HMM) has been a popular mathematical approach for sequence classification such as speech recognition. For speech emotion recognition, a HMM is trained for each emotion and an unknown sample is classified according to the model which illustrate the derived feature sequence best. Viterbi algorithm, HMM is used for guessing the most possible state sequence of observable states. In this work, first stage, we defined system of an Indonesian viseme set and the associated mouth shapes, namely system of text input segmentation. The second stage, we defined a choice of one of affection type as input in the system. The last stage, we experimentally using Trigram HMMs for generating the viseme sequence to be used for synchronized mouth shape and lip movements. The whole system is interconnected in a sequence. The final system produced a viseme sequence for natural speech of Indonesian sentences with affection. We show through various experiments that the proposed, the results in about 82,19% relative improvement in classification accuracy.
APA, Harvard, Vancouver, ISO, and other styles
43

Dent, L. J. "Speechreading the Nonvisible Consonant Viseme Group with and without Single-Channel Stimulation." Annals of Otology, Rhinology & Laryngology 96, no. 1_suppl (January 1987): 132. http://dx.doi.org/10.1177/00034894870960s170.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Andriyanto, Pyepit Rinekso, Joan San, and Endang Setyati. "Identifikasi Viseme Untuk Fonem Bahasa Madura Berbasis Clustering Berdasarkan Facial Landmark Point." J-INTECH 11, no. 1 (July 5, 2023): 73–82. http://dx.doi.org/10.32664/j-intech.v11i1.835.

Full text
Abstract:
Bentuk bahasa yang paling efektif dalam berkomunikasi adalah bahasa lisan atau bicara. Ketika berbicara manusia akan menggerakkan mulut dan bibirnya untuk mengucapkan kata tertentu. Model gerakan mulut dan bibir ini menggambarkan suatu viseme (visual-phonem), yaitu sekelompok fonem yang memiliki visual atau tampilan yang hampir sama. Bahasa Madura merupakan bahasa yang unik dan memiliki beberapa ciri tertentu. Selain mempunyai tingkat bahasa, bahasa Madura mempunyai fonem-fonem yang beraspirasi atau pengucapan kata dengan cara dihembuskan seperti: /bh/, /dh/, /Dh/, /gh/ dan /jh/ yang tidak ada pada bahasa lainnya. Peneltian ini membahas tentang identifikasi kelas-kelas viseme untuk fonem bahasa Madura berbasis clustering berdasarkan facial landmark point. Dari 47 fonem bahasa Madura diperoleh 9 viseme bahasa Madura yang dihasilkan dari proses K-Means clustering. Proses clustering menggunakan ekstraksi fitur berdasarkan facial landmark point sehingga diperoleh perhitungan jarak pada setiap fitur. Fitur-fitur yang digunakan adalah fitur geometri. Model viseme bahasa Madura digunakan untuk membangun animasi mulut 2D dalam mengucapkan kata atau kalimat bahasa Madura berdasarkan inputan berupa teks. Manfaat dari penelitian ini adalah untuk tujuan pembelajaran dalam pengucapan kata atau kalimat bahasa Madura, karena bahasa Madura memiliki tulisan dan cara pengucapan yang berbeda.
APA, Harvard, Vancouver, ISO, and other styles
45

Gischler, Eberhard, and Dieter Korn. "Goniatites from Upper Visean sediments on top of the Iberg Reef, Upper Harz Mountains." Neues Jahrbuch für Geologie und Paläontologie - Abhandlungen 185, no. 3 (October 7, 1992): 271–88. http://dx.doi.org/10.1127/njgpa/185/1992/271.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Filep, Klára, Eszter László, and Melinda Székely. "A női viselet funkciói a mai széki társadalomban." Erdélyi Múzeum 84, no. 2 (2022): 102–19. http://dx.doi.org/10.36373/em-2022-2-7.

Full text
Abstract:
A tanulmány kettős célt szolgál: egyrészt bemutatni kívánja a széki viselet mai használatára vonatkozó szabályokat, amelyeket a falubeliek ismernek. Másik célunk rámutatni azokra a funkciókra, amelyekkel a mai széki társadalomban a viselet megjelenik. A lokális történelmi tudat minden évben megemlékezik az 1717-es tatárjárásról Szent Bertalan napján. A széki emberek ennek az eseménynek tulajdonítják a viseletükben megjelenő fekete és piros szín gyakoriságát. A tanulmány rövid, jelzésszerű történeti áttekintést is ad a viselet változásáról, rögzítve azokat a fontosabb módosulásokat, amelyek a széki ruhadarabokat érintették az idők során. Elsősorban a széki női viselet darabjainak bemutatására tértünk ki, a Mellékletben található fotókkal és táblázattal igyekeztünk szemlélteni a széki viselet használatára vonatkozó előírásokat.
APA, Harvard, Vancouver, ISO, and other styles
47

Vitevitch, Michael S., and Lorin Lachs. "Using network science to examine audio-visual speech perception with a multi-layer graph." PLOS ONE 19, no. 3 (March 29, 2024): e0300926. http://dx.doi.org/10.1371/journal.pone.0300926.

Full text
Abstract:
To examine visual speech perception (i.e., lip-reading), we created a multi-layer network (the AV-net) that contained: (1) an auditory layer with nodes representing phonological word-forms and edges connecting words that were phonologically related, and (2) a visual layer with nodes representing the viseme representations of words and edges connecting viseme representations that differed by a single viseme (and additional edges to connect related nodes in the two layers). The results of several computer simulations (in which activation diffused across the network to simulate word identification) are reported and compared to the performance of human participants who identified the same words in a condition in which audio and visual information were both presented (Simulation 1), in an audio-only presentation condition (Simulation 2), and a visual-only presentation condition (Simulation 3). Another simulation (Simulation 4) examined the influence of phonological information on visual speech perception by comparing performance in the multi-layer AV-net to a single-layer network that contained only a visual layer with nodes representing the viseme representations of words and edges connecting viseme representations that differed by a single viseme. We also report the results of several analyses of the errors made by human participants in the visual-only presentation condition. The results of our analyses have implications for future research and training of lip-reading, and for the development of automatic lip-reading devices and software for individuals with certain developmental or acquired disorders or for listeners with normal hearing in noisy conditions.
APA, Harvard, Vancouver, ISO, and other styles
48

Loh, Ngiik Hoon. "Development of Real-Time Lip Sync Animation Framework Based On Viseme Human Speech." Archives of Design Research 112, no. 4 (November 30, 2014): 19. http://dx.doi.org/10.15187/adr.2014.11.112.4.19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Point, S., and C. V. Fourboul. "Le codage a visee theorique." Recherche et Applications en Marketing 21, no. 4 (December 1, 2006): 61–78. http://dx.doi.org/10.1177/076737010602100404.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Mahdi, Walid, Salah Werda, and Abdelmajid Ben Hamadou. "A hybrid approach for automatic lip localization and viseme classification to enhance visual speech recognition." Integrated Computer-Aided Engineering 15, no. 3 (May 12, 2008): 253–66. http://dx.doi.org/10.3233/ica-2008-15305.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography