To see the other types of publications on this topic, follow the link: Estill voice training systems.

Journal articles on the topic 'Estill voice training systems'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Estill voice training systems.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Steinhauer, Kimberly M., and Mary McDonald Klimek. "Vocal Traditions: Estill Voice Training®." Voice and Speech Review 13, no. 3 (April 21, 2019): 354–59. http://dx.doi.org/10.1080/23268263.2019.1605707.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Fantini, Marco, Franco Fussi, Erika Crosetti, and Giovanni Succo. "Estill Voice Training and voice quality control in contemporary commercial singing: an exploratory study." Logopedics Phoniatrics Vocology 42, no. 4 (September 30, 2016): 146–52. http://dx.doi.org/10.1080/14015439.2016.1237543.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Grillo, Elizabeth U. "Functional Voice Assessment and Therapy Methods Supported by Telepractice, VoiceEvalU8, and Estill Voice Training." Seminars in Speech and Language 42, no. 01 (January 2021): 041–53. http://dx.doi.org/10.1055/s-0040-1722753.

Full text
Abstract:
AbstractFunctional assessment and therapy methods are necessary for a client-centered approach that addresses the client's vocal needs across all environments. The purpose of this article is to present the approach with the intent to encourage discussion and implementation among educators, clinicians, researchers, and students. The functional approach is defined and its importance is described within the context of the World Health Organization's International Classification of Functioning, Disability, and Health with support provided by synchronous and asynchronous telepractice, the VoiceEvalU8 app, server, and web portal, and a framework that defines voice qualities (e.g., resonance, twang, loud, and others) by the anatomy and physiology of the voice production system (i.e., Estill Figures for Voice). Case scenarios are presented to highlight application of the functional voice approach.
APA, Harvard, Vancouver, ISO, and other styles
4

Caleo, Susan Bamford. "Many Doors: The Histories and Philosophies of Roy Hart Voice Work and Estill Voice Training." Voice and Speech Review 13, no. 2 (October 30, 2018): 188–200. http://dx.doi.org/10.1080/23268263.2018.1534931.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Grillo, Elizabeth U. "A Nonrandomized Trial for Student Teachers of an In-Person and Telepractice Global Voice Prevention and Therapy Model With Estill Voice Training Assessed by the VoiceEvalU8 App." American Journal of Speech-Language Pathology 30, no. 2 (March 26, 2021): 566–83. http://dx.doi.org/10.1044/2020_ajslp-20-00200.

Full text
Abstract:
Purpose This study investigated the effects of the in-person and telepractice Global Voice Prevention and Therapy Model (GVPTM) treatment conditions and a control condition with vocally healthy student teachers. Method In this single-blinded, nonrandomized trial, 82 participants completed all aspects of the study. Estill Voice Training was used as the stimulability component of the GVPTM to train multiple new voices meeting all the vocal needs of the student teachers. Outcomes were assessed using acoustic, perceptual, and aerodynamic measures captured by the VoiceEvalU8 app at pre and post in fall and during student teaching in spring. Results Significant improvements were achieved for several acoustic and perceptual measures in the treatment conditions, but not in the control condition. The in-person and telepractice conditions produced similar results. The all-voiced phrase and connected speech were more successful in demonstrating voice change for some of the perturbation measures as compared to sustained /a/. Conclusions The treatment conditions were successful in improving the participants' voices for fundamental frequency and some acoustic perturbation measures while maintaining the improvements during student teaching. In addition, the treatment conditions were successful in decreasing the negative impact of voice-related quality of life and vocal fatigue during student teaching. Future research should address the effectiveness of the various components of the GVPTM, the application of the GVPTM with patients with voice disorders, the relevance of defining auditory–perceptual terms by the anatomy and physiology of the voice production system (i.e., Estill Voice Training), and the continued use of the VoiceEvalU8 app for clinical voice investigations. Supplemental Material https://doi.org/10.23641/asha.13626824
APA, Harvard, Vancouver, ISO, and other styles
6

Erro, Daniel, Asunción Moreno, and Antonio Bonafonte. "INCA Algorithm for Training Voice Conversion Systems From Nonparallel Corpora." IEEE Transactions on Audio, Speech, and Language Processing 18, no. 5 (July 2010): 944–53. http://dx.doi.org/10.1109/tasl.2009.2038669.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Douma, Peter. "Methods and apparatus for training and operating voice recognition systems." Journal of the Acoustical Society of America 102, no. 3 (September 1997): 1285. http://dx.doi.org/10.1121/1.420032.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zotova, Viktoriia. "THE DEVELOPMENT OF CREATIVE ACTIVITY OF TEENAGERS IN THE PROCESS OF VOCAL EDUCATION." Academic Notes Series Pedagogical Science 1, no. 195 (2021): 178–82. http://dx.doi.org/10.36550/2415-7988-2021-1-195-178-182.

Full text
Abstract:
Here we revealed the peculiarities of the every day approach to the vocal education of teenage students, and referred to the contemporary trends of children’s vocal education as well as its transformation in the context of the current realities. We have studied theoretical approaches revealing the issue of our research from the perspective of new scientific achievements of the international vocal pedagogy, and have highlighted the most effective ones in terms of teaching vocals to teenagers. We have analyzed the current guidelines of organizing vocal practice with the use of interactive techniques and described how they can deploy the nature of teen’s creative activity especially in the sphere of art. Teenage students have a remarkable need in showing their skills especially if they have already achieved the level of high competence and are already the students of high school. The biggest authority for the teens is the approval of peers, and the moments of success are essential for such student’s self-esteem and further professional development, particularly in this age. On the other hand, our research views the creative activity as the basic component of the established education process where the teenager is the key actor. Interactive techniques are the effective tool of organizing such process. We have appealed to the experience of such acclaimed international schools as EVT (Estill Voice Training) of the american professor Jo Estill who accumulates creative potential of the teacher and student, thus enhancing and triggering the results of the acquired knowledge. Nowadays the traditional offline forms of delivering lessons are becoming secondary and outdated in the age establishing online education, they are considered less effective, less relevant and hardly able to stand the test of time and the global social-economic crisis. But owing to the crisis the new ways of creative thinking’s development are appearing, which fosters unleashing of the creative powers of the teacher and then of the teenage students, who tend to follow the teacher as the professional authority. The core essence of our work was to study the current and contemporary tendencies for developing teenagers’ creative activity in mastering vocals, therein lies the timeliness of the issue.
APA, Harvard, Vancouver, ISO, and other styles
9

Sandage, Mary J., and David D. Pascoe. "Translating Exercise Science Into Voice Care." Perspectives on Voice and Voice Disorders 20, no. 3 (November 2010): 84–89. http://dx.doi.org/10.1044/vvd20.3.84.

Full text
Abstract:
The basic principles of exercise training for skeletal muscle adaptations have been applied to voice training for some time. To date, the use of the basic principles of muscle training for designing a voice rehabilitation program or advising voice clients about the role of voice rest and modified voice use following surgical intervention has not been well developed. Voice training is a complex process of skill acquisition through application of motor learning principles and the concurrent coordinated use of many physiologic systems. However, the translation of exercise science literature to voice training and recovery needs to be undertaken with caution, because the function and performance of laryngeal skeletal muscle can be different from those of skeletal muscles used for other types of movement. This discussion will be confined to the basic adaptations of the muscle tissue itself. A brief review of basic principles of muscle training as understood for skeletal muscle will be followed by a more extensive discussion of the neurologic, metabolic, and physiologic adaptations of muscle training and detraining. Translation of this body of literature will be considered in the contexts of post-surgical voice recovery, voice rehabilitation, and maintenance of professional voice requirements.
APA, Harvard, Vancouver, ISO, and other styles
10

Kolesau, Aliaksei, and Dmitrij Šešok. "Voice Activation for Low-Resource Languages." Applied Sciences 11, no. 14 (July 7, 2021): 6298. http://dx.doi.org/10.3390/app11146298.

Full text
Abstract:
Voice activation systems are used to find a pre-defined word or phrase in the audio stream. Industry solutions, such as “OK, Google” for Android devices, are trained with millions of samples. In this work, we propose and investigate several ways to train a voice activation system when the in-domain data set is small. We compare self-training exemplar pre-training, fine-tuning a model pre-trained on another domain, joint training on both an out-of-domain high-resource and a target low-resource data set, and unsupervised pre-training. In our experiments, the unsupervised pre-training and the joint-training with a high-resource data set from another domain significantly outperform a strong baseline of fine-tuning a model trained on another data set. We obtain 7–25% relative improvement depending on the model architecture. Additionally, we improve the best test accuracy on the Lithuanian data set from 90.77% to 93.85%.
APA, Harvard, Vancouver, ISO, and other styles
11

Martin, David P., and Virginia I. Wolfe. "Effects of Perceptual Training Based upon Synthesized Voice Signals." Perceptual and Motor Skills 83, no. 3_suppl (December 1996): 1291–98. http://dx.doi.org/10.2466/pms.1996.83.3f.1291.

Full text
Abstract:
28 undergraduate students participated in a perceptual voice experiment to assess the effects of training utilizing synthesized voice signals. An instructional strategy based upon synthesized examples of a three-part classification system: “breathy,” “rough,” and “hoarse,” was employed. Training samples were synthesized with varying amounts of jitter (cycle-to-cycle deviation in pitch period) and harmonic-to-noise ratios to represent these qualities. Before training, listeners categorized 60 pathological voices into “breathy,” “rough,” and “hoarse,” largely on the basis of fundamental frequency. After training, categorizations were influenced by harmonic-to-noise ratios as well as fundamental frequency, suggesting that listeners were more aware of spectral differences in pathological voices associated with commonly occurring laryngeal conditions. 40% of the pathological voice samples remained unclassified following training.
APA, Harvard, Vancouver, ISO, and other styles
12

Ninh Khánh, Duy. "Evaluation of speaker-dependent and average-voice Vietnamese statistical speech synthesis systems." Journal of Science and Technology Issue on Information and Communications Technology 17, no. 12.1 (December 31, 2019): 11. http://dx.doi.org/10.31130/jst-ud2019-035e.

Full text
Abstract:
This paper describes the development and evaluation of a Vietnamese statistical speech synthesis system using the average voice approach. Although speaker-dependent systems have been applied extensively, no average voice based system has been developed for Vietnamese so far. We have collected speech data from several Vietnamese native speakers and employed state-of-the-art speech analysis, model training and speaker adaptation techniques to develop the system. Besides, we have performed perceptual experiments to compare the quality of speaker-adapted (SA) voices built on the average voice model and speaker-dependent (SD) voices built on SD models, and to confirm the effects of contextual features including word boundary (WB) and part-of-speech (POS) on the quality of synthetic speech. Evaluation results show that SA voices have significantly higher naturalness than SD voices when the same limited contextual feature set excluding WB and POS is used. In addition, SA voices trained with limited contextual features excluding WB and POS still have better quality than SD voices trained with full contextual features including WB and POS. These results show the robustness of the average voice method over the speaker-dependent approach for Vietnamese statistical speech synthesis.
APA, Harvard, Vancouver, ISO, and other styles
13

Mittal, Vikas, and R. K. Sharma. "Deep Learning Approach for Voice Pathology Detection and Classification." International Journal of Healthcare Information Systems and Informatics 16, no. 4 (October 2021): 1–30. http://dx.doi.org/10.4018/ijhisi.20211001.oa28.

Full text
Abstract:
A non-invasive cum robust voice pathology detection and classification architecture is proposed in the current manuscript. In place of the conventional feature-based machine learning techniques, a new architecture is proposed herein which initially performs deep learning-based filtering of the input voice signal, followed by a decision-level fusion of deep learning and a non-parametric learner. The efficacy of the proposed technique is verified by performing a comparative study with very recent work on the same dataset but based on different training algorithms.The proposed architecture has five different stages.The results are recorded in terms of nine (9) different classification score indices which are – mean average Precision, sensitivity, specificity, F1 score, accuracy, error, false-positive rate, Matthews Correlation Coefficient, and the Cohen’s Kappa index. The experimental results have shown that the use of machine learning classifier can get at most 96.12% accuracy, while the proposed technique achieved the highest accuracy of 99.14% in comparison to other techniques.
APA, Harvard, Vancouver, ISO, and other styles
14

Rudinsky, Jan, and Ebba Thora Hvannberg. "Transferability of Voice Communication in Games to Virtual Teams Training for Crisis Management." International Journal of Sociotechnology and Knowledge Development 9, no. 1 (January 2017): 1–25. http://dx.doi.org/10.4018/ijskd.2017010101.

Full text
Abstract:
A crisis is an emergency event that can lead to multiple injuries and damage to property or environment. Proper training of crisis management personnel is vital for reducing the impact of a major incident. In search for knowledge on how best to implement communication for virtual environments for training, communication in online games was studied. Findings on voice communication in online games were researched and formulated as a set of statements. By asking participants in an empirical study of crisis management, the statements were either confirmed or refuted. Results show that multiplayer games are highly similar to the requirements for crisis management training in virtual environments. Approximately two-thirds of the statements proved coherent in both domains. The practical significance of this work lies in the provision of design implications for a virtual environment for crisis management training. Thus, this paper contributes to demonstrating the transferability between these domains. Finally, the paper reflects the results in theories of communication and engagement.
APA, Harvard, Vancouver, ISO, and other styles
15

Schueller, Marianne, Donald Fucci, and Z. S. Bond. "Perceptual Judgment of Voice Pitch during Pitch-Matching Tasks." Perceptual and Motor Skills 94, no. 3 (June 2002): 967–74. http://dx.doi.org/10.2466/pms.2002.94.3.967.

Full text
Abstract:
This study investigated the perceptual judgment of voice pitch. 24 individuals were assigned to two groups to assess whether there is a difference in perceptual judgment of voice during pitch-matching tasks. Group I, Naïve listeners, had no previous experience in anatomy, physiology, or voice pitch-evaluation methods. Group II, Experienced listeners, were master's level speech-language pathologists having completed academic training in evaluation of voice. Both groups listened to identical stimuli, which required matching audiotaped voice-pitch samples of a male and female voice to a note on an electronic keyboard. The experiment included two tasks. The first task assessed pitch range, which required marching of the lowest and highest voice pitch of both a male and female speaker singing /a/ to a note on a keyboard. The second task assessed habitual pitch, which required matching of the voice pitch of a word spoken by a male and female speaker to a note on a keyboard. A one-way analysis of variance indicated a significant difference between groups occurred for only one of four conditions measured, perceptual judgment of the female pitch range. No differences between groups were found in the perceptual judgments of the male pitch range or during perceptual judgment of the female or male habitual pitch, suggesting that the skill possessed by speech-language pathology students is no different from that of inexperienced listeners.
APA, Harvard, Vancouver, ISO, and other styles
16

Moreno, L. C., and P. B. Lopes. "The Voice Biometrics Based on Pitch Replication." International Journal for Innovation Education and Research 6, no. 10 (October 31, 2018): 351–58. http://dx.doi.org/10.31686/ijier.vol6.iss10.1201.

Full text
Abstract:
Authentication and security in automated systems have become very much necessary in our days and many techniques have been proposed towards this end. One of these alternatives is biometrics in which human body characteristics are used to authenticate the system user. The objective of this article is to present a method of text independent speaker identification through the replication of pitch characteristics. Pitch is an important speech feature and is used in a variety of applications, including voice biometrics. The proposed method of speaker identification is based on short segments of speech, namely, three seconds for training and three seconds for the speaker determination. From these segments pitch characteristics are extracted and are used in the proposed method of replication for identification of the speaker.
APA, Harvard, Vancouver, ISO, and other styles
17

Espín López, Juan Manuel, Alberto Huertas Celdrán, Javier G. Marín-Blázquez, Francisco Esquembre, and Gregorio Martínez Pérez. "S3: An AI-Enabled User Continuous Authentication for Smartphones Based on Sensors, Statistics and Speaker Information." Sensors 21, no. 11 (May 28, 2021): 3765. http://dx.doi.org/10.3390/s21113765.

Full text
Abstract:
Continuous authentication systems have been proposed as a promising solution to authenticate users in smartphones in a non-intrusive way. However, current systems have important weaknesses related to the amount of data or time needed to build precise user profiles, together with high rates of false alerts. Voice is a powerful dimension for identifying subjects but its suitability and importance have not been deeply analyzed regarding its inclusion in continuous authentication systems. This work presents the S3 platform, an artificial intelligence-enabled continuous authentication system that combines data from sensors, applications statistics and voice to authenticate users in smartphones. Experiments have tested the relevance of each kind of data, explored different strategies to combine them, and determined how many days of training are needed to obtain good enough profiles. Results showed that voice is much more relevant than sensors and applications statistics when building a precise authenticating system, and the combination of individual models was the best strategy. Finally, the S3 platform reached a good performance with only five days of use available for training the users’ profiles. As an additional contribution, a dataset with 21 volunteers interacting freely with their smartphones for more than sixty days has been created and made available to the community.
APA, Harvard, Vancouver, ISO, and other styles
18

Carlin, Margaret F., Richard D. Saniga, and Nancy Dennis. "Relationship between Academic Placement and Perception of Abuse of the Voice." Perceptual and Motor Skills 71, no. 1 (August 1990): 299–304. http://dx.doi.org/10.2466/pms.1990.71.1.299.

Full text
Abstract:
Informal polling of public school speech-language pathologists indicated that special education teachers referred more children for disorders of voice than did regular classroom educators. This study evaluated the effect of academic placement (regular or special education settings) upon children's and their teachers' ratings of abuse of the voice. Analysis showed the two groups of teachers' criteria for judging abusive vocal behaviors differed while the children's ratings from each setting did not differ. The special educators appeared to perceive their students' vocal behavior as more abusive possibly due to environmental constraints, training or the social affective interactions of their students.
APA, Harvard, Vancouver, ISO, and other styles
19

Ghai, Bhavya, Buvana Ramanan, and Klaus Mueller. "Does Speech Enhancement of Publicly Available Data Help Build Robust Speech Recognition Systems? (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 10 (April 3, 2020): 13793–94. http://dx.doi.org/10.1609/aaai.v34i10.7168.

Full text
Abstract:
Automatic speech recognition(ASR) systems play a key role in many commercial products including voice assistants. Typically, they require large amounts of high quality speech data for training which gives an undue advantage to large organizations which have tons of private data. We investigated if speech data obtained from publicly available sources can be further enhanced to train better speech recognition models. We begin with noisy/contaminated speech data, apply speech enhancement to produce 'cleaned' version and use both the versions to train the ASR model. We have found that using speech enhancement gives 9.5% better word error rate than training on just the original noisy data and 9% better than training on just the ground truth 'clean' data. It's performance is also comparable to the ideal case scenario when trained on noisy and it's ground truth 'clean' version.
APA, Harvard, Vancouver, ISO, and other styles
20

Anh, Mai Ngoc, and Duong Xuan Bien. "Voice Recognition and Inverse Kinematics Control for a Redundant Manipulator Based on a Multilayer Artificial Intelligence Network." Journal of Robotics 2021 (June 29, 2021): 1–10. http://dx.doi.org/10.1155/2021/5805232.

Full text
Abstract:
This study presents the construction of a Vietnamese voice recognition module and inverse kinematics control of a redundant manipulator by using artificial intelligence algorithms. The first deep learning model is built to recognize and convert voice information into input signals of the inverse kinematics problem of a 6-degrees-of-freedom robotic manipulator. The inverse kinematics problem is solved based on the construction and training. The second deep learning model is built using the data determined from the mathematical model of the system’s geometrical structure, the limits of joint variables, and the workspace. The deep learning models are built in the PYTHON language. The efficient operation of the built deep learning networks demonstrates the reliability of the artificial intelligence algorithms and the applicability of the Vietnamese voice recognition module for various tasks.
APA, Harvard, Vancouver, ISO, and other styles
21

Blakeslee, Sarah D. M. "Special Considerations When Working With the Pediatric Vocal Performer." Perspectives on Voice and Voice Disorders 23, no. 1 (March 2013): 22–27. http://dx.doi.org/10.1044/vvd23.1.22.

Full text
Abstract:
Providing care to a performer with vocal injury requires an understanding of the physiology of the singing voice and appreciation for the complexities of the life of a performer. However, when the performer who presents to the clinic is a child or teenager, there are additional challenges, and providing appropriate care requires a special understanding of the changes that take place during vocal development across childhood and adolescence. This includes the physical changes that occur in the respiratory, phonatory, and resonatory systems; the effects of these changes on the singing voice especially in regards to puberty; and additional challenges ones faces when working with a performer who may have minimal training and whose instrument is continuing to develop. A thorough evaluation is necessary before recommending voice therapy. Voice therapy with pediatric vocal performers is similar to voice therapy with adults, but may have a larger focus on education regarding normal anatomy and physiology of the vocal mechanism, vocal hygiene, vocal warm-ups, and basic singing technique.
APA, Harvard, Vancouver, ISO, and other styles
22

Fan, Ziqi, Yuanbo Wu, Changwei Zhou, Xiaojun Zhang, and Zhi Tao. "Class-Imbalanced Voice Pathology Detection and Classification Using Fuzzy Cluster Oversampling Method." Applied Sciences 11, no. 8 (April 12, 2021): 3450. http://dx.doi.org/10.3390/app11083450.

Full text
Abstract:
The Massachusetts Eye and Ear Infirmary (MEEI) database is an international-standard training database for voice pathology detection (VPD) systems. However, there is a class-imbalanced distribution in normal and pathological voice samples and different types of pathological voice samples in the MEEI database. This study aimed to develop a VPD system that uses the fuzzy clustering synthetic minority oversampling technique algorithm (FC-SMOTE) to automatically detect and classify four types of pathological voices in a multi-class imbalanced database. The proposed FC-SMOTE algorithm processes the initial class-imbalanced dataset. A set of machine learning models was evaluated and validated using the resulting class-balanced dataset as an input. The effectiveness of the VPD system with FC-SMOTE was further verified by an external validation set and another pathological voice database (Saarbruecken Voice Database (SVD)). The experimental results show that, in the multi-classification of pathological voice for the class-imbalanced dataset, the method we propose can significantly improve the diagnostic accuracy. Meanwhile, FC-SMOTE outperforms the traditional imbalanced data oversampling algorithms, and it is preferred for imbalanced voice diagnosis in practical applications.
APA, Harvard, Vancouver, ISO, and other styles
23

Rudramurthy, M. S., V. Kamakshi Prasad, and R. Kumaraswamy. "Speaker Verification Under Degraded Conditions Using Empirical Mode Decomposition Based Voice Activity Detection Algorithm." Journal of Intelligent Systems 23, no. 4 (December 1, 2014): 359–78. http://dx.doi.org/10.1515/jisys-2013-0085.

Full text
Abstract:
AbstractThe performance of most of the state-of-the-art speaker recognition (SR) systems deteriorates under degraded conditions, owing to mismatch between the training and testing sessions. This study focuses on the front end of the speaker verification (SV) system to reduce the mismatch between training and testing. An adaptive voice activity detection (VAD) algorithm using zero-frequency filter assisted peaking resonator (ZFFPR) was integrated into the front end of the SV system. The performance of this proposed SV system was studied under degraded conditions with 50 selected speakers from the NIST 2003 database. The degraded condition was simulated by adding different types of noises to the original speech utterances. The different types of noises were chosen from the NOISEX-92 database to simulate degraded conditions at signal-to-noise ratio levels from 0 to 20 dB. In this study, widely used 39-dimension Mel frequency cepstral coefficient (MFCC; i.e., 13-dimension MFCCs augmented with 13-dimension velocity and 13-dimension acceleration coefficients) features were used, and Gaussian mixture model–universal background model was used for speaker modeling. The proposed system’s performance was studied against the energy-based VAD used as the front end of the SV system. The proposed SV system showed some encouraging results when EMD-based VAD was used at its front end.
APA, Harvard, Vancouver, ISO, and other styles
24

Bassich, Celia J., and Christy L. Ludlow. "The Use of Perceptual Methods by New Clinicians for Assessing Voice Quality." Journal of Speech and Hearing Disorders 51, no. 2 (May 1986): 125–33. http://dx.doi.org/10.1044/jshd.5102.125.

Full text
Abstract:
The purpose of this study was to determine the validity and reliability of using perceptual ratings for assessing voice quality in patients with vocal fold nodules or polyps. A 13-dimension perceptual rating system was modeled after systems currently in clinical use. To meet the criterion of 80% mean interjudge reliability, eight hours of training were required for four previously inexperienced listeners. Extended vowel phonations of patients and controls were then rated blindly by the same listeners. Interjudge reliability was greater than .90 for three dimensions judged in the pathological phonations, while intrajudge test-retest agreement was less than 75% on five dimensions. Validity was demonstrated with 100% correct assignment to group by computing a discriminant function employing all dimensions. Despite the extensive training procedures used, our reliability data were not comparable to those reported when highly experienced judges have been used, suggesting that the task of perceptually rating voice quality is difficult and requires extensive professional experience.
APA, Harvard, Vancouver, ISO, and other styles
25

Bakir, Cigdem. "Speech recognition system for Turkish language with hybrid method." Global Journal of Computer Sciences: Theory and Research 7, no. 1 (November 27, 2017): 48–57. http://dx.doi.org/10.18844/gjcs.v7i1.2699.

Full text
Abstract:
Currently, technological developments are accompanied by a number of associated problems. Security takes the first place amongst such problems. In particular, biometric systems such as authentication constitute a significant fraction of the security problem. This is because sound recordings having connection with various crimes are required to be analysed for forensic purposes. Authentication systems necessitate transmission, design and classification of biometric data in a secure manner. The aim of this study is to actualise an automatic voice and speech recognition system using wavelet transform, taking Turkish sound forms and properties into consideration. Approximately 3740 Turkish voice samples of words and clauses of differing lengths were collected from 25 males and 25 females. The features of these voice samples were obtained using Mel-frequency cepstral coefficients (MFCCs), Mel-frequency discrete wavelet coefficients (MFDWCs) and linear prediction cepstral coefficient (LPCC). Feature vectors of the voice samples obtained were trained with k-means, artificial neural network (ANN) and hybrid model. The hybrid model was formed by combining with k-means clustering and ANN. In the first phase of this model, k-means performed subsets obtained with voice feature vectors. In the second phase, a set of training and tests were formed from these sub-clusters. Thus, for being trained more suitable data by clustering increased the accuracy. In the test phase, the owner of a given voice sample was identified by taking the trained voice samples into consideration. The results and performance of the algorithms used for classification are also demonstrated in a comparative manner. Keywords: Speech recognition, hybrid model, k-means, artificial neural network (ANN).
APA, Harvard, Vancouver, ISO, and other styles
26

Krause, Michael, Meinard Müller, and Christof Weiß. "Singing Voice Detection in Opera Recordings: A Case Study on Robustness and Generalization." Electronics 10, no. 10 (May 20, 2021): 1214. http://dx.doi.org/10.3390/electronics10101214.

Full text
Abstract:
Automatically detecting the presence of singing in music audio recordings is a central task within music information retrieval. While modern machine-learning systems produce high-quality results on this task, the reported experiments are usually limited to popular music and the trained systems often overfit to confounding factors. In this paper, we aim to gain a deeper understanding of such machine-learning methods and investigate their robustness in a challenging opera scenario. To this end, we compare two state-of-the-art methods for singing voice detection based on supervised learning: A traditional approach relying on hand-crafted features with a random forest classifier, as well as a deep-learning approach relying on convolutional neural networks. To evaluate these algorithms, we make use of a cross-version dataset comprising 16 recorded performances (versions) of Richard Wagner’s four-opera cycle Der Ring des Nibelungen. This scenario allows us to systematically investigate generalization to unseen versions, musical works, or both. In particular, we study the trained systems’ robustness depending on the acoustic and musical variety, as well as the overall size of the training dataset. Our experiments show that both systems can robustly detect singing voice in opera recordings even when trained on relatively small datasets with little variety.
APA, Harvard, Vancouver, ISO, and other styles
27

Borisov, V. E., V. A. Borsoev, and A. A. Bondarenko. "Development of advanced voice-supported simulators with the function of automated estimation of air traffic controllers skills." Civil Aviation High Technologies 23, no. 6 (December 31, 2020): 8–19. http://dx.doi.org/10.26467/2079-0619-2020-23-6-8-19.

Full text
Abstract:
According to the World Health Organization, the number of potential pathogens worldwide is very high, which increases the likelihood of a new pandemic. The impact of the new coronavirus infection (Covid-19) on all spheres of human activity, including the air transport industry, has shown that it is necessary to take into account the possibilities of its functioning under the new conditions. During the research, the possibility of using automated modular training systems for the air traffic controllers training in the remote access mode has been considered. The well-known simulators do not implement a justified instrumental procedure for measuring the acquired skills in air traffic services and the assessment of their development is carried out by the instructor, who reacts to the student’s actions on the basis of his experience. It is difficult for the instructor to control the development of a student’s individual skills and he has to rely on his own experience. To simulate the controller-pilot contour, pseudo-pilots are involved, manually changing the flight parameters of the aircraft and simulating R/T communication. The well-known simulators do not allow independent training. As a result, a conceptual design was formed and a promising simulator with the function of training automation and voice support was developed. The effectiveness of the proposed solutions was tested in comparison with the traditional approach to simulator training. Eventually it was found that after using a special simulator, students' mistakes decreased. Subsequently, the simulator was used for practical training of students providing the distance learning in circumstances of pandemics (Covid-19). The project showed its viability and the ability to conduct remote training of air traffic controllers, after appropriate refinement of the promising simulator.
APA, Harvard, Vancouver, ISO, and other styles
28

Hayden, Eugene, Kang Wang, Chengjie Wu, and Shi Cao. "Augmented Reality Procedure Assistance System for Operator Training and Simulation." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 64, no. 1 (December 2020): 1176–80. http://dx.doi.org/10.1177/1071181320641281.

Full text
Abstract:
This study explores the design, implementation, and evaluation of an Augmented Reality (AR) prototype that assists novice operators in performing procedural tasks in simulator environments. The prototype uses an optical see-through head-mounted display (OST HMD) in conjunction with a simulator display to supplement sequences of interactive visual and attention-guiding cues to the operator’s field of view. We used a 2x2 within-subject design to test two conditions: with/without AR-cues, each condition had a voice assistant and two procedural tasks (preflight and landing). An experiment examined twenty-six novice operators. The results demonstrated that augmented reality had benefits in terms of improved situation awareness and accuracy, however, it yielded longer task completion time by creating a speed-accuracy trade-off effect in favour of accuracy. No significant effect on mental workload is found. The results suggest that augmented reality systems have the potential to be used by a wider audience of operators.
APA, Harvard, Vancouver, ISO, and other styles
29

Mitsuhara, Hiroyuki, Keisuke Iguchi, and Masami Shishibori. "Using Digital Game, Augmented Reality, and Head Mounted Displays for Immediate-Action Commander Training." International Journal of Emerging Technologies in Learning (iJET) 12, no. 02 (February 28, 2017): 101. http://dx.doi.org/10.3991/ijet.v12i02.6303.

Full text
Abstract:
Disaster education focusing on how we should take immediate actions after disasters strike is essential to protect our lives. However, children find it difficult to understand such disaster education. Instead of disaster education to children, adults should properly instruct them to take immediate actions in the event of a disaster. We refer to such adults as Immediate-Action Commanders (IACers) and attach importance to technology-enhanced IACer training programs with high situational and audio-visual realities. To realize such programs, we focused on digital game, augmented reality (AR) and head-mounted displays (HMDs). We prototyped three AR systems that superimpose interactive virtual objects onto HMDs’ real-time vision or a trainee’s actual view based on interactive fictional scenarios. In addition, the systems are designed to realize voice-based interactions between the virtual objects (i.e., virtual children) and the trainee. According to a brief comparative survey, the AR system equipped with a smartphone-based binocular opaque HMD (Google Cardboard) has the most promising practical system for technology-enhanced IACer training programs.
APA, Harvard, Vancouver, ISO, and other styles
30

Simon, Ellen, and Torsten Leuschner. "Laryngeal Systems in Dutch, English, and German: A Contrastive Phonological Study on Second and Third Language Acquisition." Journal of Germanic Linguistics 22, no. 4 (December 2010): 403–24. http://dx.doi.org/10.1017/s1470542710000127.

Full text
Abstract:
Although Dutch, English, and German all have a phonological contrast between voiced and voiceless plosives, they differ in the way these stops are realized. While English and German contrast voiceless aspirated with phonetically voiceless stops, Dutch has a contrast between voiceless unaspirated and prevoiced stops. This study compares these three laryngeal stop systems and examines the acquisition of the English and German systems by a group of native speakers of Dutch. The analysis reveals that both trained and untrained participants transferred prevoicing from Dutch into English and German but acquired aspiration and thus showed a “mixed” laryngeal system in both their L2 (English) and their L3 (German). Since even untrained participants produced voiceless stops in the target Voice Onset Time range, pronunciation training has only a moderate effect.*
APA, Harvard, Vancouver, ISO, and other styles
31

Gutiérrez-Muñoz, Michelle, Astryd González-Salazar, and Marvin Coto-Jiménez. "Evaluation of Mixed Deep Neural Networks for Reverberant Speech Enhancement." Biomimetics 5, no. 1 (December 20, 2019): 1. http://dx.doi.org/10.3390/biomimetics5010001.

Full text
Abstract:
Speech signals are degraded in real-life environments, as a product of background noise or other factors. The processing of such signals for voice recognition and voice analysis systems presents important challenges. One of the conditions that make adverse quality difficult to handle in those systems is reverberation, produced by sound wave reflections that travel from the source to the microphone in multiple directions. To enhance signals in such adverse conditions, several deep learning-based methods have been proposed and proven to be effective. Recently, recurrent neural networks, especially those with long short-term memory (LSTM), have presented surprising results in tasks related to time-dependent processing of signals, such as speech. One of the most challenging aspects of LSTM networks is the high computational cost of the training procedure, which has limited extended experimentation in several cases. In this work, we present a proposal to evaluate the hybrid models of neural networks to learn different reverberation conditions without any previous information. The results show that some combinations of LSTM and perceptron layers produce good results in comparison to those from pure LSTM networks, given a fixed number of layers. The evaluation was made based on quality measurements of the signal’s spectrum, the training time of the networks, and statistical validation of results. In total, 120 artificial neural networks of eight different types were trained and compared. The results help to affirm the fact that hybrid networks represent an important solution for speech signal enhancement, given that reduction in training time is on the order of 30%, in processes that can normally take several days or weeks, depending on the amount of data. The results also present advantages in efficiency, but without a significant drop in quality.
APA, Harvard, Vancouver, ISO, and other styles
32

Kochan, Thomas A. "Adapting Industrial Relations to Serve Knowledge-based Economies." Journal of Industrial Relations 48, no. 1 (February 2006): 7–20. http://dx.doi.org/10.1177/0022185606059311.

Full text
Abstract:
The central challenge facing industrial relations today is how to adapt its policies, institutions, practices, and research to serve the needs of the workforce and society in a global, knowledge-based economy. The field of industrial relations rose to prominence in the 20th century because it helped workers and employers adapt to their growing industrial economies. Today, we observe that many of the institutions and policies developed for the industrial era are in decline. A similar transformation of policies, institutions, and practices will be needed to help workers, families, communities, and societies adapt to the requirements of a knowledge-based, global economy. This will require renewed commitment to universal and life-long education and training, broad diffusion of knowledge-based work systems in organizations, more transparency and more direct worker voice in corporate governance structures and processes, flexible labor market policies that support mobility and portability of benefits across jobs and movement in and out of full time work as women and men move through different stages of their careers and family lives, and new institutions for worker voice and representation at work and in society. Given the global nature of economic activity, these reforms cannot be limited to single national systems; they must be part of a broader international consensus and coordinated effort to build transnational systems for managing cross border flows of human capital, jobs, knowledge, and value.
APA, Harvard, Vancouver, ISO, and other styles
33

Hoanca, Bogdan, and Richard Whitney. "Taking a Byte of Telephony Costs." Journal of Cases on Information Technology 12, no. 4 (October 2010): 18–34. http://dx.doi.org/10.4018/jcit.2010100102.

Full text
Abstract:
In 2006, the University of Alaska Anchorage (UAA) upgraded the telephone system at its main campus in Anchorage from a traditional private branch exchange (PBX) architecture to a Voice over Internet Protocol (VoIP) system. This case describes the organizational decisions that led to the change; the scope and the process of upgrading; and the current status of the new VoIP system. The actual migration to VoIP was completed less than a year after the start of the project. The transition process went smoothly. User satisfaction with the performance of the VoIP system is very high. Based on extensive interviews with decision makers and the technical personnel involved, this case also describes financial considerations (including “creative” ways to stretch a limited budget), outsourcing considerations, training related issues, as well as lessons learned.
APA, Harvard, Vancouver, ISO, and other styles
34

Rudramurthy, M. S., Nilabh Kumar Pathak, V. Kamakshi Prasad, and R. Kumaraswamy. "Speaker Identification Using Empirical Mode Decomposition-Based Voice Activity Detection Algorithm under Realistic Conditions." Journal of Intelligent Systems 23, no. 4 (December 1, 2014): 405–21. http://dx.doi.org/10.1515/jisys-2013-0089.

Full text
Abstract:
AbstractSpeaker recognition (SR) under mismatched conditions is a challenging task. Speech signal is nonlinear and nonstationary, and therefore, difficult to analyze under realistic conditions. Also, in real conditions, the nature of the noise present in speech data is not known a priori. In such cases, the performance of speaker identification (SI) or speaker verification (SV) degrades considerably under realistic conditions. Any SR system uses a voice activity detector (VAD) as the front-end subsystem of the whole system. The performance of most VADs deteriorates at the front end of the SR task or system under degraded conditions or in realistic conditions where noise plays a major role. Recently, speech data analysis and processing using Norden E. Huang’s empirical mode decomposition (EMD) combined with Hilbert transform, commonly referred to as Hilbert–Huang transform (HHT), has become an emerging trend. EMD is an a posteriori, adaptive, data analysis tool used in time domain that is widely accepted by the research community. Recently, speech data analysis and speech data processing for speech recognition and SR tasks using EMD have been increasing. EMD-based VAD has become an important adaptive subsystem of the SR system that mostly mitigates the effect of mismatch between the training and the testing phase. Recently, we have developed a VAD algorithm using a zero-frequency filter-assisted peaking resonator (ZFFPR) and EMD. In this article, the efficacy of an EMD-based VAD algorithm is studied at the front end of a text-independent language-independent SI task for the speaker’s data collected in three languages at five different places, such as home, street, laboratory, college campus, and restaurant, under realistic conditions using EDIROL-R09 HR, a 24-bit wav/MP3 recorder. The performance of this proposed SI task is compared against the traditional energy-based VAD in terms of percentage identification rate. In both cases, widely accepted Mel frequency cepstral coefficients are computed by employing frame processing (20-ms frame size and 10-ms frame shift) from the extracted voiced speech regions using the respective VAD techniques from the realistic speech utterances, and are used as a feature vector for speaker modeling using popular Gaussian mixture models. The experimental results showed that the proposed SI task with the VAD algorithm using ZFFPR and EMD at its front end performs better than the SI task with short-term energy-based VAD when used at its front end, and is somewhat encouraging.
APA, Harvard, Vancouver, ISO, and other styles
35

Marasek, Krzysztof, Danijel Koržinek, and Łukasz Brocki. "System for Automatic Transcription of Sessions of the Polish Senate." Archives of Acoustics 39, no. 4 (March 1, 2015): 501–9. http://dx.doi.org/10.2478/aoa-2014-0054.

Full text
Abstract:
Abstract This paper describes research behind a Large-Vocabulary Continuous Speech Recognition (LVCSR) system for the transcription of Senate speeches for the Polish language. The system utilizes severalcomponents: a phonetic transcription system, language and acoustic model training systems, a Voice Activity Detector (VAD), a LVCSR decoder, and a subtitle generator and presentation system. Some of the modules relied on already available tools and some had to be made from the beginning but the authors ensured that they used the most advanced techniques they had available at the time. Finally, several experiments were performed to compare the performance of both more modern and more conventional technologies.
APA, Harvard, Vancouver, ISO, and other styles
36

Cao, Qianyu, and Hanmei Hao. "Optimization of Intelligent English Pronunciation Training System Based on Android Platform." Complexity 2021 (March 26, 2021): 1–11. http://dx.doi.org/10.1155/2021/5537101.

Full text
Abstract:
Oral English, as a language tool, is not only an important part of English learning but also an essential part. For nonnative English learners, effective and meaningful voice feedback is very important. At present, most of the traditional recognition and error correction systems for oral English training are still in the theoretical stage. At the same time, the corresponding high-end experimental prototype also has the disadvantages of large and complex system. In the speech recognition technology, the traditional speech recognition technology is not perfect in recognition ability and recognition accuracy, and it relies too much on the recognition of speech content, which is easily affected by the noise environment. Based on this, this paper will develop and design a spoken English assistant pronunciation training system based on Android smartphone platform. Based on the in-depth study and analysis of spoken English speech correction algorithm and speech feedback mechanism, this paper proposes a lip motion judgment algorithm based on ultrasonic detection, which is used to assist the traditional speech recognition algorithm in double feedback judgment. In the feedback mechanism of intelligent speech training, a double benchmark scoring mechanism is introduced to comprehensively evaluate the speech of the speech trainer and correct the speaker’s speech in time. The experimental results show that the speech accuracy of the system reaches 85%, which improves the level of oral English trainers to a certain extent.
APA, Harvard, Vancouver, ISO, and other styles
37

Torres, Adalberto, David E. Milov, Daniela Melendez, Joseph Negron, John J. Zhao, and Stephen T. Lawless. "A new approach to alarm management: mitigating failure-prone systems." Journal of Hospital Administration 3, no. 6 (October 9, 2014): 79. http://dx.doi.org/10.5430/jha.v3n6p79.

Full text
Abstract:
Alarm management that effectively reduces alarm fatigue and improves patient safety has yet to be convincingly demonstrated. The leaders of our newly constructed children’s hospital envisioned and created a hospital department dedicated to tackling this daunting task. The Clinical Logistics Center (CLC) is the hospital’s hub where all of its monitoring technology is integrated and tracked twenty-four hours a day, seven days a week by trained paramedics. Redundancy has been added to the alarm management process through automatic escalation of alarms from bedside staff to CLC staff in a timely manner. The paramedic alerting the bedside staff to true alarms based on good signal quality and confirmed by direct visual confirmation of the patient through bedside cameras distinguishes true alarms from nuisance/false alarms in real time. Communication between CLC and bedside staff occurs primarily via smartphone texts to avoid disruption of clinical activities. The paramedics also continuously monitor physiologic variables for early indicators of clinical deterioration, which leads to early interventions through mechanisms such as rapid response team activation. Hands-free voice communication via room intercoms facilitates CLC logistical support of the bedside staff during acute clinical crises/resuscitations. Standard work is maintained through protocol-driven process steps and serial training of both bedside and CLC staff. This innovative approach to prioritize alarms for the bedside staff is a promising solution to improving alarm management.
APA, Harvard, Vancouver, ISO, and other styles
38

Torre, Iván G., Mónica Romero, and Aitor Álvarez. "Improving Aphasic Speech Recognition by Using Novel Semi-Supervised Learning Methods on AphasiaBank for English and Spanish." Applied Sciences 11, no. 19 (September 24, 2021): 8872. http://dx.doi.org/10.3390/app11198872.

Full text
Abstract:
Automatic speech recognition in patients with aphasia is a challenging task for which studies have been published in a few languages. Reasonably, the systems reported in the literature within this field show significantly lower performance than those focused on transcribing non-pathological clean speech. It is mainly due to the difficulty of recognizing a more unintelligible voice, as well as due to the scarcity of annotated aphasic data. This work is mainly focused on applying novel semi-supervised learning methods to the AphasiaBank dataset in order to deal with these two major issues, reporting improvements for the English language and providing the first benchmark for the Spanish language for which less than one hour of transcribed aphasic speech was used for training. In addition, the influence of reinforcing the training and decoding processes with out-of-domain acoustic and text data is described by using different strategies and configurations to fine-tune the hyperparameters and the final recognition systems. The interesting results obtained encourage extending this technological approach to other languages and scenarios where the scarcity of annotated data to train recognition models is a challenging reality.
APA, Harvard, Vancouver, ISO, and other styles
39

Saba, Farhad, and David Twitchell. "Integrated Services Digital Networks: How it Can Be Used for Distance Education." Journal of Educational Technology Systems 17, no. 1 (September 1988): 15–25. http://dx.doi.org/10.2190/4axa-cgn6-dm21-1ap7.

Full text
Abstract:
Integrated Services Digital Networks (ISDN) is an emerging telecommunications technology that will affect how distance education is designed, developed, and presented. Combined with integrated desktop workstations, it will put voice, text, and video telecommunications at the fingertips of educators and learners. Appropriate use of ISDN for distance education and training depends on the development of a conceptual scheme to integrate models from several closely related areas such as integrated information systems, educational broadcasting, computer assisted instruction, and distance education. In a project funded by Northern Telecom Inc., faculty and students of the Department of Educational Technology at San Diego State University are presently involved in evaluating an integrated telecommunications system and developing an integrated model of distance education.
APA, Harvard, Vancouver, ISO, and other styles
40

Ou, Soobin, Huijin Park, and Jongwoo Lee. "Implementation of an Obstacle Recognition System for the Blind." Applied Sciences 10, no. 1 (December 30, 2019): 282. http://dx.doi.org/10.3390/app10010282.

Full text
Abstract:
The blind encounter commuting risks, such as failing to recognize and avoid obstacles while walking, but protective support systems are lacking. Acoustic signals at crosswalk lights are activated by button or remote control; however, these signals are difficult to operate and not always available (i.e., broken). Bollards are posts installed for pedestrian safety, but they can create dangerous situations in that the blind cannot see them. Therefore, we proposed an obstacle recognition system to assist the blind in walking safely outdoors; this system can recognize and guide the blind through two obstacles (crosswalk lights and bollards) with image training from the Google Object Detection application program interface (API) based on TensorFlow. The recognized results notify the blind through voice guidance playback in real time. The single shot multibox detector (SSD) MobileNet and faster region-convolutional neural network (R-CNN) models were applied to evaluate the obstacle recognition system; the latter model demonstrated better performance. Crosswalk lights were evaluated and found to perform better during the day than night. They were also analyzed to determine if a client could cross at a crosswalk, while the locations of bollards were analyzed by algorithms to guide the client by voice guidance.
APA, Harvard, Vancouver, ISO, and other styles
41

Mohammed, Rawia A., Nidaa F. Hassan, and Akbas E. Ali. "Arabic Speaker Identification System Using Multi Features." Engineering and Technology Journal 38, no. 5A (May 25, 2020): 769–78. http://dx.doi.org/10.30684/etj.v38i5a.408.

Full text
Abstract:
The performance regarding the Speaker Identification Systems (SIS) has enhanced because of the current developments in speech processing methods, however, an improvement is still required with regard to text-independent speaker identification in the Arabic language. In spite of tremendous progress in applied technology for SIS, it is limited to English and some other languages. This paper aims to design an efficient SIS (text-independent) for the Arabic language. The proposed system uses speech signal features for speaker identification purposes, and it includes two phases: The first phase is training, in this phase a corpus of reference database is built which will serve as a reference for comparing and identifying the speaker for the second phase. The second phase is testing, which searches the identification of the speaker. In this system, the features will be extracted according to: Mel Frequency Cepstrum Coefficient (MFCC), mathematical calculations of voice frequency and voice fundamental frequency. Machine learning classification techniques: K-nearest neighbors, Sequential Minimum Optimization and Logistic Model Tree are used in the classification process. The best classification technique is a K-nearest neighbors, where it gives higher precision 94.8%.
APA, Harvard, Vancouver, ISO, and other styles
42

Kaizer, Betânia Mafra, Carlos Eduardo Sanches da Silva, Thaís Zerbini, and Anderson Paulo Paiva. "E-learning training in work corporations: a review on instructional planning." European Journal of Training and Development 44, no. 8/9 (May 29, 2020): 761–81. http://dx.doi.org/10.1108/ejtd-03-2020-0042.

Full text
Abstract:
Purpose The purpose of this study is a bibliometric and descriptive review of the literature on instruction planning of training offered in the e-learning modality in work corporations to identify methodologies and experiences that will serve as a model for professionals working in planning e-learning training in the corporate context. Design/methodology/approach The timeline from 2010 to 2020 was adopted. Data were extracted from five databases and were compiled in the software Zotero. Based on defined criteria, 260 productions were identified. The interrelation and metric presentation of the data from these studies were done in the software VosViewer. Subsequently, were selected only free access papers, resulting in 64 publications. From these, the authors chose six empirical studies for a descriptive analysis based on specific criteria. Findings The range of hardware and software platforms has stimulated the use of virtual reality (VR) and augmented reality and artificial intelligence (AI) resources in corporative training. The use of management tools such as voice of customer (VOC) and quality function deployment (QFD), can support those responsible for instructional planning. The literature presented important elements that should be considered for the proper planning of an e-learning training: learner: feedback, control of self-learning process, classification of cultural profiles in the case of courses in which participants are geographically distant and training management: content and delivery mode of instruction. Originality/value The authors selected six empirical studies that presented models, systems or experiences on training planning to support decisions in this area. This study contributes to the area of T&D showing an updated context of practices for the implementation of training systems that have been adopted in several countries. The authors present quantitative indicators of scientific production using two additional software to support the bibliometric review, namely, Zotero and VosViewer. This study used five databases and a research equation to systematically present the current panorama of research on training planning from the perspective of the areas of management and organizational psychology.
APA, Harvard, Vancouver, ISO, and other styles
43

Kaizer, Betânia Mafra, Carlos Eduardo Sanches Silva, Anderson Paulo de Pavia, and Thaís Zerbini. "E-learning training in work corporations: a review on instructional planning." European Journal of Training and Development 44, no. 6/7 (May 11, 2020): 615–36. http://dx.doi.org/10.1108/ejtd-08-2019-0149.

Full text
Abstract:
Purpose The main purpose of this work is a bibliometric and descriptive review of the literature on instruction planning of training offered in the e-learning modality in work corporations to identify methodologies and experiences that will serve as a model for professionals working in planning e-learning training in the corporate context. Design/methodology/approach The timeline from 2010 to 2020 was adopted. Data were extracted from five databases and were compiled in the software Zotero. Based on defined criteria, 260 productions were identified. The interrelation and metric presentation of the data from these studies were done in the software VosViewer. Subsequently, were selected only free access papers, resulting in 64 publications. From these, we chose 6 empirical studies for a descriptive analysis based on specific criteria. Findings The range of hardware and software platforms has stimulated the use of virtual reality (VR) and augmented reality (AR) and artificial intelligence (AI) resources in corporative training. The use of management tools, such as Voice of Customer (VOC) and Quality Function Deployment (QFD), can support those responsible for instructional planning. The literature presented important elements that should be considered for the proper planning of an e-Learning training: learner: feedback, control of self-learning process, classification of cultural profiles in the case of courses in which participants are geographically distant and training management: content and delivery mode of instruction. Originality/value We selected 6 empirical studies that presented models, systems or experiences on training planning to support decisions in this area. This article contributes to the area of T&D showing an updated context of practices for the implementation of training systems that have been adopted in several countries. We present quantitative indicators of scientific production using two additional software to support the bibliometric review: Zotero and VosViewer. This article used five databases and a research equation to systematically present the current panorama of research on training planning from the perspective of the areas of management and organizational psychology.
APA, Harvard, Vancouver, ISO, and other styles
44

Mairittha, Tittaya, Nattaya Mairittha, and Sozo Inoue. "Automatic Labeled Dialogue Generation for Nursing Record Systems." Journal of Personalized Medicine 10, no. 3 (July 16, 2020): 62. http://dx.doi.org/10.3390/jpm10030062.

Full text
Abstract:
The integration of digital voice assistants in nursing residences is becoming increasingly important to facilitate nursing productivity with documentation. A key idea behind this system is training natural language understanding (NLU) modules that enable the machine to classify the purpose of the user utterance (intent) and extract pieces of valuable information present in the utterance (entity). One of the main obstacles when creating robust NLU is the lack of sufficient labeled data, which generally relies on human labeling. This process is cost-intensive and time-consuming, particularly in the high-level nursing care domain, which requires abstract knowledge. In this paper, we propose an automatic dialogue labeling framework of NLU tasks, specifically for nursing record systems. First, we apply data augmentation techniques to create a collection of variant sample utterances. The individual evaluation result strongly shows a stratification rate, with regard to both fluency and accuracy in utterances. We also investigate the possibility of applying deep generative models for our augmented dataset. The preliminary character-based model based on long short-term memory (LSTM) obtains an accuracy of 90% and generates various reasonable texts with BLEU scores of 0.76. Secondly, we introduce an idea for intent and entity labeling by using feature embeddings and semantic similarity-based clustering. We also empirically evaluate different embedding methods for learning good representations that are most suitable to use with our data and clustering tasks. Experimental results show that fastText embeddings produce strong performances both for intent labeling and on entity labeling, which achieves an accuracy level of 0.79 and 0.78 f1-scores and 0.67 and 0.61 silhouette scores, respectively.
APA, Harvard, Vancouver, ISO, and other styles
45

Truong, Do Quoc, Pham Ngoc Phuong, Tran Hoang Tung, and Luong Chi Mai. "DEVELOPMENT OF HIGH-PERFORMANCE AND LARGE-SCALE VIETNAMESE AUTOMATIC SPEECH RECOGNITION SYSTEMS." Journal of Computer Science and Cybernetics 34, no. 4 (January 30, 2019): 335–48. http://dx.doi.org/10.15625/1813-9663/34/4/13165.

Full text
Abstract:
Automatic Speech Recognition (ASR) systems convert human speech into the corresponding transcription automatically. They have a wide range of applications such as controlling robots, call center analytics, voice chatbot. Recent studies on ASR for English have achieved the performance that surpasses human ability. The systems were trained on a large amount of training data and performed well under many environments. With regards to Vietnamese, there have been many studies on improving the performance of existing ASR systems, however, many of them are conducted on a small-scaled data, which does not reflect realistic scenarios. Although the corpora used to train the system were carefully design to maintain phonetic balance properties, efforts in collecting them at a large-scale are still limited. Specifically, only a certain accent of Vietnam was evaluated in existing works. In this paper, we first describe our efforts in collecting a large data set that covers all 3 major accents of Vietnam located in the Northern, Center, and Southern regions. Then, we detail our ASR system development procedure utilizing the collected data set and evaluating different model architectures to find the best structure for Vietnamese. In the VLSP 2018 challenge, our system achieved the best performance with 6.5% WER and on our internal test set with more than 10 hours of speech collected real environments, the system also performs well with 11% WER
APA, Harvard, Vancouver, ISO, and other styles
46

Brown, David J., Andrew J. R. Simpson, and Michael J. Proulx. "Visual Objects in the Auditory System in Sensory Substitution: How Much Information Do We Need?" Multisensory Research 27, no. 5-6 (2014): 337–57. http://dx.doi.org/10.1163/22134808-00002462.

Full text
Abstract:
Sensory substitution devices such as The vOICe convert visual imagery into auditory soundscapes and can provide a basic ‘visual’ percept to those with visual impairment. However, it is not known whether technical or perceptual limits dominate the practical efficacy of such systems. By manipulating the resolution of sonified images and asking naïve sighted participants to identify visual objects through a six-alternative forced-choice procedure (6AFC) we demonstrate a ‘ceiling effect’ at 8 × 8 pixels, in both visual and tactile conditions, that is well below the theoretical limits of the technology. We discuss our results in the context of auditory neural limits on the representation of ‘auditory’ objects in a cortical hierarchy and how perceptual training may be used to circumvent these limitations.
APA, Harvard, Vancouver, ISO, and other styles
47

Cronin, Séan, Bridget Kane, and Gavin Doherty. "A Qualitative Analysis of the Needs and Experiences of Hospital-based Clinicians when Accessing Medical Imaging." Journal of Digital Imaging 34, no. 2 (April 2021): 385–96. http://dx.doi.org/10.1007/s10278-021-00446-1.

Full text
Abstract:
AbstractAs digital imaging is now a common and essential tool in the clinical workflow, it is important to understand the experiences of clinicians with medical imaging systems in order to guide future development. The objective of this paper was to explore health professionals’ experiences, practices and preferences when using Picture Archiving and Communications Systems (PACS), to identify shortcomings in the existing technology and inform future developments. Semi-structured interviews are reported with 35 hospital-based healthcare professionals (3 interns, 11 senior health officers, 6 specialist registrars, 6 consultants, 2 clinical specialists, 5 radiographers, 1 sonographer, 1 radiation safety officer). Data collection took place between February 2019 and December 2020 and all data are analyzed thematically. A majority of clinicians report using PACS frequently (6+ times per day), both through dedicated PACS workstations, and through general-purpose desktop computers. Most clinicians report using basic features of PACS to view imaging and reports, and also to compare current with previous imaging, noting that they rarely use more advanced features, such as measuring. Usability is seen as a problem, including issues related to data privacy. More sustained training would help clinicians gain more value from PACS, particularly less experienced users. While the majority of clinicians report being unconcerned about sterility when accessing digital imaging, clinicians were open to the possibility of touchless operation using voice, and the ability to execute multiple commands with a single voice command would be welcomed.
APA, Harvard, Vancouver, ISO, and other styles
48

Almghairbi, Dalal Salem, Takawira C. Marufu, and Iain K. Moppett. "Conflict resolution in anaesthesia: systematic review." BMJ Simulation and Technology Enhanced Learning 5, no. 1 (July 21, 2018): 1–7. http://dx.doi.org/10.1136/bmjstel-2017-000264.

Full text
Abstract:
BackgroundConflict is a significant and recurrent problem in most modern healthcare systems. Given its ubiquity, effective techniques to manage or resolve conflict safely are required.ObjectiveThis review focuses on conflict resolution interventions for improvement of patient safety through understanding and applying/teaching conflict resolution skills that critically depend on communication and improvement of staff members’ ability to voice their concerns.MethodsWe used the Population-Intervention-Comparator-Outcome model to outline our methodology. Relevant English language sources for both published and unpublished papers up to February 2018 were sourced across five electronic databases: the Cochrane Library, EMBASE, MEDLINE, SCOPUS and Web of Science.ResultsAfter removal of duplicates, 1485 studies were screened. Six articles met the inclusion criteria with a total sample size of 286 healthcare worker participants. Three training programmes were identified among the included studies: (A) crisis resource management training; (B) the Team Strategies and Tools to Enhance Performance and Patient Safety (TeamSTEPPS) training; and (C) the two-challenge rule (a component of TeamSTEPPS), and two studies manipulating wider team behaviours. Outcomes reported included participant reaction and observer rating of conflict resolution, speaking up or advocacy-inquiry behaviours. Study results were inconsistent in showing benefits of interventions.ConclusionThe evidence for training to improve conflict resolution in the clinical environment is sparse. Novel methods that seek to influence wider team behaviours may complement traditional interventions directed at individuals.
APA, Harvard, Vancouver, ISO, and other styles
49

Venkatesh, Satvik, David Moffat, and Eduardo Reck Miranda. "Investigating the Effects of Training Set Synthesis for Audio Segmentation of Radio Broadcast." Electronics 10, no. 7 (March 31, 2021): 827. http://dx.doi.org/10.3390/electronics10070827.

Full text
Abstract:
Music and speech detection provides us valuable information regarding the nature of content in broadcast audio. It helps detect acoustic regions that contain speech, voice over music, only music, or silence. In recent years, there have been developments in machine learning algorithms to accomplish this task. However, broadcast audio is generally well-mixed and copyrighted, which makes it challenging to share across research groups. In this study, we address the challenges encountered in automatically synthesising data that resembles a radio broadcast. Firstly, we compare state-of-the-art neural network architectures such as CNN, GRU, LSTM, TCN, and CRNN. Later, we investigate how audio ducking of background music impacts the precision and recall of the machine learning algorithm. Thirdly, we examine how the quantity of synthetic training data impacts the results. Finally, we evaluate the effectiveness of synthesised, real-world, and combined approaches for training models, to understand if the synthetic data presents any additional value. Amongst the network architectures, CRNN was the best performing network. Results also show that the minimum level of audio ducking preferred by the machine learning algorithm was similar to that of human listeners. After testing our model on in-house and public datasets, we observe that our proposed synthesis technique outperforms real-world data in some cases and serves as a promising alternative.
APA, Harvard, Vancouver, ISO, and other styles
50

Loui, Psyche. "A Dual-Stream Neuroanatomy of Singing." Music Perception 32, no. 3 (February 1, 2015): 232–41. http://dx.doi.org/10.1525/mp.2015.32.3.232.

Full text
Abstract:
Singing requires effortless and efficient use of auditory and motor systems that center around the perception and production of the human voice. Although perception and production are usually tightly coupled functions, occasional mismatches between the two systems inform us of dissociable pathways in the brain systems that enable singing. Here I review the literature on perception and production in the auditory modality, and propose a dual-stream neuroanatomical model that subserves singing. I will discuss studies surrounding the neural functions of feedforward, feedback, and efference systems that control vocal monitoring, as well as the white matter pathways that connect frontal and temporal regions that are involved in perception and production. I will also consider disruptions of the perception-production network that are evident in tone-deaf individuals and poor pitch singers. Finally, by comparing expert singers against other musicians and nonmusicians, I will evaluate the possibility that singing training might offer rehabilitation from these disruptions through neuroplasticity of the perception-production network. Taken together, the best available evidence supports a model of dorsal and ventral pathways in auditory-motor integration that enables singing and is shared with language, music, speech, and human interactions in the auditory environment.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography