Log in

Relevant bibliographies by topics / Speech Communication. Engineering, Electronics and Electrical / Journal articles

To see the other types of publications on this topic, follow the link: Speech Communication. Engineering, Electronics and Electrical.

Journal articles on the topic 'Speech Communication. Engineering, Electronics and Electrical'

Author: Grafiati

Published: 4 June 2021

Last updated: 11 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Speech Communication. Engineering, Electronics and Electrical.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

IFUKUBE, TOORU. "Exciting Challenge of Biomedical Engineering. Auditory Disorder and Speech Communication." Journal of the Institute of Electrical Engineers of Japan 119, no. 11 (1999): 679–81. http://dx.doi.org/10.1541/ieejjournal.119.679.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Newell, A. F. "Speech communication technology—lessons from the disabled." Electronics and Power 32, no. 9 (1986): 661. http://dx.doi.org/10.1049/ep.1986.0389.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Zhang, Yu, Ming Dai, Yiman Hua, and Gonghuan Du. "Hyperchaotic synchronisation scheme for digital speech communication." Electronics Letters 35, no. 24 (1999): 2087. http://dx.doi.org/10.1049/el:19991411.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Moller, Sebastian, and Richard Heusdens. "Objective Estimation of Speech Quality for Communication Systems." Proceedings of the IEEE 101, no. 9 (September 2013): 1955–67. http://dx.doi.org/10.1109/jproc.2013.2241374.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Allen, J. "A perspective on man-machine communication by speech." Proceedings of the IEEE 73, no. 11 (1985): 1541–50. http://dx.doi.org/10.1109/proc.1985.13339.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Weng, Zhenzi, and Zhijin Qin. "Semantic Communication Systems for Speech Transmission." IEEE Journal on Selected Areas in Communications 39, no. 8 (August 2021): 2434–44. http://dx.doi.org/10.1109/jsac.2021.3087240.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Karjalainen, Matti. "Speech communication, human and machine." Signal Processing 15, no. 2 (September 1988): 217–18. http://dx.doi.org/10.1016/0165-1684(88)90074-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Rahmani, M., N. Yousefian, and A. Akbari. "Energy-based speech enhancement technique for hands-free communication." Electronics Letters 45, no. 1 (2009): 85. http://dx.doi.org/10.1049/el:20092177.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Deng, Zongyuan, Xi Shao, Zhen Yang, and Baoyu Zheng. "A novel covert speech communication system and its implementation." Journal of Electronics (China) 25, no. 6 (November 2008): 737–45. http://dx.doi.org/10.1007/s11767-007-0099-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Kawaguchi, Nobuo, Kazuya Takeda, and Fumitada Itakura. "Multimedia Corpus of In-Car Speech Communication." Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 36, no. 2/3 (February 2004): 153–59. http://dx.doi.org/10.1023/b:vlsi.0000015094.60008.dc.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Lopez-Ludena, Veronica, Ruben San-Segundo, Raquel Martin, David Sanchez, and Adolfo Garcia. "Evaluating a Speech Communication System for Deaf People." IEEE Latin America Transactions 9, no. 4 (July 2011): 565–70. http://dx.doi.org/10.1109/tla.2011.5993744.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

ZAHARIA, M. H. "Securing Communication in Ambient Networks for Speech Therapy Systems." Advances in Electrical and Computer Engineering 7, no. 2 (2007): 41–44. http://dx.doi.org/10.4316/aece.2007.02010.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Beritelli, Francesco, Salvatore Casale, and Salvatore Serrano. "A low-complexity speech-pause detection algorithm for communication in noisy environments." European Transactions on Telecommunications 15, no. 1 (January 2004): 33–38. http://dx.doi.org/10.1002/ett.943.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Lavie, A., F. Pianesi, and L. Levin. "The NESPOLE! System for multilingual speech communication over the Internet." IEEE Transactions on Audio, Speech and Language Processing 14, no. 5 (September 2006): 1664–73. http://dx.doi.org/10.1109/tsa.2005.858520.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Parvez, Shahid, and M. Elshafei-Ahmed. "A speech coder for PC multimedia net-to-net communication." International Journal of Communication Systems 14, no. 7 (2001): 679–94. http://dx.doi.org/10.1002/dac.499.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Wu, Zhijun, Haixin Duan, and Xing Li. "A novel approach of secure communication based on the technique of speech information hiding." Journal of Electronics (China) 23, no. 2 (March 2006): 304–9. http://dx.doi.org/10.1007/s11767-005-0071-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Li, Zhenyu, and Simin Li. "Research of solution for real-time speech communication over mobile Ad hoc network." JOURNAL OF ELECTRONIC MEASUREMENT AND INSTRUMENT 2009, no. 5 (December 9, 2009): 40–45. http://dx.doi.org/10.3724/sp.j.1187.2009.05040.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

S, Siva Priyanka, and Kishore Kumar T. "Signed Convex Combination of Fast Convergence Algorithm to Generalized Sidelobe Canceller Beamformer for Multi-Channel Speech Enhancement." Traitement du Signal 38, no. 3 (June 30, 2021): 785–95. http://dx.doi.org/10.18280/ts.380325.

Full text

Abstract:

In speech communication applications such as teleconferences, mobile phones, etc., the real-time noises degrade the desired speech quality and intelligibility. For these applications, in the case of multichannel speech enhancement, the adaptive beamforming algorithms play a major role compared to fixed beamforming algorithms. Among the adaptive beamformers, Generalized Sidelobe Canceller (GSC) beamforming with Least Mean Square (LMS) Algorithm has the least complexity but provides poor noise reduction whereas GSC beamforming with Combined LMS (CLMS) algorithm has better noise reduction performance but with high computational complexity. In order to achieve a tradeoff between noise reduction and computational complexity in real-time noisy conditions, a Signed Convex Combination of Fast Convergence (SCCFC) algorithm based GSC beamforming for multi-channel speech enhancement is proposed. This proposed SCCFC algorithm is implemented using a signed convex combination of two Fast Convergence Normalized Least Mean Square (FCNLMS) adaptive filters with different step-sizes. This improves the overall performance of the GSC beamformer in real-time noisy conditions as well as reduces the computation complexity when compared to the existing GSC algorithms. The performance of the proposed multi-channel speech enhancement system is evaluated using the standard speech processing performance metrics. The simulation results demonstrate the superiority of the proposed GSC-SCCFC beamformer over the traditional methods.

APA, Harvard, Vancouver, ISO, and other styles

19

Bouis, D. "First announcement of the European Speech Communication Association (ESCA)." Signal Processing 16, no. 2 (February 1989): 192. http://dx.doi.org/10.1016/0165-1684(89)90104-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Heracleous, Panikos, and Denis Beautemps. "Cued Speech: A visual communication mode for the deaf society." IEICE Electronics Express 7, no. 4 (2010): 234–39. http://dx.doi.org/10.1587/elex.7.234.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Borgstrom, Bengt J., Mihaela van der Schaar, and Abeer Alwan. "Rate Allocation for Noncollaborative Multiuser Speech Communication Systems Based on Bargaining Theory." IEEE Transactions on Audio, Speech and Language Processing 15, no. 4 (May 2007): 1156–66. http://dx.doi.org/10.1109/tasl.2007.894533.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Ishi, Carlos Toshinori, Shigeki Matsuda, Takayuki Kanda, Takatoshi Jitsuhiro, Hiroshi Ishiguro, Satoshi Nakamura, and Norihiro Hagita. "A Robust Speech Recognition System for Communication Robots in Noisy Environments." IEEE Transactions on Robotics 24, no. 3 (June 2008): 759–63. http://dx.doi.org/10.1109/tro.2008.919305.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Hanson, Donald, and Peggy Power. "Electronic Scanners with Speech Output - A Communication System for the Physically Handicapped And Mentally Retarded." IEEE Micro 5, no. 2 (April 1985): 20–52. http://dx.doi.org/10.1109/mm.1985.304452.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

MIYANAGA, Yoshikazu, Wataru TAKAHASHI, and Shingo YOSHIZAWA. "A Robust Speech Communication into Smart Info-Media System." IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E96.A, no. 11 (2013): 2074–80. http://dx.doi.org/10.1587/transfun.e96.a.2074.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Saha, Sagor, Farhan Hossain Shakal, and Mufrath Mahmood. "Visual, navigation and communication aid for visually impaired person." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 2 (April 1, 2021): 1276. http://dx.doi.org/10.11591/ijece.v11i2.pp1276-1283.

Full text

Abstract:

The loss of vision restrained the visually impaired people from performing their daily task. This issue has impeded their free-movement and turned them into dependent a person. People in this sector did not face technologies revamping their situations. With the advent of computer vision, artificial intelligence, the situation improved to a great extent. The propounded design is an implementation of a wearable device which is capable of performing a lot of features. It is employed to provide visual instinct by recognizing objects, identifying the face of choices. The device runs a pre-trained model to classify common objects from household items to automobiles items. Optical character recognition and Google translate were executed to read any text from image and convert speech of the user to text respectively. Besides, the user can search for an interesting topic by the command in the form of speech. Additionally, ultrasonic sensors were kept fixed at three positions to sense the obstacle during navigation. The display attached help in communication with deaf person and GPS and GSM module aid in tracing the user. All these features run by voice commands which are passed through the microphone of any earphone. The visual input is received through the camera and the computation task is processed in the raspberry pi board. However, the device seemed to be effective during the test and validation.

APA, Harvard, Vancouver, ISO, and other styles

26

Nozawa, Takayuki, Mizuki Uchiyama, Keigo Honda, Tamio Nakano, and Yoshihiro Miyake. "Speech Discrimination in Real-World Group Communication Using Audio-Motion Multimodal Sensing." Sensors 20, no. 10 (May 22, 2020): 2948. http://dx.doi.org/10.3390/s20102948.

Full text

Abstract:

Speech discrimination that determines whether a participant is speaking at a given moment is essential in investigating human verbal communication. Specifically, in dynamic real-world situations where multiple people participate in, and form, groups in the same space, simultaneous speakers render speech discrimination that is solely based on audio sensing difficult. In this study, we focused on physical activity during speech, and hypothesized that combining audio and physical motion data acquired by wearable sensors can improve speech discrimination. Thus, utterance and physical activity data of students in a university participatory class were recorded, using smartphones worn around their neck. First, we tested the temporal relationship between manually identified utterances and physical motions and confirmed that physical activities in wide-frequency ranges co-occurred with utterances. Second, we trained and tested classifiers for each participant and found a higher performance with the audio-motion classifier (average accuracy 92.2%) than both the audio-only (80.4%) and motion-only (87.8%) classifiers. Finally, we tested inter-individual classification and obtained a higher performance with the audio-motion combined classifier (83.2%) than the audio-only (67.7%) and motion-only (71.9%) classifiers. These results show that audio-motion multimodal sensing using widely available smartphones can provide effective utterance discrimination in dynamic group communications.

APA, Harvard, Vancouver, ISO, and other styles

27

Li Deng, Kuansan Wang, and Wu Chou. "Speech technology and systems in human-machine communication [from the Guest Editors." IEEE Signal Processing Magazine 22, no. 5 (September 2005): 12–14. http://dx.doi.org/10.1109/msp.2005.1511818.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Westerlund, Nils, Mattias Dahl, and Ingvar Claesson. "Speech enhancement for personal communication using an adaptive gain equalizer." Signal Processing 85, no. 6 (June 2005): 1089–101. http://dx.doi.org/10.1016/j.sigpro.2005.01.004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Zhuang, X. D., H. Zhu, and N. E. Mastorakis. "Signal Processing: New Stochastic Feature of Unvoiced Pronunciation for Whisper Speech Modeling and Synthesis." International Journal of Circuits, Systems and Signal Processing 14 (January 15, 2021): 1162–75. http://dx.doi.org/10.46300/9106.2020.14.144.

Full text

Abstract:

Whisper is an indispensable way in speech communication, especially for private conversation or human-machine interaction in public places such as library and hospital. Whisper is unvoiced pronunciation, and voiceless sound is usually considered as noise-like signals. However, unvoiced sound has unique acoustic features and can carry enough information for effective communication. Although it is a significant form of communication, currently there is much less research work on whisper signal than common speech and voiced pronunciation. Our work extends the research of unvoiced pronunciation signal by introducing a novel signal feature, which is further applied in unvoiced signal modeling and whisper sound synthesis. The statistics of amplitude for each frequency component is studied individually, based on which a new feature of “consistent standard deviation coefficient” is revealed for the amplitude spectrum of unvoiced pronunciation. A synthesis method for unvoiced pronunciation is proposed based on the new feature, which is implemented by STFT with artificially generated short-time spectrum with random amplitude and phase. The synthesis results have identical quality of auditory perception as the original pronunciation, and have similar autocorrelation as that of the original signal, which proves the effectiveness of the proposed stochastic model of short-time spectrum for unvoiced pronunciation

APA, Harvard, Vancouver, ISO, and other styles

30

Dash, Debadatta, Paul Ferrari, Satwik Dutta, and Jun Wang. "NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals." Sensors 20, no. 8 (April 16, 2020): 2248. http://dx.doi.org/10.3390/s20082248.

Full text

Abstract:

Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for a higher communication rate than the current BCIs. Although recent progress has demonstrated the potential of speech-BCIs from either invasive or non-invasive neural signals, the majority of the systems developed so far still assume knowing the onset and offset of the speech utterances within the continuous neural recordings. This lack of real-time voice/speech activity detection (VAD) is a current obstacle for future applications of neural speech decoding wherein BCI users can have a continuous conversation with other speakers. To address this issue, in this study, we attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG). First, we classified the whole segments of pre-speech, speech, and post-speech in the neural signals using a support vector machine (SVM). Second, for continuous prediction, we used a long short-term memory-recurrent neural network (LSTM-RNN) to efficiently decode the voice activity at each time point via its sequential pattern-learning mechanism. Experimental results demonstrated the possibility of real-time VAD directly from the non-invasive neural signals with about 88% accuracy.

APA, Harvard, Vancouver, ISO, and other styles

31

Dymarski, Przemysław. "Enhancement of Ground-to-Aircraft Communication Using Audio Watermarking." Journal of Telecommunications and Information Technology 1 (March 29, 2019): 93–102. http://dx.doi.org/10.26636/jtit.2019.128418.

Full text

Abstract:

This paper presents research on improving the intelligibility of spoken messages transmitted to aircraft from a ground station. The proposed solution is based on the selective calling (SELCAL) system and the audio watermarking technique. The most important elements of a spoken message (commands, numerical values) are transmitted as a watermark embedded in the speech signal and are displayed to the cockpit crew. The synchronization signal is embedded in SELCAL duo-tones. The proposed system is resistant to resampling and channel noise (at SNR > 25 dB).

APA, Harvard, Vancouver, ISO, and other styles

32

Thepie Fapi, Emmanuel Rossignol, Dominique Pastor, Christophe Beaugeant, and Hervé Taddei. "Acoustic Echo Cancellation Embedded in Smart Transcoding Algorithm between 3GPP AMR-NB Modes." Journal of Electrical and Computer Engineering 2010 (2010): 1–5. http://dx.doi.org/10.1155/2010/902569.

Full text

Abstract:

Acoustic Echo Cancellation (AEC) is a necessary feature for mobile devices when the acoustic coupling between the microphone and the loudspeaker affects the communication quality and intelligibility. When implemented inside the network, decoding is required to access the corrupted signal. The AEC performance is strongly degraded by nonlinearity introduced by speech codecs. The Echo Return Loss Enhancement (ERLE) can be less than 10 dB for low bit rate speech codecs. We propose in this paper a coded domain AEC integrated in a smart transcoding strategy which directly modifies the Code Excited Linear Prediction (CELP) parameters. The proposed system addresses simultaneously problems due to network interoperability and network voice quality enhancement. The ERLE performance of this new approach during transcoding between Adaptive Multirate-NarrowBand (AMR-NB) modes is above 45 dB as required in Global System for Mobile Communications (GSM) specifications.

APA, Harvard, Vancouver, ISO, and other styles

33

Vieira, Samuel Terra, Renata Lopes Rosa, and Demóstenes Zegarra Rodríguez. "A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations." Journal of communications software and systems 16, no. 2 (June 4, 2020): 180–87. http://dx.doi.org/10.24138/jcomss.v16i2.1032.

Full text

Abstract:

Many factors can affect the users’ quality of experience (QoE) in speech communication services. The impairment factors appear due to physical phenomena that occur in the transmission channel of wireless and wired networks. The monitoring of users’ QoE is important for service providers. In this context, a non-intrusive speech quality classifier based on the Tree Convolutional Neural Network (Tree-CNN) is proposed. The Tree-CNN is an adaptive network structure composed of hierarchical CNNs models, and its main advantage is to decrease the training time that is very relevant on speech quality assessment methods. In the training phase of the proposed classifier model, impaired speech signals caused by wired and wireless network degradation are used as input. Also, in the network scenario, different modulation schemes and channel degradation intensities, such as packet loss rate, signal-to-noise ratio, and maximum Doppler shift frequencies are implemented. Experimental results demonstrated that the proposed model achieves significant reduction of training time, reaching 25% of reduction in relation to another implementation based on DRBM. The accuracy reached by the Tree-CNN model is almost 95% for each quality class. Performance assessment results show that the proposed classifier based on the Tree-CNN overcomes both the current standardized algorithm described in ITU-T Rec. P.563 and the speech quality assessment method called ViSQOL.

APA, Harvard, Vancouver, ISO, and other styles

34

Barnidge, Matthew, Bumsoo Kim, Lindsey A. Sherrill, Žiga Luknar, and Jiehua Zhang. "Perceived exposure to and avoidance of hate speech in various communication settings." Telematics and Informatics 44 (November 2019): 101263. http://dx.doi.org/10.1016/j.tele.2019.101263.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Rogowski, Adam, Krzysztof Bieliszczuk, and Jerzy Rapcewicz. "Integration of Industrially-Oriented Human-Robot Speech Communication and Vision-Based Object Recognition." Sensors 20, no. 24 (December 18, 2020): 7287. http://dx.doi.org/10.3390/s20247287.

Full text

Abstract:

This paper presents a novel method for integration of industrially-oriented human-robot speech communication and vision-based object recognition. Such integration is necessary to provide context for task-oriented voice commands. Context-based speech communication is easier, the commands are shorter, hence their recognition rate is higher. In recent years, significant research was devoted to integration of speech and gesture recognition. However, little attention was paid to vision-based identification of objects in industrial environment (like workpieces or tools) represented by general terms used in voice commands. There are no reports on any methods facilitating the abovementioned integration. Image and speech recognition systems usually operate on different data structures, describing reality on different levels of abstraction, hence development of context-based voice control systems is a laborious and time-consuming task. The aim of our research was to solve this problem. The core of our method is extension of Voice Command Description (VCD) format describing syntax and semantics of task-oriented commands, as well as its integration with Flexible Editable Contour Templates (FECT) used for classification of contours derived from image recognition systems. To the best of our knowledge, it is the first solution that facilitates development of customized vision-based voice control applications for industrial robots.

APA, Harvard, Vancouver, ISO, and other styles

36

Bernard, A., and A. Alwan. "Low-bitrate distributed speech recognition for packet-based and wireless communication." IEEE Transactions on Speech and Audio Processing 10, no. 8 (November 2002): 570–79. http://dx.doi.org/10.1109/tsa.2002.808141.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Cox, Richard, Simao De Campos Neto, Claude Lamblin, and Mostafa Sherif. "ITU-T coders for wideband, superwideband, and fullband speech communication [Series Editorial." IEEE Communications Magazine 47, no. 10 (October 2009): 106–9. http://dx.doi.org/10.1109/mcom.2009.5273816.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Kushner, W. M., S. M. Harton, R. J. Novorita, and M. J. McLaughlin. "The acoustic properties of SCBA equipment and its effects on speech communication." IEEE Communications Magazine 44, no. 1 (January 2006): 66–72. http://dx.doi.org/10.1109/mcom.2006.1580934.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Yousaf, Kanwal, Zahid Mehmood, Tanzila Saba, Amjad Rehman, Muhammad Rashid, Muhammad Altaf, and Zhang Shuguang. "A Novel Technique for Speech Recognition and Visualization Based Mobile Application to Support Two-Way Communication between Deaf-Mute and Normal Peoples." Wireless Communications and Mobile Computing 2018 (May 24, 2018): 1–12. http://dx.doi.org/10.1155/2018/1013234.

Full text

Abstract:

Mobile technology is very fast growing and incredible, yet there are not much technology development and improvement for Deaf-mute peoples. Existing mobile applications use sign language as the only option for communication with them. Before our article, no such application (app) that uses the disrupted speech of Deaf-mutes for the purpose of social connectivity exists in the mobile market. The proposed application, named as vocalizer to mute (V2M), uses automatic speech recognition (ASR) methodology to recognize the speech of Deaf-mute and convert it into a recognizable form of speech for a normal person. In this work mel frequency cepstral coefficients (MFCC) based features are extracted for each training and testing sample of Deaf-mute speech. The hidden Markov model toolkit (HTK) is used for the process of speech recognition. The application is also integrated with a 3D avatar for providing visualization support. The avatar is responsible for performing the sign language on behalf of a person with no awareness of Deaf-mute culture. The prototype application was piloted in social welfare institute for Deaf-mute children. Participants were 15 children aged between 7 and 13 years. The experimental results show the accuracy of the proposed application as 97.9%. The quantitative and qualitative analysis of results also revealed that face-to-face socialization of Deaf-mute is improved by the intervention of mobile technology. The participants also suggested that the proposed mobile application can act as a voice for them and they can socialize with friends and family by using this app.

APA, Harvard, Vancouver, ISO, and other styles

40

Shiomi, Masahiro, Takahiro Hirano, Mitsuhiko Kimoto, Takamasa Iio, and Katsunori Shimohara. "Gaze-Height and Speech-Timing Effects on Feeling Robot-Initiated Touches." Journal of Robotics and Mechatronics 32, no. 1 (February 20, 2020): 68–75. http://dx.doi.org/10.20965/jrm.2020.p0068.

Full text

Abstract:

This paper reports the effects of communication cues on robot-initiated touch interactions at close distance by focusing on two factors: gaze-height for making eye contact and speech timing before and after touches. Although both factors are essential to achieve acceptable touches in human-human touch interaction, their effectiveness remains unknown in human-robot touch interaction contexts. To investigate the effects of these factors, we conducted an experiment whose results showed that being touched with before-touch timing is preferred to being touched with after-touch timing, although gaze-height did not significantly improve the feelings of robot-initiated touch.

APA, Harvard, Vancouver, ISO, and other styles

41

Lee, Wookey, Jessica Jiwon Seong, Busra Ozlu, Bong Sup Shim, Azizbek Marakhimov, and Suan Lee. "Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review." Sensors 21, no. 4 (February 17, 2021): 1399. http://dx.doi.org/10.3390/s21041399.

Full text

Abstract:

Voice is one of the essential mechanisms for communicating and expressing one’s intentions as a human being. There are several causes of voice inability, including disease, accident, vocal abuse, medical surgery, ageing, and environmental pollution, and the risk of voice loss continues to increase. Novel approaches should have been developed for speech recognition and production because that would seriously undermine the quality of life and sometimes leads to isolation from society. In this review, we survey mouth interface technologies which are mouth-mounted devices for speech recognition, production, and volitional control, and the corresponding research to develop artificial mouth technologies based on various sensors, including electromyography (EMG), electroencephalography (EEG), electropalatography (EPG), electromagnetic articulography (EMA), permanent magnet articulography (PMA), gyros, images and 3-axial magnetic sensors, especially with deep learning techniques. We especially research various deep learning technologies related to voice recognition, including visual speech recognition, silent speech interface, and analyze its flow, and systematize them into a taxonomy. Finally, we discuss methods to solve the communication problems of people with disabilities in speaking and future research with respect to deep learning components.

APA, Harvard, Vancouver, ISO, and other styles

42

HimaBindu, Gottumukkala, Gondi Lakshmeeswari, Giddaluru Lalitha, and Pedalanka P. S. Subhashini. "Recognition Using DNN with Bacterial Foraging Optimization Using MFCC Coefficients." Journal Européen des Systèmes Automatisés 54, no. 2 (April 27, 2021): 283–87. http://dx.doi.org/10.18280/jesa.540210.

Full text

Abstract:

Speech is an important mode of communication for people. For a long time, researchers have been working hard to develop conversational machines which will communicate with speech technology. Voice recognition is a part of a science called signal processing. Speech recognition is becoming more successful for providing user authentication. The process of user recognition is becoming more popular now a days for providing security by authenticating the users. With the rising importance of automated information processing and telecommunications, the usefulness of recognizing an individual from the features of user voice is increasing. In this paper, the three stages of speech recognition processing are defined as pre-processing, feature extraction and decoding. Speech comprehension has been significantly enhanced by using foreign languages. Automatic Speech Recognition (ASR) aims to translate text to speech. Speaker recognition is the method of recognizing an individual through his/her voice signals. The new speaker initially privileges identity for speaker authentication, and then the stated model is used for identification. The identity argument is approved when the match is above a predefined threshold. The speech used for these tasks may be either text-dependent or text-independent. The article uses Bacterial Foraging Optimization Algorithm (BFO) for accurate speech recognition through Mel Frequency Cepstral Coefficients (MFCC) model using DNN. Speech recognition efficiency is compared to that of the conventional system.

APA, Harvard, Vancouver, ISO, and other styles

43

Lahouti, Farshad, Amir K. Khandani, and Aladdin Saleh. "Robust Transmission of Multistage Vector Quantized Sources Over Noisy Communication Channels—Applications to MELP Speech Codec." IEEE Transactions on Vehicular Technology 55, no. 6 (November 2006): 1805–11. http://dx.doi.org/10.1109/tvt.2006.878722.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Pitas, C. N., D. E. Charilas, A. D. Panagopoulos, and P. Constantinou. "Adaptive neuro-fuzzy inference models for speech and video quality prediction in real-world mobile communication networks." IEEE Wireless Communications 20, no. 3 (June 2013): 80–88. http://dx.doi.org/10.1109/mwc.2013.6549286.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Baothman, Fatmah Abdulrahman. "An Intelligent Big Data Management System Using Haar Algorithm-Based Nao Agent Multisensory Communication." Wireless Communications and Mobile Computing 2021 (July 14, 2021): 1–15. http://dx.doi.org/10.1155/2021/9977751.

Full text

Abstract:

Artificial intelligence (AI) is progressively changing techniques of teaching and learning. In the past, the objective was to provide an intelligent tutoring system without intervention from a human teacher to enhance skills, control, knowledge construction, and intellectual engagement. This paper proposes a definition of AI focusing on enhancing the humanoid agent Nao’s learning capabilities and interactions. The aim is to increase Nao intelligence using big data by activating multisensory perceptions such as visual and auditory stimuli modules and speech-related stimuli, as well as being in various movements. The method is to develop a toolkit by enabling Arabic speech recognition and implementing the Haar algorithm for robust image recognition to improve the capabilities of Nao during interactions with a child in a mixed reality system using big data. The experiment design and testing processes were conducted by implementing an AI principle design, namely, the three-constituent principle. Four experiments were conducted to boost Nao’s intelligence level using 100 children, different environments (class, lab, home, and mixed reality Leap Motion Controller (LMC). An objective function and an operational time cost function are developed to improve Nao’s learning experience in different environments accomplishing the best results in 4.2 seconds for each number recognition. The experiments’ results showed an increase in Nao’s intelligence from 3 to 7 years old compared with a child’s intelligence in learning simple mathematics with the best communication using a kappa ratio value of 90.8%, having a corpus that exceeded 390,000 segments, and scoring 93% of success rate when activating both auditory and vision modules for the agent Nao. The developed toolkit uses Arabic speech recognition and the Haar algorithm in a mixed reality system using big data enabling Nao to achieve a 94% success learning rate at a distance of 0.09 m; when using LMC in mixed reality, the hand sign gestures recorded the highest accuracy of 98.50% using Haar algorithm. The work shows that the current work enabled Nao to gradually achieve a higher learning success rate as the environment changes and multisensory perception increases. This paper also proposes a cutting-edge research work direction for fostering child-robots education in real time.

APA, Harvard, Vancouver, ISO, and other styles

46

Mayor, Vicente, Rafael Estepa, Antonio Estepa, and German Madinabeitia. "Deploying a Reliable UAV-Aided Communication Service in Disaster Areas." Wireless Communications and Mobile Computing 2019 (April 8, 2019): 1–20. http://dx.doi.org/10.1155/2019/7521513.

Full text

Abstract:

When telecommunication infrastructure is damaged by natural disasters, creating a network that can handle voice channels can be vital for search and rescue missions. Unmanned Aerial Vehicles (UAV) equipped with WiFi access points could be rapidly deployed to provide wireless coverage to ground users. This WiFi access network can in turn be used to provide a reliable communication service to be used in search and rescue missions. We formulate a new problem for UAVs optimal deployment which considers not only WiFi coverage but also the mac sublayer (i.e., quality of service). Our goal is to dispatch the minimum number of UAVs for provisioning a WiFi network that enables reliable VoIP communications in disaster scenarios. Among valid solutions, we choose the one that minimizes energy expenditure at the user’s WiFi interface card in order to extend ground user’s smartphone battery life as much as possible. Solutions are found using well-known heuristics such as K-means clusterization and genetic algorithms. Via numerical results, we show that the IEEE 802.11 standard revision has a decisive impact on the number of UAVs required to cover large areas, and that the user’s average energy expenditure (attributable to communications) can be reduced by limiting the maximum altitude for drones or by increasing the VoIP speech quality.

APA, Harvard, Vancouver, ISO, and other styles

47

Ghosh, Hiranmay, Sunil Kumar Kopparapu, Tanushyam Chattopadhyay, Ashish Khare, Sujal Subhash Wattamwar, Amarendra Gorai, and Meghna Pandharipande. "Multimodal Indexing of Multilingual News Video." International Journal of Digital Multimedia Broadcasting 2010 (2010): 1–18. http://dx.doi.org/10.1155/2010/486487.

Full text

Abstract:

The problems associated with automatic analysis of news telecasts are more severe in a country like India, where there are many national and regional language channels, besides English. In this paper, we present a framework for multimodal analysis of multilingual news telecasts, which can be augmented with tools and techniques for specific news analytics tasks. Further, we focus on a set of techniques for automatic indexing of the news stories based on keywords spotted in speech as well as on the visuals of contemporary and domain interest. English keywords are derived from RSS feed and converted to Indian language equivalents for detection in speech and on ticker texts. Restricting the keyword list to a manageable number results in drastic improvement in indexing performance. We present illustrative examples and detailed experimental results to substantiate our claim.

APA, Harvard, Vancouver, ISO, and other styles

48

Petruk, Oksana. "TEXTBOOK OF THE UKRAINIAN LANGUAGE AS A MEANS OF FORMATION OF STRATEGIES OF SPEECH COMMUNICATION." Problems of Modern Textbook, no. 26 (2021): 165–75. http://dx.doi.org/10.32405/2411-1309-2021-26-165-175.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Choi, Seoyeon, Yoosun Hwang, Joonchul Shin, Jung-Sik Yang, and Hyo-Il Jung. "Relationship analysis of speech communication between salivary cortisol levels and personal characteristics using the Smartphone Linked Stress Measurement (SLSM)." BioChip Journal 11, no. 2 (January 4, 2017): 101–7. http://dx.doi.org/10.1007/s13206-016-1202-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Kadota, Kazuo. "Development of Communication Robot for STEM Education by Using Digital Fabrication." Journal of Robotics and Mechatronics 29, no. 6 (December 20, 2017): 944–51. http://dx.doi.org/10.20965/jrm.2017.p0944.

Full text

Abstract:

This paper describes the development of a communication robot for STEM education, in which digital fabrication equipment such as a 3D printer and laser cutter are used. Specifically, although STEM education programs are active in several countries outside of Japan, they are not yet officially adopted in the curricula for Japanese elementary and junior high schools; however, a few undertakings exist outside schools. Meanwhile, the new curriculum guidelines announced in March 2017 by the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) recognize the need for cross-subject activities and require elementary schools to introduce education on programmatic thinking. This suggests that STEM education-related activities will be introduced in Japanese school education in the near future and that educational programs that utilize robots will become increasingly active. Furthermore, the availability of technologies, such as speech recognition, artificial intelligence, and IoT, makes it highly likely that communication robots will be adopted in a variety of school situations. This study reviews the author’s development of a communication robot based on the use of digital fabrication technology within the context of STEM education; teaching plans are proposed, premised on the use of the STEM robot within the framework of the new curriculum guidelines that will be adopted by elementary and junior high schools in Japan from FY2020.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!