Log in

Relevant bibliographies by topics / Visual recognition system / Journal articles

To see the other types of publications on this topic, follow the link: Visual recognition system.

Journal articles on the topic 'Visual recognition system'

Author: Grafiati

Published: 4 June 2021

Last updated: 6 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Visual recognition system.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Laszlo, Sarah, and Elizabeth Sacchi. "Individual differences in involvement of the visual object recognition system during visual word recognition." Brain and Language 145-146 (June 2015): 42–52. http://dx.doi.org/10.1016/j.bandl.2015.03.009.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Gornostal, Alexandr, and Yaroslaw Dorogyy. "Development of audio-visual speech recognition system." ScienceRise 12, no. 1 (2017): 42–47. http://dx.doi.org/10.15587/2313-8416.2017.118212.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Khosla, Deepak, David J. Huber, and Christopher Kanan. "A neuromorphic system for visual object recognition." Biologically Inspired Cognitive Architectures 8 (April 2014): 33–45. http://dx.doi.org/10.1016/j.bica.2014.02.001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

ASAKURA, Toshiyuki, and Yasutomi IIDA. "Intelligent Visual Recognition System in Harvest Robot." Proceedings of the JSME annual meeting 2002.1 (2002): 195–96. http://dx.doi.org/10.1299/jsmemecjo.2002.1.0_195.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Wong, Yee Wan, Kah Phooi Seng, and Li-Minn Ang. "Audio-Visual Recognition System in Compression Domain." IEEE Transactions on Circuits and Systems for Video Technology 21, no. 5 (2011): 637–46. http://dx.doi.org/10.1109/tcsvt.2011.2129670.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

CAO, JIANGTAO, NAOYUKI KUBOTA, PING LI, and HONGHAI LIU. "THE VISUAL-AUDIO INTEGRATED RECOGNITION METHOD FOR USER AUTHENTICATION SYSTEM OF PARTNER ROBOTS." International Journal of Humanoid Robotics 08, no. 04 (2011): 691–705. http://dx.doi.org/10.1142/s0219843611002678.

Full text

Abstract:

Some of noncontact biometric ways have been used for user authentication system of partner robots, such as visual-based recognition methods and speech recognition. However, the methods of visual-based recognition are sensitive to the light noise and speech recognition systems are perturbed to the acoustic environment and sound noise. Inspiring from the human's capability of compensating visual information (looking) with audio information (hearing), a visual-audio integrating method is proposed to deal with the disturbance of light noise and to improve the recognition accuracy. Combining with the PCA-based and 2DPCA-based face recognition, a two-stage speaker recognition algorithm is used to extract useful personal identity information from speech signals. With the statistic properties of visual background noise, the visual-audio integrating method is performed to draw the final decision. The proposed method is evaluated on a public visual-audio dataset VidTIMIT and a partner robot authentication system. The results verified the visual-audio integrating method can obtain satisfied recognition results with strong robustness.

APA, Harvard, Vancouver, ISO, and other styles

7

Malowany, Dan, and Hugo Guterman. "Biologically Inspired Visual System Architecture for Object Recognition in Autonomous Systems." Algorithms 13, no. 7 (2020): 167. http://dx.doi.org/10.3390/a13070167.

Full text

Abstract:

Computer vision is currently one of the most exciting and rapidly evolving fields of science, which affects numerous industries. Research and development breakthroughs, mainly in the field of convolutional neural networks (CNNs), opened the way to unprecedented sensitivity and precision in object detection and recognition tasks. Nevertheless, the findings in recent years on the sensitivity of neural networks to additive noise, light conditions, and to the wholeness of the training dataset, indicate that this technology still lacks the robustness needed for the autonomous robotic industry. In an attempt to bring computer vision algorithms closer to the capabilities of a human operator, the mechanisms of the human visual system was analyzed in this work. Recent studies show that the mechanisms behind the recognition process in the human brain include continuous generation of predictions based on prior knowledge of the world. These predictions enable rapid generation of contextual hypotheses that bias the outcome of the recognition process. This mechanism is especially advantageous in situations of uncertainty, when visual input is ambiguous. In addition, the human visual system continuously updates its knowledge about the world based on the gaps between its prediction and the visual feedback. CNNs are feed forward in nature and lack such top-down contextual attenuation mechanisms. As a result, although they process massive amounts of visual information during their operation, the information is not transformed into knowledge that can be used to generate contextual predictions and improve their performance. In this work, an architecture was designed that aims to integrate the concepts behind the top-down prediction and learning processes of the human visual system with the state-of-the-art bottom-up object recognition models, e.g., deep CNNs. The work focuses on two mechanisms of the human visual system: anticipation-driven perception and reinforcement-driven learning. Imitating these top-down mechanisms, together with the state-of-the-art bottom-up feed-forward algorithms, resulted in an accurate, robust, and continuously improving target recognition model.

APA, Harvard, Vancouver, ISO, and other styles

8

Stork, David G. "Neural network acoustic and visual speech recognition system." Journal of the Acoustical Society of America 102, no. 3 (1997): 1282. http://dx.doi.org/10.1121/1.420021.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Jiao, Chenlei, Binbin Lian, Zhe Wang, Yimin Song, and Tao Sun. "Visual–tactile object recognition of a soft gripper based on faster Region-based Convolutional Neural Network and machining learning algorithm." International Journal of Advanced Robotic Systems 17, no. 5 (2020): 172988142094872. http://dx.doi.org/10.1177/1729881420948727.

Full text

Abstract:

Object recognition is a prerequisite to control a soft gripper successfully grasping an unknown object. Visual and tactile recognitions are two commonly used methods in a grasping system. Visual recognition is limited if the size and weight of the objects are involved, whereas the efficiency of tactile recognition is a problem. A visual–tactile recognition method is proposed to overcome the disadvantages of both methods in this article. The design and fabrication of the soft gripper considering the visual and tactile sensors are implemented, where the Kinect v2 is adopted for visual information, bending and pressure sensors are embedded to the soft fingers for tactile information. The proposed method is divided into three steps: initial recognition by vision, detail recognition by touch, and a data fusion decision making. Experiments show that the visual–tactile recognition has the best results. The average recognition accuracy of the daily objects by the proposed method is also the highest. The feasibility of the visual–tactile recognition is verified.

APA, Harvard, Vancouver, ISO, and other styles

10

Stringa, Luigi. "A VISUAL MODEL FOR PATTERN RECOGNITION." International Journal of Neural Systems 03, supp01 (1992): 31–39. http://dx.doi.org/10.1142/s0129065792000358.

Full text

Abstract:

A general model for an optical recognition system capable of simultaneous recognition of patterns at different resolution levels is outlined. The model is based on two hierarchic stages of processing networks and presents interesting analogies with the human visual system. Illustrative applications and preliminary experimental results are also briefly discussed.

APA, Harvard, Vancouver, ISO, and other styles

11

Mikherskii, R. M. "APPLICATION OF AN ARTIFICIAL IMMUNE SYSTEM FOR VISUAL PATTERN RECOGNITION." Computer Optics 42, no. 1 (2018): 113–17. http://dx.doi.org/10.18287/2412-6179-2018-42-1-113-117.

Full text

Abstract:

The suitability of artificial immune systems for recognizing visual patterns is discussed. A new algorithm and software implementation of an artificial immune system have been proposed based on which real-time pattern recognition can be done using a Web camera. It has been shown experimentally that this system can be successfully used to recognize both human faces and any other objects. An issue of using an artificial immune system in high-performance parallel computing systems is discussed. The advantages of the developed artificial immune system include the ability to teach the system a new image in a fast manner at any moment during run-time. These advantages open up a possibility of creating artificial intelligence systems for real-time learning.

APA, Harvard, Vancouver, ISO, and other styles

12

Gomez, Pablo, and Sarah Silins. "Visual word recognition models should also be constrained by knowledge about the visual system." Behavioral and Brain Sciences 35, no. 5 (2012): 287. http://dx.doi.org/10.1017/s0140525x12000179.

Full text

Abstract:

AbstractFrost's article advocates for universal models of reading and critiques recent models that concentrate in what has been described as “cracking the orthographic code.” Although the challenge to develop models that can account for word recognition beyond Indo-European languages is welcomed, we argue that reading models should also be constrained by general principles of visual processing and object recognition.

APA, Harvard, Vancouver, ISO, and other styles

13

Attamimi, Muhammad, Takaya Araki, Tomoaki Nakamura, and Takayuki Nagai. "Visual Recognition System for Cleaning Tasks by Humanoid Robots." International Journal of Advanced Robotic Systems 10, no. 11 (2013): 384. http://dx.doi.org/10.5772/56629.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Zohra Allam, Fatima. "Modeling of Biometric Recognition Based on Human Visual System." International Journal of Advanced Trends in Computer Science and Engineering 9, no. 1.2 (2020): 198–204. http://dx.doi.org/10.30534/ijatcse/2020/2991.22020.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Wallis, Guy, and Edmund T. Rolls. "INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM." Progress in Neurobiology 51, no. 2 (1997): 167–94. http://dx.doi.org/10.1016/s0301-0082(96)00054-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Tacchetti, Andrea, Leyla Isik, and Tomaso Poggio. "Invariant representations for action recognition in the visual system." Journal of Vision 15, no. 12 (2015): 558. http://dx.doi.org/10.1167/15.12.558.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Connell, Jonathan H. "Audio-only backoff in audio-visual speech recognition system." Journal of the Acoustical Society of America 125, no. 6 (2009): 4109. http://dx.doi.org/10.1121/1.3155497.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

IZUKA, Yusuke, Katsutoshi OTSUBO, Takuya KAWAMURA, and Hironao YAMADA. "Study on a Visual Recognition System Using Fovea Lens." Proceedings of Conference of Tokai Branch 2017.66 (2017): 221. http://dx.doi.org/10.1299/jsmetokai.2017.66.221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

DAISUKE, Noguchi, Katsutoshi OTSUBO, Takuya KAWAMURA, and Hironao YAMADA. "Study on a Visual Recognition System Using Fovea Lens." Proceedings of Conference of Tokai Branch 2019.68 (2019): 322. http://dx.doi.org/10.1299/jsmetokai.2019.68.322.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Zhang, Ying, Yige Guo, Jingrong Hu, et al. "A Teaching Evaluation System Based On Visual Recognition Technology." IOP Conference Series: Materials Science and Engineering 782 (April 15, 2020): 032101. http://dx.doi.org/10.1088/1757-899x/782/3/032101.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Svakha, Dmytro Mykolaiovych, and Anton Yuriiovych Varfolomieiev. "The System of Automatic Visual Recognition of Meter Indications." Microsystems, Electronics and Acoustics 23, no. 6 (2018): 22–28. http://dx.doi.org/10.20535/2523-4455.2018.23.6.149298.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

INOUE, Makoto, and Hironao YAMADA. "267 Study on Visual Recognition System using Fovea Lens." Proceedings of Conference of Tokai Branch 2009.58 (2009): 141–42. http://dx.doi.org/10.1299/jsmetokai.2009.58.141.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Elliffe, M. C. M., E. T. Rolls, and S. M. Stringer. "Invariant recognition of feature combinations in the visual system." Biological Cybernetics 86, no. 1 (2002): 59–71. http://dx.doi.org/10.1007/s004220100284.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Handa, Anand, Rashi Agarwal, and Narendra Kohli. "Audio-Visual Emotion Recognition System Using Multi-Modal Features." International Journal of Cognitive Informatics and Natural Intelligence 15, no. 4 (2021): 1–14. http://dx.doi.org/10.4018/ijcini.20211001.oa34.

Full text

Abstract:

Due to the highly variant face geometry and appearances, Facial Expression Recognition (FER) is still a challenging problem. CNN can characterize 2-D signals. Therefore, for emotion recognition in a video, the authors propose a feature selection model in AlexNet architecture to extract and filter facial features automatically. Similarly, for emotion recognition in audio, the authors use a deep LSTM-RNN. Finally, they propose a probabilistic model for the fusion of audio and visual models using facial features and speech of a subject. The model combines all the extracted features and use them to train the linear SVM (Support Vector Machine) classifiers. The proposed model outperforms the other existing models and achieves state-of-the-art performance for audio, visual and fusion models. The model classifies the seven known facial expressions, namely anger, happy, surprise, fear, disgust, sad, and neutral on the eNTERFACE’05 dataset with an overall accuracy of 76.61%.

APA, Harvard, Vancouver, ISO, and other styles

25

Nishijo, Hisao, and Taketoshi Ono. "Recognition of faces and predators in the innate recognition system : a role of the extrageniculate visual system." Higher Brain Function Research 34, no. 3 (2014): 281–88. http://dx.doi.org/10.2496/hbfr.34.281.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Yang, Jing Bao, Gao Wei Zhang, Chun Lei Song, and Zhong Hong Shen. "Intelligent Vehicle Navigation System Based on Multi-Visual Cognition Information." Advanced Materials Research 671-674 (March 2013): 2893–98. http://dx.doi.org/10.4028/www.scientific.net/amr.671-674.2893.

Full text

Abstract:

Visual cognition system is an important research content of intelligent vehicle control system. This paper researched traffic lights recognition, lane line identification and traffic sign recognition which consist of vehicle intelligent multi-visual cognition system. The intelligent vehicle navigation system based on multi-visual cognition information fusion was also built and some algorithms had been tested on the real unmanned vehicle which led a good result.

APA, Harvard, Vancouver, ISO, and other styles

27

Gorbenko, Anna. "Automatic Generation of Modules of Visual Recognition." Applied Mechanics and Materials 416-417 (September 2013): 748–52. http://dx.doi.org/10.4028/www.scientific.net/amm.416-417.748.

Full text

Abstract:

We consider the problem of automatic generation of visual recognition modules. In particular, we consider a self-learning algorithm for visual recognition and system of automatic generation that based on some biological observations.

APA, Harvard, Vancouver, ISO, and other styles

28

Okada, Kei, Mitsuharu Kojima, and Masayuki Inaba. "Object Recognition with Multi Visual Cue Integration for Shared Knowledge-based Action Recognition System." Journal of the Robotics Society of Japan 26, no. 6 (2008): 537–45. http://dx.doi.org/10.7210/jrsj.26.537.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Ivanko, D., and D. Ryumin. "A NOVEL TASK-ORIENTED APPROACH TOWARD AUTOMATED LIP-READING SYSTEM IMPLEMENTATION." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIV-2/W1-2021 (April 15, 2021): 85–89. http://dx.doi.org/10.5194/isprs-archives-xliv-2-w1-2021-85-2021.

Full text

Abstract:

Abstract. Visual information plays a key role in automatic speech recognition (ASR) when audio is corrupted by background noise, or even inaccessible. Speech recognition using visual information is called lip-reading. The initial idea of visual speech recognition comes from humans’ experience: we are able to recognize spoken words from the observation of a speaker's face without or with limited access to the sound part of the voice. Based on the conducted experimental evaluations as well as on analysis of the research field we propose a novel task-oriented approach towards practical lip-reading system implementation. Its main purpose is to be some kind of a roadmap for researchers who need to build a reliable visual speech recognition system for their task. In a rough approximation, we can divide the task of lip-reading into two parts, depending on the complexity of the problem. First, if we need to recognize isolated words, numbers or small phrases (e.g. Telephone numbers with a strict grammar or keywords). Or second, if we need to recognize continuous speech (phrases or sentences). All these stages disclosed in detail in this paper. Based on the proposed approach we implemented from scratch automatic visual speech recognition systems of three different architectures: GMM-CHMM, DNN-HMM and purely End-to-end. A description of the methodology, tools, step-by-step development and all necessary parameters are disclosed in detail in current paper. It is worth noting that for the Russian speech recognition, such systems were created for the first time.

APA, Harvard, Vancouver, ISO, and other styles

30

Radjenović–Mrčarica, J., Ž. Mrčarica, H. Detter, and V. Litovski. "Neural Network Visual Recognition for Automation of the Microelectromechanical Systems Assembly." International Journal of Neural Systems 08, no. 01 (1997): 69–79. http://dx.doi.org/10.1142/s0129065797000100.

Full text

Abstract:

A neural network visual recognition system is developed. The system is intended for automation of microsystem assembly process. The recognition of microparts is insensitive on their position.This feature is enabled by calculation of the moment properties of the image during preprocessing. The system takes grey-level images and produces recognition code as the output. A feed-forward neural network is used for recognition. Learning of the neural network is performed by combination of standard backpropagation and resilient propagation rule. The system performance is satisfactory with respect to recognition accuracy and recognition time.

APA, Harvard, Vancouver, ISO, and other styles

31

Oleksii, A. V., and Yа І. Kornaha. "INTELLECTUAL SYSTEM OF RECOGNITION OF VISUAL KEYS FOR MOBILE APPLICATIONS." Scientific notes of Taurida National V.I. Vernadsky University. Series: Technical Sciences 6, no. 1 (2019): 123–27. http://dx.doi.org/10.32838/2663-5941/2019.6-1/22.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Zhang, Liangguo. "A Medium Vocabulary Visual Recognition System for Chinese Sign Language." Journal of Computer Research and Development 43, no. 3 (2006): 476. http://dx.doi.org/10.1360/crad20060316.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Lago-Fernández, Luis F., Manuel A. Sánchez-Montañés, and Eduardo Sánchez. "A visual system for invariant recognition in animated image sequences." Neurocomputing 52-54 (June 2003): 631–36. http://dx.doi.org/10.1016/s0925-2312(02)00845-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Pantrigo, J. J., A. Sánchez, and J. Mira. "Representation spaces in a visual-based human action recognition system." Neurocomputing 72, no. 4-6 (2009): 901–15. http://dx.doi.org/10.1016/j.neucom.2008.06.017.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Stringer, S. M., and E. T. Rolls. "Position invariant recognition in the visual system with cluttered environments." Neural Networks 13, no. 3 (2000): 305–15. http://dx.doi.org/10.1016/s0893-6080(00)00017-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

SUZUKI, Kenji, Katsutoshi OTSUBO, Takuya KAWAMURA, and Hironao YAMADA. "236 Study on a Visual Recognition System Using Fovea Lens." Proceedings of Conference of Tokai Branch 2015.64 (2015): _236–1_—_236–2_. http://dx.doi.org/10.1299/jsmetokai.2015.64._236-1_.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Paulin, Hebsibah, R. S. Milton, S. JanakiRaman, and K. Chandraprabha. "Audio–Visual (Multimodal) Speech Recognition System Using Deep Neural Network." Journal of Testing and Evaluation 47, no. 6 (2019): 20180505. http://dx.doi.org/10.1520/jte20180505.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Zhiliang, He, Xiong Juntao, Mai Zhiheng, Zhong Pengfei, and Tang Linyue. "Research of Face Recognition System Based on Visual Intelligent Monitoring." International Journal of Multimedia and Ubiquitous Engineering 11, no. 5 (2016): 171–82. http://dx.doi.org/10.14257/ijmue.2016.11.5.16.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

MIN, BYUNG-WOO, HO-SUB YOON, JUNG SOH, TAKESHI OHASHI, and TOSHIAKI EJIMA. "Visual Recognition of Static/Dynamic Gesture: Gesture-Driven Editing System." Journal of Visual Languages & Computing 10, no. 3 (1999): 291–309. http://dx.doi.org/10.1006/jvlc.1999.0117.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Jin, Junwei, Yanting Li, Tiejun Yang, Liang Zhao, Junwei Duan, and C. L. Philip Chen. "Discriminative group-sparsity constrained broad learning system for visual recognition." Information Sciences 576 (October 2021): 800–818. http://dx.doi.org/10.1016/j.ins.2021.06.008.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Sieroff, Eric, Alexander Pollatsek, and Michael I. Posner. "Recognition of visual letter strings following injury to the posterior visual spatial attention system." Cognitive Neuropsychology 5, no. 4 (1988): 427–49. http://dx.doi.org/10.1080/02643298808253268.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Monk, Andrew F. "Theoretical Note: Coordinate Systems in Visual Word Recognition." Quarterly Journal of Experimental Psychology Section A 37, no. 4 (1985): 613–25. http://dx.doi.org/10.1080/14640748508400922.

Full text

Abstract:

Marr and Nishihara (1978) have made certain recommendations about how representations postulated in a theory of visual information processing should be specified. Using this scheme the paper discusses representations which might be postulated in a model of visual word recognition. A representation is specified in terms of a set of primitives (e.g., word identities or visual features) in combination with a coordinate system. The coordinate systems considered are retinal, spatial (e.g., position on page) word-centred (position in word) and sentence-centred (position in sentence). Various combinations of primitives and coordinate systems are considered along with how to decide which combinations are actually generated in the process of fluent reading. A tentative model is put forward in which a single processing stage, which starts anew after each saccade, generates a representation with word identities as its primitives and sentence-centred coordinates. Evidence to support such a model which has no intermediate representation with spatial coordinates is briefly reviewed.

APA, Harvard, Vancouver, ISO, and other styles

43

Nakadai, Kazuhiro, and Tomoaki Koiwa. "Psychologically-Inspired Audio-Visual Speech Recognition Using Coarse Speech Recognition and Missing Feature Theory." Journal of Robotics and Mechatronics 29, no. 1 (2017): 105–13. http://dx.doi.org/10.20965/jrm.2017.p0105.

Full text

Abstract:

[abstFig src='/00290001/10.jpg' width='300' text='System architecture of AVSR based on missing feature theory and P-V grouping' ] Audio-visual speech recognition (AVSR) is a promising approach to improving the noise robustness of speech recognition in the real world. For AVSR, the auditory and visual units are the phoneme and viseme, respectively. However, these are often misclassified in the real world because of noisy input. To solve this problem, we propose two psychologically-inspired approaches. One is audio-visual integration based on missing feature theory (MFT) to cope with missing or unreliable audio and visual features for recognition. The other is phoneme and viseme grouping based on coarse-to-fine recognition. Preliminary experiments show that these two approaches are effective for audio-visual speech recognition. Integration based on MFT with an appropriate weight improves the recognition performance by −5 dB. This is the case even in a noisy environment, in which most speech recognition systems do not work properly. Phoneme and viseme grouping further improved the AVSR performance, particularly at a low signal-to-noise ratio.**This work is an extension of our publication “Tomoaki Koiwa et al.: Coarse speech recognition by audio-visual integration based on missing feature theory, IROS 2007, pp.1751-1756, 2007.”

APA, Harvard, Vancouver, ISO, and other styles

44

Silva Filho, P., E. H. Shiguemori, and O. Saotome. "UAV VISUAL AUTOLOCALIZATON BASED ON AUTOMATIC LANDMARK RECOGNITION." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W3 (August 18, 2017): 89–94. http://dx.doi.org/10.5194/isprs-annals-iv-2-w3-89-2017.

Full text

Abstract:

Deploying an autonomous unmanned aerial vehicle in GPS-denied areas is a highly discussed problem in the scientific community. There are several approaches being developed, but the main strategies yet considered are computer vision based navigation systems. This work presents a new real-time computer-vision position estimator for UAV navigation. The estimator uses images captured during flight to recognize specific, well-known, landmarks in order to estimate the latitude and longitude of the aircraft. The method was tested in a simulated environment, using a dataset of real aerial images obtained in previous flights, with synchronized images, GPS and IMU data. The estimated position in each landmark recognition was compatible with the GPS data, stating that the developed method can be used as an alternative navigation system.

APA, Harvard, Vancouver, ISO, and other styles

45

S*, Manisha, Nafisa H. Saida, Nandita Gopal, and Roshni P. Anand. "Bimodal Emotion Recognition using Machine Learning." International Journal of Engineering and Advanced Technology 10, no. 4 (2021): 189–94. http://dx.doi.org/10.35940/ijeat.d2451.0410421.

Full text

Abstract:

The predominant communication channel to convey relevant and high impact information is the emotions that is embedded on our communications. Researchers have tried to exploit these emotions in recent years for human robot interactions (HRI) and human computer interactions (HCI). Emotion recognition through speech or through facial expression is termed as single mode emotion recognition. The rate of accuracy of these single mode emotion recognitions are improved using the proposed bimodal method by combining the modalities of speech and facing and recognition of emotions using a Convolutional Neural Network (CNN) model. In this paper, the proposed bimodal emotion recognition system, contains three major parts such as processing of audio, processing of video and fusion of data for detecting the emotion of a person. The fusion of visual information and audio data obtained from two different channels enhances the emotion recognition rate by providing the complementary data. The proposed method aims to classify 7 basic emotions (anger, disgust, fear, happy, neutral, sad, surprise) from an input video. We take audio and image frame from the video input to predict the final emotion of a person. The dataset used is an audio-visual dataset uniquely suited for the study of multi-modal emotion expression and perception. Dataset used here is RAVDESS dataset which contains audio-visual dataset, visual dataset and audio dataset. For bimodal emotion detection the audio-visual dataset is used.

APA, Harvard, Vancouver, ISO, and other styles

46

Craievich, D., B. Barnett, and A. C. Bovik. "A stereo visual pattern image coding system." Image and Vision Computing 18, no. 1 (1999): 21–37. http://dx.doi.org/10.1016/s0262-8856(99)00011-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Isik, Leyla, Ethan M. Meyers, Joel Z. Leibo, and Tomaso Poggio. "The dynamics of invariant object recognition in the human visual system." Journal of Neurophysiology 111, no. 1 (2014): 91–102. http://dx.doi.org/10.1152/jn.00394.2013.

Full text

Abstract:

The human visual system can rapidly recognize objects despite transformations that alter their appearance. The precise timing of when the brain computes neural representations that are invariant to particular transformations, however, has not been mapped in humans. Here we employ magnetoencephalography decoding analysis to measure the dynamics of size- and position-invariant visual information development in the ventral visual stream. With this method we can read out the identity of objects beginning as early as 60 ms. Size- and position-invariant visual information appear around 125 ms and 150 ms, respectively, and both develop in stages, with invariance to smaller transformations arising before invariance to larger transformations. Additionally, the magnetoencephalography sensor activity localizes to neural sources that are in the most posterior occipital regions at the early decoding times and then move temporally as invariant information develops. These results provide previously unknown latencies for key stages of human-invariant object recognition, as well as new and compelling evidence for a feed-forward hierarchical model of invariant object recognition where invariance increases at each successive visual area along the ventral stream.

APA, Harvard, Vancouver, ISO, and other styles

48

Huang, Kaiqi, and Tieniu Tan. "Vs-star: A visual interpretation system for visual surveillance." Pattern Recognition Letters 31, no. 14 (2010): 2265–85. http://dx.doi.org/10.1016/j.patrec.2010.05.029.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Young, A. W., D. J. Hellawell, S. Wright, and H. D. Ellis. "Reduplication of Visual Stimuli." Behavioural Neurology 7, no. 3-4 (1994): 135–42. http://dx.doi.org/10.1155/1994/249590.

Full text

Abstract:

Investigation of P.T., a man who experienced reduplicative delusions, revealed significant impairments on tests of recognition memory for faces and understanding of emotional facial expressions. On formal tests of his recognition abilities, P.T. showed reduplication to familiar faces, buildings, and written names, but not to familiar voices. Reduplication may therefore have been a genuinely visual problem in P.T.'s case, since it was not found to auditory stimuli. This is consistent with hypotheses which propose that the basis of reduplication can lie in part in malfunction of the visual system.

APA, Harvard, Vancouver, ISO, and other styles

50

Zhao, Lan, and Tao Zeng. "Target Recognition Application of Real-Time Optical Information Processing System." Applied Mechanics and Materials 536-537 (April 2014): 197–200. http://dx.doi.org/10.4028/www.scientific.net/amm.536-537.197.

Full text

Abstract:

This paper focuses on the visual tracking algorithm in optical imaging surveillance and tracking system. The tracking particle filter framework deemed find sparse representation problem, can effectively overcome the visual tracking algorithm appears in noise, occlusion, background interference and complex situations such as illumination changes. Morphological methods using digital occlusion area is detected to determine whether the date is added to the template tracking results set, thereby updating the control template, to effectively prevent the drift tracking results.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!