Log in

Relevant bibliographies by topics / Descripteur audio / Journal articles

To see the other types of publications on this topic, follow the link: Descripteur audio.

Journal articles on the topic 'Descripteur audio'

Author: Grafiati

Published: 4 June 2021

Last updated: 5 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 18 journal articles for your research on the topic 'Descripteur audio.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Li, Francis F. "Soft-Computing Audio Classification as a Pre-Processor for Automated Content Descriptor Generation." International Journal of Computer and Communication Engineering 3, no. 2 (2014): 101–4. http://dx.doi.org/10.7763/ijcce.2014.v3.300.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Wątrobiński, Damian. "Dylemat audiodeskryptora w procesie przekładu audiowizualnego." Investigationes Linguisticae 39 (May 31, 2019): 140–50. http://dx.doi.org/10.14746/il.2018.39.11.

Full text

Abstract:

The aim of this article is to point out the dilemma concerning the transmission of emotions which faces the audio descriptor when creating audio description for painting reproductions. It will indicate the importance of visual forms, dependencies between the translator, translation and emotions, as well as different approaches to the creation of audio description, with particular emphasis on the transfer of emotions. The theoretical considerations will be supplemented by a qualitative analysis of two emotionally colored audio descriptions.

APA, Harvard, Vancouver, ISO, and other styles

3

Moore, Austin. "Dynamic Range Compression and the Semantic Descriptor Aggressive." Applied Sciences 10, no. 7 (March 30, 2020): 2350. http://dx.doi.org/10.3390/app10072350.

Full text

Abstract:

In popular music productions, the lead vocal is often the main focus of the mix and engineers will work to impart creative colouration onto this source. This paper conducts listening experiments to test if there is a correlation between perceived distortion and the descriptor “aggressive”, which is often used to describe the sonic signature of Universal Audio 1176, a much-used dynamic range compressor in professional music production. The results from this study show compression settings that impart audible distortion are perceived as aggressive by the listener, and there is a strong correlation between the subjective listener scores for distorted and aggressive. Additionally, it was shown there is a strong correlation between compression settings rated with high aggressive scores and the audio feature roughness.

APA, Harvard, Vancouver, ISO, and other styles

4

Bloit, Julien, Nicolas Rasamimanana, and Frédéric Bevilacqua. "Modeling and segmentation of audio descriptor profiles with segmental models." Pattern Recognition Letters 31, no. 12 (September 2010): 1507–13. http://dx.doi.org/10.1016/j.patrec.2009.11.003.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Peng, Yu Qing, Wei Liu, Cui Cui Zhao, and Tie Jun Li. "Detection of Violent Video with Audio-Visual Features Based on MPEG-7." Applied Mechanics and Materials 411-414 (September 2013): 1002–7. http://dx.doi.org/10.4028/www.scientific.net/amm.411-414.1002.

Full text

Abstract:

In order to solve the problem that there isn’t an effective way to detect the violent video in the network, a new method using MPEG-7 audio and visual features to detect violent video was put forward. In feature extraction, the new method targeted chosen the features about audio, color, space, time, motion. Parts of MPEG-7 descriptors were added and improved: instantaneous feature of audio was added, motion intensity descriptor was customized, and a new method to extract dominant color of video was proposed. BP neural network optimized by GA was used to fuse the features. Experiment shows that these selected features are representative, discriminative and can reduce the data redundancy. Fusion model of neural network is more robust. And the method of fusing audio and visual features improves the recall and precision of video detecting.

APA, Harvard, Vancouver, ISO, and other styles

6

Wu, Pingping, Hong Liu, Xiaofei Li, Ting Fan, and Xuewu Zhang. "A Novel Lip Descriptor for Audio-Visual Keyword Spotting Based on Adaptive Decision Fusion." IEEE Transactions on Multimedia 18, no. 3 (March 2016): 326–38. http://dx.doi.org/10.1109/tmm.2016.2520091.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

XIE, ZHIBING, and LING GUAN. "MULTIMODAL INFORMATION FUSION OF AUDIO EMOTION RECOGNITION BASED ON KERNEL ENTROPY COMPONENT ANALYSIS." International Journal of Semantic Computing 07, no. 01 (March 2013): 25–42. http://dx.doi.org/10.1142/s1793351x13400023.

Full text

Abstract:

This paper focuses on the application of novel information theoretic tools in the area of information fusion. Feature transformation and fusion is critical for the performance of information fusion, however, the majority of the existing works depend on second order statistics, which is only optimal for Gaussian-like distribution. In this paper, the integration of information fusion techniques and kernel entropy component analysis provides a new information theoretic tool. The fusion of features is realized using descriptor of information entropy and is optimized by entropy estimation. A novel multimodal information fusion strategy of audio emotion recognition based on kernel entropy component analysis (KECA) has been presented. The effectiveness of the proposed solution is evaluated through experimentation on two audiovisual emotion databases. Experimental results show that the proposed solution outperforms the existing methods, especially when the dimension of feature space is substantially reduced. The proposed method offers general theoretical analysis which gives us an approach to implement information theory into multimedia research.

APA, Harvard, Vancouver, ISO, and other styles

8

Nanni, Loris, Sheryl Brahnam, Alessandra Lumini, and Gianluca Maguolo. "Animal Sound Classification Using Dissimilarity Spaces." Applied Sciences 10, no. 23 (November 30, 2020): 8578. http://dx.doi.org/10.3390/app10238578.

Full text

Abstract:

The classifier system proposed in this work combines the dissimilarity spaces produced by a set of Siamese neural networks (SNNs) designed using four different backbones with different clustering techniques for training SVMs for automated animal audio classification. The system is evaluated on two animal audio datasets: one for cat and another for bird vocalizations. The proposed approach uses clustering methods to determine a set of centroids (in both a supervised and unsupervised fashion) from the spectrograms in the dataset. Such centroids are exploited to generate the dissimilarity space through the Siamese networks. In addition to feeding the SNNs with spectrograms, experiments process the spectrograms using the heterogeneous auto-similarities of characteristics. Once the similarity spaces are computed, each pattern is “projected” into the space to obtain a vector space representation; this descriptor is then coupled to a support vector machine (SVM) to classify a spectrogram by its dissimilarity vector. Results demonstrate that the proposed approach performs competitively (without ad-hoc optimization of the clustering methods) on both animal vocalization datasets. To further demonstrate the power of the proposed system, the best standalone approach is also evaluated on the challenging Dataset for Environmental Sound Classification (ESC50) dataset.

APA, Harvard, Vancouver, ISO, and other styles

9

Castro, F. M., M. J. Marín-Jiménez, N. Guil Mata, and R. Muñoz-Salinas. "Fisher Motion Descriptor for Multiview Gait Recognition." International Journal of Pattern Recognition and Artificial Intelligence 31, no. 01 (January 2017): 1756002. http://dx.doi.org/10.1142/s021800141756002x.

Full text

Abstract:

The goal of this paper is to identify individuals by analyzing their gait. Instead of using binary silhouettes as input data (as done in many previous works) we propose and evaluate the use of motion descriptors based on densely sampled short-term trajectories. We take advantage of state-of-the-art people detectors to define custom spatial configurations of the descriptors around the target person, obtaining a rich representation of the gait motion. The local motion features (described by the Divergence-Curl-Shear descriptor [M. Jain, H. Jegou and P. Bouthemy, Better exploiting motion for better action recognition, in Proc. IEEE Conf. Computer Vision Pattern Recognition (CVPR) (2013), pp. 2555–2562.]) extracted on the different spatial areas of the person are combined into a single high-level gait descriptor by using the Fisher Vector encoding [F. Perronnin, J. Sánchez and T. Mensink, Improving the Fisher kernel for large-scale image classification, in Proc. European Conf. Computer Vision (ECCV) (2010), pp. 143–156]. The proposed approach, coined Pyramidal Fisher Motion, is experimentally validated on ‘CASIA’ dataset [S. Yu, D. Tan and T. Tan, A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition, in Proc. Int. Conf. Pattern Recognition, Vol. 4 (2006), pp. 441–444]. (parts B and C), ‘TUM GAID’ dataset, [M. Hofmann, J. Geiger, S. Bachmann, B. Schuller and G. Rigoll, The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits, J. Vis. Commun. Image Represent. 25(1) (2014) 195–206]. ‘CMU MoBo’ dataset [R. Gross and J. Shi, The CMU Motion of Body (MoBo) database, Technical Report CMU-RI-TR-01-18, Robotics Institute (2001)]. and the recent ‘AVA Multiview Gait’ dataset [D. López-Fernández, F. Madrid-Cuevas, A. Carmona-Poyato, M. Marín-Jiménez and R. Muñoz-Salinas, The AVA multi-view dataset for gait recognition, in Activity Monitoring by Multiple Distributed Sensing, Lecture Notes in Computer Science (Springer, 2014), pp. 26–39]. The results show that this new approach achieves state-of-the-art results in the problem of gait recognition, allowing to recognize walking people from diverse viewpoints on single and multiple camera setups, wearing different clothes, carrying bags, walking at diverse speeds and not limited to straight walking paths.

APA, Harvard, Vancouver, ISO, and other styles

10

Yang, Ming Liang, and Wei Ping Ding. "The Exploration of Evaluation Method about the Driving Electromotor Acoustic Comfort of the Pure Electric Vehicles." Applied Mechanics and Materials 224 (November 2012): 113–18. http://dx.doi.org/10.4028/www.scientific.net/amm.224.113.

Full text

Abstract:

The driving electromotor noise of a pure electric bus was taken as the evaluation object in this paper. The noise signals were gathered by dual channels and to simulating human auditory by synthetic stereo, and were processed into a series of noise samples for human subjective testing generated according to the 3dB differential progressive attenuation of noise sound pressure level. Then the author investigated the human body comfort/discomfort subjective feelings under various noise samples through the high fidelity audio playback, described the subjective feelings with ‘descriptor’, and quantified the subjective feelings with scores at the same time. On this basis, the correlation of subjective feelings between acoustic comfort and discomfort were revealed, and the noise sample sets corresponding with comfort feeling were found out. Based on these, an evaluation method of electromotor acoustic comfort was established.

APA, Harvard, Vancouver, ISO, and other styles

11

Xie, Zhibing, and Ling Guan. "Multimodal Information Fusion of Audiovisual Emotion Recognition Using Novel Information Theoretic Tools." International Journal of Multimedia Data Engineering and Management 4, no. 4 (October 2013): 1–14. http://dx.doi.org/10.4018/ijmdem.2013100101.

Full text

Abstract:

This paper aims at providing general theoretical analysis for the issue of multimodal information fusion and implementing novel information theoretic tools in multimedia application. The most essential issues for information fusion include feature transformation and reduction of feature dimensionality. Most previous solutions are largely based on the second order statistics, which is only optimal for Gaussian-like distribution, while in this paper we describe kernel entropy component analysis (KECA) which utilizes descriptor of information entropy and achieves improved performance by entropy estimation. The authors present a new solution based on the integration of information fusion theory and information theoretic tools in this paper. The proposed method has been applied to audiovisual emotion recognition. Information fusion has been implemented for audio and video channels at feature level and decision level. Experimental results demonstrate that the proposed algorithm achieves improved performance in comparison with the existing methods, especially when the dimension of feature space is substantially reduced.

APA, Harvard, Vancouver, ISO, and other styles

12

Marín-Reyes, Pedro A., Itziar Irigoien, Basilio Sierra, Javier Lorenzo-Navarro, Modesto Castrillón-Santana, and Concepción Arenas. "ILRA: Novelty Detection in Face-Based Intervener Re-Identification." Symmetry 11, no. 9 (September 11, 2019): 1154. http://dx.doi.org/10.3390/sym11091154.

Full text

Abstract:

Transparency laws facilitate citizens to monitor the activities of political representatives. In this sense, automatic or manual diarization of parliamentary sessions is required, the latter being time consuming. In the present work, this problem is addressed as a person re-identification problem. Re-identification is defined as the process of matching individuals under different camera views. This paper, in particular, deals with open world person re-identification scenarios, where the captured probe in one camera is not always present in the gallery collected in another one, i.e., determining whether the probe belongs to a novel identity or not. This procedure is mandatory before matching the identity. In most cases, novelty detection is tackled applying a threshold founded in a linear separation of the identities. We propose a threshold-less approach to solve the novelty detection problem, which is based on a one-class classifier and therefore it does not need any user defined threshold. Unlike other approaches that combine audio-visual features, an Isometric LogRatio transformation of a posteriori (ILRA) probabilities is applied to local and deep computed descriptors extracted from the face, which exhibits symmetry and can be exploited in the re-identification process unlike audio streams. These features are used to train the one-class classifier to detect the novelty of the individual. The proposal is evaluated in real parliamentary session recordings that exhibit challenging variations in terms of pose and location of the interveners. The experimental evaluation explores different configuration sets where our system achieves significant improvement on the given scenario, obtaining an average F measure of 71.29% for online analyzed videos. In addition, ILRA performs better than face descriptors used in recent face-based closed world recognition approaches, achieving an average improvement of 1.6% with respect to a deep descriptor.

APA, Harvard, Vancouver, ISO, and other styles

13

Yu, Guiping. "Emotion Monitoring for Preschool Children Based on Face Recognition and Emotion Recognition Algorithms." Complexity 2021 (March 2, 2021): 1–12. http://dx.doi.org/10.1155/2021/6654455.

Full text

Abstract:

In this paper, we study the face recognition and emotion recognition algorithms to monitor the emotions of preschool children. For previous emotion recognition focusing on faces, we propose to obtain more comprehensive information from faces, gestures, and contexts. Using the deep learning approach, we design a more lightweight network structure to reduce the number of parameters and save computational resources. There are not only innovations in applications, but also algorithmic enhancements. And face annotation is performed on the dataset, while a hierarchical sampling method is designed to alleviate the data imbalance phenomenon that exists in the dataset. A new feature descriptor, called “oriented gradient histogram from three orthogonal planes,” is proposed to characterize facial appearance variations. A new efficient geometric feature is also proposed to capture facial contour variations, and the role of audio methods in emotion recognition is explored. Multifeature fusion can be used to optimally combine different features. The experimental results show that the method is very effective compared to other recent methods in dealing with facial expression recognition problems about videos in both laboratory-controlled environments and outdoor environments. The method performed experiments on expression detection in a facial expression database. The experimental results are compared with data from previous studies and demonstrate the effectiveness of the proposed new method.

APA, Harvard, Vancouver, ISO, and other styles

14

Brown, Scott A. W. "Free Trade, Yes; Ideology, Not So Much: The UK’s Shifting China Policy 2010-16." British Journal of Chinese Studies 8, no. 1 (April 3, 2019): 92–126. http://dx.doi.org/10.51661/bjocs.v8i1.21.

Full text

Abstract:

Fox and Godement’s (2009) Power Audit of EU-China Relations grouped the EU’s member states into four categories based on their national approaches to relations with, as well as their preferences for, the EU’s policies towards China. In this typology, the UK, at the time governed by New Labour, was deigned an “Ideological Free Trader”—seeking to facilitate greater free trade while continuing to assert its ideological position, namely in the areas of democracy and human rights. Since the Conservative Party took the reins of power in 2010 (in coalition with the Liberal Democrats until 2015), China’s prominence on the UK’s foreign policy agenda has arguably increased. This paper examines the direction of the UK’s China policy since 2010, and asks whether the label “Ideological Free Trader” remains applicable. Through qualitative analysis of the evolving policy approach, it argues that while early policy stances appeared consistent with the descriptor, the emphasis on free trade has grown considerably whilst the normative (ideological) dimension has diminished. Consequently, the UK should be redefined as an “Accommodating Free Trader” (an amalgamation of two of Fox and Godement’s original groups—“Accommodating Mercantilist” and “Ideological Free Trader”). At time of publication, the journal operated under the old name. When quoting please refer to the citation on the left using British Journal of Chinese Studies. The pdf of the article still reflects the old journal name; issue number and page range are consistent. Picture credit: Georgina Coupe

APA, Harvard, Vancouver, ISO, and other styles

15

Hartley, S., A. Cropp, and J. Fisher. "163 Frailty, Doesn’t That Mean Birdlike? Research Into Attitudes and Understanding of Frailty in Undergraduate and Postgraduate Trainees." Age and Ageing 50, Supplement_1 (March 2021): i12—i42. http://dx.doi.org/10.1093/ageing/afab030.124.

Full text

Abstract:

Abstract Introduction Frailty is an increasingly recognised concept, with 25–50% of over 85’s estimated to be frail1,. Previous research considered medical students’ attitudes towards older people2, yet despite the correlation between frailty and increased age3, little is known about attitudes of healthcare professionals towards frailty. We researched attitudes towards, and understanding of, frailty in undergraduate and postgraduate trainees, with a view to guiding future educational interventions. Method Approval was granted by Northumbria Healthcare NHS Foundation Trust (NHCT) Research and Development department, Newcastle University’s Research Management Group and the HRA. 3 cohorts were recruited; 3rd year Newcastle University MBBS Students, 5th year Newcastle University MBBS Students and Foundation Year 2 and Core Medical Trainees working for NHCT (junior doctors). Data was collected during scheduled teaching at NHCT and individuals were invited to participate via email prior to this. Those not participating were still required to attend the teaching. Participants provided written consent. Within each cohort, small group discussions around frailty and Comprehensive Geriatric Assessment (CGA) were prompted using open questions (e.g. “what does frailty mean to you?”), during which participants anonymously submitted phrases to an online word-cloud generator. Discussions were audio recorded and transcribed. Transcriptions and word-clouds were analysed using Simple Content Analysis. The over-arching themes within each cohort were identified and compared with other cohorts. Interpretations were reviewed by an independent researcher to enhance rigour. Results Each cohort associated frailty with older age and weakness, and often used it as a byword for complexity. Frailty was described as an abstract construct composed of personal experiences rather than an objectively defined descriptor. All associated it with negative emotions. Cohorts differed in their approach, with 3rd year students primarily focussed on defining frailty, whereas junior doctors prioritised the clinical challenges it presented. Junior doctors demonstrated limited understanding of CGA whilst undergraduate students were almost universally ignorant of it. Conclusions The lack of understanding around frailty and CGA is concerning given its high prevalence. The identification of negative emotions increases this concern. To challenge this, focussed educational interventions addressing understanding and attitudes ought to be developed for tomorrow’s doctors. References 1. Clegg et al. Lancet 2013; 381: 752–62. 2. Samra et al. Age Ageing 2017; 46: 911–9. 3. Romero-Ortuno et al. Age Ageing 2012; 41: 684–9.

APA, Harvard, Vancouver, ISO, and other styles

16

"Speaker Diarization based on Black-Hole Entropy Fuzzy Clustering using Cepstral Features." International Journal of Engineering and Advanced Technology 9, no. 4 (April 30, 2020): 1055–61. http://dx.doi.org/10.35940/ijeat.d7832.049420.

Full text

Abstract:

Speaker diarization is the process of identification of the speaker in an audio sequence. This paper proposed a speaker diarization method using the Black-hole entropy fuzzy clustering and multiple kernel weighted Mel frequency cepstral coefficient (MKMFCC) parameterization. Initially, the MKMFCC descriptor extracted the cepstral features from the input audio signal. These features are used for clustering the speakers as groups for which the BHEFC is used. The feature parameter uses the audio signal containing both the high and low energy frame for speaker indexing that resulted in accurate separation of speaker. The performance evaluation of the proposed speaker diarization system is analyzed using the measures, such as F-measure, diarization error rate, and false alarm rate. The proposed MKMFCC with BHEFC obtained a minimum diarization error rate of 0.2447, maximum F-measure of 0.8526 and minimum false alarm rate of 0.4299, respectively while changing the wavelength and obtained a minimum diarization error rate of 0.2447, maximum F-measure of 0.8526 and minimum false alarm rate of 0.4298 when compared to the existing methods for the change in the frame length.

APA, Harvard, Vancouver, ISO, and other styles

17

Karim, Abdul Amir Abdullah, and Rafal Ali Sameer. "Static and Dynamic Video Summarization." Iraqi Journal of Science, July 19, 2019, 1627–38. http://dx.doi.org/10.24996/ijs.2019.60.7.23.

Full text

Abstract:

Video represented by a large number of frames synchronized with audio making video saving requires more storage, it's delivery slower, and computation cost expensive. Video summarization provides entire video information in minimum amount of time. This paper proposes static and dynamic video summarizationmethods. The proposed static video summarization method includes several steps which are extracting frames from video, keyframes selection, feature extraction and description, and matching feature descriptor with bag of visual words, and finally save frames when features matched. The proposed dynamic video summarizationmethod includes in general extracting audio from video, calculating audio features using the average of samples in windows and find the highest average which reflects portion of video with loudest sound. The experimental results for the proposed static video summarization show that there is no redundancy between selected representative keyframes and the subjective evaluation results ensure the importance of the selected keyframes. While the experimental results for the proposed static video summarization show that all the segments of goals have been extracted to provide video summary. Static and dynamic video summarization methods done to football or soccer video type.

APA, Harvard, Vancouver, ISO, and other styles

18

"User Satisfaction Detection System for Smart Healthcare using Multimedia." VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE 8, no. 10 (August 10, 2019): 3763–66. http://dx.doi.org/10.35940/ijitee.j9969.0881019.

Full text

Abstract:

Emotion plays a critical job ineffectively conveying one’s convictions and intentions. As an outcome, identification of emotion has turned into focus point of few studies recently. Patient observing models are getting to be significant in patient concern and can endow with helpful feedback related to health issues for caregivers and clinicians. In this work, patient fulfilment recognition framework is proposed that uses image frames extracted from the recorded visual-audio modality dataset. The images are treated with techniques such as Local Binary Pattern (LBP) which is a ocular descriptor. The proposed framework incorporates feature extraction from the images and then the Support Vector Machine (SVM) is applied for classification. The three distinct types of emotions are whether the patient is happy, sad or neutral and the same are detected based on the results. The result of such an analysis can be made use of by a group of analysts which include doctors, healthcare experts and system experts to improve smart healthcare system in steps. The reliability of information provided by such a system makes such upgradations more meaningful.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!