Journal articles on the topic 'Automated audio captioning'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 30 journal articles for your research on the topic 'Automated audio captioning.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.
Bokhove, Christian, and Christopher Downey. "Automated generation of ‘good enough’ transcripts as a first step to transcription of audio-recorded data." Methodological Innovations 11, no. 2 (2018): 205979911879074. http://dx.doi.org/10.1177/2059799118790743.
Full textP. Jayanth, K. Lakshmi Sree, K. Karthik Kumar Reddy, G. Om Prakash, and G. Reddy Prasad. "Vision-to-Voice: AI for generating Description & Audio of Visual Content." International Research Journal of Innovations in Engineering and Technology 09, Special Issue ICCIS (2025): 206–13. https://doi.org/10.47001/irjiet/2025.iccis-202533.
Full textSejal Pawar, Shruti Mulay, Jivani Suryawanshi, Vaishnavi Walgude, and Prof. K. V. Patil. "Enhancing Traffic Scene and Understanding Through Image Captioning and Audio." International Research Journal on Advanced Engineering and Management (IRJAEM) 6, no. 07 (2024): 2423–29. http://dx.doi.org/10.47392/irjaem.2024.0349.
Full textHaapaniemi, Riku, Annamaria Mesaros, Manu Harju, Irene Martín Morató, and Maija Hirvonen. "Primerjava semiotične konceptualizacije prevoda z besedilom, ki ga tvori UI." STRIDON: Journal of Studies in Translation and Interpreting 4, no. 1 (2024): 25–51. http://dx.doi.org/10.4312/stridon.4.1.25-51.
Full textSankalp, Kala, and Sridhar Ranganathan Prof. "Deep Learning Based Lipreading for Video Captioning." Engineering and Technology Journal 9, no. 05 (2024): 3935–46. https://doi.org/10.5281/zenodo.11120548.
Full textKoenecke, Allison, Andrew Nam, Emily Lake, et al. "Racial disparities in automated speech recognition." Proceedings of the National Academy of Sciences 117, no. 14 (2020): 7684–89. http://dx.doi.org/10.1073/pnas.1915768117.
Full textMirzaei, Maryam Sadat, Kourosh Meshgi, Yuya Akita, and Tatsuya Kawahara. "Partial and synchronized captioning: A new tool to assist learners in developing second language listening skill." ReCALL 29, no. 2 (2017): 178–99. http://dx.doi.org/10.1017/s0958344017000039.
Full textGuo, Rundong. "Advancing real-time close captioning: blind source separation and transcription for hearing impairments." Applied and Computational Engineering 30, no. 1 (2024): 125–30. http://dx.doi.org/10.54254/2755-2721/30/20230084.
Full textPrabhala, Jagat Chaitanya, Venkatnareshbabu K, and Ragoju Ravi. "OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIARIZATION SYSTEMS: A MATHEMATICAL FORMULATION." Applied Mathematics and Sciences An International Journal (MathSJ) 10, no. 1/2 (2023): 1–10. http://dx.doi.org/10.5121/mathsj.2023.10201.
Full textNam, Somang, and Deborah Fels. "Simulation of Subjective Closed Captioning Quality Assessment Using Prediction Models." International Journal of Semantic Computing 13, no. 01 (2019): 45–65. http://dx.doi.org/10.1142/s1793351x19400038.
Full textZhang, Ruijing. "A Comparative Analysis of LSTM and Transformer-based Automatic Speech Recognition Techniques." Transactions on Computer Science and Intelligent Systems Research 5 (August 12, 2024): 272–76. http://dx.doi.org/10.62051/zq6v0d49.
Full textGotmare, Abhay, Gandharva Thite, and Laxmi Bewoor. "A multimodal machine learning approach to generate news articles from geo-tagged images." International Journal of Electrical and Computer Engineering (IJECE) 14, no. 3 (2024): 3434. http://dx.doi.org/10.11591/ijece.v14i3.pp3434-3442.
Full textVerma, Dr Neeta. "Assistive Vision Technology using Deep Learning Techniques." International Journal for Research in Applied Science and Engineering Technology 9, no. VII (2021): 2695–704. http://dx.doi.org/10.22214/ijraset.2021.36815.
Full textGotmare, Abhay, Gandharva Thite, and Laxmi Bewoor. "A multimodal machine learning approach to generate news articles from geo-tagged images." A multimodal machine learning approach to generate news articles from geo-tagged images 14, no. 3 (2024): 3434–42. https://doi.org/10.11591/ijece.v14i3.pp3434-3442.
Full textEren, Aysegul Ozkaya, and Mustafa Sert. "Automated Audio Captioning with Topic Modeling." IEEE Access, 2023, 1. http://dx.doi.org/10.1109/access.2023.3235733.
Full textXiao, Feiyang, Jian Guan, Qiaoxi Zhu, and Wenwu Wang. "Graph Attention for Automated Audio Captioning." IEEE Signal Processing Letters, 2023, 1–5. http://dx.doi.org/10.1109/lsp.2023.3266114.
Full textMei, Xinhao, Xubo Liu, Mark D. Plumbley, and Wenwu Wang. "Automated audio captioning: an overview of recent progress and new challenges." EURASIP Journal on Audio, Speech, and Music Processing 2022, no. 1 (2022). http://dx.doi.org/10.1186/s13636-022-00259-2.
Full textBagheri, Majid Haji, Emma Gu, Asif Abdullah Khan, et al. "Machine Learning‐Enabled Triboelectric Nanogenerator for Continuous Sound Monitoring and Captioning." Advanced Sensor Research, January 8, 2025. https://doi.org/10.1002/adsr.202400156.
Full textWon, Hyejin, Baekseung Kim, Il-Youp Kwak, and Changwon Lim. "Using various pre-trained models for audio feature extraction in automated audio captioning." Expert Systems with Applications, June 2023, 120664. http://dx.doi.org/10.1016/j.eswa.2023.120664.
Full textPoongodi, M., Mounir Hamdi, and Huihui Wang. "Image and audio caps: automated captioning of background sounds and images using deep learning." Multimedia Systems, February 26, 2022. http://dx.doi.org/10.1007/s00530-022-00902-0.
Full textKala, Sankalp, and Prof Sridhar Ranganathan. "Deep Learning Based Lipreading for Video Captioning." Engineering and Technology Journal 09, no. 05 (2024). http://dx.doi.org/10.47191/etj/v9i05.08.
Full textGencyilmaz, Izel Zeynep, and Kürşat Mustafa Karaoğlan. "Optimizing Speech to Text Conversion in Turkish: An Analysis of Machine Learning Approaches." Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, March 20, 2024. http://dx.doi.org/10.17798/bitlisfen.1434925.
Full textLucia-Mulas, Maria Jose, Pablo Revuelta-Sanz, Belen Ruiz-Mezcua, and Israel Gonzalez-Carrasco. "Automatic music emotion classification model for movie soundtrack subtitling based on neuroscientific premises." Applied Intelligence, September 1, 2023. http://dx.doi.org/10.1007/s10489-023-04967-w.
Full textBochner, Joseph, Mark Indelicato, and Pralhad Konnur. "Effects of Sound Quality on the Accuracy of Telephone Captions Produced by Automatic Speech Recognition: A Preliminary Investigation." American Journal of Audiology, December 14, 2022, 1–8. http://dx.doi.org/10.1044/2022_aja-22-00102.
Full textStarr, Kim Linda, Sabine Braun, and Jaleh Delfani. "Taking a Cue From the Human." Journal of Audiovisual Translation 3, no. 2 (2020). http://dx.doi.org/10.47476/jat.v3i2.2020.138.
Full textKuhn, Korbinian, Verena Kersken, Benedikt Reuter, Niklas Egger, and Gottfried Zimmermann. "Measuring the Accuracy of Automatic Speech Recognition Solutions." ACM Transactions on Accessible Computing, December 8, 2023. http://dx.doi.org/10.1145/3636513.
Full textHekanaho, Laura, Maija Hirvonen, and Tuomas Virtanen. "Language-based machine perception: linguistic perspectives on the compilation of captioning datasets." Digital Scholarship in the Humanities, June 21, 2024. http://dx.doi.org/10.1093/llc/fqae029.
Full textMan, Xin, Jie Shao, Feiyu Chen, Mingxing Zhang, and Heng Tao Shen. "TEVL: Trilinear Encoder for Video-Language Representation Learning." ACM Transactions on Multimedia Computing, Communications, and Applications, February 24, 2023. http://dx.doi.org/10.1145/3585388.
Full textEllis, Katie, Mike Kent, and Gwyneth Peaty. "Captioned Recorded Lectures as a Mainstream Learning Tool." M/C Journal 20, no. 3 (2017). http://dx.doi.org/10.5204/mcj.1262.
Full textBurwell, Catherine. "New(s) Readers: Multimodal Meaning-Making in AJ+ Captioned Video." M/C Journal 20, no. 3 (2017). http://dx.doi.org/10.5204/mcj.1241.
Full text