Academic literature on the topic 'Captioning models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Captioning models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Captioning models"

1

Sayyed, Rizwan, Akash Satpute, Tushar Varkhede, Prasad Zore, and Priya Surana. "Comparing Image Captioning Techniques using Deep Learning Models." Journal of Web Development and Web Designing 8, no. 1 (2023): 13–21. http://dx.doi.org/10.46610/jowdwd.2023.v08i01.002.

Full text
Abstract:
Websites today generate tremendous amounts of data and this data needs to be processed effectively by the creators. One such important factor is processing images on the website and generating effective data from it through techniques like Web Scraping. This process can be done through techniques like Image Captioning. Image captioning is a powerful process that involves generating descriptive image captions. The ability to generate detailed and accurate descriptions of images is extremely valuable in many different fields, particularly in machine learning-based applications and systems. Image
APA, Harvard, Vancouver, ISO, and other styles
2

Ondeng, Oscar, Heywood Ouma, and Peter Akuon. "A Review of Transformer-Based Approaches for Image Captioning." Applied Sciences 13, no. 19 (2023): 11103. http://dx.doi.org/10.3390/app131911103.

Full text
Abstract:
Visual understanding is a research area that bridges the gap between computer vision and natural language processing. Image captioning is a visual understanding task in which natural language descriptions of images are automatically generated using vision-language models. The transformer architecture was initially developed in the context of natural language processing and quickly found application in the domain of computer vision. Its recent application to the task of image captioning has resulted in markedly improved performance. In this paper, we briefly look at the transformer architecture
APA, Harvard, Vancouver, ISO, and other styles
3

Muzaffar, Rimsha, Syed Yasser Arafat, Junaid Rashid, Jungeun Kim, and Usman Naseem. "UICD: A new dataset and approach for urdu image captioning." PLOS One 20, no. 6 (2025): e0320701. https://doi.org/10.1371/journal.pone.0320701.

Full text
Abstract:
Advancements in deep learning have revolutionized numerous real-world applications, including image recognition, visual question answering, and image captioning. Among these, image captioning has emerged as a critical area of research, with substantial progress achieved in Arabic, Chinese, Uyghur, Hindi, and predominantly English. However, despite Urdu being a morphologically rich and widely spoken language, research in Urdu image captioning remains underexplored due to a lack of resources. This study creates a new Urdu Image Captioning Dataset (UCID) called UC-23-RY to fill in the gaps in Urd
APA, Harvard, Vancouver, ISO, and other styles
4

Kavila, Selvani Deepthi, Moni Sushma Deep Kavila, Kanaka Raghu Sreerama, et al. "Image Captioning with Convolutional Neural Networks and Autoencoder-Transformer Model." International Journal of Experimental Research and Review 46 (December 30, 2024): 297–304. https://doi.org/10.52756/ijerr.2024.v46.023.

Full text
Abstract:
This study deals with emerging machine learning technologies, deep learning, and Transformers with autoencode-decode mechanisms for image captioning. This study is important to provide in-depth and detailed information about methodologies, algorithms and procedures involved in the task of captioning images. In this study, exploration and implementation of the most efficient technologies to produce relevant captions is done. This research aims to achieve a detailed understanding of image captioning using Transformers and convolutional neural networks, which can be achieved using various availab
APA, Harvard, Vancouver, ISO, and other styles
5

Fudholi, Dhomas Hatta, Umar Abdul Aziz Al-Faruq, Royan Abida N. Nayoan, and Annisa Zahra. "A study on attention-based deep learning architecture model for image captioning." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 23. http://dx.doi.org/10.11591/ijai.v13.i1.pp23-34.

Full text
Abstract:
<span lang="EN-US">Image captioning has been widely studied due to its ability in a visual scene understanding. Automatic visual scene understanding is useful for remote monitoring system and visually impaired people. Attention-based models, including transformer, are the current state-of-the-art architectures used in developing image captioning model. This study examines the works in the development of image captioning model, especially models that are developed based on attention mechanism. The architecture, the dataset, and the evaluation metrics analysis are done to the collected wor
APA, Harvard, Vancouver, ISO, and other styles
6

Fudholi, Dhomas Hatta, Umar Abdul Aziz Al-Faruq, Royan Abida N. Nayoan, and Annisa Zahra. "A study on attention-based deep learning architecture model for image captioning." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 23–34. https://doi.org/10.11591/ijai.v13.i1.pp23-34.

Full text
Abstract:
Image captioning has been widely studied due to its ability in a visual scene understanding. Automatic visual scene understanding is useful for remote monitoring system and visually impaired people. Attention-based models, including transformer, are the current state-of-the-art architectures used in developing image captioning model. This study examines the works in the development of image captioning model, especially models that are developed based on attention mechanism. The architecture, the dataset, and the evaluation metrics analysis are done to the collected works. A general flow of ima
APA, Harvard, Vancouver, ISO, and other styles
7

Shetty, Nikshep, and Yongmin Li. "Detailed Image Captioning and Hashtag Generation." Future Internet 16, no. 12 (2024): 444. http://dx.doi.org/10.3390/fi16120444.

Full text
Abstract:
This article presents CapFlow, an integrated approach to detailed image captioning and hashtag generation. Based on a thorough performance evaluation, the image captioning model utilizes a fine-tuned vision-language model with Low-Rank Adaptation (LoRA), while the hashtag generation employs the keyword extraction method. We evaluated the state-of-the-art image captioning models using both traditional metrics (BLEU, METEOR, ROUGE-L, and CIDEr) and the specialized CAPTURE metric for detailed captions. The hashtag generation models were assessed using precision, recall, and F1-score. The proposed
APA, Harvard, Vancouver, ISO, and other styles
8

Kumar, Ravi, Dinesh Kumar, and Ahmad Saeed. "Image Captioning Using Deep Learning Models." Computer Science and Engineering 14, no. 6 (2024): 162–68. http://dx.doi.org/10.5923/j.computer.20241406.06.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Mayank and Naveen Kumar Gondhi. "Comparative Assessment of Image Captioning Models." Journal of Computational and Theoretical Nanoscience 17, no. 1 (2020): 473–78. http://dx.doi.org/10.1166/jctn.2020.8693.

Full text
Abstract:
Image Captioning is the combination of Computer Vision and Natural Language Processing (NLP) in which simple sentences have been automatically generated describing the content of the image. This paper presents the comparative analysis of different models used for the generation of descriptive English captions for a given image. Feature extractions of the images are done using Convolutional Neural Networks (CNN). These features are then, passed onto Recurrent Neural Networks (RNN) or Long Short-term Memory (LSTM) to generate captions in English language. The evaluation metrics used to appraise
APA, Harvard, Vancouver, ISO, and other styles
10

Al-Malla, Muhammad Abdelhadie, Muhammad Abdelhadie Al-Malla, Assef Jafar, and Nada Ghneim. "Pre-trained CNNs as Feature-Extraction Modules for Image Captioning." ELCVIA Electronic Letters on Computer Vision and Image Analysis 21, no. 1 (2022): 1–16. http://dx.doi.org/10.5565/rev/elcvia.1436.

Full text
Abstract:
In this work, we present a thorough experimental study about feature extraction using Convolutional NeuralNetworks (CNNs) for the task of image captioning in the context of deep learning. We perform a set of 72experiments on 12 image classification CNNs pre-trained on the ImageNet [29] dataset. The features areextracted from the last layer after removing the fully connected layer and fed into the captioning model. We usea unified captioning model with a fixed vocabulary size across all the experiments to study the effect of changingthe CNN feature extractor on image captioning quality. The sco
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Captioning models"

1

Hoxha, Genc. "IMAGE CAPTIONING FOR REMOTE SENSING IMAGE ANALYSIS." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/351752.

Full text
Abstract:
Image Captioning (IC) aims to generate a coherent and comprehensive textual description that summarizes the complex content of an image. It is a combination of computer vision and natural language processing techniques to encode the visual features of an image and translate them into a sentence. In the context of remote sensing (RS) analysis, IC has been emerging as a new research area of high interest since it not only recognizes the objects within an image but also describes their attributes and relationships. In this thesis, we propose several IC methods for RS image analysis. We focus on t
APA, Harvard, Vancouver, ISO, and other styles
2

Gennari, Riccardo. "End-to-end Deep Metric Learning con Vision-Language Model per il Fashion Image Captioning." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25772/.

Full text
Abstract:
L'image captioning è un task di machine learning che consiste nella generazione di una didascalia, o caption, che descriva le caratteristiche di un'immagine data in input. Questo può essere applicato, ad esempio, per descrivere in dettaglio i prodotti in vendita su un sito di e-commerce, migliorando l'accessibilità del sito web e permettendo un acquisto più consapevole ai clienti con difficoltà visive. La generazione di descrizioni accurate per gli articoli di moda online è importante non solo per migliorare le esperienze di acquisto dei clienti, ma anche per aumentare le vendite online. Oltre
APA, Harvard, Vancouver, ISO, and other styles
3

Xu, Kelvin. "Exploring Attention Based Model for Captioning Images." Thèse, 2017. http://hdl.handle.net/1866/20194.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Captioning models"

1

Lim, Jian Han. "Ownership Protection for Image Captioning Models." In Digital Watermarking for Machine Learning Model. Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-7554-7_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kalimuthu, Marimuthu, Aditya Mogadala, Marius Mosbach, and Dietrich Klakow. "Fusion Models for Improved Image Captioning." In Pattern Recognition. ICPR International Workshops and Challenges. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-68780-9_32.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Rodríguez-Juan, Javier, David Ortiz-Perez, Jose Garcia-Rodriguez, David Tomás, and Grzegorz J. Nalepa. "Indoor Scenes Video Captioning." In 18th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2023). Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-42536-3_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zanfir, Mihai, Elisabeta Marinoiu, and Cristian Sminchisescu. "Spatio-Temporal Attention Models for Grounded Video Captioning." In Computer Vision – ACCV 2016. Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-54190-7_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Hendricks, Lisa Anne, Kaylee Burns, Kate Saenko, Trevor Darrell, and Anna Rohrbach. "Women Also Snowboard: Overcoming Bias in Captioning Models." In Computer Vision – ECCV 2018. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-01219-9_47.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Gallardo García, Rafael, Beatriz Beltrán Martínez, Carlos Hernández Gracidas, and Darnes Vilariño Ayala. "Towards Multilingual Image Captioning Models that Can Read." In Advances in Soft Computing. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-89820-5_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Belludi, Vinayaka A., Sumeet S. Inamdar, Kaushik Mallibhat, and Nalini C. Iyer. "VisionCraft: Advanced Image Captioning with Pre-trained Models." In Lecture Notes in Networks and Systems. Springer Nature Singapore, 2025. https://doi.org/10.1007/978-981-97-8865-1_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Nisha, Shailesh D. Kamble, and Himanshu Mittal. "Image Captioning Using Deep Learning Models: A Comprehensive Overview." In Communications in Computer and Information Science. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-91340-2_27.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Singh, Priya, and Ashish Verma. "Empirical Validation of Deep Learning Based on Image Captioning Models." In Proceedings of the 7th International Conference on Advance Computing and Intelligent Engineering. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-99-5015-7_38.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Gaurav and Pratistha Mathur. "Empirical Study of Image Captioning Models Using Various Deep Learning Encoders." In Lecture Notes in Electrical Engineering. Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-0047-3_27.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Captioning models"

1

Zatserkovnyi, Rostyslav, Zoriana Novosad, and Roksoliana Zatserkovna. "Large Language Models for Inclusive Image Captioning." In 2025 29th International Conference on Information Technology (IT). IEEE, 2025. https://doi.org/10.1109/it64745.2025.10930278.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Mahmoud, Anas, Mostafa Elhoushi, Amro Abbas, et al. "Sieve: Multimodal Dataset Pruning Using Image Captioning Models." In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2024. http://dx.doi.org/10.1109/cvpr52733.2024.02116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Vo, Hiep, Shui Yu, and Xi Zheng. "Prompt Engineering Adversarial Attack Against Image Captioning Models." In 2024 17th International Conference on Security of Information and Networks (SIN). IEEE, 2024. https://doi.org/10.1109/sin63213.2024.10871522.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Tejesh, Vambara, and Supriya M. "Enhanced Image Captioning Using CNN and BLIP Models." In 2025 International Conference on Multi-Agent Systems for Collaborative Intelligence (ICMSCI). IEEE, 2025. https://doi.org/10.1109/icmsci62561.2025.10894526.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Sayed, Abdelrahman M., Mohamed K. Elhadad, Gouda I. Salama, and Aiman Mousa. "Improving Arabic Image Captioning with Vision-Language Models." In 2025 15th International Conference on Electrical Engineering (ICEENG). IEEE, 2025. https://doi.org/10.1109/iceeng64546.2025.11031299.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Srivastava, Swati. "Review of Recent Datasets Used in Image Captioning Models." In 2024 International Conference on Sustainable Communication Networks and Application (ICSCNA). IEEE, 2024. https://doi.org/10.1109/icscna63714.2024.10864233.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kaushik, Priyanka, Patel Saileshchandra Rameshchandra, Mitali Kol, Ritushree Narayan, Abhishek Kumar Gupta, and R. Rajalakshmi. "Enhancing Image Captioning Accuracy through Hybrid Deep Learning Models." In 2024 1st International Conference on Advances in Computing, Communication and Networking (ICAC2N). IEEE, 2024. https://doi.org/10.1109/icac2n63387.2024.10895741.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Maaz, Ahmad, Shaheer Abbas, Raja Hashim Ali, Iftikhar Ahmed, and Talha Ali Khan. "VGG Models in Image Captioning: Which Architecture Delivers Better Descriptions?" In 2024 18th International Conference on Open Source Systems and Technologies (ICOSST). IEEE, 2024. https://doi.org/10.1109/icosst64562.2024.10871137.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Gamidi, Rohan, M. Hemasri, Tejaswi Muppala, Vinitha Chowdary A, Aiswariya Milan K, and Suja Palaniswamy. "Enhancing Underwater Image Captioning Using Transformer Models and Augmented Terrestrial Datasets." In 2025 International Conference on Pervasive Computational Technologies (ICPCT). IEEE, 2025. https://doi.org/10.1109/icpct64145.2025.10940912.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Sharma, Pankaj, and Sanjiv Sharma. "Synergistic Fusion of CNN and BiLSTM Models for Enhanced Video Captioning." In 2024 IEEE International Conference on Intelligent Signal Processing and Effective Communication Technologies (INSPECT). IEEE, 2024. https://doi.org/10.1109/inspect63485.2024.10896231.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!