Log in

Relevant bibliographies by topics / The Caption / Journal articles

To see the other types of publications on this topic, follow the link: The Caption.

Journal articles on the topic 'The Caption'

Author: Grafiati

Published: 7 June 2025

Last updated: 2 August 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'The Caption.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Lai, Hongling, Dianjian Wang, and Xiancai Ou. "The Effects of Different Caption Modes on Chinese English Learners' Content and Vocabulary Comprehension." International Journal of Computer-Assisted Language Learning and Teaching 11, no. 4 (2021): 54–68. http://dx.doi.org/10.4018/ijcallt.2021100104.

Full text

Abstract:

This empirical study investigates the effects of different caption modes on the content and vocabulary comprehension by Chinese English learners with different levels of English proficiency. The results show that the full captioned group performed better on content comprehension than the keyword group, while no significant difference was found on vocabulary comprehension between the two captioned groups. For the beginning-level learners, the full captioned groups did better both in content and vocabulary comprehension than the keyword caption group; meanwhile, for the advanced learners, both f

APA, Harvard, Vancouver, ISO, and other styles

2

Butler, Janine. "The Visual Experience of Accessing Captioned Television and Digital Videos." Television & New Media 21, no. 7 (2019): 679–96. http://dx.doi.org/10.1177/1527476418824805.

Full text

Abstract:

The increase in video-based communication has made different caption styles more apparent to audiences, including hearing viewers who watch social media videos with colorful open captions. To explore how viewers respond to a variety of caption styles, this article shares findings from three focus group discussions with twenty deaf and hard-of-hearing college students. This article begins by discussing the accessibility of captioned television and digital media and how captions can influence the viewing experience. This article then analyzes deaf and hard-of-hearing focus group participants’ st

APA, Harvard, Vancouver, ISO, and other styles

3

Cárdenas, Monica, and Daniela Rocio Ramirez Orellana. "Progressive Reduction of Captions in Language Learning." Journal of Information Technology Education: Innovations in Practice 23 (2024): 002. http://dx.doi.org/10.28945/5263.

Full text

Abstract:

Aim/Purpose: This exploratory qualitative case study examines the perceptions of high-school learners of English regarding a pedagogical intervention involving progressive reduction of captions (full, sentence-level, keyword captions, and no-captions) in enhancing language learning. Background: Recognizing the limitations of caption usage in fostering independent listening comprehension in non-captioned environments, this research builds upon and extends the foundational work of Vanderplank (2016), who highlighted the necessity of a comprehensive blend of tasks, strategies, focused viewing, an

APA, Harvard, Vancouver, ISO, and other styles

4

Muehlbradt, Annika, and Shaun K. Kane. "What's in an ALT Tag? Exploring Caption Content Priorities through Collaborative Captioning." ACM Transactions on Accessible Computing 15, no. 1 (2022): 1–32. http://dx.doi.org/10.1145/3507659.

Full text

Abstract:

Evaluating the quality of accessible image captions with human raters is difficult, as it may be difficult for a visually impaired user to know how comprehensive a caption is, whereas a sighted assistant may not know what information a user will need from a caption. To explore how image captioners and caption consumers assess caption content, we conducted a series of collaborative captioning sessions in which six pairs, consisting of a blind person and their sighted partner, worked together to discuss, create, and evaluate image captions. By making captioning a collaborative task, we were able

APA, Harvard, Vancouver, ISO, and other styles

5

Hsu, Hui-Tzu. "Incidental professional vocabulary scquisition of EFL business learners: Effect of captioned video with glosses as a multimedia annotation." JALT CALL Journal 14, no. 2 (2018): 119–42. http://dx.doi.org/10.29140/jaltcall.v14n2.j227.

Full text

Abstract:

Use of captioned video in classrooms has gained considerable attention in the second and foreign language learning. However, the effect of application of captioned video embedded with glosses on incidental vocabulary enhancement has not been explored. This study aims to examine the effect of video captions with glosses on efl students’ incidental business vocabulary acquisition; 50 students from a college of management served as participants. A pretest was adopted to ensure participants lacked familiarity with the target vocabulary. All participants watched three video clips presented in three

APA, Harvard, Vancouver, ISO, and other styles

6

Li, Yan. "Listen or Read? The Impact of Proficiency and Visual Complexity on Learners’ Reliance on Captions." Behavioral Sciences 15, no. 4 (2025): 542. https://doi.org/10.3390/bs15040542.

Full text

Abstract:

This study investigates how Chinese EFL (English as a foreign language) learners of low- and high-proficiency levels allocate attention between captions and audio while watching videos, and how visual complexity (single- vs. multi-speaker content) influences caption reliance. The study employed a novel paused transcription method to assess real-time processing. A total of 64 participants (31 low-proficiency [A1–A2] and 33 high-proficiency [C1–C2] learners) viewed single- and multi-speaker videos with English captions. Misleading captions were inserted to objectively measure reliance on caption

APA, Harvard, Vancouver, ISO, and other styles

7

Hsu, Ching-Kun. "Learning motivation and adaptive video caption filtering for EFL learners using handheld devices." ReCALL 27, no. 1 (2014): 84–103. http://dx.doi.org/10.1017/s0958344014000214.

Full text

Abstract:

AbstractThe aim of this study was to provide adaptive assistance to improve the listening comprehension of eleventh grade students. This study developed a video-based language learning system for handheld devices, using three levels of caption filtering adapted to student needs. Elementary level captioning excluded 220 English sight words (see Section 1 for definition), but provided captions and Chinese translations for the remaining words. Intermediate level excluded 1000 high frequency English words, but provided captions for the remaining words, and 2200 high frequency English words were ex

APA, Harvard, Vancouver, ISO, and other styles

8

Suh, Hyesun, Jiyeon Kim, Jinsoo So, and Jongjin Jung. "A core region captioning framework for automatic video understanding in story video contents." International Journal of Engineering Business Management 14 (January 2022): 184797902210781. http://dx.doi.org/10.1177/18479790221078130.

Full text

Abstract:

Due to the rapid increase in images and image data, research examining the visual analysis of such unstructured data has recently come to be actively conducted. One of the representative image caption models the DenseCap model extracts various regions in an image and generates region-level captions. However, since the existing DenseCap model does not consider priority for region captions, it is difficult to identify relatively significant region captions that best describe the image. There has also been a lack of research into captioning focusing on the core areas for story content, such as im

APA, Harvard, Vancouver, ISO, and other styles

9

Li, Hongxiang, Meng Cao, Xuxin Cheng, Yaowei Li, Zhihong Zhu, and Yuexian Zou. "Exploiting Auxiliary Caption for Video Grounding." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (2024): 18508–16. http://dx.doi.org/10.1609/aaai.v38i17.29812.

Full text

Abstract:

Video grounding aims to locate a moment of interest matching the given query sentence from an untrimmed video. Previous works ignore the sparsity dilemma in video annotations, which fails to provide the context information between potential events and query sentences in the dataset. In this paper, we contend that exploiting easily available captions which describe general actions, i.e., auxiliary captions defined in our paper, will significantly boost the performance. To this end, we propose an Auxiliary Caption Network (ACNet) for video grounding. Specifically, we first introduce dense video

APA, Harvard, Vancouver, ISO, and other styles

10

Yang, Jie Chi, and Peichin Chang. "Captions and reduced forms instruction: The impact on EFL students’ listening comprehension." ReCALL 26, no. 1 (2013): 44–61. http://dx.doi.org/10.1017/s0958344013000219.

Full text

Abstract:

AbstractFor many EFL learners, listening poses a grave challenge. The difficulty in segmenting a stream of speech and limited capacity in short-term memory are common weaknesses for language learners. Specifically, reduced forms, which frequently appear in authentic informal conversations, compound the challenges in listening comprehension. Numerous interventions have been implemented to assist EFL language learners, and of these, the application of captions has been found highly effective in promoting learning. Few studies have examined how different modes of captions may enhance listening co

APA, Harvard, Vancouver, ISO, and other styles

11

Gotmare, Yugant. "A Comparative Study of Feature Extraction Models for Image Caption Generation." International Journal for Research in Applied Science and Engineering Technology 12, no. 4 (2024): 4821–28. http://dx.doi.org/10.22214/ijraset.2024.61114.

Full text

Abstract:

Abstract: Image caption generation is a challenging task in the field of computer vision and natural language processing. This study presents a comparative analysis of various feature extraction models for image caption generation. The goal is to evaluate the performance and effectiveness of different models in capturing visual features and generating accurate and contextually relevant captions. The feature extraction models considered in this study include ResNet (Residual Neural Network), DenseNet, VGG (Visual Geometry Group), InceptionNet, DenseNet, and XceptionNet. We conduct extensive exp

APA, Harvard, Vancouver, ISO, and other styles

12

Nabagata, Saha, V. Akhila Y., and Radha Krishna P. "An Improved Image Captioning Using Emotions." Journal of Innovation Sciences and Sustainable Technologies 1, no. 2 (2021): 91–118. https://doi.org/10.0608/JISST.2022944590.

Full text

Abstract:

Image captioning has been a challenging area for generating captions that closely resemble how humans would caption a particular image. The state- of-the-art exists in factual captions to caption a given image that contains inanimate objects. However, captioning images with humans using facial expressions remains a eld that has not been tinkered into. This paper proposes a novel method that realizes this task. The emotion recognized on the human subject present in the image is concatenated along with image features and fed to an image captioning model. The caption generated is more relevant an

APA, Harvard, Vancouver, ISO, and other styles

13

Lavanya, K., B. Jayamala, C. Jeyasri, and A. Sakthivel. "Automatic Audio and Image Caption Generation with Deep Learning." Shanlax International Journal of Arts, Science and Humanities 11, S3-July (2024): 34–39. http://dx.doi.org/10.34293/sijash.v11is3-july.7916.

Full text

Abstract:

A novel approach to image caption generation tailored specifically for visually impaired individuals. The proposed system employs advanced computer vision algorithms to analyze images and generate descriptive textual captions. Furthermore, it integrates seamless text-to-speech conversion functionality, allowing for the automatic transformation of these captions into spoken audio, thereby enabling access to visual content for individuals with visual impairments. The goal of this project is to generate descriptive captions for a given photograph or image. We achieve this by employing Convolution

APA, Harvard, Vancouver, ISO, and other styles

14

Priyanka, V. Thakre* Ashish S. Sambare Namrata S. Khade. "ENHANCE APPROACH FOR AUTO CAPTION GENERATION ON DIFFERENT NEWS IMAGES DATASET USING FUZZY LOGIC." INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY 5, no. 7 (2016): 597–602. https://doi.org/10.5281/zenodo.57047.

Full text

Abstract:

These times, whenever retrieving images from the search Engines that retrieves images without analysing their include restrain, simply by matching user inquires against the image’s file name and format, user comment the tags, captions, and, generally, text surrounding the image. Also the retrieved image contains any textual data along with the images. Our announced the task of automatic caption generation for news images. The task fuses insights from computer vision and natural language processing and holds promise for various multimedia applications, such as image retrieval, development

APA, Harvard, Vancouver, ISO, and other styles

15

Nani Prihatmi, Tutut, Rini Anjarwati, and Puji Rahayu. "The Use of English on Instagram Captions: A Case Study in Camera Indonesia Photography Community." EDUTEC : Journal of Education And Technology 5, no. 1 (2021): 154–60. http://dx.doi.org/10.29062/edu.v5i1.238.

Full text

Abstract:

Along with Instagram's growing popularity as the primary medium for sharing photos and videos, the world of photography is also expanding at a rapid pace. The Instagram caption is always dropped every time people upload on social media. Photo captions often act as a result of critical thinking and a way to communicate ideas. This study used a qualitative descriptive approach to collect and describe information about using English caption on Instagram photos from the perspectives of Camera Indonesia photography community members: frequency, the purpose for dropping captions in English, its effe

APA, Harvard, Vancouver, ISO, and other styles

16

Fauzi Rahman, Ummul Qura, Nini Ibrahim,. "SURVEI INTENSITAS MENULIS CAPTION DI INSTAGRAM MAHASISWA PENDIDIKAN BAHASA DAN SASTRA INDONESIA FKIP UHAMKA (CAPTION WRITING INTENSITY SURVEY ON INSTAGRAM BY PENDIDIKAN BAHASA DAN SASTRA INDONESIA’S STUDENTS AT FKIP UHAMKA)." JURNAL BAHASA, SASTRA DAN PEMBELAJARANNYA 12, no. 1 (2022): 1. http://dx.doi.org/10.20527/jbsp.v12i1.13041.

Full text

Abstract:

AbstractCaption Writing Intensity Survey on Instagram by Pendidikan Bahasa dan SastraIndonesia’s Students at FKIP UHAMKA. In their daily life, students often use Instagram,both in reading other people's statuses and in writing captions for themselves. Therefore, thisstudy aims to determine the intensity of writing Instagram captions fo the students ofIndonesian Language and Literature Education at FKIP UHAMKA. This research methoduses a quantitative survey approach with a descriptive survey through a Cross SectionalStudy research design. The results of show that on the indicator of appreciatio

APA, Harvard, Vancouver, ISO, and other styles

17

Verma, Dr Neeta. "Assistive Vision Technology using Deep Learning Techniques." International Journal for Research in Applied Science and Engineering Technology 9, no. VII (2021): 2695–704. http://dx.doi.org/10.22214/ijraset.2021.36815.

Full text

Abstract:

One of the most important functions of the human visual system is automatic captioning. Caption generation is one of the more interesting and focused areas of AI, with numerous challenges to overcome. If there is an application that automatically captions the scenes in which a person is present and converts the caption into a clear message, people will benefit from it in a variety of ways. In this, we offer a deep learning model that detects things or features in images automatically, produces descriptions for the images, and transforms the descriptions to audio for louder readout. The model u

APA, Harvard, Vancouver, ISO, and other styles

18

Lu, Yifan, Ziqi Zhang, Chunfeng Yuan, et al. "Set Prediction Guided by Semantic Concepts for Diverse Video Captioning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 4 (2024): 3909–17. http://dx.doi.org/10.1609/aaai.v38i4.28183.

Full text

Abstract:

Diverse video captioning aims to generate a set of sentences to describe the given video in various aspects. Mainstream methods are trained with independent pairs of a video and a caption from its ground-truth set without exploiting the intra-set relationship, resulting in low diversity of generated captions. Different from them, we formulate diverse captioning into a semantic-concept-guided set prediction (SCG-SP) problem by fitting the predicted caption set to the ground-truth set, where the set-level relationship is fully captured. Specifically, our set prediction consists of two synergisti

APA, Harvard, Vancouver, ISO, and other styles

19

Li, Franklin Mingzhe, Cheng Lu, Zhicong Lu, Patrick Carrington, and Khai N. Truong. "An Exploration of Captioning Practices and Challenges of Individual Content Creators on YouTube for People with Hearing Impairments." Proceedings of the ACM on Human-Computer Interaction 6, CSCW1 (2022): 1–26. http://dx.doi.org/10.1145/3512922.

Full text

Abstract:

Deaf and Hard-of-Hearing (DHH) audiences have long complained about caption qualities for many online videos created by individual content creators on video-sharing platforms (e.g., YouTube). However, there lack explorations of practices, challenges, and perceptions of online video captions from the perspectives of both individual content creators and DHH audiences. In this work, we first explore DHH audiences' feedback on and reactions to YouTube video captions through interviews with 13 DHH individuals, and uncover DHH audiences' experiences, challenges, and perceptions on watching videos cr

APA, Harvard, Vancouver, ISO, and other styles

20

V, Anagha, and K. S. Kuppusamy. "A Survey on Machine Learning Techniques for Video Caption Accessibility to Assist Children with Learning Disabilities." International Journal for Research in Applied Science and Engineering Technology 11, no. 1 (2023): 1555–66. http://dx.doi.org/10.22214/ijraset.2022.48867.

Full text

Abstract:

Abstract: On the World wide web, videos have become a speedy and effective information distribution method. Caretakers observed that YouTube was the most popular medium for kids during the COVID-19 epidemic, with more than 78% of kids watching it. To make a video accessible to children with learning impairments, captions are considered an assistive technology tool. Those with learning disabilities are not able to comprehend text content quickly within a video timeframe. We explored the temporal dimension of the caption of a frame in the video. The main difficulties with video caption accessibi

APA, Harvard, Vancouver, ISO, and other styles

21

Iwamura, Kiyohiko, Jun Younes Louhi Kasahara, Alessandro Moro, Atsushi Yamashita, and Hajime Asama. "Image Captioning Using Motion-CNN with Object Detection." Sensors 21, no. 4 (2021): 1270. http://dx.doi.org/10.3390/s21041270.

Full text

Abstract:

Automatic image captioning has many important applications, such as the depiction of visual contents for visually impaired people or the indexing of images on the internet. Recently, deep learning-based image captioning models have been researched extensively. For caption generation, they learn the relation between image features and words included in the captions. However, image features might not be relevant for certain words such as verbs. Therefore, our earlier reported method included the use of motion features along with image features for generating captions including verbs. However, al

APA, Harvard, Vancouver, ISO, and other styles

22

Sakshi Birthi. "ReCap Pro: Caption Correction using Meta Learning." Journal of Information Systems Engineering and Management 10, no. 30s (2025): 686–95. https://doi.org/10.52783/jisem.v10i30s.4891.

Full text

Abstract:

This article presents ReCap Pro, a framework that corrects auto-generated captions by dealing with the possible errors in nouns and verbs in the caption. While caption correction has been attempted earlier, it is observed that it has never been tried as a meta-learning-based approach. The work described in this article offers few-shot learning enabling faster learning with fewer samples of images, solving one of the critical limitations of the traditional data-intensive caption generation models. An object detection model trained using Reptile Meta-Learning is employed to detect the correct no

APA, Harvard, Vancouver, ISO, and other styles

23

Mas, Farrah Chaiya, and Muhammad Yunus Anis. "The Effect of Applying Arabic Translation Techniques on the Translation Quality Assessment of Al Jazeera Captions on TikTok Social Media." Arabiyatuna: Jurnal Bahasa Arab 8, no. 2 (2024): 509–36. https://doi.org/10.29240/jba.v8i2.10703.

Full text

Abstract:

The purpose of this study is to analyze the level of accuracy of Arabic caption translation on TikTok @aljazeera social media from the perspective of techniques and types of captions used to ensure whether the results of caption translation with the source language, namely Arabic on TikTok @aljazeera social media can be well received by readers in the target language, namely Indonesian. TikTok is one of the popular social media platforms today where the Qatari news account, @aljazeera, shares news in Arabic with captions to explain the uploaded videos. The data in this study are captions on Ti

APA, Harvard, Vancouver, ISO, and other styles

24

Kapadnis, Trushna, Anuja Modhave, Akanksha Narwade, Umesh Wagh, and Prof Dr Deepali Sale. "Automatic Intelligence Caption Generator." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 07, no. 11 (2023): 1–11. http://dx.doi.org/10.55041/ijsrem26789.

Full text

Abstract:

An Image Caption Generator is a sophisticated AI system that combines computer vision and natural language processing to automatically create descriptive textual captions for images. This technology utilizes deep learning, particularly Convolutional Neural Networks (CNNs), to analyze and extract meaningful visual features from the input image. These features capture details about the objects, scenes, and elements within the image. Subsequently, a natural language processing model, often built on Recurrent Neural Networks (RNNs) or Transformers, processes these visual features and generates coh

APA, Harvard, Vancouver, ISO, and other styles

25

Leveridge, Aubrey Neil, and Jie Chi Yang. "Testing learner reliance on caption supports in second language listening comprehension multimedia environments." ReCALL 25, no. 2 (2013): 199–214. http://dx.doi.org/10.1017/s0958344013000074.

Full text

Abstract:

AbstractListening comprehension in a second language (L2) is a complex and particularly challenging task for learners. Because of this, L2 learners and instructors alike employ different learning supports as assistance. Captions in multimedia instruction readily provide support and thus have been an ever-increasing focus of many studies. However, captions must eventually be removed, as the goal of language learning is participation in the target language where captions are not typically available. Consequently, this creates a dilemma particularly for language instructors as to the usage of cap

APA, Harvard, Vancouver, ISO, and other styles

26

Hu, Xiaowei, Xi Yin, Kevin Lin, et al. "VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 2 (2021): 1575–83. http://dx.doi.org/10.1609/aaai.v35i2.16249.

Full text

Abstract:

It is highly desirable yet challenging to generate image captions that can describe novel objects which are unseen in caption-labeled training data, a capability that is evaluated in the novel object captioning challenge (nocaps). In this challenge, no additional image-caption training data, other than COCO Captions, is allowed for model training. Thus, conventional Vision-Language Pre-training (VLP) methods cannot be applied. This paper presents VIsual VOcabulary pre-training (VIVO) that performs pre-training in the absence of caption annotations. By breaking the dependency of paired image-ca

APA, Harvard, Vancouver, ISO, and other styles

27

Jeon, Minseong, Jaepil Ko, and Kyungjoo Cheoi. "Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment." Sensors 24, no. 1 (2024): 292. http://dx.doi.org/10.3390/s24010292.

Full text

Abstract:

This paper presents a novel approach to risk assessment by incorporating image captioning as a fundamental component to enhance the effectiveness of surveillance systems. The proposed surveillance system utilizes image captioning to generate descriptive captions that portray the relationship between objects, actions, and space elements within the observed scene. Subsequently, it evaluates the risk level based on the content of these captions. After defining the risk levels to be detected in the surveillance system, we constructed a dataset consisting of [Image-Caption-Danger Score]. Our datase

APA, Harvard, Vancouver, ISO, and other styles

28

Reddy, Paidimarla Naveen. "Image Captioning Using Deep Learning." International Journal for Research in Applied Science and Engineering Technology 11, no. 6 (2023): 107–12. http://dx.doi.org/10.22214/ijraset.2023.52822.

Full text

Abstract:

Abstract: Consequently making the description or title of an picture utilizing any common dialect sentences could be a exceptionally challenging assignment. It requires both strategies from computer vision to get it the substance of the picture and a dialect show from the field of common dialect preparing to turn the understanding of the picture into words within the right arrange. In addition to that we have examined how this demonstrate can be implemented on web and will be open for conclusion client as well. Our venture points to implement an Picture caption generator that responds to the c

APA, Harvard, Vancouver, ISO, and other styles

29

Ji, Wanting, and Ruili Wang. "A Multi-instance Multi-label Dual Learning Approach for Video Captioning." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 2s (2021): 1–18. http://dx.doi.org/10.1145/3446792.

Full text

Abstract:

Video captioning is a challenging task in the field of multimedia processing, which aims to generate informative natural language descriptions/captions to describe video contents. Previous video captioning approaches mainly focused on capturing visual information in videos using an encoder-decoder structure to generate video captions. Recently, a new encoder-decoder-reconstructor structure was proposed for video captioning, which captured the information in both videos and captions. Based on this, this article proposes a novel multi-instance multi-label dual learning approach (MIMLDL) to gener

APA, Harvard, Vancouver, ISO, and other styles

30

Pandit,, Ashwini. "Image Caption Generator." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 05 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem34949.

Full text

Abstract:

An Image Caption Generator is a sophisticated AI system that combines computer vision and natural language processing to automatically create descriptive textual captions for images. This technology utilizes deep learning, particularly Convolutional Neural Networks (CNNs), to analyze and extract meaningful visual features from the input image. These features capture details about the objects, scenes, and elements within the image. Subsequently, a natural language processing model, often built on Recurrent Neural Networks (RNNs) or Transformers, processes these visual features and generates coh

APA, Harvard, Vancouver, ISO, and other styles

31

Kwon, Hyun, and Sanghyun Lee. "Toward Backdoor Attacks for Image Captioning Model in Deep Neural Networks." Security and Communication Networks 2022 (August 16, 2022): 1–10. http://dx.doi.org/10.1155/2022/1525052.

Full text

Abstract:

Deep neural networks perform well in image recognition, speech recognition, and text recognition fields. The image caption model provides captions for images by generating text after image recognition. After extracting features from the original image, this model generates a representation vector and provides captions for the image by generating text through a recursive neural network. However, this image caption model has weaknesses in the backdoor sample. In this paper, we propose a method for generating backdoor samples for image caption models. By adding a specific trigger to the original

APA, Harvard, Vancouver, ISO, and other styles

32

Fitri, Elsa Amelia, Burhanudin Rais, and Asni Tri Hartati. "REVEALING THE TRENDS: STUDENTS' MOTIVATIONS FOR CHOOSING ENGLISH CAPTIONS ON INSTAGRAM." PRIMACY Journal of English Education and Literacy 2, no. 2 (2023): 126–34. http://dx.doi.org/10.33592/primacy.v2i2.4167.

Full text

Abstract:

Instagram is now a worldwide photo and video-sharing social media platform, and English is increasingly used in captions by non-English speaking Instagram users. With this known, this study aimed to determine the students' reason at Universitas Kapuas using English as a caption on Instagram. This study involved 22 students of Universitas Kapuas. The research design used was a qualitative descriptive. Data collection used an interview consisting of five questions. The researchers also used observation to triangulate the data. The results of this study stated that students use English captions o

APA, Harvard, Vancouver, ISO, and other styles

33

Mohamad Nezami, Omid, Mark Dras, Stephen Wan, and Cecile Paris. "Image Captioning using Facial Expression and Attention." Journal of Artificial Intelligence Research 68 (August 6, 2020): 661–89. http://dx.doi.org/10.1613/jair.1.12025.

Full text

Abstract:

Benefiting from advances in machine vision and natural language processing techniques, current image captioning systems are able to generate detailed visual descriptions. For the most part, these descriptions represent an objective characterisation of the image, although some models do incorporate subjective aspects related to the observer’s view of the image, such as sentiment; current models, however, usually do not consider the emotional content of images during the caption generation process. This paper addresses this issue by proposing novel image captioning models which use facial expres

APA, Harvard, Vancouver, ISO, and other styles

34

Shailesh, Sangle, Kabra Palak, Gharat Mihir, and Jha Dhiraj. "Image caption extraction to aid visual learning." i-manager’s Journal on Image Processing 10, no. 2 (2023): 14. http://dx.doi.org/10.26634/jip.10.2.19404.

Full text

Abstract:

An image caption generator is essential for social media enthusiasts or visually impaired individuals. It can be used as a plugin in popular social media platforms to recommend suitable captions or to assist visually impaired people in comprehending the image content on the web, thereby eliminating ambiguity in image meaning and ensuring accurate knowledge acquisition. This research describes an image caption generator that utilizes a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) model to generate natural language descriptions of images. The CNN was employed to extract

APA, Harvard, Vancouver, ISO, and other styles

35

Feng, Junlong, and Jianping Zhao. "Context-Fused Guidance for Image Captioning Using Sequence-Level Training." Computational Intelligence and Neuroscience 2022 (January 5, 2022): 1–9. http://dx.doi.org/10.1155/2022/9743123.

Full text

Abstract:

Recent image captioning models based on the encoder-decoder framework have achieved remarkable success in humanlike sentence generation. However, an explicit separation between encoder and decoder brings out a disconnection between the image and sentence. It usually leads to a rough image description: the generated caption only contains main instances but neglects additional objects and scenes unexpectedly, which reduces the caption consistency of the image. To address this issue, we proposed an image captioning system within context-fused guidance in this paper. It incorporates regional and g

APA, Harvard, Vancouver, ISO, and other styles

36

Panicker, Megha J., Vikas Upadhayay, Gunjan Sethi, and Vrinda Mathur. "Image Caption Generator." International Journal of Innovative Technology and Exploring Engineering 10, no. 3 (2021): 87–92. http://dx.doi.org/10.35940/ijitee.c8383.0110321.

Full text

Abstract:

In the modern era, image captioning has become one of the most widely required tools. Moreover, there are inbuilt applications that generate and provide a caption for a certain image, all these things are done with the help of deep neural network models. The process of generating a description of an image is called image captioning. It requires recognizing the important objects, their attributes, and the relationships among the objects in an image. It generates syntactically and semantically correct sentences.In this paper, we present a deep learning model to describe images and generate capti

APA, Harvard, Vancouver, ISO, and other styles

37

Megha, J. Panicker, Upadhayay Vikas, Sethi Gunjan, and Mathur Vrinda. "Image Caption Generator." International Journal of Innovative Technology and Exploring Engineering (IJITEE) 10, no. 3 (2021): 87–92. https://doi.org/10.35940/ijitee.C8383.0110321.

Full text

Abstract:

In the modern era, image captioning has become one of the most widely required tools. Moreover, there are inbuilt applications that generate and provide a caption for a certain image, all these things are done with the help of deep neural network models. The process of generating a description of an image is called image captioning. It requires recognizing the important objects, their attributes, and the relationships among the objects in an image. It generates syntactically and semantically correct sentences.In this paper, we present a deep learning model to describe images and generate capti

APA, Harvard, Vancouver, ISO, and other styles

38

Jaiswal, B. Uthkarsh. "IMAGE CAPTIONING FOR VISUALLY IMPAIRED." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 06 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem35798.

Full text

Abstract:

Image captioning has always been a great source of help for visually impaired by generating captions for the given image. But limiting it to the captions won't be that helpful for the visually challenged. In this project we tried to give voice to our generated captions by using the concept for TTS that is text-to-speech which is more impactful and practical.To accomplish caption generation and to implement Deep learning architecture we have used Tensorflow and Keras. it is challenging to generate captions that have right linguistic properties because it requires sophisticated level of image un

APA, Harvard, Vancouver, ISO, and other styles

39

Nadhifah, Elok, and M. Bayu Firmansyah. "Fungsi Historis dalam Caption Instagram Vania Winola (Kajian Pragmatik Siber)." Multiverse: Open Multidisciplinary Journal 2, no. 3 (2023): 324–26. http://dx.doi.org/10.57251/multiverse.v2i3.1163.

Full text

Abstract:

This research aims to determine the historical function of cyber pragmatics in Vania Winola's Instagram captions so that followers on her social media understand the meaning of the written in the captions. Caption is short text used to accompany an image which occurs below the upload. Cyber ??pragmatics is a branch of pragmatics that studies the meaning of speech in certain situations on social media. Therefore, this research was conducted to describe the function of cyber pragmatics in Vania Winola's Instagram captions. This research uses a cyber pragmatics approach using virtual ethnographic

APA, Harvard, Vancouver, ISO, and other styles

40

Jiang, Zutao, Guansong Lu, Xiaodan Liang, et al. "3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 1 (2023): 1051–59. http://dx.doi.org/10.1609/aaai.v37i1.25186.

Full text

Abstract:

Text-guided 3D object generation aims to generate 3D objects described by user-defined captions, which paves a flexible way to visualize what we imagined. Although some works have been devoted to solving this challenging task, these works either utilize some explicit 3D representations (e.g., mesh), which lack texture and require post-processing for rendering photo-realistic views; or require individual time-consuming optimization for every single case. Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a

APA, Harvard, Vancouver, ISO, and other styles

41

Seo, Paul Hongsuck, Piyush Sharma, Tomer Levinboim, Bohyung Han, and Radu Soricut. "Reinforcing an Image Caption Generator Using Off-Line Human Feedback." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 03 (2020): 2693–700. http://dx.doi.org/10.1609/aaai.v34i03.5655.

Full text

Abstract:

Human ratings are currently the most accurate way to assess the quality of an image captioning model, yet most often the only used outcome of an expensive human rating evaluation is a few overall statistics over the evaluation dataset. In this paper, we show that the signal from instance-level human caption ratings can be leveraged to improve captioning models, even when the amount of caption ratings is several orders of magnitude less than the caption training data. We employ a policy gradient method to maximize the human ratings as rewards in an off-policy reinforcement learning setting, whe

APA, Harvard, Vancouver, ISO, and other styles

42

Wardani, Sri. "Crafting Effective Captions: Strategies for Enhancing Students’ Engagement and Writing Skill." TEFLA Journal (Teaching English as Foreign Language and Applied Linguistic Journal) 5, no. 1 (2023): 28–34. http://dx.doi.org/10.35747/tefla.v5i1.682.

Full text

Abstract:

This research aims to investigate students' proficiency in writing skills through the utilization of photographs as writing prompts (captions). The study focuses on analyzing the outcomes of students' caption writing using a four-step writing process, namely drafting, revising, editing, and publishing. The assessment involved a caption writing test to evaluate students' progress, supplemented by a questionnaire to gather additional data. The participants consisted of 32 students from XII MIPA 4 of SMA N 1 Malang. Following one cycle of the teaching and learning process, the findings of this re

APA, Harvard, Vancouver, ISO, and other styles

43

Fei, Zhengcong. "Partially Non-Autoregressive Image Captioning." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 2 (2021): 1309–16. http://dx.doi.org/10.1609/aaai.v35i2.16219.

Full text

Abstract:

Current state-of-the-art image captioning systems usually generated descriptions autoregressively, i.e., every forward step conditions on the given image and previously produced words. The sequential attribution causes a unavoidable decoding latency. Non-autoregressive image captioning, on the other hand, predicts the entire sentence simultaneously and accelerates the inference process significantly. However, it removes the dependence in a caption and commonly suffers from repetition or missing issues. To make a better trade-off between speed and quality, we introduce a partially non-autoregre

APA, Harvard, Vancouver, ISO, and other styles

44

Huang, Xuefei, Ka-Hou Chan, Wei Ke, and Hao Sheng. "Parallel Dense Video Caption Generation with Multi-Modal Features." Mathematics 11, no. 17 (2023): 3685. http://dx.doi.org/10.3390/math11173685.

Full text

Abstract:

The task of dense video captioning is to generate detailed natural-language descriptions for an original video, which requires deep analysis and mining of semantic captions to identify events in the video. Existing methods typically follow a localisation-then-captioning sequence within given frame sequences, resulting in caption generation that is highly dependent on which objects have been detected. This work proposes a parallel-based dense video captioning method that can simultaneously address the mutual constraint between event proposals and captions. Additionally, a deformable Transformer

APA, Harvard, Vancouver, ISO, and other styles

45

Prastyo, Yanuar Dwi, Zulkarnaen Zulkarnaen, and Siti Farhana. "THE EFFECT OF USING INSTAGRAM CAPTION ON STUDENTS’ VOCABULARY MASTERY AT TWELFTH GRADE OF SMA AL-AZHAR 3 BANDAR LAMPUNG." Journal of English Educational Study (JEES) 5, no. 2 (2022): 122–31. http://dx.doi.org/10.31932/jees.v5i2.1600.

Full text

Abstract:

Abstract: The phenomenon of Instagram users arising in Indonesia who shares about learning English independently, specifically learning vocabularies in this case Instagram caption, underlies this research to be studied profoundly. The objectives of the research were (1) to gain an overall picture of students’ vocabulary mastery in learning vocabulary through Instagram captions, (2) to find out students’ interest in learning vocabulary through Instagram. This journal used a mix-method. To collect the data used, vocabulary tests and interviews. The result showed that there was a statically signi

APA, Harvard, Vancouver, ISO, and other styles

46

Li, Yuepeng. "Image Caption using VGG model and LSTM." Applied and Computational Engineering 48, no. 1 (2024): 68–77. http://dx.doi.org/10.54254/2755-2721/48/20241175.

Full text

Abstract:

Deep convolutional networks and recurrent neural networks have gained significant popularity in the field of image captioning tasks in recent times. As we all know the performance and the architecture of models are still eternal topic. We constructed the model using a new method to enhance its performance and accuracy. In our model, we make use of pretrained CNN model VGG (Visual Geometry Group) to extract image features, and learn caption sentence features using bidirectional LSTM(Long-Short-Term-Memory) which can better understand the meaning of sentences in the text. Then we combine the ima

APA, Harvard, Vancouver, ISO, and other styles

47

Jaiswal, Sushma, Harikumar Pallthadka, and Rajesh P.Chinhewadi. "Exploring a Spectrum of Deep Learning Models for Automated Image Captioning: A Comprehensive Survey." International Journal of Scientific Methods in Engineering and Management 02, no. 03 (2024): 33–46. http://dx.doi.org/10.58599/ijsmem.2024.2304.

Full text

Abstract:

Automatic caption generation from images has emerged as a fundamental and challenging problem at the intersection of computer vision and natural language processing. This paper presents a comprehensive survey of the techniques, methodologies, and advancements in the field of automatic caption generation from images. The primary objective is to provide an extensive review of the state-of-the-art models, evaluation metrics, datasets, and applications associated with this domain. The survey begins by elucidating the underlying principles of image feature extraction and caption generation. Various

APA, Harvard, Vancouver, ISO, and other styles

48

Deny, Deny. "THE TRANSLATION SHIFT AND ACCURACY ANALYSIS OF MUSEUM MACAN’S CAPTION." Lire Journal (Journal of Linguistics and Literature) 2, no. 2 (2018): 83–90. http://dx.doi.org/10.33019/lire.v2i2.32.

Full text

Abstract:

Museum MACAN is the first museum that exhibit modern art. Museum MACAN has Indonesia-English caption in their Instagram account in order to promote their collection. This study investigates translation shift occurs in Museum MACAN’s Instagram caption and the accuracy of their translation. The translation shift is used to produces accurate translation. This study uses Catford’s theory about translation shift (1964) and the accuracy based on ATA Rubric Assessment. The data were twenty five Museum MACAN’s Instagram captions. The collected data were analysed, in order to fine the translation shift

APA, Harvard, Vancouver, ISO, and other styles

49

Razida, Razida. "Increasing Students' Participation in Literacy Movement by Creating Instagram Caption." Journal of English as a Foreign Language Education (JEFLE) 1, no. 2 (2024): 56. http://dx.doi.org/10.26418/jefle.v1i2.43755.

Full text

Abstract:

Due to students’ low interest in reading, the government addresses this issue by involving a literacy movement program in national curriculum. However, many students if not all prefer being on their phone to read a book. Therefore, instead of making students go against the use of their phone, the researcher takes advantage of learning through their phone. In this case the researcher suggests a way to support the literacy movement and ask students to use their mobile phone as a useful media to study by creating Instagram captions. Students upload photos on daily basis using English caption to d

APA, Harvard, Vancouver, ISO, and other styles

50

Ranasinghe, Damsara, Randil Pushpananda, and Ruvan Weerasinghe. "Image Caption Generator for Sinhala Using Deep Learning." International Journal on Advances in ICT for Emerging Regions (ICTer) 16, no. 2 (2023): 40–46. http://dx.doi.org/10.4038/icter.v16i2.7266.

Full text

Abstract:

In this study, for the image caption generation in the Sinhala language, we have implemented a Recurrent Neural Network based model consisting of an InceptionV3 model as an image feature extraction model and a Long Short Term Memory network for the language model by referring to the literature. The different variations of Sinhala versions of the Flickr8K and MS COCO datasets have been constructed and used to train experimental models. Evaluation of the generated captions has been done using both automated and manual approaches. The model trained on the MS COCO dataset with Google translated Si

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!