Academic literature on the topic 'Visual question generation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Visual question generation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Visual question generation"

1

Patil, Charulata, and Manasi Patwardhan. "Visual Question Generation." ACM Computing Surveys 53, no. 3 (2020): 1–22. http://dx.doi.org/10.1145/3383465.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Liu, Hongfei, Jiali Chen, Wenhao Fang, Jiayuan Xie, and Yi Cai. "Category-Guided Visual Question Generation (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (2023): 16262–63. http://dx.doi.org/10.1609/aaai.v37i13.26991.

Full text
Abstract:
Visual question generation aims to generate high-quality questions related to images. Generating questions based only on images can better reduce labor costs and thus be easily applied. However, their methods tend to generate similar general questions that fail to ask questions about the specific content of each image scene. In this paper, we propose a category-guided visual question generation model that can generate questions with multiple categories that focus on different objects in an image. Specifically, our model first selects the appropriate question category based on the objects in th
APA, Harvard, Vancouver, ISO, and other styles
3

Xie, Jiayuan, Mengqiu Cheng, Xinting Zhang, et al. "Explicitly Guided Difficulty-Controllable Visual Question Generation." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 24 (2025): 25552–60. https://doi.org/10.1609/aaai.v39i24.34745.

Full text
Abstract:
Visual question generation (VQG) aims to generate questions from images automatically. While existing studies primarily focus on the quality of generated questions, such as fluency and relevance, the difficulty of the questions is also a crucial factor in assessing their quality. Question difficulty directly impacts the effectiveness of VQG systems in applications like education and human-computer interaction, where appropriately challenging questions can stimulate learning interest and improve interaction experiences. However, accurately defining and controlling question difficulty is a chall
APA, Harvard, Vancouver, ISO, and other styles
4

Mi, Li, Syrielle Montariol, Javiera Castillo Navarro, Xianjie Dai, Antoine Bosselut, and Devis Tuia. "ConVQG: Contrastive Visual Question Generation with Multimodal Guidance." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 5 (2024): 4207–15. http://dx.doi.org/10.1609/aaai.v38i5.28216.

Full text
Abstract:
Asking questions about visual environments is a crucial way for intelligent agents to understand rich multi-faceted scenes, raising the importance of Visual Question Generation (VQG) systems. Apart from being grounded to the image, existing VQG systems can use textual constraints, such as expected answers or knowledge triplets, to generate focused questions. These constraints allow VQG systems to specify the question content or leverage external commonsense knowledge that can not be obtained from the image content only. However, generating focused questions using textual constraints while enfo
APA, Harvard, Vancouver, ISO, and other styles
5

Sarrouti, Mourad, Asma Ben Abacha, and Dina Demner-Fushman. "Goal-Driven Visual Question Generation from Radiology Images." Information 12, no. 8 (2021): 334. http://dx.doi.org/10.3390/info12080334.

Full text
Abstract:
Visual Question Generation (VQG) from images is a rising research topic in both fields of natural language processing and computer vision. Although there are some recent efforts towards generating questions from images in the open domain, the VQG task in the medical domain has not been well-studied so far due to the lack of labeled data. In this paper, we introduce a goal-driven VQG approach for radiology images called VQGRaD that generates questions targeting specific image aspects such as modality and abnormality. In particular, we study generating natural language questions based on the vis
APA, Harvard, Vancouver, ISO, and other styles
6

Pang, Wei, and Xiaojie Wang. "Visual Dialogue State Tracking for Question Generation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 11831–38. http://dx.doi.org/10.1609/aaai.v34i07.6856.

Full text
Abstract:
GuessWhat?! is a visual dialogue task between a guesser and an oracle. The guesser aims to locate an object supposed by the oracle oneself in an image by asking a sequence of Yes/No questions. Asking proper questions with the progress of dialogue is vital for achieving successful final guess. As a result, the progress of dialogue should be properly represented and tracked. Previous models for question generation pay less attention on the representation and tracking of dialogue states, and therefore are prone to asking low quality questions such as repeated questions. This paper proposes visual
APA, Harvard, Vancouver, ISO, and other styles
7

Srinivas, Dr Rhea. "VISUAL QUESTION ANSWERING." International Scientific Journal of Engineering and Management 04, no. 04 (2025): 1–7. https://doi.org/10.55041/isjem03029.

Full text
Abstract:
Abstract - Vision-Language Pre-Training (VLP) significantly improves performance for a variety of multimodal tasks. However, existing models are often specialized in understanding or generation, which limits their versatility. Furthermore, trust in text data for large, loud web text remains the optimal approach for monitoring. To address these challenges, we propose VLX, a uniform VLP framework that distinguishes both vision languages and generation tasks. VLX introduces a new type of data optimization strategy. This strategy allows the generator to create high-quality synthetic training data,
APA, Harvard, Vancouver, ISO, and other styles
8

Kamala, M. "Visual Question Generation from Remote Sensing Images Using Gemini API." International Journal for Research in Applied Science and Engineering Technology 12, no. 3 (2024): 2924–29. http://dx.doi.org/10.22214/ijraset.2024.59537.

Full text
Abstract:
Abstract: Visual Question Generation Extracting Information from Remote Sensing Images Remote Sensing Images plays a vital role in understanding and extracting information from aerial and satellite images. Utilizing Bidirectional Encoder Representation from Transformers (BERT) for extracting valuable insights from remote sensing images. Gemini Application Programming Interface(API), and Convolution Neural Networks (CNNs) are used. First, The proposed methodology employs CNN to extract high-level features from remote sensing images, capturing spatial data and generatingquestions. Similarly, the
APA, Harvard, Vancouver, ISO, and other styles
9

Kachare, Atul, Mukesh Kalla, and Ashutosh Gupta. "Visual Question Generation Answering (VQG-VQA) using Machine Learning Models." WSEAS TRANSACTIONS ON SYSTEMS 22 (June 28, 2023): 663–70. http://dx.doi.org/10.37394/23202.2023.22.67.

Full text
Abstract:
Presented automated visual question-answer system generates graphics-based question-answer pairs. The system consists of the Visual Query Generation (VQG) and Visual Question Answer (VQA) modules. VQG generates questions based on visual cues, and VQA provides matching answers to the VQG modules. VQG system generates questions using LSTM and VGG19 model, training parameters, and predicting words with the highest probability for output. VQA uses VGG-19 convolutional neural network for image encoding, embedding, and multilayer perceptron for high-quality responses. The proposed system reduces the
APA, Harvard, Vancouver, ISO, and other styles
10

Sandhya, Vidyashankar, Vahi Rakshit, Karkhanis Yash, and Srinivasa Gowri. "Vis Quelle: Visual Question-based Elementary Learning Companion a system to Facilitate Learning Word-Object Associations." International Journal of Innovative Technology and Exploring Engineering (IJITEE) 11, no. 1 (2021): 41–49. https://doi.org/10.35940/ijitee.A9599.1111121.

Full text
Abstract:
We present an automated, visual question answering based companion – Vis Quelle - to facilitate elementary learning of word-object associations. In particular, we attempt to harness the power of machine learning models for object recognition and the understanding of combined processing of images and text data from visual-question answering to provide variety and nuance in the images associated with letters or words presented to the elementary learner. We incorporate elements such as gamification to motivate the learner by recording scores, errors, etc., to track the learner’s progr
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Visual question generation"

1

Bordes, Patrick. "Deep Multimodal Learning for Joint Textual and Visual Reasoning." Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS370.

Full text
Abstract:
Au cours de la dernière décennie, l'évolution des techniques d'apprentissage en profondeur, combinée à une augmentation importante des données multimodales a suscité un intérêt croissant dans la communauté de recherche pour la compréhension conjointe du langage et de la vision. Le défi au cœur de l'apprentissage automatique multimodal est la différence sémantique entre le langage et la vision: alors que la vision représente fidèlement la réalité et transmet une sémantique de bas niveau, le langage porte un raisonnement de haut niveau. D'une part, le langage peut améliorer les performances des
APA, Harvard, Vancouver, ISO, and other styles
2

Chowdhury, Muhammad Iqbal Hasan. "Question-answering on image/video content." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205096/1/Muhammad%20Iqbal%20Hasan_Chowdhury_Thesis.pdf.

Full text
Abstract:
This thesis explores a computer's ability to understand multimodal data where the correspondence between image/video content and natural language text are utilised to answer open-ended natural language questions through question-answering tasks. Static image data consisting of both indoor and outdoor scenes, where complex textual questions are arbitrarily posed to a machine to generate correct answers, was examined. Dynamic videos consisting of both single-camera and multi-camera settings for the exploration of more challenging and unconstrained question-answering tasks were also considered. I
APA, Harvard, Vancouver, ISO, and other styles
3

Testoni, Alberto. "Asking Strategic and Informative Questions in Visual Dialogue Games: Strengths and Weaknesses of Neural Generative Models." Doctoral thesis, Università degli studi di Trento, 2023. https://hdl.handle.net/11572/370672.

Full text
Abstract:
Gathering information by asking questions about the surrounding world is a hallmark of human intelligence. Modelling this feature in Natural Language Generation systems represents a central challenge for effective and reliable conversational agents. The evaluation of these systems plays a crucial role in understanding the strengths and weaknesses of current neural architectures. In the scientific community, there is an open debate about what makes generated dialogues sound natural and human-like, and there is no agreement on what measures to use to track progress. In the first part of the thes
APA, Harvard, Vancouver, ISO, and other styles
4

Wei, Min-Chia, and 魏敏家. "Evaluation of Visual Question Generation With Captions." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/65t4uu.

Full text
Abstract:
碩士<br>國立臺灣大學<br>資訊工程學研究所<br>106<br>Over the last few years, there have been many types of research in the vision and language community. There are many popular topics, for example, image captions, video transcription, question answering about images or videos, Image-Grounded Conversation(IGC) and Visual Question Generation(VQG). In this thesis, we focus on question generation about images. Because of the popularity of image on social media, people always upload an image with some descriptions, we think that maybe image captions can help Artificial Intelligence (AI) to learn to ask more natural
APA, Harvard, Vancouver, ISO, and other styles
5

Anderson, Peter James. "Vision and Language Learning: From Image Captioning and Visual Question Answering towards Embodied Agents." Phd thesis, 2018. http://hdl.handle.net/1885/164018.

Full text
Abstract:
Each time we ask for an object, describe a scene, follow directions or read a document containing images or figures, we are converting information between visual and linguistic representations. Indeed, for many tasks it is essential to reason jointly over visual and linguistic information. People do this with ease, typically without even noticing. Intelligent systems that perform useful tasks in unstructured situations, and interact with people, will also require this ability. In this thesis, we focus on the joint modelling of visual and linguistic
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Visual question generation"

1

Dadyan, Eduard. Modern programming technologies. The C#language. Volume 1. For novice users. INFRA-M Academic Publishing LLC., 2021. http://dx.doi.org/10.12737/1196552.

Full text
Abstract:
Volume 1 of the textbook is addressed to novice users who want to learn the popular object-oriented programming language C#. The tutorial provides complete information about the C# language and platform .NET. Basic data types, variables, functions, and arrays are considered. Working with dates and enumerations is shown. The elements and constructs of the language are described: classes, interfaces, assemblies, manifests, namespaces, collections, generalizations, delegates, events, etc. It provides information about Windows processes and threads, as well as examples of organizing work in multit
APA, Harvard, Vancouver, ISO, and other styles
2

Calabretta-Sajder, Ryan, ed. Pasolini’s Lasting Impressions. Fairleigh Dickinson University Press, 2018. https://doi.org/10.5040/9781683935322.

Full text
Abstract:
Noted as a ‘civil poet’ by Alberto Moravia, Pier Paolo Pasolini was a creative and philosophical genius whose works challenged generations of Western Europeans and Americans to reconsider not only issues regarding the self, but also various social concerns. Pasolini’s works touched and continues to inspire students, scholars, and intellectuals alike to question the status quo. This collection of thirteen articles and two interviews evidences the on-going discourse around Pasolini’s lasting impressions on the new generation. Pasolini’s Lasting Impressions: Death, Eros and Literary Enterprise in
APA, Harvard, Vancouver, ISO, and other styles
3

Nowell Smith, David. W. S. Graham. Oxford University Press, 2022. http://dx.doi.org/10.1093/oso/9780192842909.001.0001.

Full text
Abstract:
Only in recent years has W. S. Graham come to be recognised as one of the great poets of the twentieth century. On the peripheries of UK poetry culture during his lifetime, he in many ways appears to us today as exemplary of the poetics of the mid-century: his extension of modernist explorations of rhythm and diction; his interweaving of linguistic and geographic places; his dialogue with the plastic arts; and the tensions that run through his work, between philosophical seriousness and play, between solitude and sociality, regionalism and cosmopolitanism, between the heft and evanescence of p
APA, Harvard, Vancouver, ISO, and other styles
4

Buchner, Helmut. Evoked potentials. Oxford University Press, 2016. http://dx.doi.org/10.1093/med/9780199688395.003.0015.

Full text
Abstract:
Evoked potentials (EPs) occur in the peripheral and the central nervous system. The low amplitude signals are extracted from noise by averaging multiple time epochs time-locked to a sensory stimulus. The mechanisms of generation, the techniques for stimulation and recording are established. Clinical applications provide robust information to various questions. The importance of EPs is to measure precisely the conduction times within the stimulated sensory system. Visual evoked potentials to a pattern reversal checker board stimulus are commonly used to evaluate the optic nerve. Auditory evoked
APA, Harvard, Vancouver, ISO, and other styles
5

Fox, Kieran C. R. Neural Origins of Self-Generated Thought. Edited by Kalina Christoff and Kieran C. R. Fox. Oxford University Press, 2018. http://dx.doi.org/10.1093/oxfordhb/9780190464745.013.1.

Full text
Abstract:
Functional magnetic resonance imaging (fMRI) has begun to narrow down the neural correlates of self-generated forms of thought, with current evidence pointing toward central roles for the default, frontoparietal, and visual networks. Recent work has linked the arising of thoughts more specifically to default network activity, but the limited temporal resolution of fMRI has precluded more detailed conclusions about where in the brain self-created mental content is generated and how this is achieved. This chapter argues that the unparalleled spatiotemporal resolution of intracranial electrophysi
APA, Harvard, Vancouver, ISO, and other styles
6

Brantingham, Patricia L., Paul J. Brantingham, Justin Song, and Valerie Spicer. Advances in Visualization for Theory Testing in Environmental Criminology. Edited by Gerben J. N. Bruinsma and Shane D. Johnson. Oxford University Press, 2018. http://dx.doi.org/10.1093/oxfordhb/9780190279707.013.37.

Full text
Abstract:
This chapter discusses advances in visualization for environmental criminology. The environment within which people move has many dimensions that influence or constrain decisions and actions by individuals and by groups. This complexity creates a challenge for theoreticians and researchers in presenting their research results in a way that conveys the dynamic spatiotemporal aspects of crime and actions by offenders in a clearly understandable way. There is an increasing need in environmental criminology to use scientific visualization to convey research results. A visual image can describe und
APA, Harvard, Vancouver, ISO, and other styles
7

Gover, K. E. Art and Authority. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780198768692.001.0001.

Full text
Abstract:
Art and Authority is a philosophical essay on artistic authority and freedom: its sources, nature, and limits. It draws upon real-world cases and controversies in contemporary visual art and connects them to significant theories in the philosophical literature on art and aesthetics. Artworks, it is widely agreed, are the products of intentional human activity. And yet they are different from other kinds of artifacts; for one thing, they are meaningful. It is often presumed that artworks are an extension of their makers’ personality in ways that other kinds of artifacts are not. This is clear f
APA, Harvard, Vancouver, ISO, and other styles
8

Campbell, Kenneth L. Western Civilization in a Global Context: Prehistory to the Enlightenment. Bloomsbury Publishing Plc, 2015. http://dx.doi.org/10.5040/9781474275491.

Full text
Abstract:
Western Civilization in a Global Context is a source collection that introduces a comparative element to the study of Western civilization, offering students an opportunity to explore non-Western perspectives. An interesting and provocative set of readings are included, from a range of primary sources, including official documents, historical writings, literary sources, letters, speeches, interviews as well as visual sources. These different sources are carefully selected with a view to generating class discussion and providing students with a sense of the different approaches historians might
APA, Harvard, Vancouver, ISO, and other styles
9

Contreras, Ayana. Energy Never Dies. University of Illinois Press, 2021. http://dx.doi.org/10.5622/illinois/9780252044069.001.0001.

Full text
Abstract:
Black Chicago in the post–civil rights era was constantly refreshed by an influx of newcomers from the American South via the Great Migration. Chicago was a beacon, disseminating a fresh, powerful definition of Black identity primarily through music, art, and entrepreneurship and mass media. This book uses ruminations on oft-undervalued found ephemeral materials (like a fan club pamphlet or a creamy-white Curtis Mayfield record) and a variety of in-depth original and archival interviews to unearth tales of the aspiration, will, courage, and imagination born in Black Chicago. It also questions
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Visual question generation"

1

Wu, Qi, Peng Wang, Xin Wang, Xiaodong He, and Wenwu Zhu. "Visual Question Generation." In Visual Question Answering. Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Chen, Feng, Jiayuan Xie, Yi Cai, Tao Wang, and Qing Li. "Difficulty-Controllable Visual Question Generation." In Web and Big Data. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-85896-4_26.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Xu, Feifei, Yingchen Zhou, Zheng Zhong, and Guangzhen Li. "Object Category-Based Visual Dialog for Effective Question Generation." In Computational Visual Media. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2092-7_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Xinyu, Chenchen Jing, Mingliang Zhai, Yuwei Wu, and Yunde Jia. "Visual-Guided Reasoning Path Generation for Visual Question Answering." In Lecture Notes in Computer Science. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-8487-5_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Junjie, Qi Wu, Chunhua Shen, Jian Zhang, Jianfeng Lu, and Anton van den Hengel. "Goal-Oriented Visual Question Generation via Intermediate Rewards." In Computer Vision – ECCV 2018. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-01228-1_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Nahar, Shrey, Shreya Naik, Niti Shah, Saumya Shah, and Lakshmi Kurup. "Automated Question Generation and Answer Verification Using Visual Data." In Studies in Computational Intelligence. Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-38445-6_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Chai, Zi, Xiaojun Wan, Soyeon Caren Han, and Josiah Poon. "Visual Question Generation Under Multi-granularity Cross-Modal Interaction." In MultiMedia Modeling. Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-27077-2_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Uehara, Kohei, Antonio Tejero-De-Pablos, Yoshitaka Ushiku, and Tatsuya Harada. "Visual Question Generation for Class Acquisition of Unknown Objects." In Computer Vision – ECCV 2018. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-01258-8_30.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Salewski, Leonard, A. Sophia Koepke, Hendrik P. A. Lensch, and Zeynep Akata. "CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations." In xxAI - Beyond Explainable AI. Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-04083-2_5.

Full text
Abstract:
AbstractProviding explanations in the context of Visual Question Answering (VQA) presents a fundamental problem in machine learning. To obtain detailed insights into the process of generating natural language explanations for VQA, we introduce the large-scale CLEVR-X dataset that extends the CLEVR dataset with natural language explanations. For each image-question pair in the CLEVR dataset, CLEVR-X contains multiple structured textual explanations which are derived from the original scene graphs. By construction, the CLEVR-X explanations are correct and describe the reasoning and visual information that is necessary to answer a given question. We conducted a user study to confirm that the ground-truth explanations in our proposed dataset are indeed complete and relevant. We present baseline results for generating natural language explanations in the context of VQA using two state-of-the-art frameworks on the CLEVR-X dataset. Furthermore, we provide a detailed analysis of the explanation generation quality for different question and answer types. Additionally, we study the influence of using different numbers of ground-truth explanations on the convergence of natural language generation (NLG) metrics. The CLEVR-X dataset is publicly available at https://github.com/ExplainableML/CLEVR-X.
APA, Harvard, Vancouver, ISO, and other styles
10

Xu, Feifei, Yingchen Zhou, Zheng Zhong, Guangzhen Li, and Wang Zhou. "Enhancing Relevance and Efficiency in Visual Question Generation Through Redundant Object Filtering." In Communications in Computer and Information Science. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-8749-4_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Visual question generation"

1

Padaria, Ali Asgar, Dhyan Patel, Rajesh Gupta, Nilesh Kumar Jadav, Sudeep Tanwar, and Deepak Garg. "Visual Question Generation Framework for Chess Game State Identification." In 2024 IEEE International Conference on Contemporary Computing and Communications (InC4). IEEE, 2024. http://dx.doi.org/10.1109/inc460750.2024.10649325.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Li, Siran, Li Mi, Javiera Castillo-Navarro, and Devis Tuia. "Knowledge-Aware Visual Question Generation for Remote Sensing Images." In IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2024. http://dx.doi.org/10.1109/igarss53475.2024.10642766.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Adjali, Omar, Olivier Ferret, Sahar Ghannay, and Hervé Le Borgne. "Multi-Level Information Retrieval Augmented Generation for Knowledge-based Visual Question Answering." In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.emnlp-main.922.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Hu, Zhanghao, and Frank Keller. "Causal and Temporal Inference in Visual Question Generation by Utilizing Pre-trained Models." In Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR). Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.alvr-1.12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Das, Deepayan, Davide Talon, Massimiliano Mancini, Yiming Wang, and Elisa Ricci. "One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering." In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2025. https://doi.org/10.1109/wacv61041.2025.00550.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Shen, Ruoyue, Nakamasa Inoue, and Koichi Shinoda. "Pyramid Coder: Hierarchical Code Generator for Compositional Visual Question Answering." In 2024 IEEE International Conference on Image Processing (ICIP). IEEE, 2024. http://dx.doi.org/10.1109/icip51287.2024.10648180.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Yan, Quan, Junwen Duan, and Jianxin Wang. "Multi-modal Concept Alignment Pre-training for Generative Medical Visual Question Answering." In Findings of the Association for Computational Linguistics ACL 2024. Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.findings-acl.319.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ding, Wenjian, Yao Zhang, Jun Wang, Adam Jatowt, and Zhenglu Yang. "Exploring Union and Intersection of Visual Regions for Generating Questions, Answers, and Distractors." In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.emnlp-main.88.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Vedd, Nihir, Zixu Wang, Marek Rei, Yishu Miao, and Lucia Specia. "Guiding Visual Question Generation." In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2022. http://dx.doi.org/10.18653/v1/2022.naacl-main.118.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Bi, Chao, Shuhui Wang, Zhe Xue, Shengbo Chen, and Qingming Huang. "Inferential Visual Question Generation." In MM '22: The 30th ACM International Conference on Multimedia. ACM, 2022. http://dx.doi.org/10.1145/3503161.3548055.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!