To see the other types of publications on this topic, follow the link: Retrieval Augmented Generation (RAG).

Journal articles on the topic 'Retrieval Augmented Generation (RAG)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Retrieval Augmented Generation (RAG).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Mishra, Ankit, and Aniket Gupta. "Retrieval Augmented Generation (RAG) Model." International Journal of Research Publication and Reviews 6, no. 6 (January 2025): 4690–93. https://doi.org/10.55248/gengpi.6.0125.0635.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Liu, Yicheng. "Retrieval-Augmented Generation: Methods, Applications and Challenges." Applied and Computational Engineering 142, no. 1 (April 24, 2025): 99–108. https://doi.org/10.54254/2755-2721/2025.kl22312.

Full text
Abstract:
The Retrieval-Augmented Generation (RAG) has been proven to have a promising approach. It can address the limitations of purely generative models in knowledge-intensive tasks caused by their reliance on static, pre-trained knowledge. RAG addresses these challenges by integrating a retrieval mechanism with a generative model, enabling dynamic access to external knowledge sources during the generation process. This paper presents a comprehensive study of the RAG framework, focusing on its architecture, training strategies, and applications. The framework combines a dense passage retriever (DPR) with a sequence-to-sequence generator (GPT-3.5-turbo), jointly optimized in an end-to-end manner to retrieve and utilize relevant knowledge effectively. This paper evaluates RAG on MS MARCO, demonstrating its superiority over state-of-the-art purely generative models and traditional retrieval-based systems. Experimental results show that RAG achieves significant improvements in factual accuracy, relevance, and interpretability, as measured by metrics such as term frequencyinverse document frequency, bidirectional encoder representation from transformer Score, and Q-Bilingual Evaluation Understudy-1.
APA, Harvard, Vancouver, ISO, and other styles
3

Long, Xinwei, Zhiyuan Ma, Ermo Hua, Kaiyan Zhang, Biqing Qi, and Bowen Zhou. "Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 23 (April 11, 2025): 24723–31. https://doi.org/10.1609/aaai.v39i23.34653.

Full text
Abstract:
Retrieval-augmented generation (RAG) has emerged to address the knowledge-intensive visual question answering (VQA) task. Current methods mainly employ separate retrieval and generation modules to acquire external knowledge and generate answers, respectively. We propose ReAuSE, an alternative to the previous RAG model for the knowledge-based VQA task, which seamlessly integrates knowledge retriever into the generative multi-modal large language model, serving as a built-in search engine. Specifically, our model functions both as a generative retriever and an accurate answer generator. It not only helps retrieve documents from the knowledge base by producing identifier for each document, but it also answers visual questions based on the retrieved documents. Furthermore, we also propose a reinforced retrieval calibration module from relevance feedback to improve retrieval performance and align with the preferences for accurate answer generation. Extensive experiments on two representative OKVQA and A-OKVQA datasets demonstrate significant improvements ranging from 2.9% to 9.6% across all evaluation metrics when compared to strong baselines.
APA, Harvard, Vancouver, ISO, and other styles
4

Han, Binglan, Teo Susnjak, and Anuradha Mathrani. "Automating Systematic Literature Reviews with Retrieval-Augmented Generation: A Comprehensive Overview." Applied Sciences 14, no. 19 (October 9, 2024): 9103. http://dx.doi.org/10.3390/app14199103.

Full text
Abstract:
This study examines Retrieval-Augmented Generation (RAG) in large language models (LLMs) and their significant application for undertaking systematic literature reviews (SLRs). RAG-based LLMs can potentially automate tasks like data extraction, summarization, and trend identification. However, while LLMs are exceptionally proficient in generating human-like text and interpreting complex linguistic nuances, their dependence on static, pre-trained knowledge can result in inaccuracies and hallucinations. RAG mitigates these limitations by integrating LLMs’ generative capabilities with the precision of real-time information retrieval. We review in detail the three key processes of the RAG framework—retrieval, augmentation, and generation. We then discuss applications of RAG-based LLMs to SLR automation and highlight future research topics, including integration of domain-specific LLMs, multimodal data processing and generation, and utilization of multiple retrieval sources. We propose a framework of RAG-based LLMs for automating SRLs, which covers four stages of SLR process: literature search, literature screening, data extraction, and information synthesis. Future research aims to optimize the interaction between LLM selection, training strategies, RAG techniques, and prompt engineering to implement the proposed framework, with particular emphasis on the retrieval of information from individual scientific papers and the integration of these data to produce outputs addressing various aspects such as current status, existing gaps, and emerging trends.
APA, Harvard, Vancouver, ISO, and other styles
5

Choi, Yein, Sungwoo Kim, Yipene Cedric Francois Bassole, and Yunsick Sung. "Enhanced Retrieval-Augmented Generation Using Low-Rank Adaptation." Applied Sciences 15, no. 8 (April 17, 2025): 4425. https://doi.org/10.3390/app15084425.

Full text
Abstract:
Recent advancements in retrieval-augmented generation (RAG) have substantially enhanced the efficiency of information retrieval. However, traditional RAG-based systems still encounter challenges, such as high latency in output decision making, the inaccurate retrieval of road traffic-related laws and regulations, and considerable processing overhead in large-scale searches. This study presents an innovative application of RAG technology for processing road traffic-related laws and regulations, particularly in the context of unmanned systems like autonomous driving. Our approach integrates embedding generation using a LoRA-enhanced BERT-based uncased model and an optimized retrieval strategy that combines maximal marginal similarity score thresholding with contextual compression retrieval. The proposed system enhances and achieves improved retrieval accuracy while reducing processing overhead. Leveraging road traffic-related regulatory datasets, the LoRA-enhanced model demonstrated remarkable performance gains over traditional RAG methods. Specifically, our model reduced the number of trainable parameters by 13.6% and lowered computational costs by 18.7%. Performance evaluations using BLEU, CIDEr, and SPICE scores revealed a 4.36% increase in BLEU-4, a 6.83% improvement in CIDEr, and a 5.46% improved in SPICE, confirming greater structural accuracy in regulatory text generation. Additionally, our method achieved an 8.5% improvement in retrieval accuracy across key metrics, outperforming baseline RAG systems. These contributions pave the way for more efficient and reliable traffic regulation processing, enabling better decision making in autonomous systems.
APA, Harvard, Vancouver, ISO, and other styles
6

Grabuloski, Marko, Aleksandar Karadimce, and Anis Sefidanoski. "Enhancing Language Models with Retrieval-Augmented Generation A Comparative Study on Performance." WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS 22 (April 2, 2025): 272–97. https://doi.org/10.37394/23209.2025.22.23.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) is a powerful technique that enhances the capabilities of Large Language Models (LLMs) by integrating information retrieval with text generation. By accessing and incorporating relevant external knowledge, RAG systems address the limitations of traditional LLMs, such as memory constraints and the inability to access up-to-date information. This research explores the implementation and evaluation of RAG systems, focusing on their potential to improve the accuracy and relevance of LLM responses. It investigates the impact of different LLM types (causal, question-answering, conversational) and retrieval-augmentation strategies (sentence-level, paragraph-level) on the performance of RAG systems. We conducted experiments using various open-source LLMs and a custom-built RAG system to assess the effectiveness of different approaches. The findings indicate that RAG systems can significantly enhance the performance of LLMs, especially for complex questions that require access to diverse information sources. T5 conversational models, in particular, demonstrate strong performance in synthesis-based tasks, effectively combining information from multiple retrieved documents. However, causal and question-answering models may struggle with complex reasoning and synthesis, even with RAG augmentation.
APA, Harvard, Vancouver, ISO, and other styles
7

Chen, Jiawei, Hongyu Lin, Xianpei Han, and Le Sun. "Benchmarking Large Language Models in Retrieval-Augmented Generation." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 16 (March 24, 2024): 17754–62. http://dx.doi.org/10.1609/aaai.v38i16.29728.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) is a promising approach for mitigating the hallucination of large language models (LLMs). However, existing research lacks rigorous evaluation of the impact of retrieval-augmented generation on different large language models, which make it challenging to identify the potential bottlenecks in the capabilities of RAG for different LLMs. In this paper, we systematically investigate the impact of Retrieval-Augmented Generation on large language models. We analyze the performance of different large language models in 4 fundamental abilities required for RAG, including noise robustness, negative rejection, information integration, and counterfactual robustness. To this end, we establish Retrieval-Augmented Generation Benchmark (RGB), a new corpus for RAG evaluation in both English and Chinese. RGB divides the instances within the benchmark into 4 separate testbeds based on the aforementioned fundamental abilities required to resolve the case. Then we evaluate 6 representative LLMs on RGB to diagnose the challenges of current LLMs when applying RAG. Evaluation reveals that while LLMs exhibit a certain degree of noise robustness, they still struggle significantly in terms of negative rejection, information integration, and dealing with false information. The aforementioned assessment outcomes indicate that there is still a considerable journey ahead to effectively apply RAG to LLMs.
APA, Harvard, Vancouver, ISO, and other styles
8

Dong, Guanting, Xiaoshuai Song, Yutao Zhu, Runqi Qiao, Zhicheng Dou, and Ji-Rong Wen. "Toward Verifiable Instruction-Following Alignment for Retrieval Augmented Generation." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 22 (April 11, 2025): 23796–804. https://doi.org/10.1609/aaai.v39i22.34551.

Full text
Abstract:
Following natural instructions is crucial for the effective application of Retrieval-Augmented Generation (RAG) systems. Despite recent advancements in Large Language Models (LLMs), research on assessing and improving instruction-following (IF) alignment within the RAG domain remains limited. To address this issue, we propose VIF-RAG, an automated, scalable, and verifiable synthetic pipeline for instruction-following alignment in RAG systems. We start by manually crafting a minimal set of atomic instructions (100k) through automated processes. To further bridge the gap in instruction-following auto-evaluation for RAG systems, we introduce FollowRAG Benchmark, which includes approximately 3K test samples, covering 22 categories of general instruction constraints and four knowledge-intensive QA datasets. Due to its robust pipeline design, FollowRAG can seamlessly integrate with different RAG benchmarks. Using FollowRAG and eight widely-used IF and foundational abilities benchmarks for LLMs, we demonstrate that VIF-RAG markedly enhances LLM performance across a broad range of general instruction constraints while effectively leveraging its capabilities in RAG scenarios. Further analysis offers practical insights for achieving IF alignment in RAG systems.
APA, Harvard, Vancouver, ISO, and other styles
9

Shaji, Edwin Alex, Jerishab M. Jerishab M, Leya Thomas, M. Viraj Prabhu, and Asst Prof Chinchu M Pillai. "Survey on Speech Recognition and Retrieval-Augmented Generation." International Journal of Advances in Engineering and Management 06, no. 12 (December 2024): 75–81. https://doi.org/10.35629/5252-06127581.

Full text
Abstract:
Automatic speech recognition (ASR) and retrieval-augmented generation (RAG) systems have seen remarkable progress in handling multilingualism, noise robustness, real-time transcription, and knowledge-intensive tasks. The survey reviews 12 key papers that contribute to advancements in ASR and RAG, covering approaches like end-to-end multilingual models, noise-reduction techniques, and real-time speech processing. It also examines RAG systems that enhance generative models by integrating retrieval mechanisms for improved accuracy in tasks like question answering and summarization. By categorizing the papers into themes, this survey highlights key methodologies, compares their performance, and identifies future directions for improving ASR and RAG technologies in handling real-world challenges.
APA, Harvard, Vancouver, ISO, and other styles
10

Vaibhav Fanindra Mahajan. "Retrieval-augmented generation: The technical foundation of intelligent AI Chatbots." World Journal of Advanced Research and Reviews 26, no. 1 (April 30, 2025): 4093–99. https://doi.org/10.30574/wjarr.2025.26.1.1571.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) has emerged as a transformative approach in conversational AI technology, addressing fundamental limitations of traditional chatbot systems. This technical article explores the architecture, mechanisms, and advantages of RAG implementations. Traditional AI chatbots suffer from outdated knowledge bases, hallucination tendencies, and limited context awareness - constraints that RAG effectively overcomes by combining dynamic information retrieval with sophisticated text generation capabilities. The RAG framework operates through a multi-stage process encompassing query processing, information retrieval, contextualization, response generation, and delivery. This hybrid architecture yields substantial improvements in factual accuracy, knowledge recency, system transparency, and operational efficiency. The article further examines critical implementation considerations including vector database selection, embedding model optimization, document chunking strategies, retrieval algorithm configuration, and prompt engineering techniques. Looking toward future developments, the article highlights promising directions including multi-modal capabilities, hybrid retrieval methodologies, adaptive retrieval systems, and enterprise knowledge integration. It demonstrates how RAG represents a significant advancement in creating more intelligent, reliable, and context-aware AI conversational systems.
APA, Harvard, Vancouver, ISO, and other styles
11

Yang, Yihe, Xiaoming Li, Hongwei Jin, and Kun Huang. "Advancing Structured Query Processing in Retrieval-Augmented Generation with Generative Semantic Integration." Frontiers in Computing and Intelligent Systems 9, no. 3 (September 27, 2024): 64–71. http://dx.doi.org/10.54097/z309gx59.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) has become a pivotal approach in enhancing language models by incorporating external knowledge during the text generation process. However, traditional RAG systems often face challenges in processing structured queries, leading to suboptimal integration of retrieved information. In this paper, we introduce a novel method called Generative Semantic Integration (GSI), which advances structured query processing within RAG frameworks. GSI leverages generative models to semantically integrate structured queries with retrieved data, enabling more coherent and contextually relevant responses. Our experiments on benchmark datasets demonstrate that GSI significantly improves the performance of RAG systems in structured query understanding and response generation, outperforming existing baseline models.
APA, Harvard, Vancouver, ISO, and other styles
12

Zhu, Jia, Hanghui Guo, Weijie Shi, Zhangze Chen, and Pasquale De Meo. "RaDIO: Real-Time Hallucination Detection with Contextual Index Optimized Query Formulation for Dynamic Retrieval Augmented Generation." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 24 (April 11, 2025): 26129–37. https://doi.org/10.1609/aaai.v39i24.34809.

Full text
Abstract:
The Dynamic Retrieval Augmented Generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs). However, current dynamic RAG methods fall short in both aspects: identifying the optimal moment to activate the retrieval module and crafting the appropriate query once retrieval is triggered. To overcome these limitations, we introduce an approach, namely, RaDIO, Real-Time Hallucination Detection with Contextual Index Optimized query formulation for dynamic RAG. The approach is specifically designed to make decisions on when and what to retrieve based on the LLM’s real-time information needs during the text generation process. We evaluate RaDIO along with existing methods comprehensively over several knowledge-intensive generation datasets. Experimental results show that RaDIO achieves superior performance on all tasks, demonstrating the effectiveness of our work.
APA, Harvard, Vancouver, ISO, and other styles
13

Rackauckas, Zackary. "Rag-Fusion: A New Take on Retrieval Augmented Generation." International Journal on Natural Language Computing 13, no. 1 (February 28, 2024): 37–47. http://dx.doi.org/10.5121/ijnlc.2024.13103.

Full text
Abstract:
Infineon has identified a need for engineers, account managers, and customers to rapidly obtain product information. This problem is traditionally addressed with retrieval-augmented generation (RAG) chatbots, but in this study, I evaluated the use of the newly popularized RAG-Fusion method. RAG-Fusion combines RAG and reciprocal rank fusion (RRF) by generating multiple queries, reranking them with reciprocal scores and fusing the documents and scores. Through manually evaluating answers on accuracy, relevance, and comprehensiveness, I found that RAG-Fusion was able to provide accurate and comprehensive answers due to the generated queries contextualizing the original query from various perspectives. However, some answers strayed off topic when the generated queries' relevance to the original query is insufficient. This research marks significant progress in artificial intelligence (AI) and natural language processing (NLP) applications and demonstrates transformations in a global and multiindustry context.
APA, Harvard, Vancouver, ISO, and other styles
14

Lau, David, Ganthan Narayana Samy, Dr Fiza Abdul Rahim, Nurazean Maarop, Mahiswaran Selvananthan, Mazlan Ali, and Sundresan Perumal. "Multimodal RAG Analysis of Product Datasheet." Open International Journal of Informatics 12, no. 2 (December 27, 2024): 1–12. https://doi.org/10.11113/oiji2024.12n2.309.

Full text
Abstract:
Large language models such as ChatGPT serves as multipurpose chatbot that can provide information across diverse disciplines. However, in order to generate timely and accurate response, retrieval-augmented generation method has been devised to enhance the response of these models. The release of vision models has paved the way for practitioners to perform multimodal retrieval augmented generation on documents that commonly consist of a combination of text, images and tables. Hence, this method is explored to analyze a product datasheet and match it with minimum specification required by potential clients. It is demonstrated that multimodal retrieval augmented generation performed better compared to basic retrieval augmented generation that did not consider the information contained in image and table specifically. While the performance of this method still lagged behind the commercially available GPT-4o, information is not exchanged with any external parties which could potentially address any privacy issue with regards to highly sensitive information. The incorporation of various best practices in this domain as highlighted in other studies can potentially improve the output generation of this method toward matching or exceeding the performance of commercially available tools.
APA, Harvard, Vancouver, ISO, and other styles
15

Siddharth Nandagopal. "Securing Retrieval-Augmented Generation Pipelines: A Comprehensive Framework." Journal of Computer Science and Technology Studies 7, no. 1 (January 12, 2025): 17–29. https://doi.org/10.32996/jcsts.2025.7.1.2.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) has significantly enhanced the capabilities of Large Language Models (LLMs) by enabling them to access and incorporate external knowledge sources, thereby improving response accuracy and relevance. However, the security of RAG pipelines remains a paramount concern as these systems become integral to various critical applications. This paper introduces a comprehensive framework designed to secure RAG pipelines through the integration of advanced encryption techniques, zero-trust architecture, and structured guardrails. The framework employs symmetric and asymmetric encryption to protect data at rest and in transit, ensuring confidentiality and integrity throughout the data lifecycle. Adopting zero-trust principles, the framework mandates continuous verification of all entities within the data flow, effectively mitigating unauthorized access and lateral movement risks. Additionally, the implementation of guardrails, such as immutable system prompts and salted sequence tagging, fortifies the system against prompt injection and other malicious attacks. A detailed lifecycle security continuum is presented, illustrating the application of these security measures from data ingestion to decommissioning. Case studies across healthcare, finance, retail, and education sectors demonstrate the framework’s effectiveness in maintaining high performance and scalability without compromising security. This work provides a foundational model for future research and practical implementation, emphasizing the necessity of robust security protocols in the deployment of RAG-based applications.
APA, Harvard, Vancouver, ISO, and other styles
16

Ajay Mukund, S., and K. S. Easwarakumar. "Optimizing Legal Text Summarization Through Dynamic Retrieval-Augmented Generation and Domain-Specific Adaptation." Symmetry 17, no. 5 (April 23, 2025): 633. https://doi.org/10.3390/sym17050633.

Full text
Abstract:
Legal text summarization presents distinct challenges due to the intricate and domain-specific nature of legal language. This paper introduces a novel framework integrating dynamic Retrieval-Augmented Generation (RAG) with domain-specific adaptation to enhance the accuracy and contextual relevance of legal document summaries. The proposed Dynamic Legal RAG system achieves a vital form of symmetry between information retrieval and content generation, ensuring that retrieved legal knowledge is both comprehensive and precise. Using the BM25 retriever with top-3 chunk selection, the system optimizes relevance and efficiency, minimizing redundancy while maximizing legally pertinent content. with top-3 chunk selection, the system optimizes relevance and efficiency, minimizing redundancy while maximizing legally pertinent content. A key design feature is the compression ratio constraint (0.05 to 0.5), maintaining structural symmetry between the original judgment and its summary by balancing representation and information density. Extensive evaluations establish BM25 as the most effective retriever, striking an optimal balance between precision and recall. A comparative analysis of transformer-based (Decoder-only) models—DeepSeek-7B, LLaMA 2-7B, and LLaMA 3.1-8B—demonstrates that LLaMA 3.1-8B, enriched with Legal Named Entity Recognition (NER) and the Dynamic RAG system, achieves superior performance with a BERTScore of 0.89. This study lays a strong foundation for future research in hybrid retrieval models, adaptive chunking strategies, and legal-specific evaluation metrics, with practical implications for case law analysis and automated legal drafting.
APA, Harvard, Vancouver, ISO, and other styles
17

Yao, Chengyuan, and Satoshi Fujita. "Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective Tags." Electronics 13, no. 23 (November 25, 2024): 4643. http://dx.doi.org/10.3390/electronics13234643.

Full text
Abstract:
While retrieval-augmented generation (RAG) enhances large language models (LLMs), it also introduces challenges that can impact accuracy and performance. In practice, RAG can obscure the intrinsic strengths of LLMs. Firstly, LLMs may become too reliant on external retrieval, underutilizing their own knowledge and reasoning, which can diminish responsiveness. Secondly, RAG may introduce irrelevant or low-quality data, adding noise that disrupts generation, especially with complex tasks. This paper proposes an RAG framework that uses reflective tags to manage retrieval, evaluating documents in parallel and applying the chain-of-thought (CoT) technique for step-by-step generation. The model selects the highest quality content for final output. The key contributions are as follows: (1) reducing hallucinations by focusing on high-scoring documents; (2) improving real-time performance through efficient retrieval; and (3) mitigating negative effects by filtering out irrelevant information using parallel generation and reflective tagging. These innovations aim to optimize RAG for more reliable, high-quality results.
APA, Harvard, Vancouver, ISO, and other styles
18

Xu, Kehan, Kun Zhang, Jingyuan Li, Wei Huang, and Yuanzhuo Wang. "CRP-RAG: A Retrieval-Augmented Generation Framework for Supporting Complex Logical Reasoning and Knowledge Planning." Electronics 14, no. 1 (December 26, 2024): 47. https://doi.org/10.3390/electronics14010047.

Full text
Abstract:
The Retrieval-Augmented Generation (RAG) framework enhances Large Language Models (LLMs) by retrieving relevant knowledge to broaden their knowledge boundaries and mitigate factual hallucinations stemming from knowledge gaps. However, the RAG Framework faces challenges in effective knowledge retrieval and utilization; invalid or misused knowledge will interfere with LLM generation, reducing reasoning efficiency and answer quality. Existing RAG methods address these issues by decomposing and expanding queries, introducing special knowledge structures, and using reasoning process evaluation and feedback. However, the linear reasoning structures limit complex thought transformations and reasoning based on intricate queries. Additionally, knowledge retrieval and utilization are decoupled from reasoning and answer generation, hindering effective knowledge support during answer generation. To address these limitations, we propose the CRP-RAG framework, which employs reasoning graphs to model complex query reasoning processes more comprehensively and accurately. CRP-RAG guides knowledge retrieval, aggregation, and evaluation through reasoning graphs, dynamically adjusting the reasoning path based on evaluation results and selecting knowledge-sufficiency paths for answer generation. CRP-RAG outperforms the best LLM and RAG baselines by 2.46 in open-domain QA, 7.43 in multi-hop reasoning, and 4.2 in factual verification. Experiments also show the superior factual consistency and robustness of CRP-RAG over existing RAG methods. Extensive analyses confirm its accurate and fact-faithful reasoning and answer generation for complex queries.
APA, Harvard, Vancouver, ISO, and other styles
19

Pokhrel, Sangita, Bina K C, and Prashant Bikram Shah. "A Practical Application of Retrieval-Augmented Generation for Website-Based Chatbots: Combining Web Scraping, Vectorization, and Semantic Search." Journal of Trends in Computer Science and Smart Technology 6, no. 4 (January 2025): 424–42. https://doi.org/10.36548/jtcsst.2024.4.007.

Full text
Abstract:
The Retrieval-Augmented Generation (RAG) model significantly enhances the capabilities of large language models (LLMs) by integrating information retrieval with text generation, which is particularly relevant for applications requiring context-aware responses based on dynamic data sources. This research study presents a practical implementation of a RAG model personalized for a Chabot that answers user inquiries from various specific websites. The methodology encompasses several key steps: web scraping using BeautifulSoup to extract relevant content, text processing to segment this content into manageable chunks, and vectorization to create embeddings for efficient semantic search. By employing a semantic search approach, the system retrieves the most relevant document segments based on user queries. The OpenAI API is then utilized to generate contextually appropriate responses from the retrieved information. Key results highlight the system's effectiveness in providing accurate and relevant answers, with evaluation metrics centered on response quality, retrieval efficiency, and user satisfaction. This research contributes a comprehensive integration of scraping, vectorization, and semantic search technologies into a cohesive chatbot application, offering valuable insights into the practical implementation of RAG models.
APA, Harvard, Vancouver, ISO, and other styles
20

Yoon, Yeochan, and Sookyun Kim. "Trends and Prospects of Retrieval-Augmented Generation (RAG) for Generative AI." Journal of Korean Association of Computer Education 28, no. 2 (February 28, 2025): 69–80. https://doi.org/10.32431/kace.2025.28.2.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Zhang, Wan, and Jing Zhang. "Hallucination Mitigation for Retrieval-Augmented Large Language Models: A Review." Mathematics 13, no. 5 (March 4, 2025): 856. https://doi.org/10.3390/math13050856.

Full text
Abstract:
Retrieval-augmented generation (RAG) leverages the strengths of information retrieval and generative models to enhance the handling of real-time and domain-specific knowledge. Despite its advantages, limitations within RAG components may cause hallucinations, or more precisely termed confabulations in generated outputs, driving extensive research to address these limitations and mitigate hallucinations. This review focuses on hallucination in retrieval-augmented large language models (LLMs). We first examine the causes of hallucinations from different sub-tasks in the retrieval and generation phases. Then, we provide a comprehensive overview of corresponding hallucination mitigation techniques, offering a targeted and complete framework for addressing hallucinations in retrieval-augmented LLMs. We also investigate methods to reduce the impact of hallucination through detection and correction. Finally, we discuss promising future research directions for mitigating hallucinations in retrieval-augmented LLMs.
APA, Harvard, Vancouver, ISO, and other styles
22

O, Nakhod. "Using retrieval-augmented generation to elevate low-code developer skills." Artificial Intelligence 28, AI.2023.28(3) (November 30, 2023): 126–30. http://dx.doi.org/10.15407/jai2023.03.126.

Full text
Abstract:
This article proposes applying retrieval-augmented generation (RAG) to improve the skills of low-code developers by augmenting large language models with up-to-date domain-specific knowledge. As low-code development requires combining multiple systems into a final product, developers must consult several sources of documentation and various articles, videos, and forum threads. Such a process may be time-consuming, prompting the use of an LLM for the authoritative answer. However, LLMs often lack knowledge of low-code platforms, leading to hallucinations and superficial responses. RAG utilizes the benefits of LLMs on relevant information, suggesting a presumption that it may be effectively applied in low-code development. Heterogeneous data sources concerning low-code systems are converted to a text representation, split into logical chunks, and stored in a vector database. During the exploitation of the model, cosine similarity is used to retrieve top-K documents and concatenate them with user query, using the produced text as a prompt to an LLM. The results support the hypothesis that RAG models outperform standard LLMs in knowledge retrieval in this domain
APA, Harvard, Vancouver, ISO, and other styles
23

Gresha Bhatia. "Intelligent Railways: Leveraging Retrieval-Augmented Generation for Smarter Systems." Communications on Applied Nonlinear Analysis 32, no. 3s (November 29, 2024): 91–103. https://doi.org/10.52783/cana.v32.2553.

Full text
Abstract:
In an era for faster, secure and convenient mode of travel, there is a need for a system that provides real time updates This paper presents a technical study on the integration of Retrieval-Augmented Generation (RAG) systems within railway operations, emphasizing their potential to enhance decision-making, service delivery, and passenger engagement. The study explores how RAG systems can streamline processes by providing accurate, context-aware responses to inquiries across various railway services, including ticketing, scheduling, and customer support. The findings highlight key challenges such as data security, infrastructure limitations, and the necessity for specialized training, while also emphasizing the operational benefits, including improved efficiency and greater accessibility of services. The methodology encompasses a comprehensive review of existing RAG implementations in transportation, followed by the design and analysis of a prototype system specifically tailored to railway needs, utilizing domain-specific datasets and natural language queries. This study offers valuable insights into the feasibility and scalability of RAG systems for enhancing the efficiency and responsiveness of railway operations.
APA, Harvard, Vancouver, ISO, and other styles
24

Gu, Jiafeng. "A Research of Challenges and Solutions in Retrieval Augmented Generation (RAG) Systems." Highlights in Science, Engineering and Technology 124 (February 18, 2025): 132–38. https://doi.org/10.54097/364hex16.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) systems represent a significant innovation in the field of Natural Language Processing (NLP), ingeniously integrating Large Language Models (LLMs) with dynamic external knowledge retrieval. This amalgamation not only enhances the models' responsiveness to real-world knowledge but also addresses the limitations of conventional generative models in terms of knowledge update velocity and factual accuracy. This review examines the challenges faced by RAG systems and their solutions. It delves into the central architecture of RAG systems, encompassing retrieval components, generative components, and knowledge bases, with a particular focus on recent advancements that have expanded the boundaries of performance and functionality. The study critically analyzes major challenges such as retrieval efficiency and dynamic knowledge management. This paper evaluates various advanced solutions proposed in recent literature, comparing their efficacy and discussing the trade-offs involved. Ultimately, this paper aims to provide researchers, developers, and users of RAG systems with a comprehensive perspective, fostering ongoing innovation and the expansion of applications in this domain.
APA, Harvard, Vancouver, ISO, and other styles
25

Zhu, Xishi, Xiaoming Guo, Shengting Cao, Shenglin Li, and Jiaqi Gong. "StructuGraphRAG: Structured Document-Informed Knowledge Graphs for Retrieval-Augmented Generation." Proceedings of the AAAI Symposium Series 4, no. 1 (November 8, 2024): 242–51. http://dx.doi.org/10.1609/aaaiss.v4i1.31798.

Full text
Abstract:
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating external data sources beyond their training sets and querying predefined knowledge bases to generate accurate, context-rich responses. Most RAG implementations use vector similarity searches, but the effectiveness of this approach and the representation of knowledge bases remain underexplored. Emerging research suggests knowledge graphs as a promising solution. Therefore, this paper presents StructuGraphRAG, which leverages document structures to inform the extraction process and constructs knowledge graphs to enhance RAG for social science research, specifically using NSDUH datasets. Our method parses document structures to extract entities and relationships, constructing comprehensive and relevant knowledge graphs. Experimental results show that StructuGraphRAG outperforms traditional RAG methods in accuracy, comprehensiveness, and contextual relevance. This approach provides a robust tool for social science researchers, facilitating precise analysis of social determinants of health and justice, and underscores the potential of structured document-informed knowledge graph construction in AI and social science research.
APA, Harvard, Vancouver, ISO, and other styles
26

Siriwardhana, Shamane, Rivindu Weerasekera, Elliott Wen, Tharindu Kaluarachchi, Rajib Rana, and Suranga Nanayakkara. "Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering." Transactions of the Association for Computational Linguistics 11 (2023): 1–17. http://dx.doi.org/10.1162/tacl_a_00530.

Full text
Abstract:
Abstract Retrieval Augment Generation (RAG) is a recent advancement in Open-Domain Question Answering (ODQA). RAG has only been trained and explored with a Wikipedia-based external knowledge base and is not optimized for use in other specialized domains such as healthcare and news. In this paper, we evaluate the impact of joint training of the retriever and generator components of RAG for the task of domain adaptation in ODQA. We propose RAG-end2end, an extension to RAG that can adapt to a domain-specific knowledge base by updating all components of the external knowledge base during training. In addition, we introduce an auxiliary training signal to inject more domain-specific knowledge. This auxiliary signal forces RAG-end2end to reconstruct a given sentence by accessing the relevant information from the external knowledge base. Our novel contribution is that, unlike RAG, RAG-end2end does joint training of the retriever and generator for the end QA task and domain adaptation. We evaluate our approach with datasets from three domains: COVID-19, News, and Conversations, and achieve significant performance improvements compared to the original RAG model. Our work has been open-sourced through the HuggingFace Transformers library, attesting to our work’s credibility and technical consistency.
APA, Harvard, Vancouver, ISO, and other styles
27

Swacha, Jakub, and Michał Gracel. "Retrieval-Augmented Generation (RAG) Chatbots for Education: A Survey of Applications." Applied Sciences 15, no. 8 (April 11, 2025): 4234. https://doi.org/10.3390/app15084234.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) overcomes the main barrier for the adoption of LLM-based chatbots in education: hallucinations. The uncomplicated architecture of RAG chatbots makes it relatively easy to implement chatbots that serve specific purposes and thus are capable of addressing various needs in the educational domain. With five years having passed since the introduction of RAG, the time has come to check the progress attained in its adoption in education. This paper identifies 47 papers dedicated to RAG chatbots’ uses for various kinds of educational purposes, which are analyzed in terms of their character, the target of the support provided by the chatbots, the thematic scope of the knowledge accessible via the chatbots, the underlying large language model, and the character of their evaluation.
APA, Harvard, Vancouver, ISO, and other styles
28

Deepak, M., A. Anusha, P. Phanivighnesh, and Dr G. Sreenivasulu. "Langchain-Chat with My PDF." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 03 (March 14, 2025): 1–9. https://doi.org/10.55041/ijsrem42403.

Full text
Abstract:
This paper presents a state-of-the-art system that is intended to facilitate natural language interaction with PDF documents. Leveraging the powerful Retrieval-Augmented Generation (RAG) algorithm, the solution seamlessly integrates information retrieval and generative language models to generate precise and context-sensitive responses. The operation starts when a user uploads a PDF file. The system proceeds to process the file, breaking it down into bite-sized text chunks that are kept organized for easy retrieval. Upon the submission of a query by a user, the RAG algorithm locates the most applicable parts of the document and uses a generative language model to build an understandable and accurate response. By combining the LangChain platform with the RAG approach, this paper introduces an effective tool for extracting precise information from long PDF documents efficiently. Its capacity to provide relevant and accurate responses makes it particularly valuable in education, research, and documentation professions where immediate access to accurate information is paramount. Key Words: Retrieval-Augmented Generation (RAG), LangChain Framework, PDF Document Querying, Information Retrieval, Generative Language Models
APA, Harvard, Vancouver, ISO, and other styles
29

Jeong, Minbyul, Jiwoong Sohn, Mujeen Sung, and Jaewoo Kang. "Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models." Bioinformatics 40, Supplement_1 (June 28, 2024): i119—i129. http://dx.doi.org/10.1093/bioinformatics/btae238.

Full text
Abstract:
Abstract Summary Recent proprietary large language models (LLMs), such as GPT-4, have achieved a milestone in tackling diverse challenges in the biomedical domain, ranging from multiple-choice questions to long-form generations. To address challenges that still cannot be handled with the encoded knowledge of LLMs, various retrieval-augmented generation (RAG) methods have been developed by searching documents from the knowledge corpus and appending them unconditionally or selectively to the input of LLMs for generation. However, when applying existing methods to different domain-specific problems, poor generalization becomes apparent, leading to fetching incorrect documents or making inaccurate judgments. In this paper, we introduce Self-BioRAG, a framework reliable for biomedical text that specializes in generating explanations, retrieving domain-specific documents, and self-reflecting generated responses. We utilize 84k filtered biomedical instruction sets to train Self-BioRAG that can assess its generated explanations with customized reflective tokens. Our work proves that domain-specific components, such as a retriever, domain-related document corpus, and instruction sets are necessary for adhering to domain-related instructions. Using three major medical question-answering benchmark datasets, experimental results of Self-BioRAG demonstrate significant performance gains by achieving a 7.2% absolute improvement on average over the state-of-the-art open-foundation model with a parameter size of 7B or less. Similarly, Self-BioRAG outperforms RAG by 8% Rouge-1 score in generating more proficient answers on two long-form question-answering benchmarks on average. Overall, we analyze that Self-BioRAG finds the clues in the question, retrieves relevant documents if needed, and understands how to answer with information from retrieved documents and encoded knowledge as a medical expert does. We release our data and code for training our framework components and model weights (7B and 13B) to enhance capabilities in biomedical and clinical domains. Availability and implementation Self-BioRAG is available at https://github.com/dmis-lab/self-biorag.
APA, Harvard, Vancouver, ISO, and other styles
30

Radeva, Irina, Ivan Popchev, Lyubka Doukovska, and Miroslava Dimitrova. "Web Application for Retrieval-Augmented Generation: Implementation and Testing." Electronics 13, no. 7 (April 4, 2024): 1361. http://dx.doi.org/10.3390/electronics13071361.

Full text
Abstract:
The purpose of this paper is to explore the implementation of retrieval-augmented generation (RAG) technology with open-source large language models (LLMs). A dedicated web-based application, PaSSER, was developed, integrating RAG with Mistral:7b, Llama2:7b, and Orca2:7b models. Various software instruments were used in the application’s development. PaSSER employs a set of evaluation metrics, including METEOR, ROUGE, BLEU, perplexity, cosine similarity, Pearson correlation, and F1 score, to assess LLMs’ performance, particularly within the smart agriculture domain. The paper presents the results and analyses of two tests. One test assessed the performance of LLMs across different hardware configurations, while the other determined which model delivered the most accurate and contextually relevant responses within RAG. The paper discusses the integration of blockchain with LLMs to manage and store assessment results within a blockchain environment. The tests revealed that GPUs are essential for fast text generation, even for 7b models. Orca2:7b on Mac M1 was the fastest, and Mistral:7b had superior performance on the 446 question–answer dataset. The discussion is on technical and hardware considerations affecting LLMs’ performance. The conclusion outlines future developments in leveraging other LLMs, fine-tuning approaches, and further integration with blockchain and IPFS.
APA, Harvard, Vancouver, ISO, and other styles
31

Dhami, Aatishkumar, and Lagan Goel. "Optimizing retrieval augmented generation pipelines for domain specific applications." International Journal of Research in Modern Engineering & Emerging Technology 13, no. 3 (March 2025): 55–72. https://doi.org/10.63345/ijrmeet.org.v13.i3.4.

Full text
Abstract:
Retrieval Augmented Generation (RAG) pipelines have emerged as a transformative approach in integrating external knowledge into generative models. However, tailoring these systems to domain-specific applications presents unique challenges, including the handling of specialized vocabularies and intricate contextual nuances. This paper introduces a novel optimization framework for RAG pipelines, emphasizing adaptive retrieval strategies, customized knowledge bases, and fine-tuned generative components. By incorporating domain-tailored filtering mechanisms and dynamically adjusting retrieval parameters, our approach significantly enhances the accuracy and relevance of generated outputs. Extensive experiments across various specialized fields, such as legal analysis and medical documentation, demonstrate notable improvements in precision and recall, affirming the framework’s effectiveness. The proposed methodology not only bridges the gap between general-purpose language models and domain-specific needs but also lays a foundation for more context-aware and reliable AI-driven applications in specialized industries.
APA, Harvard, Vancouver, ISO, and other styles
32

Tanyildiz, Derya, Serkan Ayvaz, and Mehmet Fatih Amasyali. "Enhancing Retrieval-Augmented Generation Accuracy with Dynamic Chunking and Optimized Vector Search." Orclever Proceedings of Research and Development 5, no. 1 (December 31, 2024): 215–25. https://doi.org/10.56038/oprd.v5i1.516.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) architectures depend on the integration of efficient retrieval and ranking mechanisms to enhance response accuracy and relevance. This study investigates a novel approach to improving the response performance of RAG systems, leveraging dynamic chunking for contextual coherence, Sentence-Transformers (all-mpnet-base-v2) for high-quality embeddings, and cross-encoder-based re-ranking for retrieval refinement. Our evaluation utilizes RAGAS metrics to assess key performance metrics, including faithfulness, relevancy, correctness, and context precision. Empirical evaluations highlighted the significant impact of index choice on the performance. Our proposed approach integrates the FAISS HNSW index with re-ranking, resulting in a balanced architecture that improves response fidelity without compromising efficiency. These insights underscore the importance of advanced indexing and retrieval techniques in bridging the gap between large-scale language models and domain-specific information needs. The findings provide a robust framework for future research in optimizing RAG systems, particularly in scenarios requiring high-context preservation and precision.
APA, Harvard, Vancouver, ISO, and other styles
33

Bazzi, Wafaa. "The Wonders of RAG: Streamlining Knowledge with Advanced Techniques Systematic Literature Review Report." Journal of Neurology Research Reviews & Reports 7, no. 3 (March 31, 2025): 1–4. https://doi.org/10.47363/jnrrr/2025(7)175.

Full text
Abstract:
The Retrieval-Augmented Generation (RAG) framework enhances Large Language Model (LLM) performance by incorporating external knowledge through information retrieval, addressing inherent limitations in standard LLMs
APA, Harvard, Vancouver, ISO, and other styles
34

Iaroshev, Ivan, Ramalingam Pillai, Leandro Vaglietti, and Thomas Hanne. "Evaluating Retrieval-Augmented Generation Models for Financial Report Question and Answering." Applied Sciences 14, no. 20 (October 12, 2024): 9318. http://dx.doi.org/10.3390/app14209318.

Full text
Abstract:
This study explores the application of retrieval-augmented generation (RAG) to improve the accuracy and reliability of large language models (LLMs) in the context of financial report analysis. The focus is on enabling private investors to make informed decisions by enhancing the question-and-answering capabilities regarding the half-yearly or quarterly financial reports of banks. The study adopts a Design Science Research (DSR) methodology to develop and evaluate an RAG system tailored for this use case. The study conducts a series of experiments to explore models in which different RAG components are used. The aim is to enhance context relevance, answer faithfulness, and answer relevance. The results indicate that model one (OpenAI ADA and OpenAI GPT-4) achieved the highest performance, showing robust accuracy and relevance in response. Model three (MiniLM Embedder and OpenAI GPT-4) scored significantly lower, indicating the importance of high-quality components. The evaluation also revealed that well-structured reports result in better RAG performance than less coherent reports. Qualitative questions received higher scores than the quantitative ones, demonstrating the RAG’s proficiency in handling descriptive data. In conclusion, a tailored RAG can aid investors in providing accurate and contextually relevant information from financial reports, thereby enhancing decision making.
APA, Harvard, Vancouver, ISO, and other styles
35

Zhu, Yutao, Zhaoheng Huang, Zhicheng Dou, and Ji-Rong Wen. "One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 24 (April 11, 2025): 26166–74. https://doi.org/10.1609/aaai.v39i24.34813.

Full text
Abstract:
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs) for generating more factual, accurate, and up-to-date content. Existing methods either optimize prompts to guide LLMs in leveraging retrieved information or directly fine-tune LLMs to adapt to RAG scenarios. Although fine-tuning can yield better performance, it often compromises the LLMs' general generation capabilities by modifying their parameters. This limitation poses challenges in practical applications, especially when LLMs are already deployed, as parameter adjustments may affect their original functionality. To address this, we propose a novel method that involves learning scalable and pluggable virtual tokens for RAG. By maintaining the LLMs' original parameters and fine-tuning only the embeddings of these pluggable tokens, our approach not only enhances LLMs' performance but also preserves their general generation capabilities. Furthermore, we design several training strategies to improve the scalability, flexibility, and generalizability of our method. Comprehensive experiments across 12 question-answering tasks demonstrate the superiority of our approach.
APA, Harvard, Vancouver, ISO, and other styles
36

Sarat Kiran. "Hybrid Retrieval-Augmented Generation (RAG) Systems with Embedding Vector Databases." International Journal of Scientific Research in Computer Science, Engineering and Information Technology 11, no. 2 (March 28, 2025): 2694–702. https://doi.org/10.32628/cseit25112702.

Full text
Abstract:
This article explores the integration of embedding vector databases into Retrieval-Augmented Generation (RAG) systems to enhance the capabilities of large language models. The article explores how hybrid retrieval strategies combining dense vector search with traditional keyword-based methods can address the limitations of standalone LLMs, particularly regarding knowledge cutoff, hallucinations, and access to domain-specific information. The article presents a comprehensive framework covering theoretical foundations, methodological approaches, implementation considerations, and experimental results across multiple domains. By leveraging vector embeddings for semantic search alongside traditional retrieval techniques, the proposed system demonstrates significant improvements in accuracy, relevance, and factual correctness while maintaining reasonable query response times. The article provides valuable insights for enterprise-scale deployments of RAG systems across various application domains including healthcare, legal, technical support, and financial services.
APA, Harvard, Vancouver, ISO, and other styles
37

Arora1, Kartik. "Real-Time Retrieval-Augmented Generation in Healthcare Using LLMs for Audio-based Querying and Retrieval." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 01 (January 28, 2025): 1–9. https://doi.org/10.55041/ijsrem41141.

Full text
Abstract:
Healthcare information systems are critical for efficient decision-making and patient care. This paper presents a real-time audio-query-driven retrieval-augmented generation (RAG) system using a large language model (LLM) that interfaces with a healthcare database and a vector database. The system allows users to make real-time audio queries, retrieves the necessary information using LLM tools, and delivers audio responses. Comparative performance evaluations between traditional healthcare information systems and our proposed LLM-based approach show significant improvements in query throughput, cost-efficiency, and response time. Key Words: RAG, LLM, healthcare database, real-time audio input, vector database, AI in healthcare
APA, Harvard, Vancouver, ISO, and other styles
38

Liu, Shuxin. "Exploring Optimal Prefetching Sizes in RaLMSpec to Enhance Retrieval-Augmented Generation Efficiency." Highlights in Science, Engineering and Technology 138 (May 11, 2025): 24–31. https://doi.org/10.54097/hhff9g78.

Full text
Abstract:
Retrieval-augmented generation (RAG) frameworks like RaLMSpec enhance language model performance by integrating external knowledge. A key method to accelerate RaLMSpec efficiency is the prefetching, which determines the number of documents to retrieve in advance to balance retrieval speed and cache utilization. This study introduces and tests both static and dynamic prefetching strategies to optimize performance in RaLMSpec. Static prefetching uses fixed sizes, while dynamic prefetching adjusts based on real-time factors including task complexity, cache hit rates, and retrieval latency. Experiments across multiple datasets, retrievers, and language models demonstrate that dynamic prefetching significantly reduces latency by 18% on average, outperforming static strategies. Dynamic prefetching adapts to varying task demands, providing better balance between retrieval and caching efficiency. Among static strategies, a prefetch size of 64 offers the best trade-off between latency reduction and cache utilization. The results highlight that dynamic prefetching is optimal for environments with fluctuating task complexity, while static prefetching with a size of 64 is effective for predictable tasks. This study provides valuable insights for improving RAG system efficiency and suggests future directions, including machine learning-based adaptations and hardware optimizations.
APA, Harvard, Vancouver, ISO, and other styles
39

Ngangmeni, Joëd, and Danda B. Rawat. "Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy." AI 6, no. 3 (March 1, 2025): 47. https://doi.org/10.3390/ai6030047.

Full text
Abstract:
Background: Both early researchers, such as new graduate students, and experienced researchers face the challenge of sifting through vast amounts of literature to find their needle in a haystack. This process can be time-consuming, tedious, or frustratingly unproductive. Methods: Using only abstracts and titles of research articles, we compare three retrieval methods—Bibliographic Indexing/Databasing (BI/D), Retrieval-Augmented Generation (RAG), and Graph Retrieval-Augmented Generation (GraphRAG)—which reportedly offer promising solutions to these common challenges. We assess their performance using two sets of Large Language Model (LLM)-generated queries: one set of queries with context and the other set without context. Our study evaluates six sub-models—four from Light Retrieval-Augmented Generation (LightRAG) and two from Microsoft’s Graph Retrieval-Augmented Generation (MGRAG). We examine these sub-models across four key criteria—comprehensiveness, diversity, empowerment, and directness—as well as the overall combination of these factors. Results: After three separate experiments, we observe that MGRAG has a slight advantage over LightRAG, naïve RAG, and BI/D for answering queries that require a semantic understanding of our data pool. The results (displayed in grouped bar charts) provide clear and accessible comparisons to help researchers quickly make informed decisions on which method best suits their needs. Conclusions: Supplementing BI/D with RAG or GraphRAG pipelines would positively impact the way both beginners and experienced researchers find and parse through volumes of potentially relevant information.
APA, Harvard, Vancouver, ISO, and other styles
40

Tohir, Herdian, Nita Merlina, and Muhammad Haris. "UTILIZING RETRIEVAL-AUGMENTED GENERATION IN LARGE LANGUAGE MODELS TO ENHANCE INDONESIAN LANGUAGE NLP." JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer) 10, no. 2 (November 19, 2024): 352–60. http://dx.doi.org/10.33480/jitk.v10i2.5916.

Full text
Abstract:
The improvement of Large Language Models (LLM) such as ChatGPT through Retrieval-Augmented Generation (RAG) techniques has urgency in the development of natural language translation technology and dialogue systems. LLMs often experience obstacles in addressing special requests that require information outside the training data. This study aims to discuss the use of Retrieval-Augmented Generation (RAG) on large-scale language models to improve the performance of Natural Language Processing (NLP) in Indonesian, which has so far been poorly supported by high-quality data and to overcome the limitations of traditional language models in understanding the context of Indonesian better. The method used is a combination of retrieval capabilities (external information search) with generation (text generation), where the model utilizes broader and more structured basic data through the retrieval process to produce more accurate and relevant text. The data used includes the Indonesian corpus of the 30 Juz Quran translation into Indonesian. The results of the trial show that the RAG approach significantly improves the performance of the model in various NLP tasks, including token usage optimization, text classification, and context understanding, by increasing the accuracy and relevance of the results
APA, Harvard, Vancouver, ISO, and other styles
41

Bora, Arunabh, and Heriberto Cuayáhuitl. "Systematic Analysis of Retrieval-Augmented Generation-Based LLMs for Medical Chatbot Applications." Machine Learning and Knowledge Extraction 6, no. 4 (October 18, 2024): 2355–74. http://dx.doi.org/10.3390/make6040116.

Full text
Abstract:
Artificial Intelligence (AI) has the potential to revolutionise the medical and healthcare sectors. AI and related technologies could significantly address some supply-and-demand challenges in the healthcare system, such as medical AI assistants, chatbots and robots. This paper focuses on tailoring LLMs to medical data utilising a Retrieval-Augmented Generation (RAG) database to evaluate their performance in a computationally resource-constrained environment. Existing studies primarily focus on fine-tuning LLMs on medical data, but this paper combines RAG and fine-tuned models and compares them against base models using RAG or only fine-tuning. Open-source LLMs (Flan-T5-Large, LLaMA-2-7B, and Mistral-7B) are fine-tuned using the medical datasets Meadow-MedQA and MedMCQA. Experiments are reported for response generation and multiple-choice question answering. The latter uses two distinct methodologies: Type A, as standard question answering via direct choice selection; and Type B, as language generation and probability confidence score generation of choices available. Results in the medical domain revealed that Fine-tuning and RAG are crucial for improved performance, and that methodology Type A outperforms Type B.
APA, Harvard, Vancouver, ISO, and other styles
42

Yang, Chi Bok, and Yang Sok Kim. "Implementation of Retrieval Augmented Generation (RAG) Model Using LLM: A RapidMiner-Based Approach." Korean Institute of Smart Media 14, no. 2 (February 28, 2025): 34–42. https://doi.org/10.30693/smj.2025.14.2.34.

Full text
Abstract:
Generative AI technology, driven by Large Language Models (LLMs), is being increasingly utilized to overcome existing limitations. Retrieval-Augmented Generation (RAG) has emerged as an effective approach to reduce hallucination in LLMs by leveraging up-to-date and domain-specific knowledge beyond training data. However, most studies propose programming-based implementations. This research introduces a GUI-based RAG framework using RapidMiner, to construct RAG systems without programming proficiency. The methodology includes storing and retrieving embeddings with the Qdrant vector database and generating question-and-answer pairs via the OpenAI API. Practical demonstrations confirm the system’s effectiveness in real-world scenarios, offering a simpler and more efficient method for developing generative AI services with LLMs.
APA, Harvard, Vancouver, ISO, and other styles
43

Chandra, Prudhvi. "ENHANCING INFORMATION RETRIEVAL WITH RETRIEVAL-AUGMENTED GENERATION (RAG) FOR IMPROVED CONVERSATIONAL AI." INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING AND TECHNOLOGY 16, no. 1 (February 14, 2025): 3344–57. https://doi.org/10.34218/ijcet_16_01_233.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Yilma, Girma M., Jose A. Ayala-Romero, Andres Garcia-Saavedra, and Xavier Costa-Perez. "TelecomRAG: Taming Telecom Standards with Retrieval Augmented Generation and LLMs." ACM SIGCOMM Computer Communication Review 54, no. 3 (July 30, 2024): 18–23. https://doi.org/10.1145/3711992.3711996.

Full text
Abstract:
Large Language Models (LLMs) have immense potential to transform the telecommunications industry. They could help professionals understand complex standards, generate code, and accelerate development. However, traditional LLMs struggle with the precision and source verification essential for telecom work. To address this, specialized LLM-based solutions tailored to telecommunication standards are needed. This Editorial Note showcases how Retrieval-Augmented Generation (RAG) can offer a way to create precise, factual answers. In particular, we show how to build a Telecommunication Standards Assistant that provides accurate, detailed, and verifiable responses. We show a usage example of this framework using 3GPP Release 16 and Release 18 specification documents. We believe that the application of RAG can bring significant value to the telecommunications field.
APA, Harvard, Vancouver, ISO, and other styles
45

Huang, Jie, Mo Wang, Yunpeng Cui, Juan Liu, Li Chen, Ting Wang, Huan Li, and Jinming Wu. "Layered Query Retrieval: An Adaptive Framework for Retrieval-Augmented Generation in Complex Question Answering for Large Language Models." Applied Sciences 14, no. 23 (November 27, 2024): 11014. http://dx.doi.org/10.3390/app142311014.

Full text
Abstract:
Retrieval-augmented generation (RAG) addresses the problem of knowledge cutoff and overcomes the inherent limitations of pre-trained language models by retrieving relevant information in real time. However, challenges related to efficiency and accuracy persist in current RAG strategies. A key issue is how to select appropriate methods for user queries of varying complexity dynamically. This study introduces a novel adaptive retrieval-augmented generation framework termed Layered Query Retrieval (LQR). The LQR framework focuses on query complexity classification, retrieval strategies, and relevance analysis, utilizing a custom-built training dataset to train smaller models that aid the large language model (LLM) in efficiently retrieving relevant information. A central technique in LQR is a semantic rule-based approach to distinguish between different levels of multi-hop queries. The process begins by parsing the user’s query for keywords, followed by a keyword-based document retrieval. Subsequently, we employ a natural language inference (NLI) model to assess whether the retrieved document is relevant to the query. We validated our approach on multiple single-hop and multi-hop datasets, demonstrating significant improvements in both accuracy and efficiency compared to existing single-step, multi-step, and adaptive methods. Our method exhibits high accuracy and efficiency, particularly on the HotpotQA dataset, where it outperforms the Adaptive-RAG method by improving accuracy by 9.4% and the F1 score by 16.14%. The proposed approach carefully balances retrieval efficiency with the accuracy of the LLM’s responses.
APA, Harvard, Vancouver, ISO, and other styles
46

Kwon, Mincheol, Jimin Bang, Seyoung Hwang, Junghoon Jang, and Woosin Lee. "A Dynamic-Selection-Based, Retrieval-Augmented Generation Framework: Enhancing Multi-Document Question-Answering for Commercial Applications." Electronics 14, no. 4 (February 8, 2025): 659. https://doi.org/10.3390/electronics14040659.

Full text
Abstract:
Commercial multi-document question-answering (QA) applications require a high multi-document retrieval performance, while simultaneously minimizing Application Programming Interface (API) usage costs of large language models (LLMs) and system complexity. To address this need, we designed the Dynamic-Selection-based, Retrieval-Augmented Generation (DS-RAG) framework, which consists of two key modules: an Entity-Preserving Question Decomposition (EPQD) module that effectively decomposes questions while preserving the entities of the original user’s question to reduce unnecessary retrieval and enhance performance, and a Dynamic Input Context Selection (DICS) module that optimizes the LLM input context based on the content of the user’s question, thereby minimizing API usage. We evaluated the proposed framework on a newly constructed dataset containing questions that require up to four multi-document retrievals. Experimental results demonstrated the new framework’s superior performance in terms of retrieval quality, input context optimization, and final answer generation compared to existing approaches. Consequently, the DS-RAG framework can be leveraged to develop domain-specific commercial QA applications in the future.
APA, Harvard, Vancouver, ISO, and other styles
47

Petroni, Fabio, Federico Siciliano, Fabrizio Silvestri, and Giovanni Trappolini. "Report on the 1st Workshop on Information Retrieval's Role in RAG Systems (IR-RAG 2024) at SIGIR 2024." ACM SIGIR Forum 58, no. 2 (December 2024): 1–12. https://doi.org/10.1145/3722449.3722463.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) systems have become a transformative component of artificial intelligence, combining the strengths of information retrieval (IR) and generative models to tackle complex problems in diverse domains. Despite their rapid adoption and proven potential, the role of IR within RAG frameworks remains under-explored, with research often prioritising advances in generative techniques. This imbalance has left a critical gap in understanding how robust retrieval mechanisms can optimise the overall performance and reliability of RAG systems. The 1st Workshop on Information Retrieval's Role in RAG Systems (IR-RAG), held at SIGIR 2024, addressed this gap by focusing on the fundamental principles of information retrieval within the RAG paradigm. The workshop provided a dedicated platform for researchers, practitioners, and experts to share insights, foster discussions, and present innovative research highlighting the centrality of IR in RAG frameworks. Through keynote talks, oral and poster presentations, and collaborative breakout sessions, the workshop highlighted both the challenges and opportunities in refining retrieval methodologies to support the generative components of RAG systems. This event has set the stage for advancing research on IR's pivotal role in shaping the future of RAG systems. The proceedings of the workshop, published in CEUR Workshop Proceedings, are available at https://ceur-ws.org/Vol-3784/. Date : 18 July 2024. Website : https://coda.io/@rstless-group/ir-rag-sigir24.
APA, Harvard, Vancouver, ISO, and other styles
48

Pal Singh Shobhit, Jatin. "Automating Threat Intelligence Analysis with Retrieval Augmented Generation (RAG) for Enhanced Cybersecurity Posture." International Journal of Science and Research (IJSR) 13, no. 5 (May 5, 2024): 251–55. http://dx.doi.org/10.21275/sr24502103758.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Iapăscurtă, Victor, Sergey Kronin, and Ion Fiodorov. "RETRIEVAL-AUGMENTED GENERATION USING DOMAIN-SPECIFIC TEXT: A PILOT STUDY." JOURNAL OF ENGINEERING SCIENCE 31, no. 2 (July 15, 2024): 48–59. http://dx.doi.org/10.52326/jes.utm.2024.31(2).05.

Full text
Abstract:
The natural language processing (NLP) field has witnessed remarkable advancements with the advent of large language models (LLMs) like GPT, Gemini, Claude, etc. These models are trained on vast amounts of text data, allowing them to generate human-like responses for various tasks. However, despite their impressive capabilities, LLMs have limitations in their ability to incorporate and reason over external knowledge that is not in their training data. This limitation of LLMs is particularly evident in the case of specific domain knowledge. This situation has given rise to the concept of retrieval augmented generation (RAG), an approach that combines the generative power of LLMs with the ability to retrieve and integrate relevant information from external knowledge sources. This research attempts to use RAG as a module in an application designed to answer questions concerning a specific domain, namely social philosophy/philosophy of management, using a published book from the respective domain as an external source. The paper analyzes the mentioned application output, draws conclusions, and traces future directions to improve the accuracy of the output.
APA, Harvard, Vancouver, ISO, and other styles
50

Toukmaji, Christopher, and Allison Tee. "Retrieval-Augmented Generation and LLM Agents for Biomimicry Design Solutions." Proceedings of the AAAI Symposium Series 3, no. 1 (May 20, 2024): 273–78. http://dx.doi.org/10.1609/aaaiss.v3i1.31210.

Full text
Abstract:
We present BIDARA, a Bio-Inspired Design And Research Assistant, to address the complexity of biomimicry -- the practice of designing modern-day engineering solutions inspired by biological phenomena. Large Language Models (LLMs) have been shown to act as sufficient general-purpose task solvers, but they often hallucinate and fail in regimes that require domain-specific and up-to-date knowledge. We integrate Retrieval-Augmented Generation (RAG) and Reasoning-and-Action agents to aid LLMs in avoiding hallucination and utilizing updated knowledge during generation of biomimetic design solutions. We find that incorporating RAG increases the feasibility of the design solutions in both prompting and agent settings, and we use these findings to guide our ongoing work. To the extent of our knowledge, this is the first work that integrates and evaluates Retrieval-Augmented Generation within LLM-generated biomimetic design solutions.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography