Dissertations / Theses on the topic 'Large Language Models (LLM)'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 18 dissertations / theses for your research on the topic 'Large Language Models (LLM).'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Naqvi, Syed Muhammad Raza. "Exploration des LLM et de l'XAI sémantique pour les capacités des robots industriels et les connaissances communes en matière de fabrication." Electronic Thesis or Diss., Université de Toulouse (2023-....), 2025. http://www.theses.fr/2025TLSEP014.
Full textIn Industry 4.0, advanced manufacturing is vital in shaping future factories, enabling enhanced planning, scheduling, and control. The ability to adaptproduction lines swiftly in response to customer demands or unexpected situations is essential to enhance the future of manufacturing. While AI is emerging as a solution, industries still rely on human expertise due to trust issues and a lack of transparency in AI decisions. Explainable AI integrating commonsense knowledge related to manufacturing is crucial for making AI decisions understandable and trustworthy. Within this context, we propose the S-XAI framework, an integrated solution combining machine specifications with MCSK to provide explainable and transparent decision-making. The focus is on providing real-time machine capabilities to ensure precise decision-making while simultaneously explaining the decision-making process to all involved stakeholders. Accordingly, the first objective was formalizing machine specifications, including capabilities, capacities, functions, quality, and process characteristics, focusing on robotics. To do so, we created a Robot Capability ontology formalizing all relevant aspects of machine specifications, such as Capability, Capacity, Function, Quality, and Process Characteristics. On top of this formalization, the RCO allows manufacturing stakeholders to capture robotic capabilities described in specification manuals (advertised capabilities) and compare them with real-world performance (operational capabilities). RCO is based on the Machine Service Description Language, a domain reference ontology created for manufacturing services, and aligned with the Basic Formal Ontology, Industrial Foundry Ontology, Information Artifact Ontology, and Relations Ontology. The second objective was the formalization of MCSK. We introduce MCSK and present a methodology for identifying it, starting with recognizing different CSK patterns in manufacturing and aligning them with manufacturing concepts. Extracting MCSK in a usable form is challenging, so our approach structures MCSK into NL statements utilizing LLMs. to facilitate rule-based reasoning, thereby enhancing decision-making capabilities. The third and final objective is to propose an S-XAI framework utilizing RCO and MCSK to assess if existing machines can perform specific tasks and generate understandable NL explanations. This was achieved by integrating the RCO, which provides operational capabilities like repeatability and precision, with MCSK, which outlines the process requirements. By utilizing MCSK-based semantic reasoning, the S-XAI system seamlessly provides NL explanations that detail each logic and outcome. In the S-XAI framework, an NN predicts the operational capabilities of robots, while symbolic AI incorporates these predictions within an MCSK-based reasoning system grounded in the RCO. This hybrid setup maximizes the strengths of each AI system and ensures that predictions support a transparent decision-making process. Additionally, S-XAI enhances the interpretability of NN predictions through XAI techniques such as LIME, SHAP, and PDP, clarifying NN predictions and enabling detailed insights for better calibration and proactive management, ultimately fostering a resilient and informed manufacturing environment
Labeau, Matthieu. "Neural language models : Dealing with large vocabularies." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS313/document.
Full textThis work investigates practical methods to ease training and improve performances of neural language models with large vocabularies. The main limitation of neural language models is their expensive computational cost: it depends on the size of the vocabulary, with which it grows linearly. Despite several training tricks, the most straightforward way to limit computation time is to limit the vocabulary size, which is not a satisfactory solution for numerous tasks. Most of the existing methods used to train large-vocabulary language models revolve around avoiding the computation of the partition function, ensuring that output scores are normalized into a probability distribution. Here, we focus on sampling-based approaches, including importance sampling and noise contrastive estimation. These methods allow an approximate computation of the partition function. After examining the mechanism of self-normalization in noise-contrastive estimation, we first propose to improve its efficiency with solutions that are adapted to the inner workings of the method and experimentally show that they considerably ease training. Our second contribution is to expand on a generalization of several sampling based objectives as Bregman divergences, in order to experiment with new objectives. We use Beta divergences to derive a set of objectives from which noise contrastive estimation is a particular case. Finally, we aim at improving performances on full vocabulary language models, by augmenting output words representation with subwords. We experiment on a Czech dataset and show that using character-based representations besides word embeddings for output representations gives better results. We also show that reducing the size of the output look-up table improves results even more
Schaeffer, Marion. "Towards efficient Knowledge Graph-based Retrieval Augmented Generation for conversational agents." Electronic Thesis or Diss., Normandie, 2025. http://www.theses.fr/2025NORMIR06.
Full textConversational agents have become widespread in recent years. Today, they have transcended their initial purpose of simulating a conversation with a computer program and are now valuable tools for accessing information and carrying out various tasks, from customer service to personal assistance. With the rise of text-generative models and Large Language Models (LLMs), the capabilities of conversational agents have increased tenfold. However, they are now subject to hallucinations, producing false information. A popular technique to limit the risk of hallucinations is Retrieval Augmented Generation (RAG), which injects knowledge into a text generation process. Such injected knowledge can be drawn from Knowledge Graphs (KGs), which are structured machine-readable knowledge representations. Therefore, we explore Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) to build trusted conversational agents. We demonstrate our approach on a real-world use case for citizen support by building conversational agents for disability management in cities. We first present a history of conversational agents, introducing the approaches implemented over the years and the evaluation techniques. We then define KGs and ontologies, and explore construction and evaluation techniques. As we could not find a directly exploitable KG, our first contribution introduces the Ontology Learning Applied Framework (OLAF). This modular system is built for automated and repeatable KG construction from unstructured text. OLAF integrates linguistic, statistical, and LLM-based techniques to generate Minimum Viable Ontologies for specific domains. Applied to real-world datasets, OLAF demonstrates robust performance through gold-standard evaluations and task-specific Competency Questions. We detail the construction process for a KG about disability management in a French city. We then propose an architecture for KG-RAG systems to enhance information retrieval by aligning user queries with KG structures through entity linking, graph queries, and LLM-based retrieval approaches. We demonstrate our architecture on different use cases, which we evaluate using criteria such as performance, human preference, and environmental impact. While user preferences advantage Text-RAG, KG-RAG's reduced computational footprint underscores its potential for sustainable AI practices. Finally, we identify the critical part of the architecture as the retriever. Therefore, we tackle the retrieval task in our architecture by exploring embeddings in various contexts, i.e. improving EL, retrieval, and providing a caching system. We also propose mechanisms for handling multi-turn conversations. This work establishes a comprehensive framework for KG-RAG systems, combining the semantic depth of KGs with the generative capabilities of LLMs to deliver accurate, contextual, and sustainable conversational agents. Contributions include OLAF for scalable KG construction, a robust KG-RAG pipeline, and embedding-based enhancements for retrieval and interaction quality. By addressing conversational agents' industrial challenges, such as scalability, retrieval precision, and conversational coherence, this research lays the foundation for deploying KG-RAG systems in diverse and specialised domains
Zervakis, Georgios. "Enriching large language models with semantic lexicons and analogies." Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0039.
Full textRecent advances in deep learning and neural networks have made it possible to address complex natural language processing tasks, which find application in a plethora of real-world problems ranging from smart assistants in mobile devices to the prediction of cancer. Nonetheless, modern systems based on these frameworks exhibit various limitations that may compromise their performance and trustworthiness, render them unfair towards minorities, or subject them to privacy leakage. It is our belief that integrating symbolic knowledge and reasoning into the deep learning framework is a necessary step towards addressing the aforementioned limitations. For example, lexical resources can enrich deep neural networks with semantic or syntactic knowledge, and logical rules can provide learning and reasoning mechanisms. Therefore, the scope of this thesis is to develop and evaluate ways of integrating different types of symbolic knowledge and reasoning into a widely used language model, Bidirectional Encoder Representations from Transformers (BERT). ln a first stage, we consider retrofitting, a simple and popular technique for refining distributional word embeddings based on relations coming from a semantic lexicon. Inspired by this technique, we present two methods for incorporating this knowledge into BERT contextualized embeddings. We evaluate these methods on three biomedical datasets for relation extraction and one movie review dataset for sentiment analysis, and show that they do not substantially impact the performance for these tasks. Furthermore, we conduct a qualitative analysis to provide further insights on this negative result. ln a second stage, we integrate analogical reasoning with BERT as a means to improve its performance on the target sense verification task, and make it more robust. To do so, we reformulate target sense verification as an analogy detection task. We present a hybrid model that combines BERT to encode the input data into quadruples and a convolutional neural classifier to decide whether they constitute valid analogies. We test our system on a benchmark dataset, and show that it can outperform existing approaches. Our empirical study shows the importance of the input encoding for BERT, and how this dependence gets alleviated by integrating the axiomatic properties of analogies during training, while preserving performance and improving robustness
Chadha, Vikrampal. "Simulation of large-scale system-level models." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-12162009-020334/.
Full textBughio, Kulsoom Saima. "IoMT security: A semantic framework for vulnerability detection in remote patient monitoring." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2024. https://ro.ecu.edu.au/theses/2841.
Full textHittner, Brian Edward. "Rendering large-scale terrain models and positioning objects in relation to 3D terrain." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2003. http://library.nps.navy.mil/uhtbin/hyperion-image/03Dec%5FHittner.pdf.
Full textThesis advisor(s): Don Brutzman, Curt Blais. Includes bibliographical references (p. 117-118). Also available online.
Kropff, Emilio. "Statistical and dynamical properties of large cortical network models: insights into semantic memory and language." Doctoral thesis, SISSA, 2007. http://hdl.handle.net/20.500.11767/4639.
Full textZhao, Ying, and ying zhao@rmit edu au. "Effective Authorship Attribution in Large Document Collections." RMIT University. Computer Science and Information Technology, 2008. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080730.162501.
Full textPan, Bi-Yu. "Hierarchical test generation for VHDL behavioral models." Thesis, This resource online, 1992. http://scholar.lib.vt.edu/theses/available/etd-09052009-040449/.
Full textWest, James F. "An examination of the application of design metrics to the development of testing strategies in large-scale SDL models." Virtual Press, 2000. http://liblink.bsu.edu/uhtbin/catkey/1191725.
Full textDepartment of Computer Science
Kapoor, Shekhar. "Process level test generation for VHDL behavioral models." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-05022009-040753/.
Full textNarayanaswamy, Sathyanarayanan. "Development of VHDL behavioral models with back annotated timing." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-06112009-063442/.
Full textUzelac, Lawrence Stevan. "A Multiple Coupled Microstrip Transmission Line Model for High-Speed VLSI Interconnect Simulation." PDXScholar, 1991. https://pdxscholar.library.pdx.edu/open_access_etds/4526.
Full textPontes, Miranda James William. "Federation of heterogeneous models with machine learning-assisted model views." Electronic Thesis or Diss., Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2025. http://www.theses.fr/2025IMTA0454.
Full textModel-driven engineering (MDE) promotes models as a key element in addressing the increasing complexity of the software systems’ lifecycle. Engineering systems with MDE involves various models representing different system aspects. This heterogeneity requires model federation capabilities to integrate viewpoints specific to multiple domains. Model View solutions address this challenge but still lack more automation support. This thesis explores the integration of Machine Learning (ML), notably Graph Neural Networks (GNNs) and Large Language Models (LLMs), in order to improve the definition and building of such views. The proposed solution introduces a twofold approach within the EMF Views technical solution. This allowed to partially automate the definition of model views at design time, and to dynamically compute inter-model links at runtime. Our results indicate that the application of Deep Learning (DL) techniques, in this particular MDE context, already allows to achieve a first relevant level of automation. More globally, this research effort contributes to the ongoing development of more intelligent MDE solutions
Menad, Safaa. "Enrichissement et alignement sémantique d'οntοlοgies biοmédicales par mοdèles de langue." Electronic Thesis or Diss., Normandie, 2024. http://www.theses.fr/2024NORMR104.
Full textThe first part of this thesis addresses the design of siamese neural models trained for semantic similarity between biomedical texts and their application to NLP tasks on biomedical documents. The training of these models was performed by embedding the titles and abstracts from the PubMed corpus along with the MeSH thesaurus into a common space. In the second part, we use these models to align and enrich the terminologies of UMLS (Unified Medical Language System) and automate the integration of new relationships between similar concepts, particularly from diseases (DOID), drugs (DRON), and symptoms. These enriched relationships enhance the usability of these ontologies, thereby facilitating their application in various clinical and scientific domains. Additionally, we propose validation approaches using resources such as LLMs, OpenFDA, the UMLS Metathesaurus, and the UMLS semantic network, supplemented by manual validation from domain experts
Nyberg, Jakob. "Response Generation Using Large-scale Pre-trained Language Models." Thesis, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-415323.
Full textHwang, Chien-Yo, and 黃健祐. "Analyzing Properties of Smoothing Issues for Language Models in Large Mandarin Corpus." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/75029464702391160845.
Full text國立中興大學
資訊網路多媒體研究所
100
Smoothing technique is a very fundamental and important topic. Many applications like speech reconition, machine translation, input method, Chinese characters conversion use this technique a lot. In this thesis, we discuss the properties and entropies of smoothing methods. Because of the problem of data sparseness, smoothing methods are employed to estimate the probability of each event in language models. We will mention several well-known smoothing methods: Additive Discount Method, Good-Turing Method and Witten-Bell method. The present smoothing techniques have solved the data sparse problem effectively but have not further anzlyzed the reasonableness for the frequency distribution of events occurring.So we analyzed smoothing method from a statitiscal point of view. We propose a set of properties to analyzed the statistical bebaviors of these smoothing methods. Furthmore, we present two new smoothing methods which comply with nearly all the properties. Finally, we implement the language models using large Mandarin corpus and discuss how to evaluate language models by cross-entropy and perplexity. Then we discuss some related problems of the cut off issues proopsed by Katz.