To see the other types of publications on this topic, follow the link: Extensive Language Models.

Journal articles on the topic 'Extensive Language Models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Extensive Language Models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ayres‐Bennett, Wendy. "Researching Language Standards and Standard Languages: Theories, Models and Methods." Transactions of the Philological Society 122, no. 3 (2024): 496–503. https://doi.org/10.1111/1467-968x.12298.

Full text
Abstract:
AbstractThe title of Jim Adams's rich and interesting paper clearly states the key question at the heart of his analysis: ‘Was classical (late republican) Latin a “standard language”?’. In this article, I contextualise some of his answers to this and other related questions he raises by situating them in the context of theoretical discussions of standardisation and recent explorations of its processes and outcomes. In recent years there has been extensive research on linguistic standardisation. This research has broadened the scope of consideration from the now stock examples of modern Western European languages to minoritized languages, multilingual situations and stateless languages, raising the question of whether traditional models and conceptions of standardisation often associated with the creation of a nation‐state can equally be applied to other contexts, including Latin during the late republican and early imperial periods.
APA, Harvard, Vancouver, ISO, and other styles
2

Nguyen, Van Viet, Thi Minh Hue Luong, The Vinh Nguyen, et al. "Revolutionizing education: An extensive analysis of large language models integration." International Research Journal of Science, Technology, Education, and Management 4, no. 4 (2024): 10–21. https://doi.org/10.5281/zenodo.14744029.

Full text
Abstract:
Large Language Models have garnered significant attention from companies, universities, and research groups in recent times, driven by the abundance of data available for their training. However, little evidence has been conducted in the field of education, leaving a huge gap that needs to be filled. Therefore, the purpose of this article is to provide an overview of the use of new areas of artificial intelligence in the field of education. We use the PRISMA method to analyze the relevant contents in detail to gather data, covering articles collected in the contemporary period between January 2019 and 2024. Results from 54 reviewed publications indicated that trends of utilizing LLMs in education have increased significantly since 2022 and arXiv preprint is the most common repository for declaring researchers’ ideas. The application of LLMs can support the achievement of learning objectives, enhance the quality and accuracy of assessments, and contribute to improving the educational environment as well as the practical application of various subjects. Seven limitations are identified and discussed, opening several avenues for future research agenda.
APA, Harvard, Vancouver, ISO, and other styles
3

Chhun, Cyril, Fabian M. Suchanek, and Chloé Clavel. "Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation." Transactions of the Association for Computational Linguistics 12 (2024): 1122–42. http://dx.doi.org/10.1162/tacl_a_00689.

Full text
Abstract:
Abstract Storytelling is an integral part of human experience and plays a crucial role in social interactions. Thus, Automatic Story Evaluation (ASE) and Generation (ASG) could benefit society in multiple ways, but they are challenging tasks which require high-level human abilities such as creativity, reasoning, and deep understanding. Meanwhile, Large Language Models (LLMs) now achieve state-of-the-art performance on many NLP tasks. In this paper, we study whether LLMs can be used as substitutes for human annotators for ASE. We perform an extensive analysis of the correlations between LLM ratings, other automatic measures, and human annotations, and we explore the influence of prompting on the results and the explainability of LLM behaviour. Most notably, we find that LLMs outperform current automatic measures for system-level evaluation but still struggle at providing satisfactory explanations for their answers.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Hanyu, Xiting Wang, Chengao Li, Xiang Ao, and Qing He. "Controlling Large Language Models Through Concept Activation Vectors." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 24 (2025): 25851–59. https://doi.org/10.1609/aaai.v39i24.34778.

Full text
Abstract:
As large language models (LLMs) are widely deployed across various domains, the ability to control their generated outputs has become more critical. This control involves aligning LLMs outputs with human values and ethical principles or customizing LLMs on specific topics or styles for individual users. Existing controlled generation methods either require significant computational resources and extensive trial-and-error or provide coarse-grained control. In this paper, we propose Generation with Concept Activation Vector (GCAV), a lightweight model control framework that ensures accurate control without requiring resource-extensive fine-tuning. Specifically, GCAV first trains a concept activation vector for specified concepts to be controlled, such as toxicity. During inference, GCAV steers the concept vector in LLMs, for example, by removing the toxicity concept vector from the activation layers. Control experiments from different perspectives, including toxicity reduction, sentiment control, linguistic style, and topic control, demonstrate that our framework achieves state-of-the-art performance with granular control, allowing for fine-grained adjustments of both the steering layers and the steering magnitudes for individual samples.
APA, Harvard, Vancouver, ISO, and other styles
5

FİROOZİ, Tahereh, Okan BULUT, and Mark GİERL. "Language Models in Automated Essay Scoring: Insights for the Turkish Language." International Journal of Assessment Tools in Education 10, Special Issue (2023): 53–67. http://dx.doi.org/10.21449/ijate.1394194.

Full text
Abstract:
The proliferation of large language models represents a paradigm shift in the landscape of automated essay scoring (AES) systems, fundamentally elevating their accuracy and efficacy. This study presents an extensive examination of large language models, with a particular emphasis on the transformative influence of transformer-based models, such as BERT, mBERT, LaBSE, and GPT, in augmenting the accuracy of multilingual AES systems. The exploration of these advancements within the context of the Turkish language serves as a compelling illustration of the potential for harnessing large language models to elevate AES performance in in low-resource linguistic environments. Our study provides valuable insights for the ongoing discourse on the intersection of artificial intelligence and educational assessment.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhou, Hao, Zhijun Wang, Shujian Huang, et al. "MoE-LPR: Multilingual Extension of Large Language Models Through Mixture-of-Experts with Language Priors Routing." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 24 (2025): 26092–100. https://doi.org/10.1609/aaai.v39i24.34805.

Full text
Abstract:
Large Language Models (LLMs) are often English-centric due to the disproportionate distribution of languages in their pre-training data. Enhancing non-English language capabilities through post-pretraining often results in catastrophic forgetting of high-resource languages. Previous methods either achieve good expansion with severe forgetting or slight forgetting with poor expansion, indicating the challenge of balancing language expansion while preventing forgetting. In this paper, we propose a method called MoE-LPR (Mixture-of-Experts with Language Priors Routing) to alleviate this problem. MoE-LPR employs a two-stage training approach to enhance the multilingual capability. First, the model is post-pretrained into a Mixture-of-Experts(MoE) architecture by upcycling, where all the original parameters are frozen and new experts are added. In this stage, we focus improving the ability on expanded languages, without using any original language data. Then, the model reviews the knowledge of the original languages with replay data amounting to less than 1% of post-pretraining, where we incorporate language priors routing to better recover the abilities of the original languages. Evaluations on multiple benchmarks show that MoE-LPR outperforms other post-pretraining methods. Freezing original parameters preserves original language knowledge while adding new experts preserves the learning ability. Reviewing with LPR enables effective utilization of multilingual knowledge within the parameters. Additionally, the MoE architecture maintains the same inference overhead while increasing total model parameters. Extensive experiments demonstrate MoE-LPR’s effectiveness in improving expanded languages and preserving original language proficiency with superior scalability.
APA, Harvard, Vancouver, ISO, and other styles
7

Zhang, Junyang, Mu Yuan, Ruiguang Zhong, et al. "A-VL: Adaptive Attention for Large Vision-Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 21 (2025): 22461–69. https://doi.org/10.1609/aaai.v39i21.34403.

Full text
Abstract:
The Large Vision-Language Model (LVLM) integrates computer vision and natural language processing techniques, offering substantial application potential. However, these models demand extensive resources during inference. Adaptive attention techniques can dynamically reduce computational redundancy and thus improve efficiency. Although current adaptive attention methods significantly reduce the memory requirements of Transformer-based language models, they are not tailored for LVLMs. We observe that LVLMs generate responses from both remote image tokens and local text tokens, and different modalities have different attention patterns. This observation inspires us to manage the attention for each modality separately. Specifically, for visual input, we store the cache of potentially useful information but only compute the most critical parts. For language input, we care more about local information. Based on our observation and analysis of vision-language attention patterns, we develop A-VL, a plug-and-play adaptive attention tailored for LVLM inference. Extensive evaluations on three vision-language tasks and five datasets show the effectiveness of our designs. Our approach A-VL outperforms existing adaptive attention methods in reducing memory usage and computational load without compromising performance.
APA, Harvard, Vancouver, ISO, and other styles
8

Mozafari, Marzieh, Khouloud Mnassri, Reza Farahbakhsh, and Noel Crespi. "Offensive language detection in low resource languages: A use case of Persian language." PLOS ONE 19, no. 6 (2024): e0304166. http://dx.doi.org/10.1371/journal.pone.0304166.

Full text
Abstract:
THIS ARTICLE USES WORDS OR LANGUAGE THAT IS CONSIDERED PROFANE, VULGAR, OR OFFENSIVE BY SOME READERS. Different types of abusive content such as offensive language, hate speech, aggression, etc. have become prevalent in social media and many efforts have been dedicated to automatically detect this phenomenon in different resource-rich languages such as English. This is mainly due to the comparative lack of annotated data related to offensive language in low-resource languages, especially the ones spoken in Asian countries. To reduce the vulnerability among social media users from these regions, it is crucial to address the problem of offensive language in such low-resource languages. Hence, we present a new corpus of Persian offensive language consisting of 6,000 out of 520,000 randomly sampled micro-blog posts from X (Twitter) to deal with offensive language detection in Persian as a low-resource language in this area. We introduce a method for creating the corpus and annotating it according to the annotation practices of recent efforts for some benchmark datasets in other languages which results in categorizing offensive language and the target of offense as well. We perform extensive experiments with three classifiers in different levels of annotation with a number of classical Machine Learning (ML), Deep learning (DL), and transformer-based neural networks including monolingual and multilingual pre-trained language models. Furthermore, we propose an ensemble model integrating the aforementioned models to boost the performance of our offensive language detection task. Initial results on single models indicate that SVM trained on character or word n-grams are the best performing models accompanying monolingual transformer-based pre-trained language model ParsBERT in identifying offensive vs non-offensive content, targeted vs untargeted offense, and offensive towards individual or group. In addition, the stacking ensemble model outperforms the single models by a substantial margin, obtaining 5% respective macro F1-score improvement for three levels of annotation.
APA, Harvard, Vancouver, ISO, and other styles
9

Blinn, Andrew, Xiang Li, June Hyung Kim, and Cyrus Omar. "Statically Contextualizing Large Language Models with Typed Holes." Proceedings of the ACM on Programming Languages 8, OOPSLA2 (2024): 468–98. http://dx.doi.org/10.1145/3689728.

Full text
Abstract:
Large language models (LLMs) have reshaped the landscape of program synthesis. However, contemporary LLM-based code completion systems often hallucinate broken code because they lack appropriate code context, particularly when working with definitions that are neither in the training data nor near the cursor. This paper demonstrates that tighter integration with the type and binding structure of the programming language in use, as exposed by its language server, can help address this contextualization problem in a token-efficient manner. In short, we contend that AIs need IDEs, too! In particular, we integrate LLM code generation into the Hazel live program sketching environment. The Hazel Language Server is able to identify the type and typing context of the hole that the programmer is filling, with Hazel's total syntax and type error correction ensuring that a meaningful program sketch is available whenever the developer requests a completion. This allows the system to prompt the LLM with codebase-wide contextual information that is not lexically local to the cursor, nor necessarily in the same file, but that is likely to be semantically local to the developer's goal. Completions synthesized by the LLM are then iteratively refined via further dialog with the language server, which provides error localization and error messages. To evaluate these techniques, we introduce MVUBench, a dataset of model-view-update (MVU) web applications with accompanying unit tests that have been written from scratch to avoid data contamination, and that can easily be ported to new languages because they do not have large external library dependencies. These applications serve as challenge problems due to their extensive reliance on application-specific data structures. Through an ablation study, we examine the impact of contextualization with type definitions, function headers, and errors messages, individually and in combination. We find that contextualization with type definitions is particularly impactful. After introducing our ideas in the context of Hazel, a low-resource language, we duplicate our techniques and port MVUBench to TypeScript in order to validate the applicability of these methods to higher-resource mainstream languages. Finally, we outline ChatLSP, a conservative extension to the Language Server Protocol (LSP) that language servers can implement to expose capabilities that AI code completion systems of various designs can use to incorporate static context when generating prompts for an LLM.
APA, Harvard, Vancouver, ISO, and other styles
10

Penner, Regina V. "Large Language Models: А Socio-Philosophical Essay". Galactica Media: Journal of Media Studies 6, № 3 (2024): 83–100. http://dx.doi.org/10.46539/gmd.v6i3.502.

Full text
Abstract:
Neural networks have filled the information space. On the one hand, this indicates the scientific and technological movement of contemporary society (perhaps, AGI is already waiting for us outside the door). On the other hand, in everyday discourse there are extensive discussions about the fact that when neural networks are created, a person is left with hard work. However, a holistic understanding of the neural network is associated with a movement from the mythotechnological framework to the phenomenon itself and the questioning of its social role. The key aim of the paper is returning, through observing the range of functions of current LLMs, to the classic question of whether a machine can think. At the same time another question remains, are humans ready to accept the social subjectivity of machines.
APA, Harvard, Vancouver, ISO, and other styles
11

He, Jiaming, Guanyu Hou, Xinyue Jia, et al. "Data Stealing Attacks against Large Language Models via Backdooring." Electronics 13, no. 14 (2024): 2858. http://dx.doi.org/10.3390/electronics13142858.

Full text
Abstract:
Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring. Our method mainly focuses on the scenario of model customization and is conducted in two phases, including backdoor training and backdoor activation, which allow for the extraction of private information without prior knowledge of the model’s architecture or training data. During the model customization stage, attackers inject the backdoor into the pre-trained LLM by poisoning a small ratio of the training dataset. During the inference stage, attackers can extract private information from the third-party knowledge database by incorporating the pre-defined backdoor trigger. Our method leverages the customization process of LLMs, injecting a stealthy backdoor that can be triggered after deployment to retrieve private data. We demonstrate the effectiveness of our proposed attack through extensive experiments, achieving a notable attack success rate. Extensive experiments demonstrate the effectiveness of our stealing attack in popular LLM architectures, as well as stealthiness during normal inference.
APA, Harvard, Vancouver, ISO, and other styles
12

Dong, Yanling, and Xiaolan Zhou. "Advancements in AI-driven multilingual comprehension for social robot interactions: An extensive review." Electronic Research Archive 31, no. 11 (2023): 6600–6633. http://dx.doi.org/10.3934/era.2023334.

Full text
Abstract:
<abstract><p>In the digital era, human-robot interaction is rapidly expanding, emphasizing the need for social robots to fluently understand and communicate in multiple languages. It is not merely about decoding words but about establishing connections and building trust. However, many current social robots are limited to popular languages, serving in fields like language teaching, healthcare and companionship. This review examines the AI-driven language abilities in social robots, providing a detailed overview of their applications and the challenges faced, from nuanced linguistic understanding to data quality and cultural adaptability. Last, we discuss the future of integrating advanced language models in robots to move beyond basic interactions and towards deeper emotional connections. Through this endeavor, we hope to provide a beacon for researchers, steering them towards a path where linguistic adeptness in robots is seamlessly melded with their capacity for genuine emotional engagement.</p></abstract>
APA, Harvard, Vancouver, ISO, and other styles
13

Wang, Runze, Mingqi Yang, and Yanming Shen. "Bridging Molecular Graphs and Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 20 (2025): 21234–42. https://doi.org/10.1609/aaai.v39i20.35422.

Full text
Abstract:
While Large Language Models (LLMs) have shown exceptional generalization capabilities, their ability to process graph data, such as molecular structures, remains limited. To bridge this gap, this paper proposes Graph2Token, an efficient solution that aligns graph tokens to LLM tokens. The key idea is to represent a graph token with the LLM token vocabulary, without fine-tuning the LLM backbone. To achieve this goal, we first construct a molecule-text paired dataset from multi-sources, including CHEBI and HMDB, to train a graph structure encoder, which reduces the distance between graphs and texts representations in the feature space. Then, we propose a novel alignment strategy that associates a graph token with LLM tokens. To further unleash the potential of LLMs, we collect molecular IUPAC name identifiers, which are incorporated into the LLM prompts. By aligning molecular graphs as special tokens, we can activate LLMs' generalization ability to molecular few-shot learning. Extensive experiments on molecular classification and regression tasks demonstrate the effectiveness of our proposed Graph2Token.
APA, Harvard, Vancouver, ISO, and other styles
14

Yan, Mengyi, Yaoshu Wang, Yue Wang, Xiaoye Miao, and Jianxin Li. "GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models." Proceedings of the ACM on Management of Data 2, no. 6 (2024): 1–29. https://doi.org/10.1145/3698811.

Full text
Abstract:
Data quality is critical across many applications. The utility of data is undermined by various errors, making rigorous data cleaning a necessity. Traditional data cleaning systems depend heavily on predefined rules and constraints, which necessitate significant domain knowledge and manual effort. Moreover, while configuration-free approaches and deep learning methods have been explored, they struggle with complex error patterns, lacking interpretability, requiring extensive feature engineering or labeled data. This paper introduces GIDCL ( G raph-enhanced I nterpretable D ata C leaning with L arge language models), a pioneering framework that harnesses the capabilities of Large Language Models (LLMs) alongside Graph Neural Network (GNN) to address the challenges of traditional and machine learning-based data cleaning methods. By converting relational tables into graph structures, GIDCL utilizes GNN to effectively capture and leverage structural correlations among data, enhancing the model's ability to understand and rectify complex dependencies and errors. The framework's creator-critic workflow innovatively employs LLMs to automatically generate interpretable data cleaning rules and tailor feature engineering with minimal labeled data. This process includes the iterative refinement of error detection and correction models through few-shot learning, significantly reducing the need for extensive manual configuration. GIDCL not only improves the precision and efficiency of data cleaning but also enhances its interpretability, making it accessible and practical for non-expert users. Our extensive experiments demonstrate that GIDCL significantly outperforms existing methods, improving F1-scores by 10% on average while requiring only 20 labeled tuples.
APA, Harvard, Vancouver, ISO, and other styles
15

Tian, Yijun, Huan Song, Zichen Wang, et al. "Graph Neural Prompting with Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (2024): 19080–88. http://dx.doi.org/10.1609/aaai.v38i17.29875.

Full text
Abstract:
Large language models (LLMs) have shown remarkable generalization capability with exceptional performance in various language modeling tasks. However, they still exhibit inherent limitations in precisely capturing and returning grounded knowledge. While existing work has explored utilizing knowledge graphs (KGs) to enhance language modeling via joint training and customized model architectures, applying this to LLMs is problematic owing to their large number of parameters and high computational cost. Therefore, how to enhance pre-trained LLMs using grounded knowledge, e.g., retrieval-augmented generation, remains an open question. In this work, we propose Graph Neural Prompting (GNP), a novel plug-and-play method to assist pre-trained LLMs in learning beneficial knowledge from KGs. GNP encompasses various designs, including a standard graph neural network encoder, a cross-modality pooling module, a domain projector, and a self-supervised link prediction objective. Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks across different LLM sizes and settings. Code is available at https://github.com/meettyj/GNP.
APA, Harvard, Vancouver, ISO, and other styles
16

Kuznetsov, Alexey Valer'evich. "Beyond Topic Modeling: Analyzing Historical Text with Large Language Models." Историческая информатика, no. 4 (April 2024): 47–65. https://doi.org/10.7256/2585-7797.2024.4.72560.

Full text
Abstract:
The article explores the potential of large language models in thematic analysis of historical texts, exemplified by the 1849 diary of Vologda gymnasium student Kirill Antonovich Berezkin. This rich source illuminates the everyday life, worldview, and social interactions of a young individual in mid-19th century provincial Russia. The diary offers a multifaceted narrative, capturing cultural events, political contexts, and personal introspections. By meticulously analyzing this text, researchers can reconstruct not just an individual's experiences, but also gain profound insights into the social, cultural, and educational landscape of the era. Employing the Gemini 1.5 Pro model, renowned for processing extensive textual data, the study conducted a comprehensive analysis. The research methodology involved examining the diary both holistically and through monthly segmentation, enabling the identification of nuanced content aspects. The novelty of the approach lies in applying modern large language models to a Russian historical document. The results demonstrated the model's remarkable capability to identify key themes, successfully isolating eight major thematic areas that reflect the gymnasium student's life. Utilizing parallel prompting with a monthly text breakdown revealed specific themes and subtleties that a comprehensive review might have overlooked. The study ultimately validates the effectiveness of large language models in historical source analysis, presenting promising opportunities for automating topic modeling and uncovering hidden patterns in extensive textual datasets. However, the inherently stochastic nature of these models necessitates multiple analyses, careful result interpretation, and critical comparison with traditional historical research methodologies.
APA, Harvard, Vancouver, ISO, and other styles
17

Yang, Xiaohao, He Zhao, Dinh Phung, Wray Buntine, and Lan Du. "LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models." Transactions of the Association for Computational Linguistics 13 (2025): 357–75. https://doi.org/10.1162/tacl_a_00744.

Full text
Abstract:
Abstract Topic modeling has been a widely used tool for unsupervised text analysis. However, comprehensive evaluations of a topic model remain challenging. Existing evaluation methods are either less comparable across different models (e.g., perplexity) or focus on only one specific aspect of a model (e.g., topic quality or document representation quality) at a time, which is insufficient to reflect the overall model performance. In this paper, we propose WALM (Word Agreement with Language Model), a new evaluation method for topic modeling that considers the semantic quality of document representations and topics in a joint manner, leveraging the power of Large Language Models (LLMs). With extensive experiments involving different types of topic models, WALM is shown to align with human judgment and can serve as a complementary evaluation method to the existing ones, bringing a new perspective to topic modeling. Our software package is available at https://github.com/Xiaohao-Yang/Topic_Model_Evaluation.
APA, Harvard, Vancouver, ISO, and other styles
18

Gyurjyan, Mikayel K., and Andranik Hayrapetyan. "Approach and Challenges of Training an Armenian Version of BERT Language Model." Mathematical Problems of Computer Science 62 (December 1, 2024): 59–71. https://doi.org/10.51408/1963-0121.

Full text
Abstract:
Training and deploying BERT models for specific languages, especially low-resource ones, presents a unique set of challenges. These challenges stem from the inherent data scarcity associated with languages like Armenian, the computational demands of training BERT models, often requiring extensive resources, and the inefficiencies in hosting and maintaining models for languages with limited digital traffic. In this research, we introduce a novel methodology that leverages the Armenian Wikipedia as a primary data source, aiming to optimize the performance of BERT for the Armenian language. Our approach demonstrates that, with strategic preprocessing and transfer learning techniques, it's possible to achieve performance metrics that rival those of models trained on more abundant datasets. Furthermore, we explore the potential of fine-tuning pre-trained multilingual BERT models, revealing that they can serve as robust starting points for training models for low-resource but significant languages like Armenian.
APA, Harvard, Vancouver, ISO, and other styles
19

Sagar, Kumar, Ahamed Sakib, H. V. Sanjana, Khan K. A. Farhan, and M. I. Shilpa. "Comprehensive Survey on Kannada Language Speech to English Language Translation and Voice Cloning System." Journal of Advanced Research in Artificial Intelligence & It's Applications 2, no. 2 (2025): 18–29. https://doi.org/10.5281/zenodo.15123506.

Full text
Abstract:
<em>India is a culturally rich country with diverse languages, with over 22 official languages and countless dialects spoken across the country. However, this linguistic diversity often acts as a communication barrier, hindering interactions between individuals who speak different languages. To address this challenge and revolutionize communication, there is an increasing interest in using Artificial Intelligence (AI) for language trans- lation. This research explores the application of AI in language translation, with a specific focus on converting local languages into a universal language. Two AI models, namely VALL-EX and ELLA-V, play a important role in this project. These models are trained on extensive multilingual speech data and are designed to overcome the communication gaps and achieve zero-shot cross-lingual speech synthesis. The proposed approach takes advantage of recent advances in text-to-speech synthesis. With the development of voice cloning techniques and synthesized speech quality approaching human equivalency, the industry has seen huge developments over the years. This research introduces a novel approach to address language barriers, proposing solutions with the help of VALL-EX. This AI models aim to create high-quality zero-shot cross-lingual voice synthesis using data gathered from large multilingual speech samples. By doing this, the study hopes to improve current communication breakdowns and support smooth information transfer across various linguistic contexts.</em>
APA, Harvard, Vancouver, ISO, and other styles
20

Xu, Canwen, Zexue He, Zhankui He, and Julian McAuley. "Leashing the Inner Demons: Self-Detoxification for Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (2022): 11530–37. http://dx.doi.org/10.1609/aaai.v36i10.21406.

Full text
Abstract:
Language models (LMs) can reproduce (or amplify) toxic language seen during training, which poses a risk to their practical application. In this paper, we conduct extensive experiments to study this phenomenon. We analyze the impact of prompts, decoding strategies and training corpora on the output toxicity. Based on our findings, we propose a simple yet effective unsupervised method for language models to ``detoxify'' themselves without an additional large corpus or external discriminator. Compared to a supervised baseline, our proposed method shows better toxicity reduction with good generation quality in the generated content under multiple settings. Warning: some examples shown in the paper may contain uncensored offensive content.
APA, Harvard, Vancouver, ISO, and other styles
21

Clark, Timothy, and Martin G. Hicks. "Models of necessity." Beilstein Journal of Organic Chemistry 16 (July 13, 2020): 1649–61. http://dx.doi.org/10.3762/bjoc.16.137.

Full text
Abstract:
The way chemists represent chemical structures as two-dimensional sketches made up of atoms and bonds, simplifying the complex three-dimensional molecules comprising nuclei and electrons of the quantum mechanical description, is the everyday language of chemistry. This language uses models, particularly of bonding, that are not contained in the quantum mechanical description of chemical systems, but has been used to derive machine-readable formats for storing and manipulating chemical structures in digital computers. This language is fuzzy and varies from chemist to chemist but has been astonishingly successful and perhaps contributes with its fuzziness to the success of chemistry. It is this creative imagination of chemical structures that has been fundamental to the cognition of chemistry and has allowed thought experiments to take place. Within the everyday language, the model nature of these concepts is not always clear to practicing chemists, so that controversial discussions about the merits of alternative models often arise. However, the extensive use of artificial intelligence (AI) and machine learning (ML) in chemistry, with the aim of being able to make reliable predictions, will require that these models be extended to cover all relevant properties and characteristics of chemical systems. This, in turn, imposes conditions such as completeness, compactness, computational efficiency and non-redundancy on the extensions to the almost universal Lewis and VSEPR bonding models. Thus, AI and ML are likely to be important in rationalizing, extending and standardizing chemical bonding models. This will not affect the everyday language of chemistry but may help to understand the unique basis of chemical language.
APA, Harvard, Vancouver, ISO, and other styles
22

Suzuki, Jun, Heiga Zen, and Hideto Kazawa. "Extracting representative subset from extensive text data for training pre-trained language models." Information Processing & Management 60, no. 3 (2023): 103249. http://dx.doi.org/10.1016/j.ipm.2022.103249.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Liu, Jiaxiang, Tianxiang Hu, Jiawei Du, Ruiyuan Zhang, Joey Tianyi Zhou, and Zuozhu Liu. "KPL: Training-Free Medical Knowledge Mining of Vision-Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 18 (2025): 18852–60. https://doi.org/10.1609/aaai.v39i18.34075.

Full text
Abstract:
Visual Language Models such as CLIP excel in image recognition due to extensive image-text pre-training. However, applying the CLIP inference in zero-shot classification, particularly for medical image diagnosis, faces challenges due to: 1) the inadequacy of representing image classes solely with single category names; 2) the modal gap between the visual and text spaces generated by CLIP encoders. Despite attempts to enrich disease descriptions with large language models, the lack of class-specific knowledge often leads to poor performance. In addition, empirical evidence suggests that existing proxy learning methods for zero-shot image classification on natural image datasets exhibit instability when applied to medical datasets. To tackle these challenges, we introduce the Knowledge Proxy Learning (KPL) to mine knowledge from CLIP. KPL is designed to leverage CLIP's multimodal understandings for medical image classification through Text Proxy Optimization and Multimodal Proxy Learning. Specifically, KPL retrieves image-relevant knowledge descriptions from the constructed knowledge-enhanced base to enrich semantic text proxies. It then harnesses input images and these descriptions, encoded via CLIP, to stably generate multimodal proxies that boost the zero-shot classification performance. Extensive experiments conducted on both medical and natural image datasets demonstrate that KPL enables effective zero-shot image classification, outperforming all baselines. These findings highlight the great potential in this paradigm of mining knowledge from CLIP for medical image classification and broader areas.
APA, Harvard, Vancouver, ISO, and other styles
24

Li, Zihao, Yucheng Shi, Zirui Liu, et al. "Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 27 (2025): 28186–94. https://doi.org/10.1609/aaai.v39i27.35038.

Full text
Abstract:
The development of Large Language Models (LLMs) relies on extensive text corpora, which are often unevenly distributed across languages. This imbalance results in LLMs performing significantly better on high-resource languages like English, German, and French, while their capabilities in low-resource languages remain inadequate. Currently, there is a lack of quantitative methods to evaluate the performance of LLMs in these low-resource languages. To address this gap, we propose the Language Ranker, an intrinsic metric designed to benchmark and rank languages based on LLM performance using internal representations. By comparing the LLM's internal representation of various languages against a baseline derived from English, we can assess the model's multilingual capabilities in a robust and language-agnostic manner. Our analysis reveals that high-resource languages exhibit higher similarity scores with English, demonstrating superior performance, while low-resource languages show lower similarity scores, underscoring the effectiveness of our metric in assessing language-specific capabilities. Besides, the experiments show that there is a strong correlation between the LLM’s performance in different languages and the proportion of those languages in its pre-training corpus. These insights underscore the efficacy of the Language Ranker as a tool for evaluating LLM performance across different languages, particularly those with limited resources.
APA, Harvard, Vancouver, ISO, and other styles
25

Hou, Yuteng. "Using Language Models to Augment Data in Stock Prediction." Advances in Economics and Management Research 8, no. 1 (2023): 298. http://dx.doi.org/10.56028/aemr.8.1.298.2023.

Full text
Abstract:
This paper delves into the innovative application of language models as a means of enhancing data augmentation techniques in the context of Sentiment Analysis for Event-Driven Stock Prediction. In recent years, the integration of natural language processing and machine learning has led to significant advancements in sentiment analysis, enabling the extraction of valuable insights from textual data for enhancing stock prediction accuracy. In this work, we incorporate T-5 language model to enrich the training dataset with semantically diverse and contextually relevant textual variations. By conducting extensive experimental results, we demonstrate the effectiveness of using T-5 for data augmentation in the task of Sentiment Analysis for Event-Driven Stock Prediction..
APA, Harvard, Vancouver, ISO, and other styles
26

Chu, Zhibo, Zichong Wang, and Wenbin Zhang. "Fairness in Large Language Models: A Taxonomic Survey." ACM SIGKDD Explorations Newsletter 26, no. 1 (2024): 34–48. http://dx.doi.org/10.1145/3682112.3682117.

Full text
Abstract:
Large Language Models (LLMs) have demonstrated remarkable success across various domains. However, despite their promising performance in numerous real-world applications, most of these algorithms lack fairness considerations. Consequently, they may lead to discriminatory outcomes against certain communities, particularly marginalized populations, prompting extensive study in fair LLMs. On the other hand, fairness in LLMs, in contrast to fairness in traditional machine learning, entails exclusive backgrounds, taxonomies, and fulfillment techniques. To this end, this survey presents a comprehensive overview of recent advances in the existing literature concerning fair LLMs. Specifically, a brief introduction to LLMs is provided, followed by an analysis of factors contributing to bias in LLMs. Additionally, the concept of fairness in LLMs is discussed categorically, summarizing metrics for evaluating bias in LLMs and existing algorithms for promoting fairness. Furthermore, resources for evaluating bias in LLMs, including toolkits and datasets, are summarized. Finally, existing research challenges and open questions are discussed.
APA, Harvard, Vancouver, ISO, and other styles
27

Shen, Huawen, Gengluo Li, Jinwen Zhong, and Yu Zhou. "LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 7 (2025): 6805–13. https://doi.org/10.1609/aaai.v39i7.32730.

Full text
Abstract:
Visual Information Extraction (VIE) plays a crucial role in the comprehension of semi-structured documents, and several pre-trained models have been developed to enhance performance. However, most of these works are monolingual (usually English). Due to the extremely unbalanced quantity and quality of pre-training corpora between English and other languages, few works can extend to non-English scenarios. In this paper, we conduct systematic experiments to show that vision and layout modality hold invariance among images with different languages. If decoupling language bias from document images, a vision-layout-based model can achieve impressive cross-lingual generalization. Accordingly, we present a simple but effective multilingual training paradigm LDP (Language Decoupled Pre-training) for better utilization of monolingual pre-training data. Our proposed model LDM (Language Decoupled Model) is first pre-trained on the language-independent data, where the language knowledge is decoupled by a diffusion model, and then the LDM is fine-tuned on the downstream languages. Extensive experiments show that the LDM outperformed all SOTA multilingual pre-trained models, and also maintains competitiveness on downstream monolingual/English benchmarks.
APA, Harvard, Vancouver, ISO, and other styles
28

Xu, Wei, and Xiaodong Jin. "Dynamic Layer Skipping for Large Language Models on Natural Language Understanding Tasks and Machine Translation Using Reinforcement Learning." Frontiers in Computing and Intelligent Systems 9, no. 3 (2024): 1–10. http://dx.doi.org/10.54097/wy0g8m89.

Full text
Abstract:
Large Language Models (LLMs) demonstrate remarkable proficiency in various natural language processing (NLP) tasks. However, their extensive size, resulting from the inclusion of billions of parameters across multiple layers, presents significant challenges regarding storage, training, and inference. Traditional methodologies such as model pruning and distillation are employed to decrease the size of these models, but these techniques often result in a compromise on performance retention. In this work, we propose a novel framework that uses dynamic layer skipping for different samples to accelerate the inference speed of LLMs. First, we add an adapter layer at each transformer layer to predict whether to skip the next layer or not, and we propose layer skip pretraining to recover the model’s performance. Second, we propose using reinforcement learning (RL) to optimize the model and design several strategies to stabilize the training. Extensive experiments on four natural language understanding (NLU) datasets and three machine translation datasets and ablation studies show that our method achieves SOTA performance among layer skipping methods on LLMs.
APA, Harvard, Vancouver, ISO, and other styles
29

Tyler, Andrea. "Usage-Based Approaches to Language and Their Applications to Second Language Learning." Annual Review of Applied Linguistics 30 (March 2010): 270–91. http://dx.doi.org/10.1017/s0267190510000140.

Full text
Abstract:
Over the past 20 years, many in the field of second language learning and pedagogy have become familiar with models of language that emphasize its communicative nature. These models are often referred to as usage-based because they emphasize the notion that actual language use is a primary shaper of linguistic form. Supporters of these models also argue that making meaning, that is, the use to which language is put, is central to how language is configured. Usage-based models share several other underlying assumptions as well. While these usage models have a number of ideas in common, several distinct approaches have emerged. They often use similar terms, such as cognition and metaphor, but the precise interpretations can vary from model to model. The overall result is that without extensive reading, it is not always clear just how these models differ and what unique insights each offer. This article attempts to address this situation by examining three major usage-based models—systemic functional linguistics, discourse functionalism, and cognitive linguistics. First, the common, underlying tenets shared by the three models are discussed. Second, an overview of the unique tenets and concerns of each approach is presented in order to distinguish key differences among them. Within the discussion of each approach, I also discuss various attempts to apply the model to issues in second language learning.
APA, Harvard, Vancouver, ISO, and other styles
30

Alarie, A., and C. Morisset. "EXTENSIVE ONLINE SHOCK MODEL DATABASE." Revista Mexicana de Astronomía y Astrofísica 55, no. 2 (2019): 377–92. http://dx.doi.org/10.22201/ia.01851101p.2019.55.02.21.

Full text
Abstract:
We present a new database of fully radiative shock models calculated with the shock and photoionization code mappings v. The database architecture is built to contain diverse shock grids comprising of multiple shock parameters. It can be easily accessible through the MySQL protocol. Intensities of spectral lines from infrared to X-rays are stored along with other useful outputs such as the ionic fractions/temperature, integrated densities, etc. A web page was created in order to explore interactively the database as it evolves with time. Examples of its usage are given using the Python language.
APA, Harvard, Vancouver, ISO, and other styles
31

Suvirat, Kerdkiat, Detphop Tanasanchonnakul, Sawrawit Chairat, and Sitthichok Chaichulee. "Leveraging Language Models for Inpatient Diagnosis Coding." Applied Sciences 13, no. 16 (2023): 9450. http://dx.doi.org/10.3390/app13169450.

Full text
Abstract:
Medical coding plays an essential role in medical billing, health resource planning, clinical research and quality assessment. Automated coding systems offer promising solutions to streamline the coding process, improve accuracy and reduce the burden on medical coders. To date, there has been limited research focusing on inpatient diagnosis coding using an extensive comprehensive dataset and encompassing the full ICD-10 code sets. In this study, we investigate the use of language models for coding inpatient diagnoses and examine their performance using an institutional dataset comprising 230,645 inpatient admissions and 8677 diagnosis codes spanning over a six-year period. A total of three language models, including two general-purpose models and a domain-specific model, were evaluated and compared. The results show competitive performance among the models, with the domain-specific model achieving the highest micro-averaged F1 score of 0.7821 and the highest mean average precision of 0.8097. Model performance varied by disease and condition, with diagnosis codes with larger sample sizes producing better results. The rarity of certain diseases and conditions posed challenges to accurate coding. The results also indicated the potential difficulties of the model with long clinical documents. Our models demonstrated the ability to capture relevant associations between diagnoses. This study advances the understanding of language models for inpatient diagnosis coding and provides insights into the extent to which the models can be used.
APA, Harvard, Vancouver, ISO, and other styles
32

Balaskas, George, Homer Papadopoulos, Dimitra Pappa, Quentin Loisel, and Sebastien Chastin. "A Framework for Domain-Specific Dataset Creation and Adaptation of Large Language Models." Computers 14, no. 5 (2025): 172. https://doi.org/10.3390/computers14050172.

Full text
Abstract:
This paper introduces a novel framework for addressing domain adaptation challenges in large language models (LLMs), emphasising privacy-preserving synthetic data generation and efficient fine-tuning. The proposed framework employs a multi-stage approach that includes document ingestion, relevance assessment, and automated dataset creation. This process reduces the need for extensive technical expertise while safeguarding data privacy. We evaluate the framework’s performance on domain-specific tasks in fields such as biobanking and public health, demonstrating that models fine-tuned using our method achieve results comparable to larger proprietary models. Crucially, these models maintain their general instruction-following capabilities, even when adapted to specialised domains, as shown through experiments with 7B and 8B parameter LLMs. Key components of the framework include continuous pre-training, supervised fine-tuning (SFT), and reinforcement learning methods such as direct preference optimisation (DPO), which together provide a flexible and configurable solution for deploying LLMs. The framework supports both local models and API-based solutions, making it scalable and accessible. By enabling privacy-preserving, domain-specific adaptation without requiring extensive expertise, this framework represents a significant step forward in the deployment of LLMs for specialised applications. The framework significantly lowers the barrier to domain adaptation for small- and medium-sized enterprises (SMEs), enabling them to utilise the power of LLMs without requiring extensive resources or technical expertise.
APA, Harvard, Vancouver, ISO, and other styles
33

Gernaey, K. V., C. Rosen, D. J. Batstone, and J. Alex. "Efficient modelling necessitates standards for model documentation and exchange." Water Science and Technology 53, no. 1 (2006): 277–85. http://dx.doi.org/10.2166/wst.2006.030.

Full text
Abstract:
In this paper, problems related to simulation model documentation and model exchange between users are discussed. Complex simulation models have gained popularity in the environmental field, but require extensive documentation to allow independent implementation. The existence of different simulation platforms puts high demands on the quality of the original documentation. Recent experiences from cross-platform implementations with the ASM2d and ADM1 models reveal that error-free model documentation is difficult to obtain, and as a consequence, considerable time is spent on searching for documentation and implementation errors of various sources. As such, the list of errors and coding pitfalls provided for ASM2d and ADM1 in this paper is vital information for any future implementation of both models. The time needed to obtain an error-free model implementation can be significantly reduced if a standard language for model documentation and exchange is adopted. The extensible markup language (XML) and languages based on this format may provide a remedy to the problem of platform independent model documentation and exchange. In this paper the possibility to apply this to environmental models is discussed, whereas the practical model implementation examples corroborate the necessity for a standardised approach.
APA, Harvard, Vancouver, ISO, and other styles
34

Parthiban, Dwarak Govind, Yongyi Mao, and Diana Inkpen. "On the Softmax Bottleneck of Recurrent Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 15 (2021): 13640–47. http://dx.doi.org/10.1609/aaai.v35i15.17608.

Full text
Abstract:
Recent research has pointed to a limitation of word-level neural language models with softmax outputs. This limitation, known as the softmax bottleneck refers to the inability of these models to produce high-rank log probability (log P) matrices. Various solutions have been proposed to break this bottleneck, including Mixture of Softmaxes, SigSoftmax, and Linear Monotonic Softmax with Piecewise Linear Increasing Functions. They were reported to offer better performance in terms of perplexity on test data. A natural perception from these results is a strong positive correlation between the rank of the log P matrix and the model's performance. In this work, we show via an extensive empirical study that such a correlation is fairly weak and that the high-rank of the log P matrix is neither necessary nor sufficient for better test perplexity. Although our results are empirical, they are established in part via the construction of a rich family of models, which we call Generalized SigSoftmax. They are able to create diverse ranks for the log P matrices. We also present an investigation as to why the proposed solutions achieve better performance.
APA, Harvard, Vancouver, ISO, and other styles
35

Zheng, Hong, Zhenhui Xu, Qihong Pan, Zhenzhen Zhao, and Xiangjie Kong. "Plugging Small Models in Large Language Models for POI Recommendation in Smart Tourism." Algorithms 18, no. 7 (2025): 376. https://doi.org/10.3390/a18070376.

Full text
Abstract:
Point-of-interest (POI) recommendation is a crucial task in location-based social networks, especially for enhancing personalized travel experiences in smart tourism. Recently, large language models (LLMs) have demonstrated significant potential in this domain. Unlike classical deep learning-based methods, which focus on capturing various user preferences, LLM-based approaches can further analyze candidate POIs using common sense and provide corresponding reasons. However, existing methods often fail to fully capture user preferences due to limited contextual inputs and insufficient incorporation of cooperative signals. Additionally, most methods inadequately address target temporal information, which is essential for planning travel itineraries. To address these limitations, we propose PSLM4ST, a novel framework that enables synergistic interaction between LLMs and a lightweight temporal knowledge graph reasoning model. This plugin model enhances the input to LLMs by making adjustments and additions, guiding them to focus on reasoning processes related to fine-grained preferences and temporal information. Extensive experiments on three real-world datasets demonstrate the efficacy of PSLM4ST.
APA, Harvard, Vancouver, ISO, and other styles
36

Gao, Shiqi, Tianxiang Gong, Zijie Lin, Runhua Xu, Haoyi Zhou, and Jianxin Li. "FLUE: Streamlined Uncertainty Estimation for Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 16 (2025): 16745–53. https://doi.org/10.1609/aaai.v39i16.33840.

Full text
Abstract:
Uncertainty estimation is essential for practical applications such as decision-making, risk assessment, and human-AI collaboration. However, Uncertainty estimation in open-ended question-answering (QA) tasks presents unique challenges. The output space for open-ended QA is vast and discrete, and the autoregressive nature of LLMs, combined with the rapid increase in model parameters, makes inference sampling significantly costly. An ideal uncertainty estimation for LLMs should meet two criteria: 1) incur no additional inference cost and 2) capture the semantic dependencies of token-level uncertainty within sequences. We propose a promising solution that converts redundancy into randomness in the extensive parameters of LLMs to quantify knowledge uncertainty. We can obtain token-level Monte Carlo samples without multiple inferences by introducing randomness during a single forward pass. We theoretically analyze the FLUE sampling method and employ a post-processing method to learn the state transitions from token uncertainty to sequence uncertainty. In open-ended question-answering tasks, we demonstrate that FLUE can achieve competitive performance in estimating the uncertainty of generated sentences without adding extra inference overhead.
APA, Harvard, Vancouver, ISO, and other styles
37

Unanue, Inigo Jauregi, Gholamreza Haffari, and Massimo Piccardi. "T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification." Transactions of the Association for Computational Linguistics 11 (2023): 1147–61. http://dx.doi.org/10.1162/tacl_a_00593.

Full text
Abstract:
Abstract Cross-lingual text classification leverages text classifiers trained in a high-resource language to perform text classification in other languages with no or minimal fine-tuning (zero/ few-shots cross-lingual transfer). Nowadays, cross-lingual text classifiers are typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest. However, the performance of these models varies significantly across languages and classification tasks, suggesting that the superposition of the language modelling and classification tasks is not always effective. For this reason, in this paper we propose revisiting the classic “translate-and-test” pipeline to neatly separate the translation and classification stages. The proposed approach couples 1) a neural machine translator translating from the targeted language to a high-resource language, with 2) a text classifier trained in the high-resource language, but the neural machine translator generates “soft” translations to permit end-to-end backpropagation during fine-tuning of the pipeline. Extensive experiments have been carried out over three cross-lingual text classification datasets (XNLI, MLDoc, and MultiEURLEX), with the results showing that the proposed approach has significantly improved performance over a competitive baseline.
APA, Harvard, Vancouver, ISO, and other styles
38

Yue, Pengfei, Hailiang Tang, Wanyu Li, Wenxiao Zhang, and Bingjie Yan. "MLKGC: Large Language Models for Knowledge Graph Completion Under Multimodal Augmentation." Mathematics 13, no. 9 (2025): 1463. https://doi.org/10.3390/math13091463.

Full text
Abstract:
Knowledge graph completion (KGC) is a critical task for addressing the incompleteness of knowledge graphs and supporting downstream applications. However, it faces significant challenges, including insufficient structured information and uneven entity distribution. Although existing methods have alleviated these issues to some extent, they often rely heavily on extensive training and fine-tuning, which results in low efficiency. To tackle these challenges, we introduce our MLKGC framework, a novel approach that combines large language models (LLMs) with multi-modal modules (MMs). LLMs leverage their advanced language understanding and reasoning abilities to enrich the contextual information for KGC, while MMs integrate multi-modal data, such as audio and images, to bridge knowledge gaps. This integration augments the capability of the model to address long-tail entities, enhances its reasoning processes, and facilitates more robust information integration through the incorporation of diverse inputs. By harnessing the synergy between LLMs and MMs, our approach reduces dependence on traditional text-based training and fine-tuning, providing a more efficient and accurate solution for KGC tasks. It also offers greater flexibility in addressing complex relationships and diverse entities. Extensive experiments on multiple benchmark KGC datasets demonstrate that MLKGC effectively leverages the strengths of both LLMs and multi-modal data, achieving superior performance in link-prediction tasks.
APA, Harvard, Vancouver, ISO, and other styles
39

Meng, GuangHao, Sunan He, Jinpeng Wang, et al. "EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 6 (2025): 6126–34. https://doi.org/10.1609/aaai.v39i6.32655.

Full text
Abstract:
Vision-language retrieval (VLR) has attracted significant attention in both academia and industry, which involves using text (or images) as queries to retrieve corresponding images (or text). However, existing methods often neglect the rich visual semantics knowledge of entities, thus leading to incorrect retrieval results. To address this problem, we propose the Entity Visual Description enhanced CLIP (EvdCLIP), designed to leverage the visual knowledge of entities to enrich queries. Specifically, since humans recognize entities through visual cues, we employ a large language model (LLM) to generate Entity Visual Descriptions (EVDs) as alignment cues to complement textual data. These EVDs are then integrated into raw queries to create visually-rich, EVD-enhanced queries. Furthermore, recognizing that EVD-enhanced queries may introduce noise or low-quality expansions, we develop a novel, trainable EVD-aware Rewriter (EaRW) for vision-language retrieval tasks. EaRW utilizes EVD knowledge and the generative capabilities of the language model to effectively rewrite queries. With our specialized training strategy, EaRW can generate high-quality and low-noise EVD-enhanced queries. Extensive quantitative and qualitative experiments on image-text retrieval benchmarks validate the superiority of EvdCLIP on vision-language retrieval tasks.
APA, Harvard, Vancouver, ISO, and other styles
40

Carlo, Michele, and Osamu Takeuchi. "High-accuracy, privacy-compliant multilingual sentiment categorization on consumer-grade hardware: A monte carlo evaluation of locally deployed large language models." Digital Applied Linguistics 3 (March 11, 2025): 102585. https://doi.org/10.29140/dal.v3.102585.

Full text
Abstract:
This study presents a comprehensive evaluation of multilingual sentiment categorization performance using locally deployed large language models (LLMs) on consumer-grade hardware, focusing on GDPR-compliant implementation scenarios. Through extensive Monte Carlo validation involving 947,700 classifications over 702 iterations, we demonstrate significant performance capabilities across English, Italian, and Japanese languages while operating within consumer hardware constraints. Using lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half on a Python-based llama-cpp framework on consumer NVIDIA GPU hardware, English achieved 96.3% accuracy (95% CI: 0.963–0.964), with Italian and Japanese showing strong performance at 92.2% (95% CI: 0.921–0.922) and 90.7% (95% CI: 0.906–0.908) respectively. Notably, our analysis demonstrates that plurality voting can achieve extremely high confidence levels across all languages, suggesting an efficient approach to improving classification reliability without requiring extensive computational resources.
APA, Harvard, Vancouver, ISO, and other styles
41

Jiang, Zhengbao, Frank F. Xu, Jun Araki, and Graham Neubig. "How Can We Know What Language Models Know?" Transactions of the Association for Computational Linguistics 8 (July 2020): 423–38. http://dx.doi.org/10.1162/tacl_a_00324.

Full text
Abstract:
Recent work has presented intriguing results examining the knowledge contained in language models (LMs) by having the LM fill in the blanks of prompts such as “ Obama is a __ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “ Obama worked as a __ ” may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts, as well as ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 39.6%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA .
APA, Harvard, Vancouver, ISO, and other styles
42

Yu, Qing, Mikihiro Tanaka, and Kent Fujiwara. "ReMoGPT: Part-Level Retrieval-Augmented Motion-Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 9 (2025): 9635–43. https://doi.org/10.1609/aaai.v39i9.33044.

Full text
Abstract:
Generation of 3D human motion holds significant importance in the creative industry. While recent notable advances have been made in generating common motions, existing methods struggle to generate diverse and rare motions due to the complexity of motions and limited training data. This work introduces ReMoGPT, a unified motion-language generative model that solves a wide range of motion-related tasks by incorporating a multi-modal retrieval mechanism into the generation process to address the limitations of existing models, namely diversity and generalizability. We propose to focus on body-part-level motion features to enable fine-grained text-motion retrieval and locate suitable references from the database to conduct generation. Then, the motion-language generative model is trained with prompt-based question-and-answer tasks designed for different motion-relevant problems. We incorporate the retrieved samples into the prompt, and then perform instruction tuning of the motion-language model, to learn from task feedback and produce promising results with the help of fine-grained multi-modal retrieval. Extensive experiments validate the efficacy of ReMoGPT, showcasing its superiority over existing state-of-the-art methods. The framework performs well on multiple motion tasks, including motion retrieval, generation, and captioning.
APA, Harvard, Vancouver, ISO, and other styles
43

Kustwar, Jitendra Singh, Gourav Kumar Shrivastava, and Nikhil Chaurasia. "Showcasing Retrieval and Language Models for Information-Rich Natural Language Processing (NLP)." International Journal on Advances in Engineering, Technology and Science (IJAETS) 5, no. 1 (2024): 45–48. https://doi.org/10.5281/zenodo.10711142.

Full text
Abstract:
Abstract&mdash; In the realm of Natural Language Processing (NLP), the integration of retrieval and language models has become paramount for handling information-rich content effectively. This paper presents a comprehensive exploration and showcase of advanced techniques in combining retrieval and language models to enhance the capabilities of information-intensive NLP systems. The primary objective is to bridge the gap between knowledge retrieval and contextual understanding, enabling applications to seamlessly navigate extensive knowledge bases. The paper begins by surveying state-of-the-art retrieval models, delving into their strengths and limitations in extracting relevant information from large datasets. Subsequently, it explores the landscape of language models, including transformer-based architectures such as BERT and GPT, focusing on their abilities to capture intricate linguistic nuances and semantic relationships within the context of information-rich tasks. Our approach involves the careful composition of these models, emphasizing the synergy between retrieved knowledge and contextual understanding. The proposed models aim to not only retrieve relevant information but also comprehend and integrate it seamlessly into the context of natural language understanding. To demonstrate the efficacy of the showcased models, we present practical applications across diverse domains, including healthcare, legal, and scientific literature analysis. We evaluate the models using rigorous metrics, assessing their performance in terms of accuracy, precision, and recall. The language model is constructed using advanced deep learning methods, specifically focusing on recurrent neural networks (RNNs) and transformer architectures. The model is trained on large datasets to learn intricate patterns, semantic structures, and contextual nuances within the language. &nbsp; Keywords&mdash; Natural Language Processing (NLP), Machine Learning (ML), Deep Learning (DL), recurrent neural networks (RNNs).
APA, Harvard, Vancouver, ISO, and other styles
44

Li, Quan. "Adapter Based on Pre-Trained Language Models for Classification of Medical Text." Journal of Electronic Research and Application 8, no. 3 (2024): 129–34. http://dx.doi.org/10.26689/jera.v8i3.7219.

Full text
Abstract:
We present an approach to classify medical text at a sentence level automatically. Given the inherent complexity of medical text classification, we employ adapters based on pre-trained language models to extract information from medical text, facilitating more accurate classification while minimizing the number of trainable parameters. Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach.
APA, Harvard, Vancouver, ISO, and other styles
45

Neveditsin, Nikita, Pawan Lingras, and Vijay Mago. "Clinical insights: A comprehensive review of language models in medicine." PLOS Digital Health 4, no. 5 (2025): e0000800. https://doi.org/10.1371/journal.pdig.0000800.

Full text
Abstract:
This paper explores the advancements and applications of language models in healthcare, focusing on their clinical use cases. It examines the evolution from early encoder-based systems requiring extensive fine-tuning to state-of-the-art large language and multimodal models capable of integrating text and visual data through in-context learning. The analysis emphasizes locally deployable models, which enhance data privacy and operational autonomy, and their applications in tasks such as text generation, classification, information extraction, and conversational systems. The paper also highlights a structured organization of tasks and a tiered ethical approach, providing a valuable resource for researchers and practitioners, while discussing key challenges related to ethics, evaluation, and implementation.
APA, Harvard, Vancouver, ISO, and other styles
46

Ma, Jian, Yonglin Deng, Chen Chen, Nanyang Du, Haonan Lu, and Zhenyu Yang. "GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 6 (2025): 5955–63. https://doi.org/10.1609/aaai.v39i6.32636.

Full text
Abstract:
Posters serve an essential function in marketing and advertising by improving visual communication and brand visibility, thus significantly contributing to industrial design. With the latest developments in controllable T2I diffusion models, research interest has surged in text rendering within synthesized images. Although text rendering accuracy has seen advancements, automatic poster generation remains a relatively untapped area. This paper presents an automatic poster generation framework featuring text rendering capabilities through the use of LLMs. Our framework employs a triple-cross attention mechanism based on alignment learning to achieve precise text placement within detailed contextual backgrounds. Moreover, it supports adjustable fonts, varying image resolutions, and poster rendering with textual prompts in both English and Chinese. Additionally, we present a comprehensive bilingual image-text dataset, GlyphDraw-3M, comprising 3 million image-text pairs, each with OCR annotations and resolutions exceeding 1024. Our method utilizes the SDXL architecture, and extensive experiments confirm its ability to generate posters with intricate and context-rich backgrounds.
APA, Harvard, Vancouver, ISO, and other styles
47

Lu, Haifeng, Jiuyi Chen, Feng Liang, Mingkui Tan, Runhao Zeng, and Xiping Hu. "Understanding Emotional Body Expressions via Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 2 (2025): 1447–55. https://doi.org/10.1609/aaai.v39i2.32135.

Full text
Abstract:
Emotion recognition based on body movements is vital in human-computer interaction. However, existing emotion recognition methods predominantly focus on enhancing classification accuracy, often neglecting the provision of textual explanations to justify their classifications. In this paper, we propose an Emotion-Action Interpreter powered by LargeLanguage Model (EAI-LLM), which not only recognizes emotions but also generates textual explanations by treating 3D body movement data as unique input tokens within large language models (LLMs). Specifically, we propose a multi-granularity skeleton tokenizer designed for LLMs, which separately extracts spatio-temporal tokens and semantic tokens from the skeleton data. This approach allows LLMs to generate more nuanced classification descriptions while maintaining robust classification performance. Furthermore, we treat the skeleton sequence as a specific language and propose a unified skeleton token module. This module leverages the extensive background knowledge and language processing capabilities of LLMs to address the challenges of joint training on heterogeneous datasets, thereby significantly enhancing recognition accuracy on individual datasets. Experimental results demonstrate that our model achieves recognition accuracy comparable to existing methods. More importantly, with the support of background knowledge from LLMs, our model can generate detailed emotion descriptions based on classification results, even when trained on a limited amount of labeled skeleton data.
APA, Harvard, Vancouver, ISO, and other styles
48

Tran, Hanh Thi-Hong, Carlos-Emiliano González-Gallardo, Antoine Doucet, and Senja Pollak. "LlamATE." Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 31, no. 1 (2025): 5–36. https://doi.org/10.1075/term.00082.tra.

Full text
Abstract:
Abstract Over the past decades, automatic term or terminology extraction (ATE), a natural language processing (NLP) task that aims to identify terms from specific domains by providing a list of candidate terms, has been challenging due to the strong influence of domain-specific differences on term definitions. Leveraging the advances of large-scale language models (LLMs), we propose LlamATE, a framework to verify the impact of domain specificity on ATE when using in-context learning prompts in open-sourced LLM-based chat models, namely Llama-2-Chat. We evaluate how well the LLM-based chat (e.g., using reinforcement learning with human feedback (RLHF)) models perform with different levels of domain-related information in the dominant language in NLP research (e.g., English) and other European languages (e.g., French, Slovene) from ACTER datasets, i.e., in-domain and cross-domain demonstrations with and without domain enunciation. Furthermore, we examine the potential of cross-lingual and cross-domain prompting to reduce the need for extensive data annotation of the target domain and language. The results demonstrate the potential of implicit in-domain learning where examples of the target domain are used as demonstrations for the prompts without specifying the domain of each example, and cross-lingual learning when knowledge is transferred from the dominant to lesser-represented European languages as for the data used to pre-train the LLMs. LlamATE also offers a valuable compromise by reducing the need for extensive data annotation, making it suitable for real-world applications where labeled corpora are scarce. The source code is publicly available at the following link: https://github.com/honghanhh/terminology2024.
APA, Harvard, Vancouver, ISO, and other styles
49

Ordoñez-Briceño, Karla, José R. Hilera, Luis De-Marcos, and Rodrigo Saraguro-Bravo. "Generating Accessible Webpages from Models." Computers 14, no. 6 (2025): 213. https://doi.org/10.3390/computers14060213.

Full text
Abstract:
Despite significant efforts to promote web accessibility through the adoption of various standards and tools, the web remains inaccessible to many users. One of the main barriers is the limited knowledge of accessibility issues among website designers. This gap in expertise results in the development of websites that fail to meet accessibility standards, hindering access for people with diverse abilities and needs. In response to this challenge, this paper presents the ACG WebAcc prototype, which enables the automatic generation of accessible HTML code using a model-driven development (MDD) approach. The tool takes as input a Unified Modeling Language (UML) model, with a specific profile, and incorporates predefined Object Constraint Language (OCL) rules to ensure compliance with accessibility guidelines. By automating this process, ACG WebAcc reduces the need for extensive knowledge of accessibility standards, making it easier for designers to create accessible websites.
APA, Harvard, Vancouver, ISO, and other styles
50

Khaliq, Ayesha, Sikander Bakht Abbasi, Arslan Ilyas, Saim Masood Shaikh, and Syed Ashar Ali. "Optimizing Academic Queries With Retrieval-Augmented Large Language Models." Migration Letters 21, S10 (2024): 1274–83. https://doi.org/10.59670/ml.v21is10.11858.

Full text
Abstract:
This research investigates the application of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) methods to enhance the management of academic queries, providing advantages for both students and educators. The study aims to improve the precision and pertinence of generated answers by combining LLMs with multi-source RAG systems. The model employs PDF datasets of various sizes and incorporates vector database support to streamline storage and retrieval, thereby boosting the model's capacity to handle extensive datasets. To process and produce comprehensive responses, the research utilizes the LLaMA model, which features a parameter range from 7 billion to 65 billion. The study addresses challenges such as textual imperfections in data retrieval, which can significantly impact the model's output. To ensure its robustness, the proposed model is evaluated using a diverse set of academic inquiries. Beyond answering course-related questions, the model also supports international students by providing information on scholarship opportunities and admission guidelines. This research contributes to the advancement of academic research tools by merging retrieval-augmented techniques with sophisticated LLMs, paving the way for future studies in education and generative AI.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!