Zaloguj się

Gotowe bibliografie tematyczne / Pretrained language model / Artykuły w czasopismach

Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Pretrained language model.

Artykuły w czasopismach na temat „Pretrained language model”

Autor: Grafiati

Data publikacji: 2 czerwca 2025

Data aktualizacji: 31 lipca 2025

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Pretrained language model”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Lee, Chanhee, Kisu Yang, Taesun Whang, Chanjun Park, Andrew Matteson, and Heuiseok Lim. "Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models." Applied Sciences 11, no. 5 (2021): 1974. http://dx.doi.org/10.3390/app11051974.

Pełny tekst źródła

Streszczenie:

Language model pretraining is an effective method for improving the performance of downstream natural language processing tasks. Even though language modeling is unsupervised and thus collecting data for it is relatively less expensive, it is still a challenging process for languages with limited resources. This results in great technological disparity between high- and low-resource languages for numerous downstream natural language processing tasks. In this paper, we aim to make this technology more accessible by enabling data efficient training of pretrained language models. It is achieved b

Style APA, Harvard, Vancouver, ISO itp.

2

De Coster, Mathieu, and Joni Dambre. "Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation." Information 13, no. 5 (2022): 220. http://dx.doi.org/10.3390/info13050220.

Pełny tekst źródła

Streszczenie:

We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required video processing. At the same time, sign languages are low-resource languages, their datasets dwarfed by those available for written languages. Recent advances in written language processing and success stories of transfer learning raise the question of how pretrained written

Style APA, Harvard, Vancouver, ISO itp.

3

Kuwana, Ayato, Atsushi Oba, Ranto Sawai, and Incheon Paik. "Automatic Taxonomy Classification by Pretrained Language Model." Electronics 10, no. 21 (2021): 2656. http://dx.doi.org/10.3390/electronics10212656.

Pełny tekst źródła

Streszczenie:

In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating the architecture is possible because of the development in natural language processing (NLP). By contrast, the transfer learning of language models trained by a large, unlabeled corpus has yielded a breakthrough in NLP. Inspired by these achievements, we propose a novel workflow for ontology generat

Style APA, Harvard, Vancouver, ISO itp.

4

Lee, Eunchan, Changhyeon Lee, and Sangtae Ahn. "Comparative Study of Multiclass Text Classification in Research Proposals Using Pretrained Language Models." Applied Sciences 12, no. 9 (2022): 4522. http://dx.doi.org/10.3390/app12094522.

Pełny tekst źródła

Streszczenie:

Recently, transformer-based pretrained language models have demonstrated stellar performance in natural language understanding (NLU) tasks. For example, bidirectional encoder representations from transformers (BERT) have achieved outstanding performance through masked self-supervised pretraining and transformer-based modeling. However, the original BERT may only be effective for English-based NLU tasks, whereas its effectiveness for other languages such as Korean is limited. Thus, the applicability of BERT-based language models pretrained in languages other than English to NLU tasks based on t

Style APA, Harvard, Vancouver, ISO itp.

5

Wang, Canjun, Zhao Li, Tong Chen, Ruishuang Wang, and Zhengyu Ju. "Research on the Application of Prompt Learning Pretrained Language Model in Machine Translation Task with Reinforcement Learning." Electronics 12, no. 16 (2023): 3391. http://dx.doi.org/10.3390/electronics12163391.

Pełny tekst źródła

Streszczenie:

With the continuous advancement of deep learning technology, pretrained language models have emerged as crucial tools for natural language processing tasks. However, optimization of pretrained language models is essential for specific tasks such as machine translation. This paper presents a novel approach that integrates reinforcement learning with prompt learning to enhance the performance of pretrained language models in machine translation tasks. In our methodology, a “prompt” string is incorporated into the input of the pretrained language model, to guide the generation of an output that a

Style APA, Harvard, Vancouver, ISO itp.

6

Chen, Zhi, Yuncong Liu, Lu Chen, Su Zhu, Mengyue Wu, and Kai Yu. "OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue." Transactions of the Association for Computational Linguistics 11 (2023): 68–84. http://dx.doi.org/10.1162/tacl_a_00534.

Pełny tekst źródła

Streszczenie:

Abstract This paper presents an ontology-aware pretrained language model (OPAL) for end-to-end task-oriented dialogue (TOD). Unlike chit-chat dialogue models, task-oriented dialogue models fulfill at least two task-specific modules: Dialogue state tracker (DST) and response generator (RG). The dialogue state consists of the domain-slot-value triples, which are regarded as the user’s constraints to search the domain-related databases. The large-scale task-oriented dialogue data with the annotated structured dialogue state usually are inaccessible. It prevents the development of the pretrained l

Style APA, Harvard, Vancouver, ISO itp.

7

Xu, Canwen, and Julian McAuley. "A Survey on Model Compression and Acceleration for Pretrained Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (2023): 10566–75. http://dx.doi.org/10.1609/aaai.v37i9.26255.

Pełny tekst źródła

Streszczenie:

Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long inference delay prevent Transformer-based pretrained language models (PLMs) from seeing broader adoption including for edge and mobile computing. Efficient NLP research aims to comprehensively consider computation, time and carbon emission for the entire life-cycle of NLP, including data preparation, model training and inference. In this survey, we focus on the inference stage and review the current state of model compression and acceleration for pretrained language models, including benchmarks, met

Style APA, Harvard, Vancouver, ISO itp.

8

Gu, Yang, and Yanke Hu. "Extractive Summarization with Very Deep Pretrained Language Model." International Journal of Artificial Intelligence & Applications 10, no. 02 (2019): 27–32. http://dx.doi.org/10.5121/ijaia.2019.10203.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

9

Qi, Xianglong, Yang Gao, Ruibin Wang, Minghua Zhao, Shengjia Cui, and Mohsen Mortazavi. "Learning High-Order Semantic Representation for Intent Classification and Slot Filling on Low-Resource Language via Hypergraph." Mathematical Problems in Engineering 2022 (September 16, 2022): 1–16. http://dx.doi.org/10.1155/2022/8407713.

Pełny tekst źródła

Streszczenie:

Representation of language is the first and critical task for Natural Language Understanding (NLU) in a dialogue system. Pretraining, embedding model, and fine-tuning for intent classification and slot-filling are popular and well-performing approaches but are time consuming and inefficient for low-resource languages. Concretely, the out-of-vocabulary and transferring to different languages are two tough challenges for multilingual pretrained and cross-lingual transferring models. Furthermore, quality-proved parallel data are necessary for the current frameworks. Stepping over these challenges

Style APA, Harvard, Vancouver, ISO itp.

10

Won, Hyun-Sik, Min-Ji Kim, Dohyun Kim, Hee-Soo Kim, and Kang-Min Kim. "University Student Dropout Prediction Using Pretrained Language Models." Applied Sciences 13, no. 12 (2023): 7073. http://dx.doi.org/10.3390/app13127073.

Pełny tekst źródła

Streszczenie:

Predicting student dropout from universities is an imperative but challenging task. Numerous data-driven approaches that utilize both student demographic information (e.g., gender, nationality, and high school graduation year) and academic information (e.g., GPA, participation in activities, and course evaluations) have shown meaningful results. Recently, pretrained language models have achieved very successful results in understanding the tasks associated with structured data as well as textual data. In this paper, we propose a novel student dropout prediction framework based on demographic a

Style APA, Harvard, Vancouver, ISO itp.

11

Elazar, Yanai, Nora Kassner, Shauli Ravfogel, et al. "Measuring and Improving Consistency in Pretrained Language Models." Transactions of the Association for Computational Linguistics 9 (2021): 1012–31. http://dx.doi.org/10.1162/tacl_a_00410.

Pełny tekst źródła

Streszczenie:

Abstract Consistency of a model—that is, the invariance of its behavior under meaning-preserving alternations in its input—is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel🤘, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel🤘, we show that the consistency of all PLMs we experiment with is poor— though with high variance between relations. Our ana

Style APA, Harvard, Vancouver, ISO itp.

12

Edman, Lukas, Gabriele Sarti, Antonio Toral, Gertjan van Noord, and Arianna Bisazza. "Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation." Transactions of the Association for Computational Linguistics 12 (2024): 392–410. http://dx.doi.org/10.1162/tacl_a_00651.

Pełny tekst źródła

Streszczenie:

Abstract Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing tasks. However, there has been little research on their effectiveness for neural machine translation (NMT), particularly within the popular pretrain-then-finetune paradigm. This work performs an extensive comparison across multiple languages and experimental conditions of character- and subword-level pretrained models (ByT5 and mT5, respectively) on NMT. We show the effectiveness of character-level modeling in translation

Style APA, Harvard, Vancouver, ISO itp.

13

Lu, Kevin, Aditya Grover, Pieter Abbeel, and Igor Mordatch. "Frozen Pretrained Transformers as Universal Computation Engines." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (2022): 7628–36. http://dx.doi.org/10.1609/aaai.v36i7.20729.

Pełny tekst źródła

Streszczenie:

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning numerical computation, vision, and protein fold prediction. In contrast to prior works which investigate finetuning on the same modality as the pretraining dataset, we show that pretraining on natural

Style APA, Harvard, Vancouver, ISO itp.

14

Khan, Vasima, and Tariq Azfar Meenai. "Pretrained Natural Language Processing Model for Intent Recognition (BERT-IR)." Human-Centric Intelligent Systems 1, no. 3-4 (2021): 66. http://dx.doi.org/10.2991/hcis.k.211109.001.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

15

Jawale, Shila S., and S. D. Sawarkar. "Exploiting Emotions via Composite Pretrained Embedding and Ensemble Language Model." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 8s (2023): 362–75. http://dx.doi.org/10.17762/ijritcc.v11i8s.7216.

Pełny tekst źródła

Streszczenie:

Decisions in the modern era are based on more than just the available data; they also incorporate feedback from online sources. Processing reviews known as Sentiment analysis (SA) or Emotion analysis. Understanding the user's perspective and routines is crucial now-a-days for multiple reasons. It is used by both businesses and governments to make strategic decisions. Various architectural and vector embedding strategies have been developed for SA processing. Accurate representation of text is crucial for automatic SA. Due to the large number of languages spoken and written, polysemy and syntac

Style APA, Harvard, Vancouver, ISO itp.

16

Li, Jiahuan, Hao Zhou, Shujian Huang, Shanbo Cheng, and Jiajun Chen. "Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions." Transactions of the Association for Computational Linguistics 12 (2024): 576–92. http://dx.doi.org/10.1162/tacl_a_00655.

Pełny tekst źródła

Streszczenie:

Abstract Large-scale pretrained language models (LLMs), such as ChatGPT and GPT4, have shown strong abilities in multilingual translation, without being explicitly trained on parallel corpora. It is intriguing how the LLMs obtain their ability to carry out translation instructions for different languages. In this paper, we present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7.5B, to perform multilingual translation following given instructions. Firstly, we show that multilingual LLMs have stronger translation abilities than previously demonstrated. For a ce

Style APA, Harvard, Vancouver, ISO itp.

17

Bear Don’t Walk IV, Oliver J., Tony Sun, Adler Perotte, and Noémie Elhadad. "Clinically relevant pretraining is all you need." Journal of the American Medical Informatics Association 28, no. 9 (2021): 1970–76. http://dx.doi.org/10.1093/jamia/ocab086.

Pełny tekst źródła

Streszczenie:

Abstract Clinical notes present a wealth of information for applications in the clinical domain, but heterogeneity across clinical institutions and settings presents challenges for their processing. The clinical natural language processing field has made strides in overcoming domain heterogeneity, while pretrained deep learning models present opportunities to transfer knowledge from one task to another. Pretrained models have performed well when transferred to new tasks; however, it is not well understood if these models generalize across differences in institutions and settings within the cli

Style APA, Harvard, Vancouver, ISO itp.

18

Zhang, Yijia, Tiancheng Zhang, Peng Xie, Minghe Yu, and Ge Yu. "BEM-SM: A BERT-Encoder Model with Symmetry Supervision Module for Solving Math Word Problem." Symmetry 15, no. 4 (2023): 916. http://dx.doi.org/10.3390/sym15040916.

Pełny tekst źródła

Streszczenie:

In order to find solutions to math word problems, some modules have been designed to check the generated expressions, but they neither take into account the symmetry between math word problems and their corresponding mathematical expressions, nor do they utilize the efficiency of pretrained language models in natural language understanding tasks. Anyway, designing fine-tuning tasks for pretrained language models that encourage cooperation with other modules to improve the performance of math word problem solvers is an unaddressed problem. To solve these problems, in this paper we propose a BER

Style APA, Harvard, Vancouver, ISO itp.

19

Reddy, K. Sahit, N. Ragavenderan, Vasanth K., Ganesh N. Naik, Vishalakshi Prabhu H, and Nagaraja G. S. "MedicalBERT: enhancing biomedical natural language processing using pretrained BERT-based model." IAES International Journal of Artificial Intelligence (IJ-AI) 14, no. 3 (2025): 2367. https://doi.org/10.11591/ijai.v14.i3.pp2367-2378.

Pełny tekst źródła

Streszczenie:

<p><span lang="EN-US">Recent advances in natural language processing (NLP) have been driven by pretrained language models like BERT, RoBERTa, T5, and GPT. These models excel at understanding complex texts, but biomedical literature, with its domain-specific terminology, poses challenges that models like Word2Vec and bidirectional long short-term memory (Bi-LSTM) can't fully address. GPT and T5, despite capturing context, fall short in tasks needing bidirectional understanding, unlike BERT. Addressing this, we proposed MedicalBERT, a pretrained BERT model trained on a large biomedic

Style APA, Harvard, Vancouver, ISO itp.

20

Zhang, Wenbo, Xiao Li, Yating Yang, Rui Dong, and Gongxu Luo. "Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation." Future Internet 12, no. 12 (2020): 215. http://dx.doi.org/10.3390/fi12120215.

Pełny tekst źródła

Streszczenie:

Recently, the pretraining of models has been successfully applied to unsupervised and semi-supervised neural machine translation. A cross-lingual language model uses a pretrained masked language model to initialize the encoder and decoder of the translation model, which greatly improves the translation quality. However, because of a mismatch in the number of layers, the pretrained model can only initialize part of the decoder’s parameters. In this paper, we use a layer-wise coordination transformer and a consistent pretraining translation transformer instead of a vanilla transformer as the tra

Style APA, Harvard, Vancouver, ISO itp.

21

Javed, Tahir, Sumanth Doddapaneni, Abhigyan Raman, et al. "Towards Building ASR Systems for the Next Billion Users." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (2022): 10813–21. http://dx.doi.org/10.1609/aaai.v36i10.21327.

Pełny tekst źródła

Streszczenie:

Recent methods in speech and language technology pretrain very large models which are fine-tuned for specific tasks. However, the benefits of such large models are often limited to a few resource rich languages of the world. In this work, we make multiple contributions towards building ASR systems for low resource languages from the Indian subcontinent. First, we curate 17,000 hours of raw speech data for 40 Indian languages from a wide variety of domains including education, news, technology, and finance. Second, using this raw speech data we pretrain several variants of wav2vec style models

Style APA, Harvard, Vancouver, ISO itp.

22

Kotei, Evans, and Ramkumar Thirunavukarasu. "A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning." Information 14, no. 3 (2023): 187. http://dx.doi.org/10.3390/info14030187.

Pełny tekst źródła

Streszczenie:

Transfer learning is a technique utilized in deep learning applications to transmit learned inference to a different target domain. The approach is mainly to solve the problem of a few training datasets resulting in model overfitting, which affects model performance. The study was carried out on publications retrieved from various digital libraries such as SCOPUS, ScienceDirect, IEEE Xplore, ACM Digital Library, and Google Scholar, which formed the Primary studies. Secondary studies were retrieved from Primary articles using the backward and forward snowballing approach. Based on set inclusion

Style APA, Harvard, Vancouver, ISO itp.

23

Joukhadar, Alaa, Nada Ghneim, and Ghaida Rebdawi. "Impact of Using Bidirectional Encoder Representations from Transformers (BERT) Models for Arabic Dialogue Acts Identification." Ingénierie des systèmes d information 26, no. 5 (2021): 469–75. http://dx.doi.org/10.18280/isi.260506.

Pełny tekst źródła

Streszczenie:

In Human-Computer dialogue systems, the correct identification of the intent underlying a speaker's utterance is crucial to the success of a dialogue. Several researches have studied the Dialogue Act Classification (DAC) task to identify Dialogue Acts (DA) for different languages. Recently, the emergence of Bidirectional Encoder Representations from Transformers (BERT) models, enabled establishing state-of-the-art results for a variety of natural language processing tasks in different languages. Very few researches have been done in the Arabic Dialogue acts identification task. The BERT repres

Style APA, Harvard, Vancouver, ISO itp.

24

Dudaš, Adam, and Jarmila Skrinarova. "Natural Language Processing in Translation of Relational Languages." IPSI Transactions on Internet Research 19, no. 01 (2023): 17–23. http://dx.doi.org/10.58245/ipsi.tir.2301.04.

Pełny tekst źródła

Streszczenie:

Methods of data manipulation in combination with sturcture of the data and integrity constraints define data model used in the relational databases. This article focuses on the methods and processes of operation sets which are used for selection of data from relational database and translation between various formats of this manipulation. The article presents the design, implementation and experimental evaluation of tool for translating between relational algebra, tuple relational calculus, Structured Query Language and unrestricted natural language in all directions. Presented software tool t

Style APA, Harvard, Vancouver, ISO itp.

25

Pan, Yu, Ye Yuan, Yichun Yin, et al. "Preparing Lessons for Progressive Training on Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (2024): 18860–68. http://dx.doi.org/10.1609/aaai.v38i17.29851.

Pełny tekst źródła

Streszczenie:

The rapid progress of Transformers in artificial intelligence has come at the cost of increased resource consumption and greenhouse gas emissions due to growing model sizes. Prior work suggests using pretrained small models to improve training efficiency, but this approach may not be suitable for new model structures. On the other hand, training from scratch can be slow, and progressively stacking layers often fails to achieve significant acceleration. To address these challenges, we propose a novel method called Apollo, which prepares lessons for expanding operations by learning high-layer fu

Style APA, Harvard, Vancouver, ISO itp.

26

Zheng, Zhe, Xin-Zheng Lu, Ke-Yin Chen, Yu-Cheng Zhou, and Jia-Rui Lin. "Pretrained domain-specific language model for natural language processing tasks in the AEC domain." Computers in Industry 142 (November 2022): 103733. http://dx.doi.org/10.1016/j.compind.2022.103733.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

27

Yigzaw, Netsanet, Million Meshesha, and Chala Diriba. "A Generic Approach towards Amharic Sign Language Recognition." Advances in Human-Computer Interaction 2022 (September 22, 2022): 1–11. http://dx.doi.org/10.1155/2022/1112169.

Pełny tekst źródła

Streszczenie:

In the day-to-day life of communities, good communication channels are crucial for mutual understanding. The hearing-impaired community uses sign language, which is a visual and gestural language. In terms of orientation and expression, it is separate from written and spoken languages. Despite the fact that sign language is an excellent platform for communication among hearing-impaired persons, it has created a communication barrier between hearing-impaired and non-disabled people. To address this issue, researchers have proposed sign language to text translation systems for English and other

Style APA, Harvard, Vancouver, ISO itp.

28

Yan, Ming, Chenliang Li, Bin Bi, Wei Wang, and Songfang Huang. "A Unified Pretraining Framework for Passage Ranking and Expansion." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 5 (2021): 4555–63. http://dx.doi.org/10.1609/aaai.v35i5.16584.

Pełny tekst źródła

Streszczenie:

Pretrained language models have recently advanced a wide range of natural language processing tasks. Nowadays, the application of pretrained language models to IR tasks has also achieved impressive results. Typical methods either directly apply a pretrained model to improve the re-ranking stage, or use it to conduct passage expansion and term weighting for first-stage retrieval. We observe that the passage ranking and passage expansion tasks share certain inherent relations, and can benefit from each other. Therefore, in this paper, we propose a general pretraining framework to enhance both ta

Style APA, Harvard, Vancouver, ISO itp.

29

Li, Juncai, and Xiaofei Jiang. "Mol-BERT: An Effective Molecular Representation with BERT for Molecular Property Prediction." Wireless Communications and Mobile Computing 2021 (September 2, 2021): 1–7. http://dx.doi.org/10.1155/2021/7181815.

Pełny tekst źródła

Streszczenie:

Molecular property prediction is an essential task in drug discovery. Most computational approaches with deep learning techniques either focus on designing novel molecular representation or combining with some advanced models together. However, researchers pay fewer attention to the potential benefits in massive unlabeled molecular data (e.g., ZINC). This task becomes increasingly challenging owing to the limitation of the scale of labeled data. Motivated by the recent advancements of pretrained models in natural language processing, the drug molecule can be naturally viewed as language to som

Style APA, Harvard, Vancouver, ISO itp.

30

Khilji, Muhammad Danial. "Features Matching using Natural Language Processing." International Journal on Cybernetics & Informatics 12, no. 2 (2023): 251–60. http://dx.doi.org/10.5121/ijci.2023.120218.

Pełny tekst źródła

Streszczenie:

The feature matching is a basic step in matching different datasets. This article proposes shows a new hybrid model of a pretrained Natural Language Processing (NLP) based model called BERT used in parallel with a statistical model based on Jaccard similarity to measure the similarity between list of features from two different datasets. This reduces the time required to search for correlations or manually match each feature from one dataset to another.

Style APA, Harvard, Vancouver, ISO itp.

31

Zhu, Fangqi, Jun Gao, Changlong Yu, et al. "A Generative Approach for Script Event Prediction via Contrastive Fine-Tuning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (2023): 14056–64. http://dx.doi.org/10.1609/aaai.v37i11.26645.

Pełny tekst źródła

Streszczenie:

Script event prediction aims to predict the subsequent event given the context. This requires the capability to infer the correlations between events. Recent works have attempted to improve event correlation reasoning by using pretrained language models and incorporating external knowledge (e.g., discourse relations). Though promising results have been achieved, some challenges still remain. First, the pretrained language models adopted by current works ignore event-level knowledge, resulting in an inability to capture the correlations between events well. Second, modeling correlations between

Style APA, Harvard, Vancouver, ISO itp.

32

Nooralahzadeh, Farhad, and Rico Sennrich. "Improving the Cross-Lingual Generalisation in Visual Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (2023): 13419–27. http://dx.doi.org/10.1609/aaai.v37i11.26574.

Pełny tekst źródła

Streszczenie:

While several benefits were realized for multilingual vision-language pretrained models, recent benchmarks across various tasks and languages showed poor cross-lingual generalisation when multilingually pre-trained vision-language models are applied to non-English data, with a large gap between (supervised) English performance and (zero-shot) cross-lingual transfer. In this work, we explore the poor performance of these models on a zero-shot cross-lingual visual question answering (VQA) task, where models are fine-tuned on English visual-question data and evaluated on 7 typologically diverse l

Style APA, Harvard, Vancouver, ISO itp.

33

Shu, Peng, and Sun Cuiqin. "A Statistical English Syntax Analysis Model Based on Linguistic Evaluation Information." Security and Communication Networks 2022 (July 30, 2022): 1–7. http://dx.doi.org/10.1155/2022/3766417.

Pełny tekst źródła

Streszczenie:

Language evaluation research currently focuses on the analysis of scholars from various native language backgrounds, whereas the local grammatical characteristics of other groups, particularly English language learners, are discussed less frequently. Local grammar offers a new perspective for analyzing the meaning characteristics of evaluation languages from the point of view of the people who employ them. In order to provide context for this paper, past research on local syntax is reviewed. The language model generates text that can be analyzed to determine the model’s aggressiveness when per

Style APA, Harvard, Vancouver, ISO itp.

34

Zhang, Dongqiu, and Wenkui Li. "An Improved Math Word Problem (MWP) Model Using Unified Pretrained Language Model (UniLM) for Pretraining." Computational Intelligence and Neuroscience 2022 (July 14, 2022): 1–9. http://dx.doi.org/10.1155/2022/7468286.

Pełny tekst źródła

Streszczenie:

Natural Language Understanding (NLU) and Natural Language Generation (NLG) are the general methods that support machine understanding of text content. They play a very important role in the text information processing system including recommendation and question and answer systems. There are many researches in the field of NLU such as Bag of words, N-Gram, and neural network language model. These models have achieved a good performance in NLU and NLG tasks. However, since they require lots of training data, it is difficult to obtain rich data in practical applications. Thus, pretraining become

Style APA, Harvard, Vancouver, ISO itp.

35

Yang, Jinyu, Ruijia Wang, Cheng Yang, et al. "Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 12 (2025): 13026–34. https://doi.org/10.1609/aaai.v39i12.33421.

Pełny tekst źródła

Streszczenie:

Heterogeneous graphs (HGs) that contain various node and edge types are ubiquitous in real-world scenarios. Considering the common label sparsity problem in HGs, some researchers propose to pretrain on source HGs to extract general knowledge and then fine-tune on a target HG for knowledge transfer. However, existing methods often assume that source and target HGs share a single heterogeneity, meaning that they have the same types of nodes and edges, which contradicts the real-world scenarios requiring cross-heterogeneity transfer. Although a recent study has made some preliminary attempts in c

Style APA, Harvard, Vancouver, ISO itp.

36

Kim, Boeun, Dohaeng Lee, Damrin Kim, et al. "Generative Model Using Knowledge Graph for Document-Grounded Conversations." Applied Sciences 12, no. 7 (2022): 3367. http://dx.doi.org/10.3390/app12073367.

Pełny tekst źródła

Streszczenie:

Document-grounded conversation (DGC) is a natural language generation task to generate fluent and informative responses by leveraging dialogue history and document(s). Recently, DGCs have focused on fine-tuning using pretrained language models. However, these approaches have a problem in that they must leverage the background knowledge under capacity constraints. For example, the maximum length of the input is limited to 512 or 1024 tokens. This problem is fatal in DGC because most documents are longer than the maximum input length. To address this problem, we propose a document-grounded gener

Style APA, Harvard, Vancouver, ISO itp.

37

Yu, Hyunwook, Yejin Cho, Geunchul Park, and Mucheol Kim. "KRongBERT: Enhanced factorization-based morphological approach for the Korean pretrained language model." Information Processing & Management 62, no. 3 (2025): 104072. https://doi.org/10.1016/j.ipm.2025.104072.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

38

Zhang, Weihong, Fan Hu, Wang Li, and Peng Yin. "Does protein pretrained language model facilitate the prediction of protein–ligand interaction?" Methods 219 (November 2023): 8–15. http://dx.doi.org/10.1016/j.ymeth.2023.08.016.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

39

Lee, Jaeseong, Dohyeon Lee, and Seung-won Hwang. "Script, Language, and Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (2023): 13004–13. http://dx.doi.org/10.1609/aaai.v37i11.26528.

Pełny tekst źródła

Streszczenie:

Although multilingual pretrained models (mPLMs) enabled support of various natural language processing in diverse languages, its limited coverage of 100+ languages lets 6500+ languages remain ‘unseen’. One common approach for an unseen language is specializing the model for it as target, by performing additional masked language modeling (MLM) with the target language corpus. However, we argue that, due to the discrepancy from multilingual MLM pretraining, a naive specialization as such can be suboptimal. Specifically, we pose three discrepancies to overcome. Script and linguistic discrepancy o

Style APA, Harvard, Vancouver, ISO itp.

40

Delgadillo, Josiel, Johnson Kinyua, and Charles Mutigwe. "FinSoSent: Advancing Financial Market Sentiment Analysis through Pretrained Large Language Models." Big Data and Cognitive Computing 8, no. 8 (2024): 87. http://dx.doi.org/10.3390/bdcc8080087.

Pełny tekst źródła

Streszczenie:

Predicting the directions of financial markets has been performed using a variety of approaches, and the large volume of unstructured data generated by traders and other stakeholders on social media microblog platforms provides unique opportunities for analyzing financial markets using additional perspectives. Pretrained large language models (LLMs) have demonstrated very good performance on a variety of sentiment analysis tasks in different domains. However, it is known that sentiment analysis is a very domain-dependent NLP task that requires knowledge of the domain ontology, and this is part

Style APA, Harvard, Vancouver, ISO itp.

41

Guo, Jianyu, Jingnan Chen, Li Ren, Huanlai Zhou, Wenbo Xu, and Haitao Jia. "Constructing Chinese taxonomy trees from understanding and generative pretrained language models." PeerJ Computer Science 10 (October 3, 2024): e2358. http://dx.doi.org/10.7717/peerj-cs.2358.

Pełny tekst źródła

Streszczenie:

The construction of hypernym taxonomic trees, a critical task in the field of natural language processing, involves extracting lexical relationships, specifically creating a tree structure that represents hypernym relationships among a given set of words within the same domain. In this work, we present a method for constructing hypernym taxonomy trees in the Chinese language domain, and we named it CHRRM (Chinese Hypernym Relationship Reasoning Model). Our method consists of two main steps: First, we utilize pre-trained models to predict hypernym relationships between pairs of words; second, w

Style APA, Harvard, Vancouver, ISO itp.

42

Ahuir, Vicent, Lluís-F. Hurtado, José Ángel González, and Encarna Segarra. "NASca and NASes: Two Monolingual Pre-Trained Models for Abstractive Summarization in Catalan and Spanish." Applied Sciences 11, no. 21 (2021): 9872. http://dx.doi.org/10.3390/app11219872.

Pełny tekst źródła

Streszczenie:

Most of the models proposed in the literature for abstractive summarization are generally suitable for the English language but not for other languages. Multilingual models were introduced to address that language constraint, but despite their applicability being broader than that of the monolingual models, their performance is typically lower, especially for minority languages like Catalan. In this paper, we present a monolingual model for abstractive summarization of textual content in the Catalan language. The model is a Transformer encoder-decoder which is pretrained and fine-tuned specifi

Style APA, Harvard, Vancouver, ISO itp.

43

Mallappa, Satishkumar, Dhandra B.V., and Gururaj Mukarambi. "SCRIPT IDENTIFICATION FROM CAMERA CAPTURED INDIAN DOCUMENT IMAGES WITH CNN MODEL." ICTACT Journal on Soft Computing 14, no. 2 (2023): 3232–36. http://dx.doi.org/10.21917/ijsc.2023.0453.

Pełny tekst źródła

Streszczenie:

Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and noncontact image capture, which enables many new applications and breathes new life into existing ones, but camera-captured documents may suffer from distortions caused by a nonplanar document shape and perspective projection, which lead to the failure of current optical character recognition (OCR) technologies. This paper presents a new CNN model for script identification from camera-captured Indian multilingual document images. To evaluate the performance of the proposed model 9 regional languages, one n

Style APA, Harvard, Vancouver, ISO itp.

44

Zhu, Beier, Yulei Niu, Saeil Lee, Minhoe Hur, and Hanwang Zhang. "Debiased Fine-Tuning for Vision-Language Models by Prompt Regularization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (2023): 3834–42. http://dx.doi.org/10.1609/aaai.v37i3.25496.

Pełny tekst źródła

Streszczenie:

We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstream task, dubbed Prompt Regularization (ProReg). Different from traditional fine-tuning which easily overfits to the downstream task data, ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning. The motivation is: by prompting the large model “a photo of a [CLASS]”, the fill-in answer is only dependent on the pretraining encyclopedic knowledge while independent of the task data distribution, which is usually biased. Specifically, given a training sample predict

Style APA, Harvard, Vancouver, ISO itp.

45

Chen, Guanlin, Zhao Cheng, Qi Lu, Wenyong Weng, and Wujian Yang. "Named Entity Recognition of Hazardous Chemical Risk Information Based on Multihead Self-Attention Mechanism and BERT." Wireless Communications and Mobile Computing 2022 (July 7, 2022): 1–8. http://dx.doi.org/10.1155/2022/8300672.

Pełny tekst źródła

Streszczenie:

An approach based on self-attention mechanism and pretrained model BERT is proposed to solve the problems of entity recognition and relationship recognition of hazardous chemical risk information. The text of hazardous chemical risk information is coded at the character level by adding the pretrained language model BERT, which, when paired with a multihead self-attention mechanism, improves the ability to mine global and local aspects of texts. The experimental results show that the model’s F1 value is 94.57 percent, which is significantly higher than that of other standard models.

Style APA, Harvard, Vancouver, ISO itp.

46

Joshi, Herat, and Shenson Joseph. "ULMFiT: Universal Language Model Fine-Tuning for Text Classification." International Journal of Advanced Medical Sciences and Technology 4, no. 6 (2024): 1–9. http://dx.doi.org/10.54105/ijamst.e3049.04061024.

Pełny tekst źródła

Streszczenie:

While inductive transfer learning has revolutionized computer vision, current approaches to natural language processing still need training from the ground up and task-specific adjustments. As a powerful transfer learning approach applicable to any NLP activity, we provide Universal Language Model Fine-tuning (ULMFiT) and outline essential strategies for language model fine-tuning. With an error reduction of 18–24% on most datasets, our technique considerably surpasses the state-of-the-art on six text categorization tasks. Additionally, it achieves the same level of performance as training on

Style APA, Harvard, Vancouver, ISO itp.

47

Herat, Joshi. "ULMFiT: Universal Language Model Fine-Tuning for Text Classification." International Journal of Advanced Medical Sciences and Technology (IJAMST) 4, no. 6 (2024): 1–9. https://doi.org/10.54105/ijamst.E3049.04061024/.

Pełny tekst źródła

Streszczenie:

<strong>Abstract:</strong> While inductive transfer learning has revolutionized computer vision, current approaches to natural language processing still need training from the ground up and task-specific adjustments. As a powerful transfer learning approach applicable to any NLP activity, we provide Universal Language Model Fine-tuning (ULMFiT) and outline essential strategies for language model fine-tuning. With an error reduction of 18–24% on most datasets, our technique considerably surpasses the state-of-the-art on six text categorization tasks. Additionally, it achieves the same lev

Style APA, Harvard, Vancouver, ISO itp.

48

Budige, Usharani, and Srikar Goud Konda. "Text To Image Generation By Using Stable Diffusion Model With Variational Autoencoder Decoder." International Journal for Research in Applied Science and Engineering Technology 11, no. 10 (2023): 514–19. http://dx.doi.org/10.22214/ijraset.2023.56024.

Pełny tekst źródła

Streszczenie:

Abstract: Imagen is a text-to-image diffusion model with a profound comprehension of language and an unmatched level of photorealism. Imagen relies on the potency of diffusion models for creating high-fidelity images and draws on the strength of massive transformer language models for comprehending text. Our most important finding is that general large language models, like T5, pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: expanding the language model in Imagen improves sample fidelity and image to text alignment much more than expanding the

Style APA, Harvard, Vancouver, ISO itp.

49

Alrashidi, Bedour, Amani Jamal, and Ali Alkhathlan. "Abusive Content Detection in Arabic Tweets Using Multi-Task Learning and Transformer-Based Models." Applied Sciences 13, no. 10 (2023): 5825. http://dx.doi.org/10.3390/app13105825.

Pełny tekst źródła

Streszczenie:

Different social media platforms have become increasingly popular in the Arab world in recent years. The increasing use of social media, however, has also led to the emergence of a new challenge in the form of abusive content, including hate speech, offensive language, and abusive language. Existing research work focuses on automatic abusive content detection as a binary classification problem. In addition, the existing research work on the automatic detection task surrounding abusive Arabic content fails to tackle the dialect-specific phenomenon. Consequently, this has led to two important is

Style APA, Harvard, Vancouver, ISO itp.

50

Jiang, Peihai, Xixiang Lyu, Yige Li, and Jing Ma. "Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 23 (2025): 24285–93. https://doi.org/10.1609/aaai.v39i23.34605.

Pełny tekst źródła

Streszczenie:

Supervised fine-tuning has become the predominant method for adapting large pretrained models to downstream tasks. However, recent studies have revealed that these models are vulnerable to backdoor attacks, where even a small number of malicious samples can successfully embed backdoor triggers into the model. While most existing defense methods focus on post-training backdoor defense, efficiently defending against backdoor attacks during training phase remains largely unexplored. To address this gap, we propose a novel defense method called Backdoor Token Unlearning (BTU), which proactively de

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!