Log in

Relevant bibliographies by topics / BERT model / Journal articles

To see the other types of publications on this topic, follow the link: BERT model.

Journal articles on the topic 'BERT model'

Author: Grafiati

Published: 5 June 2025

Last updated: 16 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'BERT model.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Yu, Daegon, Yongyeon Kim, Sangwoo Han, and Byung-Won On. "CLES-BERT: Contrastive Learning-based BERT Model for Automated Essay Scoring." Journal of Korean Institute of Information Technology 21, no. 4 (2023): 31–43. http://dx.doi.org/10.14801/jkiit.2023.21.4.31.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Cui, Hongyang, Chentao Wang, and Yibo Yu. "News Short Text Classification Based on Bert Model and Fusion Model." Highlights in Science, Engineering and Technology 34 (February 28, 2023): 262–68. http://dx.doi.org/10.54097/hset.v34i.5482.

Full text

Abstract:

Text classification task is one of the most fundamental tasks in NLP, and the classification of short news text could be the basis for many other tasks. In this paper, we applied a fusion model combining Bert and TextRNN with some modified details to expect higher accuracy of text classification. We used the THUCNews as dataset which consists of two columns one for news text and the other for numbers. The original dataset was seperated into three parts: training set, validation set and test set. Besides, we used BERT model which contains two pre-training tasks and TextRNN model which refers to the use of RNN to solve text classification problems. We trained these two models in parallel, and then the optimal Bert and TextRNN models obtained through training and parameter tuning are added with a fully-connected layer to receive the final results by weighting the efficiency of Bert and TextRNN. The fusion model solves the problem of over-fitting and under-fitting of a single model, and helps to obtain a model with better generalization performance. The experimental results show the sharp change in loss and accuracy as well as the final accuracy of the BERT model. The precision, recall-rate and F1-score are also evaluated in this paper. The accuracy of fusion model of BERT and TextRNN is much better than single Bert model and has a gap to 1.76%.

APA, Harvard, Vancouver, ISO, and other styles

3

Wen, Yu, Yezhang Liang, and Xinhua Zhu. "Sentiment analysis of hotel online reviews using the BERT model and ERNIE model—Data from China." PLOS ONE 18, no. 3 (2023): e0275382. http://dx.doi.org/10.1371/journal.pone.0275382.

Full text

Abstract:

The emotion analysis of hotel online reviews is discussed by using the neural network model BERT, which proves that this method can not only help hotel network platforms fully understand customer needs but also help customers find suitable hotels according to their needs and affordability and help hotel recommendations be more intelligent. Therefore, using the pretraining BERT model, a number of emotion analytical experiments were carried out through fine-tuning, and a model with high classification accuracy was obtained by frequently adjusting the parameters during the experiment. The BERT layer was taken as a word vector layer, and the input text sequence was used as the input to the BERT layer for vector transformation. The output vectors of BERT passed through the corresponding neural network and were then classified by the softmax activation function. ERNIE is an enhancement of the BERT layer. Both models can lead to good classification results, but the latter performs better. ERNIE exhibits stronger classification and stability than BERT, which provides a promising research direction for the field of tourism and hotels.

APA, Harvard, Vancouver, ISO, and other styles

4

S, Sushma, Sasmita Kumari Nayak, and M. Vamsi Krishna. "Enhanced toxic comment detection model through Deep Learning models using Word embeddings and transformer architectures." Future Technology 4, no. 3 (2025): 76–84. https://doi.org/10.55670/fpll.futech.4.3.8.

Full text

Abstract:

The proliferation of harmful and toxic comments on social media platforms necessitates the development of robust methods for automatically detecting and classifying such content. This paper investigates the application of natural language processing (NLP) and ML techniques for toxic comment classification using the Jigsaw Toxic Comment Dataset. Several deep learning models, including recurrent neural networks (RNN, LSTM, and GRU), are evaluated in combination with feature extraction methods such as TF-IDF, Word2Vec, and BERT embeddings. The text data is pre-processed using both Word2Vec and TF-IDF techniques for feature extraction. Rather than implementing a combined ensemble output, the study conducts a comparative evaluation of model-embedding combinations to determine the most effective pairings. Results indicate that integrating BERT with traditional models (RNN+BERT, LSTM+BERT, GRU+BERT) leads to significant improvements in classification accuracy, precision, recall, and F1-score, demonstrating the effectiveness of BERT embeddings in capturing nuanced text features. Among all configurations, LSTM combined with Word2Vec and LSTM with BERT yielded the highest performance. This comparative approach highlights the potential of combining classical recurrent models with transformer-based embeddings as a promising direction for detecting toxic comments. The findings of this work provide valuable insights into leveraging deep learning techniques for toxic comment detection, suggesting future directions for refining such models in real-world applications.

APA, Harvard, Vancouver, ISO, and other styles

5

Okolo, Omachi, B. Y. Baha, and M. D. Philemon. "Using Causal Graph Model variable selection for BERT models Prediction of Patient Survival in a Clinical Text Discharge Dataset." Journal of Future Artificial Intelligence and Technologies 1, no. 4 (2025): 455–73. https://doi.org/10.62411/faith.3048-3719-61.

Full text

Abstract:

Feature selection in most black-box machine learning algorithms, such as BERT, is based on the cor-relations between features and the target variable rather than causal relationships in the dataset. This makes their predictive power and decisions questionable because of their potential bias. This paper presents novel BERT models that learn from causal variables in a clinical discharge dataset. The causal-directed acyclic Graphs (DAG) identify input variables for patients’ survival rate prediction and decisions. The core idea behind our model lies in the ability of the BERT-based model to learn from the causal DAG semi-synthetic dataset, enabling it to model the underlying causal structure accurately in-stead of the generic spurious correlations devoid of causation. The results from Causal DAG Conditional Independence Test (CIT) validation metrics showed that the conceptual assumptions of the causal DAG were supported, the Pearson correlation coefficient ranges between -1 and 1, the p-value was (>0.05), and the confidence interval of 95% and 25% were satisfied. We further mapped the semi-synthetic dataset that evolved from the Causal DAG to three BERT models. Two metrics, pre-diction accuracy, and AUC score, were used to compare the performance of the BERT models. The accuracy of the BERT models showed that the regular BERT has a performance of 96%, while Clinical-BERT performance was 90%, and Clinical-BERT-Discharge-summary was 92%. On the other hand, the AUC score for BERT was 79%, ClinicalBERT was 77%, while ClinicalBERT-discharge summary was 84%. Our experiments on the synthetic dataset for the patient’s survival rate from the causal DAG datasets demonstrate high predictive performance and explainable input variables for human under-standing to justify prediction.

APA, Harvard, Vancouver, ISO, and other styles

6

Zhao, Lanxin, Wanrong Gao, and Jianbin Fang. "Optimizing Large Language Models on Multi-Core CPUs: A Case Study of the BERT Model." Applied Sciences 14, no. 6 (2024): 2364. http://dx.doi.org/10.3390/app14062364.

Full text

Abstract:

The BERT model is regarded as the cornerstone of various pre-trained large language models that have achieved promising results in recent years. This article investigates how to optimize the BERT model in terms of fine-tuning speed and prediction accuracy, aiming to accelerate the execution of the BERT model on a multi-core processor and improve its prediction accuracy in typical downstream natural language processing tasks. Our contributions are two-fold. First, we port and parallelize the fine-tuning training of the BERT model on a multi-core shared-memory processor. We port the BERT model onto a multi-core processor platform to accelerate the fine-tuning training process of the model for downstream tasks. Second, we improve the prediction performance of typical downstream natural language processing tasks through fine-tuning the model parameters. We select five typical downstream natural language processing tasks (CoLA, SST-2, MRPC, RTE, and WNLI) and perform optimization on the multi-core platform, taking the hyperparameters of batch size, learning rate, and training epochs into account. Our experimental results show that, by increasing the number of CPUs and the number of threads, the model training time can be significantly reduced. We observe that the reduced time is primarily concentrated in the self-attention mechanism. Our further experimental results show that setting reasonable hyperparameters can improve the accuracy of the BERT model when applied to downstream tasks and that appropriately increasing the batch size under conditions of sufficient computing resources can significantly reduce training time.

APA, Harvard, Vancouver, ISO, and other styles

7

Manuel-Ilie, Dorca, Pitic Antoniu Gabriel, and Crețulescu Radu George. "Sentiment Analysis Using Bert Model." International Journal of Advanced Statistics and IT&C for Economics and Life Sciences 13, no. 1 (2023): 59–66. http://dx.doi.org/10.2478/ijasitels-2023-0007.

Full text

Abstract:

Abstract The topic of this presentation entails a comprehensive investigation of our sentiment analysis algorithm. The document provides a thorough examination of its theoretical underpinnings, meticulous assessment criteria, consequential findings, and an enlightening comparative analysis. Our system makes a substantial contribution to the field of sentiment analysis by using advanced techniques based on deep learning and state-of-the-art architectures.

APA, Harvard, Vancouver, ISO, and other styles

8

Said, Fadillah, and Lindung Parningotan Manik. "Aspect-Based Sentiment Analysis on Indonesian Presidential Election Using Deep Learning." Paradigma - Jurnal Komputer dan Informatika 24, no. 2 (2022): 160–67. http://dx.doi.org/10.31294/paradigma.v24i2.1415.

Full text

Abstract:

Pemilihan presiden tahun 2019 merupakan pemilihan presiden yang menjadi perbincangan hangat selama beberapa waktu bahkan orang membicarakan topik ini sejak tahun 2018 di internet. Dalam memprediksi pemenang pemilihan presiden penelitian sebelumnya telah melakukan penelitian terhadap dataset Analisis sentimen berbasis aspek (ABSA) pemilihan presiden tahun 2019 menggunakan algoritma pembelajaran mesin seperti Support Vector Machine (SVM), Naive Bayes (NB), dan K-Nearest Neighbors (KNN) dan menghasilkan akurasi yang cukup baik. Penelitian ini mengusulkan metode deep learning dengan menggunakan model BERT (Bidirectional Encoder Representation form Transformers) dan RoBERTa (A Robustly Optimized BERT Pretraining Approach). Hasil penelitian ini menunjukkan bahwa model BERT indobenchmark dan RoBERTa base-indonesian single label classification pada fitur target dengan preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 98.02%. Model BERT indolem dan indobenchmark single label classification pada fitur target tanpa preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 98.02%. Model BERT indobenchmark single label classification pada fitur aspek dengan preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 74.26%. Model BERT indolem single label classification pada fitur aspek tanpa preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 74.26%. Model BERT indolem single label classification pada fitur sentiment dengan preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 93.07%. Model BERT indolem single label classification pada fitur sentiment tanpa preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 94.06%. Model BERT indobenchmark multi label classification dengan preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 98.66%. Model BERT indobenchmark multi label classification tanpa preprocessing menghasilkan akurasi yang terbaik yaitu sebesar 98.66%.

APA, Harvard, Vancouver, ISO, and other styles

9

Fu, Guanping, and Jianwei Sun. "Chinese text multi-classification based on Sentences Order Prediction improved Bert model." Journal of Physics: Conference Series 2031, no. 1 (2021): 012054. http://dx.doi.org/10.1088/1742-6596/2031/1/012054.

Full text

Abstract:

Abstract For the strong noise interference brought by the NSP mechanism (Next Sentences Prediction) in Bert to the model, in order to improve the classification effect of the Bert model when it is used in text classification, an SOP (Sentences Order Prediction) mechanism is used to replace the Bert model of the NSP mechanism-Multi-classification of Chinese news texts. At first, use randomly sorted adjacent sentence pairs for segment embedding. Then use the Transformer structure of the Bert model to encode the Chinese text, and obtain the final CLS vector as the semantic vector of the text. Finally, connect the different semantic vectors to the multi-category Classification. After ablation experiments, the improved SOP-Bert model obtained the highest F1 value of 96.69. The results show that this model is more effective than the original Bert model on text multi-classification problems.

APA, Harvard, Vancouver, ISO, and other styles

10

Mannix, Ilma Alpha, and Evi Yulianti. "Academic expert finding using BERT pre-trained language model." International Journal of Advances in Intelligent Informatics 10, no. 2 (2024): 280. http://dx.doi.org/10.26555/ijain.v10i2.1497.

Full text

Abstract:

Academic expert finding has numerous advantages, such as: finding paper-reviewers, research collaboration, enhancing knowledge transfer, etc. Especially, for research collaboration, researchers tend to seek collaborators who share similar backgrounds or with the same native languages. Despite its importance, academic expert findings remain relatively unexplored within the context of Indonesian language. Recent studies have primarily relied on static word embedding techniques such as Word2Vec to match documents with relevant expertise areas. However, Word2Vec is unable to capture the varying meanings of words in different contexts. To address this research gap, this study employs Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art contextual embedding model. This paper aims to examine the effectiveness of BERT on the task of academic expert finding. The proposed model in this research consists of three variations of BERT, namely IndoBERT (Indonesian BERT), mBERT (Multilingual BERT), and SciBERT (Scientific BERT), which will be compared to a static embedding model using Word2Vec. Two approaches were employed to rank experts using the BERT variations: feature-based and fine-tuning. We found that the IndoBERT model outperforms the baseline by 6–9% when utilizing the feature-based approach and shows an improvement of 10–18% with the fine-tuning approach. Our results proved that the fine-tuning approach performs better than the feature-based approach, with an improvement of 1–5%. It concludes by using IndoBERT, this research has shown an improved effectiveness in the academic expert finding within the context of Indonesian language.

APA, Harvard, Vancouver, ISO, and other styles

11

Xu, Huatao, Pengfei Zhou, Rui Tan, Mo Li, and Guobin Shen. "LIMU-BERT." GetMobile: Mobile Computing and Communications 26, no. 3 (2022): 39–42. http://dx.doi.org/10.1145/3568113.3568124.

Full text

Abstract:

Deep learning greatly empowers Inertial Measurement Unit (IMU) sensors for a wide range of sensing applications. Most existing works require substantial amounts of wellcurated labeled data to train IMU-based sensing models, which incurs high annotation and training costs. Compared with labeled data, unlabeled IMU data are abundant and easily accessible. This article presents a novel representation learning model that can make use of unlabeled IMU data and extract generalized rather than task-specific features. With the representations learned via our model, task-specific models trained with limited labeled samples can achieve superior performances in typical IMU sensing applications, such as Human Activity Recognition (HAR).

APA, Harvard, Vancouver, ISO, and other styles

12

Maryanto, Maryanto, Philips Philips, and Abba Suganda Girsang. "Hybrid model for extractive single document summarization: utilizing BERTopic and BERT model." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 2 (2024): 1723. http://dx.doi.org/10.11591/ijai.v13.i2.pp1723-1731.

Full text

Abstract:

Extractive text summarization has been a popular research area for many years. The goal of this task is to generate a compact and coherent summary of a given document, preserving the most important information. However, current extractive summarization methods still face several challenges such as semantic drift, repetition, redundancy, and lack of coherence. A novel approach is presented in this paper to improve the performance of an extractive summarization model based on bidirectional encoder representations from transformers (BERT) by incorporating topic modeling using the BERTopic model. Our method first utilizes BERTopic to identify the dominant topics in a document and then employs a BERT-based deep neural network to extract the most salient sentences related to those topics. Our experiments on the cable news network (CNN)/daily mail dataset demonstrate that our proposed method outperforms state-of-the-art BERT-based extractive summarization models in terms of recall-oriented understudy for gisting evaluation (ROUGE) scores, which resulted in an increase of 32.53% of ROUGE-1, 47.55% of ROUGE-2, and 16.63% of ROUGE-L when compared to baseline BERT-based extractive summarization models. This paper contributes to the field of extractive text summarization, highlights the potential of topic modeling in improving summarization results, and provides a new direction for future research.

APA, Harvard, Vancouver, ISO, and other styles

13

Maryanto, Maryanto, Philips Philips, and Suganda Girsang Abba. "Hybrid model for extractive single document summarization: utilizing BERTopic and BERT model." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 2 (2024): 1723–31. https://doi.org/10.11591/ijai.v13.i2.pp1723-1731.

Full text

Abstract:

Extractive text summarization has been a popular research area for many years. The goal of this task is to generate a compact and coherent summary of a given document, preserving the most important information. However, current extractive summarization methods still face several challenges such as semantic drift, repetition, redundancy, and lack of coherence. A novel approach is presented in this paper to improve the performance of an extractive summarization model based on bidirectional encoder representations from transformers (BERT) by incorporating topic modeling using the BERTopic model. Our method first utilizes BERTopic to identify the dominant topics in a document and then employs a BERT-based deep neural network to extract the most salient sentences related to those topics. Our experiments on the cable news network (CNN)/daily mail dataset demonstrate that our proposed method outperforms state-of-the-art BERT-based extractive summarization models in terms of recall-oriented understudy for gisting evaluation (ROUGE) scores, which resulted in an increase of 32.53% of ROUGE-1, 47.55% of ROUGE-2, and 16.63% of ROUGE-L when compared to baseline BERT-based extractive summarization models. This paper contributes to the field of extractive text summarization, highlights the potential of topic modeling in improving summarization results, and provides a new direction for future research.

APA, Harvard, Vancouver, ISO, and other styles

14

Arefeva, Veronika, and Roman Egger. "When BERT Started Traveling: TourBERT—A Natural Language Processing Model for the Travel Industry." Digital 2, no. 4 (2022): 546–59. http://dx.doi.org/10.3390/digital2040030.

Full text

Abstract:

In recent years, Natural Language Processing (NLP) has become increasingly important for extracting new insights from unstructured text data, and pre-trained language models now have the ability to perform state-of-the-art tasks like topic modeling, text classification, or sentiment analysis. Currently, BERT is the most widespread and widely used model, but it has been shown that a potential to optimize BERT can be applied to domain-specific contexts. While a number of BERT models that improve downstream tasks’ performance for other domains already exist, an optimized BERT model for tourism has yet to be revealed. This study thus aimed to develop and evaluate TourBERT, a pre-trained BERT model for the tourism industry. It was trained from scratch and outperforms BERT-Base in all tourism-specific evaluations. Therefore, this study makes an essential contribution to the growing importance of NLP in tourism by providing an open-source BERT model adapted to tourism requirements and particularities.

APA, Harvard, Vancouver, ISO, and other styles

15

Yu, Geyang. "An analysis of BERT-based model for Berkshire stock performance prediction using Warren Buffet's letters." Applied and Computational Engineering 52, no. 1 (2024): 55–61. http://dx.doi.org/10.54254/2755-2721/52/20241232.

Full text

Abstract:

The objective of this study is to discover and validate eective Bidirectional Encoder Representations from Transformers (BERT)-based models for stock market prediction of Berkshire Hathaway. The stock market is full of uncertainty and dynamism and its prediction has always been a critical challenge in the nancial domain. Therefore, accurate predictions of market trends are important for making investment decisions and risk management. The primary approach involves sentiment analysis of reviews on market performance. This work selects Warren Buetts annual letters to investors and the year-by-year stock market performance of the Berkshire Hathway as the dataset. This work leverages three BERT-based models which are BERT-Gated Recurrent Units (BERT-GRU) model, BERT-Long short-term memory (BERT-LSTM) model, and BERT-Multi-Head Attention model to analyse the Buetts annual letters and predict the Berkshire Hathways stock price changes. After conducting experiments, it could be concluded that all three models have a certain degree of predictive capability, with the BERT-Multi-Head Attention model demonstrating the best predictive performance.

APA, Harvard, Vancouver, ISO, and other styles

16

Wu, Jun, Tianliang Zhu, Xinli Zheng, and Chunzhi Wang. "Multi-Modal Sentiment Analysis Based on Interactive Attention Mechanism." Applied Sciences 12, no. 16 (2022): 8174. http://dx.doi.org/10.3390/app12168174.

Full text

Abstract:

In recent years, multi-modal sentiment analysis has become more and more popular in the field of natural language processing. Multi-modal sentiment analysis mainly concentrates on text, image and audio information. Previous work based on BERT utilizes only text representation to fine-tune BERT, while ignoring the importance of nonverbal information. Most current research methods are fine-tuning models based on BERT that do not optimize BERT’s internal structure. Therefore, in this paper, we propose an optimized BERT model that is composed of three modules: the Hierarchical Multi-head Self Attention module realizes the hierarchical extraction process of the features; the Gate Channel module replaces BERT’s original Feed-Forward layer to realize information filtering; the tensor fusion model based on self-attention mechanism utilized to implement the fusion process of different modal features. In CMU-MOSI, a public mult-imodal sentiment analysis dataset, the accuracy and F1-Score were improved by 0.44% and 0.46% compared with the original BERT model using custom fusion. Compared with traditional models, such as LSTM and Transformer, they are improved to a certain extent.

APA, Harvard, Vancouver, ISO, and other styles

17

Jiao, Yuxin, and Li Zhao. "Real-Time Extraction of News Events Based on BERT Model." International Journal of Advanced Network, Monitoring and Controls 9, no. 3 (2024): 24–31. http://dx.doi.org/10.2478/ijanmc-2024-0023.

Full text

Abstract:

Abstract For the large number of news reports generated every day, in order to obtain effective information from these unstructured news text data more efficiently. In this paper, we study the real-time crawling of news data from news websites through crawling techniques and propose a BERT model-based approach to extract events from news long text. In this study, NetEase news website is selected as an example for real-time extraction to crawl the news data of this website. BERT model as a pre-trained model based on two-way encoded representation of transformer performs well on natural language understanding and natural language generation tasks. In this study, we will fine-tune the training based on BERT model on news corpus related dataset and perform sequence annotation through CRF layer to finally complete the event extraction task. In this paper, the DUEE dataset is chosen to train the model, and the experiments show that the overall performance of the BERT model is better than other network models. Finally, the model of this paper is further optimised, using the ALBERT and RoBERTa models improved on the basis of the BERT model, experiments were conducted, the results show that both models are improved compared to the BERT model, the ALBERT model algorithm performs the best, the model algorithm's F1 value is 1% higher than that of BERT. The results show that the performance is optimised.

APA, Harvard, Vancouver, ISO, and other styles

18

Angger Saputra, Revelin, and Yuliant Sibaroni. "Multilabel Hate Speech Classification in Indonesian Political Discourse on X using Combined Deep Learning Models with Considering Sentence Length." Jurnal Ilmu Komputer dan Informasi 18, no. 1 (2025): 113–25. https://doi.org/10.21609/jiki.v18i1.1440.

Full text

Abstract:

Hate speech, as public expression of hatred or offensive discourse targeting race, religion, gender, or sexual orientation, is widespread on social media. This study assesses BERT-based models for multi-label hate speech detection, emphasizing how text length impacts model performance. Models tested include BERT, BERT-CNN, BERT-LSTM, BERT-BiLSTM, and BERT with two LSTM layers. Overall, BERT-BiLSTM achieved the highest (82.00%) and best performance on longer texts (83.20% ) with high and , highlighting its ability to capture nuanced context. BERT-CNN excelled in shorter texts, achieving the highest (79.80%) and an of 79.10%, indicating its effectiveness in extracting features in brief content. BERT-LSTM showed balanced and across text lengths, while BERT-BiLSTM, although high in r, had slightly lower on short texts due to its reliance on broader context. These results highlight the importance of model selection based on text characteristics: BERT-BiLSTM is ideal for nuanced analysis in longer texts, while BERT-CNN better captures key features in shorter content.

APA, Harvard, Vancouver, ISO, and other styles

19

Chen, Xi. "Performance analysis of robustness of BERT model under attack." Journal of Physics: Conference Series 2580, no. 1 (2023): 012022. http://dx.doi.org/10.1088/1742-6596/2580/1/012022.

Full text

Abstract:

Abstract With the aim of testing the robustness of machine learning models, this paper tests the performance of five classification models based on IMDB datasets. Furthermore, in this work, two types of sentence embeddings generated by word2vec and BERT are added with the noise of the normal distribution with different intensities. They are fed into the Support Vector Machine for testing. The experimental results show that the performance of the model slowly decreases as the noise intensity is increased. BERT-based sentiment embedding reduces less than Word2vec-based sentiment embedding. This paper believes that this experimental phenomenon can be used as a basis to support the robustness of BERT representation learning. At the same time, this paper also noted that the previous work experimented with perturbating the BERT model by replacing the words in the original text of the IMDB dataset, causing the BERT model performance to drop sharply. However, the experiment in this paper tested the robustness of the BERT model by embedding tampering, and the experimental results were stable. In this paper, it is assumed that word substitution and embedding interference are possible situations when operating the model. Therefore, for the two different experimental phenomena, this paper plans to design more detailed experiments in the next work to check the robustness of the model.

APA, Harvard, Vancouver, ISO, and other styles

20

Cheng, Hua, Renjie Yu, Yixin Tang, Yiquan Fang, and Tao Cheng. "Text Classification Model Enhanced by Unlabeled Data for LaTeX Formula." Applied Sciences 11, no. 22 (2021): 10536. http://dx.doi.org/10.3390/app112210536.

Full text

Abstract:

Generic language models pretrained on large unspecific domains are currently the foundation of NLP. Labeled data are limited in most model training due to the cost of manual annotation, especially in domains including massive Proper Nouns such as mathematics and biology, where it affects the accuracy and robustness of model prediction. However, directly applying a generic language model on a specific domain does not work well. This paper introduces a BERT-based text classification model enhanced by unlabeled data (UL-BERT) in the LaTeX formula domain. A two-stage Pretraining model based on BERT(TP-BERT) is pretrained by unlabeled data in the LaTeX formula domain. A double-prediction pseudo-labeling (DPP) method is introduced to obtain high confidence pseudo-labels for unlabeled data by self-training. Moreover, a multi-rounds teacher–student model training approach is proposed for UL-BERT model training with few labeled data and more unlabeled data with pseudo-labels. Experiments on the classification of the LaTex formula domain show that the classification accuracies have been significantly improved by UL-BERT where the F1 score has been mostly enhanced by 2.76%, and lower resources are needed in model training. It is concluded that our method may be applicable to other specific domains with enormous unlabeled data and limited labelled data.

APA, Harvard, Vancouver, ISO, and other styles

21

Yang, Menglin, and Yiqing Lu. "Sentiment Analysis of online reviews based on LDA and AP-Bert model." Highlights in Science, Engineering and Technology 1 (June 14, 2022): 261–69. http://dx.doi.org/10.54097/hset.v1i.472.

Full text

Abstract:

The purpose of this paper is to construct a more accurate behavior matrix by using fine-grained aspect level emotion analysis method. Firstly, LDA topic extraction model is used to extract the topic of online review text, and the concerned attributes are extracted. According to the characteristics of online comments, a BERT emotion analysis model with enhanced pooling was proposed. Activation function layer and max-average pooling layer were designed to solve the over-fitting problem of BERT model in the process of emotion analysis. Finally, by combining LDA extraction results and AP-Bert sentiment analysis results, the proportion matrix is obtained. Experimental results show that the accuracy, recall rate and AUC value of AP-Bert model are better than those of the same type of model and original BERT model.

APA, Harvard, Vancouver, ISO, and other styles

22

MacLean, Alexander, and Alexander Wong. "Where do Clinical Language Models Break Down? A Critical Behavioural Exploration of the ClinicalBERT Deep Transformer Model." Journal of Computational Vision and Imaging Systems 6, no. 1 (2021): 1–4. http://dx.doi.org/10.15353/jcvis.v6i1.3548.

Full text

Abstract:

 The introduction of Bidirectional Encoder Representations from Transformers (BERT) was a major breakthrough for transfer learning in natural language processing, enabling state-of-the-art performance across a large variety of complex language understanding tasks. In the realm of clinical language modeling, the advent of BERT led to the creation of ClinicalBERT, a state-of-the-art deep transformer model pretrained on a wealth of patient clinical notes to facilitate for downstream predictive tasks in the clinical domain. While ClinicalBERT has been widely leveraged by the research community as the foundation for building clinical domain-specific predictive models given its overall improved performance in the Medical Natural Language inference (MedNLI) challenge compared to the seminal BERT model, the fine-grained behaviour and intricacies of this popular clinical language model has not been well-studied. Without this deeper understanding, it is very challenging to understand where ClinicalBERT does well given its additional exposure to clinical knowledge, where it doesn't, and where it can be improved in a meaningful manner. Motivated to garner a deeper understanding, this study presents a critical behaviour exploration of the ClinicalBERT deep transformer model using MedNLI challenge dataset to better understanding the following intricacies: 1) decision-making similarities between ClinicalBERT and BERT (leverage a new metric we introduce called Model Alignment), 2) where ClinicalBERT holds advantages over BERT given its clinical knowledge exposure, and 3) where ClinicalBERT struggles when compared to BERT. The insights gained about the behaviour of ClinicalBERT will help guide towards new directions for designing and training clinical language models in a way that not only addresses the remaining gaps and facilitates for further improvements in clinical language understanding performance, but also highlights the limitation and boundaries of use for such models.

APA, Harvard, Vancouver, ISO, and other styles

23

Shem-Tov, Eliad, Moshe Sipper, and Achiya Elyasaf. "BERT Mutation: Deep Transformer Model for Masked Uniform Mutation in Genetic Programming." Mathematics 13, no. 5 (2025): 779. https://doi.org/10.3390/math13050779.

Full text

Abstract:

We introduce BERT mutation, a novel, domain-independent mutation operator for Genetic Programming (GP) that leverages advanced Natural Language Processing (NLP) techniques to improve convergence, particularly using the Masked Language Modeling approach. By combining the capabilities of deep reinforcement learning and the BERT transformer architecture, BERT mutation intelligently suggests node replacements within GP trees to enhance their fitness. Unlike traditional stochastic mutation methods, BERT mutation adapts dynamically by using historical fitness data to optimize mutation decisions, resulting in more effective evolutionary improvements. Through comprehensive evaluations across three benchmark domains, we demonstrate that BERT mutation significantly outperforms conventional and state-of-the-art mutation operators in terms of convergence speed and solution quality. This work represents a pivotal step toward integrating state-of-the-art deep learning into evolutionary algorithms, pushing the boundaries of adaptive optimization in GP.

APA, Harvard, Vancouver, ISO, and other styles

24

Rajan, Abraham, and Manohar Manur. "Aspect based sentiment analysis using fine-tuned BERT model with deep context features." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 2 (2024): 1250. http://dx.doi.org/10.11591/ijai.v13.i2.pp1250-1261.

Full text

Abstract:

<p>Sentiment analysis is the task of analysing, processing, inferencing and concluding the subjective texts along with sentiment. Considering the application of sentiment analysis, it is categorized into document-level, sentence-level and aspect level. In past, several researches have achieved solutions through the bidirectional encoder representations from transformers (BERT) model, however, the existing model does not understand the context of the aspect in deep, which leads to low metrics. This research work leads to the study of the aspect-based sentiment analysis presented by deep context bidirectional encoder representations from transformers (DC-BERT), main aim of the DC-BERT model is to improvise the context understating for aspects to enhance the metrics. DC-BERT model comprises fine-tuned BERT model along with a deep context features layer, which enables the model to understand the context of targeted aspects deeply. A customized feature layer is introduced to extract two distinctive features, later both features are integrated through the interaction layer. DC-BERT mode is evaluated considering the review dataset of laptops and restaurants from SemEval 2014 task 4, evaluation is carried out considering the different metrics. In comparison with the other model, DC-BERT achieves an accuracy of 84.48% and 92.86% for laptop and restaurant datasets respectively.</p>

APA, Harvard, Vancouver, ISO, and other styles

25

Nicolae, Dragoş Constantin, Rohan Kumar Yadav, and Dan Tufiş. "A Lite Romanian BERT: ALR-BERT." Computers 11, no. 4 (2022): 57. http://dx.doi.org/10.3390/computers11040057.

Full text

Abstract:

Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances. However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called “A Lite Romanian BERT (ALR-BERT)”. Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task.

APA, Harvard, Vancouver, ISO, and other styles

26

Sudianto, Sudianto. "Pre-trained BERT Architecture Analysis for Indonesian Question Answer Model." Journal of Applied Engineering and Technological Science (JAETS) 6, no. 1 (2024): 60–68. https://doi.org/10.37385/jaets.v6i1.4746.

Full text

Abstract:

Developing a question-and-answer system in Natural Language Processing (NLP) has become a major concern in the Indonesian language context. One of the main challenges in developing a question-and-answer system is the limited dataset, which can cause instability in system performance. The limitations of the dataset make it difficult for the question-and-answer model to understand and answer questions well. The proposed solution uses Transfer Learning with pre-trained models such as BERT. This research aims to analyze the performance of the BERT model, which has been adapted for question-and-answer tasks in Indonesian. The BERT model uses an Indonesian language dataset adapted specifically for question-and-answer tasks. A customization approach tunes BERT parameters according to the given training data. The results obtained; the model is improved by minimizing the loss function. Evaluation of the trained model shows that the best validation loss is 0.00057 after 150 epochs. In addition, through in-depth evaluation of the similarity of question texts, the BERT model can answer questions measurably, according to existing knowledge in the dataset.

APA, Harvard, Vancouver, ISO, and other styles

27

Zheng, Min, Bo Liu, and Le Sun. "Study of Deep Learning-Based Legal Judgment Prediction in Internet of Things Era." Computational Intelligence and Neuroscience 2022 (August 8, 2022): 1–6. http://dx.doi.org/10.1155/2022/8490760.

Full text

Abstract:

Legal judgment prediction is the most typical application of artificial intelligence technology, especially natural language processing methods, in the judicial field. In a practical environment, the performance of algorithms is often restricted by the computing resource conditions due to the uneven computing performance of the devices. Reducing the computational resource consumption of the model and improving the inference speed can effectively reduce the deployment difficulty of the legal judgment prediction model. To improve the prediction accuracy, enhance the model inference speed, and reduce the model memory consumption, we propose a BERT knowledge distillation-based legal decision prediction model, called KD-BERT. To reduce the resource consumption in the model inference process, we use the BERT pretraining model with lower memory requirements to be the encoder. Then, the knowledge distillation strategy transfers the knowledge to the student model of the shallow transformer structure. Experiment results show that the proposed KD-BERT has the highest F1-score compared with traditional BERT models. Its inference speed is also much faster than the other BERT models.

APA, Harvard, Vancouver, ISO, and other styles

28

Li, Fenfang, Zhengzhang Zhao, Li Wang, and Han Deng. "Tibetan Sentence Boundaries Automatic Disambiguation Based on Bidirectional Encoder Representations from Transformers on Byte Pair Encoding Word Cutting Method." Applied Sciences 14, no. 7 (2024): 2989. http://dx.doi.org/10.3390/app14072989.

Full text

Abstract:

Sentence Boundary Disambiguation (SBD) is crucial for building datasets for tasks such as machine translation, syntactic analysis, and semantic analysis. Currently, most automatic sentence segmentation in Tibetan adopts the methods of rule-based and statistical learning, as well as the combination of the two, which have high requirements on the corpus and the linguistic foundation of the researchers and are more costly to annotate manually. In this study, we explore Tibetan SBD using deep learning technology. Initially, we analyze Tibetan characteristics and various subword techniques, selecting Byte Pair Encoding (BPE) and Sentencepiece (SP) for text segmentation and training the Bidirectional Encoder Representations from Transformers (BERT) pre-trained language model. Secondly, we studied the Tibetan SBD based on different BERT pre-trained language models, which mainly learns the ambiguity of the shad (“།”) in different positions in modern Tibetan texts and determines through the model whether the shad (“།”) in the texts has the function of segmenting sentences. Meanwhile, this study introduces four models, BERT-CNN, BERT-RNN, BERT-RCNN, and BERT-DPCNN, based on the BERT model for performance comparison. Finally, to verify the performance of the pre-trained language models on the SBD task, this study conducts SBD experiments on both the publicly available Tibetan pre-trained language model TiBERT and the multilingual pre-trained language model (Multi-BERT). The experimental results show that the F1 score of the BERT (BPE) model trained in this study reaches 95.32% on 465,669 Tibetan sentences, nearly five percentage points higher than BERT (SP) and Multi-BERT. The SBD method based on pre-trained language models in this study lays the foundation for establishing datasets for the later tasks of Tibetan pre-training, summary extraction, and machine translation.

APA, Harvard, Vancouver, ISO, and other styles

29

Li, Yanjie, and He Mao. "Evaluation and Construction of College Students’ Growth and Development Index System Based on Data Association Mining and Deep Learning Model." Security and Communication Networks 2021 (December 31, 2021): 1–8. http://dx.doi.org/10.1155/2021/7415129.

Full text

Abstract:

The rise of big data in the field of education provides an opportunity to solve college students’ growth and development. The establishment of a personalized student management mode based on big data in universities will promote the change of personalized student management from the empirical mode to the scientific mode, from passive response to active warning, from reliance on point data to holistic data, and thus improve the efficiency and quality of personalized student management. In this paper, using the latest ideas and techniques in deep learning such as self-supervised learning and multitask learning, we propose an open-source educational big data pretrained language model F-BERT based on the BERT model architecture. Based on the BERT architecture, F-BERT can effectively and automatically extract knowledge from educational big data and memorize it in the model without modifying the model structure specific to educational big data tasks so that it can be directly applied to various educational big data domain tasks downstream. The experiment demonstrates that Vanilla F-BERT outperformed the two Vanilla BERT-based models, Vanilla BERT and BERT tasks, by 0.0.6 and 0.03 percent, respectively, in terms of accuracy.

APA, Harvard, Vancouver, ISO, and other styles

30

Subowo, Edy. "Implementasi Pembelajaran Mendalam dalam Klasifikasi Sentimen Ulasan Aplikasi: Evaluasi Model BERT, LSTM, dan CNN." Jurnal Surya Informatika 14, no. 2 (2024): 66–70. https://doi.org/10.48144/suryainformatika.v14i2.1973.

Full text

Abstract:

Penelitian ini membahas tentang sistem analisis sentimen aplikasi menggunakan teknik pembelajaran mesin. Tujuannya adalah untuk mengidentifikasi sentimen positif, netral, dan negatif dari ulasan aplikasi Shopee Indonesia. Metode yang digunakan meliputi pengumpulan data ulasan aplikasi menggunakan Google Play Scraper, penghapusan duplikasi, dan preprocessing data untuk siap digunakan dalam model klasifikasi.Model-model yang digunakan adalah BERT, LSTM, dan CNN. Model BERT didapatkan dengan menggunakan pre-trained bert-base-uncased dan dilatih dengan dataset ulasan aplikasi yang telah diproses. Sedangkan model LSTM dan CNN didapatkan dengan menggunakan tokenizer dan teknik padding sequence untuk menghadapi masalah variabel panjang teks.Hasil eksperimen menunjukkan bahwa semua tiga model memiliki performa yang baik dalam mengklasifikasikan sentimen ulasan aplikasi. Namun, model BERT memberikan hasil akurasi tertinggi (83%) dibandingkan dengan model LSTM (78%) dan CNN (75%). Hasil ini menunjukkan model BERT dapat menganalisis sentimen aplikasi secara efektif karena kemampuannya dalam mendeteksi pola bahasa kompleks dan meningkatkan akurasi prediksi.

APA, Harvard, Vancouver, ISO, and other styles

31

Xu, Jiayu, Nan Xu, Weixin Xie, Chengkui Zhao, Lei Yu, and Weixing Feng. "BERT-siRNA: siRNA target prediction based on BERT pre-trained interpretable model." Gene 910 (June 2024): 148330. http://dx.doi.org/10.1016/j.gene.2024.148330.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Kim, Kyungmo, Seongkeun Park, Jeongwon Min, et al. "Multifaceted Natural Language Processing Task–Based Evaluation of Bidirectional Encoder Representations From Transformers Models for Bilingual (Korean and English) Clinical Notes: Algorithm Development and Validation." JMIR Medical Informatics 12 (October 30, 2024): e52897-e52897. http://dx.doi.org/10.2196/52897.

Full text

Abstract:

Abstract Background The bidirectional encoder representations from transformers (BERT) model has attracted considerable attention in clinical applications, such as patient classification and disease prediction. However, current studies have typically progressed to application development without a thorough assessment of the model’s comprehension of clinical context. Furthermore, limited comparative studies have been conducted on BERT models using medical documents from non–English-speaking countries. Therefore, the applicability of BERT models trained on English clinical notes to non-English contexts is yet to be confirmed. To address these gaps in literature, this study focused on identifying the most effective BERT model for non-English clinical notes. Objective In this study, we evaluated the contextual understanding abilities of various BERT models applied to mixed Korean and English clinical notes. The objective of this study was to identify the BERT model that excels in understanding the context of such documents. Methods Using data from 164,460 patients in a South Korean tertiary hospital, we pretrained BERT-base, BERT for Biomedical Text Mining (BioBERT), Korean BERT (KoBERT), and Multilingual BERT (M-BERT) to improve their contextual comprehension capabilities and subsequently compared their performances in 7 fine-tuning tasks. Results The model performance varied based on the task and token usage. First, BERT-base and BioBERT excelled in tasks using classification ([CLS]) token embeddings, such as document classification. BioBERT achieved the highest F1-score of 89.32. Both BERT-base and BioBERT demonstrated their effectiveness in document pattern recognition, even with limited Korean tokens in the dictionary. Second, M-BERT exhibited a superior performance in reading comprehension tasks, achieving an F1-score of 93.77. Better results were obtained when fewer words were replaced with unknown ([UNK]) tokens. Third, M-BERT excelled in the knowledge inference task in which correct disease names were inferred from 63 candidate disease names in a document with disease names replaced with [MASK] tokens. M-BERT achieved the highest hit@10 score of 95.41. Conclusions This study highlighted the effectiveness of various BERT models in a multilingual clinical domain. The findings can be used as a reference in clinical and language-based applications.

APA, Harvard, Vancouver, ISO, and other styles

33

Nurjoko, Agus Rahardi. "Model Indo-BERT untuk Identifikasi Sentimen Kekerasan Verbal di Twitter." TEKNIKA 18, no. 2 (2024): 583–93. https://doi.org/10.5281/zenodo.12788184.

Full text

Abstract:

Munculnya kekerasan verbal di platform media sosial Twitter menjadi isu yang semakin memprihatinkan dalam beberapa tahun terakhir. Kekerasan verbal mencakup berbagai bentuk komunikasi yang merendahkan, menghina, dan mengancam, seringkali merugikan individu atau kelompok. Permasalahan utama yang dibahas dalam penelitian ini adalah identifikasi sentimen terkait perilaku kekerasan verbal di media sosial, khususnya Twitter. Penelitian ini bertujuan untuk mengembangkan model analisis sentimen Verbal Violence Behavior (VVB) di Twitter dengan menggunakan Indo-BERT. Keakuratan model akan dibandingkan dengan BERT. Penelitian ini diawali dengan pengumpulan data melalui web crawling dengan menggunakan kategori perilaku kekerasan verbal sebagai referensi. Pelabelan dataset dilakukan melalui kombinasi metode manual dan otomatis dengan pendekatan semi-supervised learning. Proses ini melibatkan pembelajaran mandiri, di mana data yang tidak berlabel secara otomatis diberi label menggunakan model yang telah dilatih sebelumnya. Dataset dikategorikan menjadi sentimen positif, sentimen negatif, dan sentimen netral. Model Indo-BERT digunakan sebagai kerangka analisis. Evaluasi hasil dilakukan dengan menerapkan matriks konfusi. Temuan dari percobaan ini menunjukkan bahwa model Indo-BERT memiliki kemampuan yang lebih unggul dengan tingkat akurasi mencapai 72% dalam memproses teks berbahasa Indonesia dibandingkan dengan model BERT dengan tingkat akurasi sebesar 69%. 

APA, Harvard, Vancouver, ISO, and other styles

34

Reddy, K. Sahit, N. Ragavenderan, Vasanth K., Ganesh N. Naik, Vishalakshi Prabhu H, and Nagaraja G. S. "MedicalBERT: enhancing biomedical natural language processing using pretrained BERT-based model." IAES International Journal of Artificial Intelligence (IJ-AI) 14, no. 3 (2025): 2367. https://doi.org/10.11591/ijai.v14.i3.pp2367-2378.

Full text

Abstract:

<p><span lang="EN-US">Recent advances in natural language processing (NLP) have been driven by pretrained language models like BERT, RoBERTa, T5, and GPT. These models excel at understanding complex texts, but biomedical literature, with its domain-specific terminology, poses challenges that models like Word2Vec and bidirectional long short-term memory (Bi-LSTM) can't fully address. GPT and T5, despite capturing context, fall short in tasks needing bidirectional understanding, unlike BERT. Addressing this, we proposed MedicalBERT, a pretrained BERT model trained on a large biomedical dataset and equipped with domain-specific vocabulary that enhances the comprehension of biomedical terminology. MedicalBERT model is further optimized and fine-tuned to address diverse tasks, including named entity recognition, relation extraction, question answering, sentence similarity, and document classification. Performance metrics such as the F1-score, accuracy, and Pearson correlation are employed to showcase the efficiency of our model in comparison to other BERT-based models such as BioBERT, SciBERT, and ClinicalBERT. MedicalBERT outperforms these models on most of the benchmarks, and surpasses the general-purpose BERT model by 5.67% on average across all the tasks evaluated respectively. This work also underscores the potential of leveraging pretrained BERT models for medical NLP tasks, demonstrating the effectiveness of transfer learning techniques in capturing domain-specific information.</span></p>

APA, Harvard, Vancouver, ISO, and other styles

35

Zhou, Lu, Shuangqiao Liu, Caiyan Li, et al. "Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine." Evidence-Based Complementary and Alternative Medicine 2021 (October 11, 2021): 1–12. http://dx.doi.org/10.1155/2021/6676607.

Full text

Abstract:

Background. The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions. Methods. Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score. Results. The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics. Conclusions. The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms.

APA, Harvard, Vancouver, ISO, and other styles

36

Kannan, Eswariah, and Lakshmi Anusha Kothamasu. "Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data." Ingénierie des systèmes d information 27, no. 1 (2022): 93–100. http://dx.doi.org/10.18280/isi.270111.

Full text

Abstract:

Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.

APA, Harvard, Vancouver, ISO, and other styles

37

Zengeya, Tsitsi, Jean Vincent Fonou Dombeu, and Mandlenkosi Gwetu. "A Centrality-Weighted Bidirectional Encoder Representation from Transformers Model for Enhanced Sequence Labeling in Key Phrase Extraction from Scientific Texts." Big Data and Cognitive Computing 8, no. 12 (2024): 182. https://doi.org/10.3390/bdcc8120182.

Full text

Abstract:

Deep learning approaches, utilizing Bidirectional Encoder Representation from Transformers (BERT) and advanced fine-tuning techniques, have achieved state-of-the-art accuracies in the domain of term extraction from texts. However, BERT presents some limitations in that it primarily captures the semantic context relative to the surrounding text without considering how relevant or central a token is to the overall document content. There has also been research on the application of sequence labeling on contextualized embeddings; however, the existing methods often rely solely on local context for extracting key phrases from texts. To address these limitations, this study proposes a centrality-weighted BERT model for key phrase extraction from text using sequence labelling (CenBERT-SEQ). The proposed CenBERT-SEQ model utilizes BERT to represent terms with various contextual embedding architectures, and introduces a centrality-weighting layer that integrates document-level context into BERT. This layer leverages document embeddings to influence the importance of each term based on its relevance to the entire document. Finally, a linear classifier layer is employed to model the dependencies between the outputs, thereby enhancing the accuracy of the CenBERT-SEQ model. The proposed CenBERT-SEQ model was evaluated against the standard BERT base-uncased model using three Computer Science article datasets, namely, SemEval-2010, WWW, and KDD. The experimental results show that, although the CenBERT-SEQ and BERT-base models achieved higher and close comparable accuracy, the proposed CenBERT-SEQ model achieved higher precision, recall, and F1-score than the BERT-base model. Furthermore, a comparison of the proposed CenBERT-SEQ model to that of related studies revealed that the proposed CenBERT-SEQ model achieved a higher accuracy, precision, recall, and F1-score of 95%, 97%, 91%, and 94%, respectively, than related studies, showing the superior capabilities of the CenBERT-SEQ model in keyphrase extraction from scientific documents.

APA, Harvard, Vancouver, ISO, and other styles

38

Wang, Yang, and Zhengjie Zhu. "The Application of Deep Learning Model in Recruitment Decision." Wireless Communications and Mobile Computing 2022 (March 2, 2022): 1–13. http://dx.doi.org/10.1155/2022/9645830.

Full text

Abstract:

With the rapid development of social economy, the competition of human resources is becoming more and more fierce. Recruitment, as the main way for enterprises to obtain talents, determines the future development of enterprises to a great extent. Compared with Western advanced countries, the research on recruitment in China started late, the overall research level is relatively backward, and most of the relevant technical means and analytical methods are introduced from advanced countries. The existing literature is still relatively scattered, which is not conducive to the rapid development of recruitment direction research, and is not conducive to specific applications. Starting with the existing deep learning, from four models, that is, based on the traditional machine learning model, conditional random field (CRF), deep learning models Bi-LSTM-CRF, BERT, and BERT-Bi-LSTM-CRF identify and automatically extract recruitment entities and study recruitment accordingly; BERT-Bi-LSTM-CRF-BERT and BERT-BiLSTM-CRF are the models with the worst recognition effect. Although they have stronger text feature extraction ability and context information capture ability, they are limited by the small scale of information science recruitment corpus and the small number of entities, so their performance under this task is not brought into full play. Although CRF is relatively traditional, it can still achieve excellent results on some small-scale sparse datasets.

APA, Harvard, Vancouver, ISO, and other styles

39

Sukmawati, Enjeli Cistia, Lintang Suryaningrum, Diva Angelica, and Nur Ghaniaviyanto Ramadhan. "Klasifikasi Berita Palsu Menggunakan Model Bidirectional Encoder Representations From Transformers (BERT)." SisInfo 6, no. 2 (2024): 76–85. https://doi.org/10.37278/sisinfo.v6i2.934.

Full text

Abstract:

Penyebaran informasi palsu menjadi tantangan serius dalam era digital yang terus berkembang, terutama melalui internet dan platform sosial media. Akses mudah terhadap informasi tidak terverifikasi menciptakan tantangan membedakan antara fakta dan hoaks. Salah satu aspek utama yang perlu diatasi adalah mengklasifikasikan berita palsu dengan tingkat akurasi yang tinggi. Berita sebagai sumber informasi aktual memerlukan pengelompokan untuk memfasilitasi akses, namun tidak semua berita dari berbagai sumber memiliki kredibilitas tinggi, terutama dengan adanya fake news. Fake news dapat merugikan individu dan berpotensi memanipulasi persepsi masyarakat, terutama melalui media sosial. Mengidentifikasi informasi palsu menjadi tantangan dalam Natural Language Processing (NLP) dengan pertumbuhan pesat platform media sosial. Meskipun ada beberapa metode untuk mendeteksi berita palsu, belum ada platform yang secara luas dikenal menerapkan algoritma terfokus pada sumber berita spesifik. Oleh karena itu, penelitian ini bertujuan untuk mengatasi masalah penyebaran informasi palsu dengan mengelompokkan berita palsu berdasarkan ciri-ciri bahasa, mengadopsi metode klasifikasi menggunakan teknologi Bidirectional Encoder Representations from Transformers (BERT). BERT, sebagai model bahasa yang dilatih mendalam, memahami konteks kata dalam teks dengan baik. Kami mengadopsi teknologi BERT untuk meningkatkan akurasi deteksi berita palsu. Meskipun BERT memiliki kompleksitas, sumber model yang sudah dilatih oleh Google memudahkan penggunaan tanpa perlu membuat model dari awal. Dengan langkah pretraining dan fine tuning, BERT dianggap lebih akurat dalam mendeteksi berita palsu dibandingkan metode lainnya. Penelitian ini memberikan kontribusi dalam menghadapi tantangan penyebaran informasi palsu dengan memanfaatkan keunggulan teknologi BERT dalam mengklasifikasikan berita.

APA, Harvard, Vancouver, ISO, and other styles

40

Noviana, Medi, and Sunny Arief Sudiro. "AUTOMATION OF THE BERT AND RESNET50 MODEL INFERENCE CONFIGURATION ANALYSIS PROCESS." JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer) 10, no. 2 (2024): 324–32. http://dx.doi.org/10.33480/jitk.v10i2.5053.

Full text

Abstract:

Inference is the process of using models to make predictions on new data, performance is measured based on throughput, latency, GPU memory usage, and GPU power usage. The models used are BERT and ResNet50. The right configuration can be used to maximise inference. Configuration analysis needs to be done to find out which configuration is right for model inference. The main challenge in the analysis process lies in its inherent time-intensive nature and inherent complexity, making it a task that is not simple. The analysis needs to be made easier by building an automation programme. The automation programme analyses the BERT model inference configuration by dividing 10 configurations namely bert-large_config_0 to bert-large_config_9, the result is that the right configuration is bert-large_config_2 resulting in a throughput of 12.8 infer/sec with a latency of 618 ms. While the ResNet50 model is divided into 5 configurations, namely resnet50_config_0 to resnet50_config_4, the result is that the right configuration is resnet50_config_1 which produces a throughput of 120.6 infer/sec with a latency of 60.9 ms. The automation programme has the benefit of facilitating the process of analysing the inference configuration.

APA, Harvard, Vancouver, ISO, and other styles

41

Kaur, ChehakPreet. "DECODING REVIEW SENTIMENT ANALYSIS MODEL." International Scientific Journal of Engineering and Management 04, no. 05 (2025): 1–7. https://doi.org/10.55041/isjem03305.

Full text

Abstract:

Abstract: Review Sentiment Analysis is performed so that, any business or organization can understand how their audience perceives their products or services. The study of their opinion provides them with extremely valuable information. The analysis helps in deciding where and on what aspects the organization should invest in. This paper reviews and looks over different types of methods of deep learning that can be used and expands on the BERT method. The BERT method is a deep learning model designed and used to improve the accuracy and efficiency of Natural Language Processing tasks.

APA, Harvard, Vancouver, ISO, and other styles

42

Li, Luqi, Yunkai Zhai, Jinghong Gao, Linlin Wang, Li Hou, and Jie Zhao. "Stacking-BERT model for Chinese medical procedure entity normalization." Mathematical Biosciences and Engineering 20, no. 1 (2022): 1018–36. http://dx.doi.org/10.3934/mbe.2023047.

Full text

Abstract:

<abstract> <p>Medical procedure entity normalization is an important task to realize medical information sharing at the semantic level; it faces main challenges such as variety and similarity in real-world practice. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings, and there is minimal research on medical entity recognition in Chinese Regarding the entity normalization task as a sentence pair classification task, we applied a three-step framework to normalize Chinese medical procedure terms, and it consists of dataset construction, candidate concept generation and candidate concept ranking. For dataset construction, external knowledge base and easy data augmentation skills were used to increase the diversity of training samples. For candidate concept generation, we implemented the BM25 retrieval method based on integrating synonym knowledge of SNOMED CT and train data. For candidate concept ranking, we designed a stacking-BERT model, including the original BERT-based and Siamese-BERT ranking models, to capture the semantic information and choose the optimal mapping pairs by the stacking mechanism. In the training process, we also added the tricks of adversarial training to improve the learning ability of the model on small-scale training data. Based on the clinical entity normalization task dataset of the 5th China Health Information Processing Conference, our stacking-BERT model achieved an accuracy of 93.1%, which outperformed the single BERT models and other traditional deep learning models. In conclusion, this paper presents an effective method for Chinese medical procedure entity normalization and validation of different BERT-based models. In addition, we found that the tricks of adversarial training and data augmentation can effectively improve the effect of the deep learning model for small samples, which might provide some useful ideas for future research.</p> </abstract>

APA, Harvard, Vancouver, ISO, and other styles

43

Aydın, Özlem, and Hüsein Kantarcı. "Türkçe Anahtar Sözcük Çıkarımında LSTM ve BERT Tabanlı Modellerin Karşılaştırılması." Bilgisayar Bilimleri ve Mühendisliği Dergisi 17, no. 1 (2024): 9–18. http://dx.doi.org/10.54525/bbmd.1454220.

Full text

Abstract:

Günümüzde internet ortamında metne dayalı veri çok hızlı bir şekilde artış göstermektedir ve bu büyük veri içinden istenilen bilgiyi barındıran doğru içeriklere ulaşabilmek önemli bir ihtiyaçtır. İçeriklere ait anahtar sözcüklerin bilinmesi bu ihtiyacı karşılamada olumlu bir etki sağlayabilmektedir. Bu çalışmada, doğal dil işleme ve derin öğrenme modelleri ile Türkçe metinleri temsil eden anahtar sözcüklerin belirlenmesi amaçlanmıştır. Veri kümesi olarak Türkçe Etiketli Metin Derlemi ve Metin Özetleme-Anahtar Kelime Çıkarma Veri Kümesi birlikte kullanılmıştır. Derin öğrenme modeli olarak çalışmada iki farklı model ortaya konmuştur. İlk olarak Uzun Ömürlü Kısa Dönem Belleği ( LSTM) katmanlı bir Diziden Diziye (Seq2Seq) model tasarlanmıştır. Diğer model ise BERT (Transformatörler ile İki Yönlü Kodlayıcı Temsilleri) ile oluşturulmuş Seq2Seq bir modeldir. LSTM katmanlı Seq2seq modelin başarı değerlendirmesinde ROUGE-1 ölçütünde 0,38 F-1 değerine ulaşılmıştır. BERT tabanlı Seq2Seq modelde ROUGE-1 ölçütünde 0,399 F-1 değeri elde edilmiştir. Sonuç olarak dönüştürücü mimarisini temel alan BERT tabanlı Seq2Seq modelin, LSTM tabanlı Seq2seq modele görece daha başarılı olduğu gözlemlenmiştir.

APA, Harvard, Vancouver, ISO, and other styles

44

Sak, Semih, and Mustafa Alper Akkaş. "6G'de Nesnelerin İnterneti Teknolojisinin Medikal Alandaki Gelişmeleri." Bilgisayar Bilimleri ve Mühendisliği Dergisi 17, no. 1 (2024): 1–8. http://dx.doi.org/10.54525/bbmd.1454186.

Full text

Abstract:

Günümüzde internet ortamında metne dayalı veri çok hızlı bir şekilde artış göstermektedir ve bu büyük veri içinden istenilen bilgiyi barındıran doğru içeriklere ulaşabilmek önemli bir ihtiyaçtır. İçeriklere ait anahtar sözcüklerin bilinmesi bu ihtiyacı karşılamada olumlu bir etki sağlayabilmektedir. Bu çalışmada, doğal dil işleme ve derin öğrenme modelleri ile Türkçe metinleri temsil eden anahtar sözcüklerin belirlenmesi amaçlanmıştır. Veri kümesi olarak Türkçe Etiketli Metin Derlemi ve Metin Özetleme-Anahtar Kelime Çıkarma Veri Kümesi birlikte kullanılmıştır. Derin öğrenme modeli olarak çalışmada iki farklı model ortaya konmuştur. İlk olarak Uzun Ömürlü Kısa Dönem Belleği ( LSTM) katmanlı bir Diziden Diziye (Seq2Seq) model tasarlanmıştır. Diğer model ise BERT (Transformatörler ile İki Yönlü Kodlayıcı Temsilleri) ile oluşturulmuş Seq2Seq bir modeldir. LSTM katmanlı Seq2seq modelin başarı değerlendirmesinde ROUGE-1 ölçütünde 0,38 F-1 değerine ulaşılmıştır. BERT tabanlı Seq2Seq modelde ROUGE-1 ölçütünde 0,399 F-1 değeri elde edilmiştir. Sonuç olarak dönüştürücü mimarisini temel alan BERT tabanlı Seq2Seq modelin, LSTM tabanlı Seq2seq modele görece daha başarılı olduğu gözlemlenmiştir.

APA, Harvard, Vancouver, ISO, and other styles

45

Lee, Sangah, and Hyopil Shin. "Combining Sentiment-Combined Model with Pre-Trained BERT Models for Sentiment Analysis." Journal of KIISE 48, no. 7 (2021): 815–24. http://dx.doi.org/10.5626/jok.2021.48.7.815.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Zhang, Yucong. "Research on Sentiment Analysis of Government Short Video Comments based on BERT's Multi Strategy Combination Model." Frontiers in Computing and Intelligent Systems 3, no. 3 (2023): 75–80. http://dx.doi.org/10.54097/fcis.v3i3.8570.

Full text

Abstract:

In the field of NLP, pre trained models greatly shorten development time, reduce usage difficulty, and improve model robustness. In this paper, the BERT pre training model is used to analyze the text emotion of Tiktok short video comments. Through experimental comparison, it is found that the BERT pre training model is better than CNN and BiLSTM in emotion classification. Based on the BERT model, the multi strategy hybrid method is used to further input the BERT output dynamic word vector into the CNN and BiLSTM models to further extract local features and features with long distance dependency, improve the accuracy of the model, and obtain a good classification effect on the government short video comment dataset, which is better than the single model.

APA, Harvard, Vancouver, ISO, and other styles

47

Park, Myeong-Jun, and Jae-Hyun Seo. "Improving BERT Classification Model Performance using Data Augmentation." Journal of Korean Institute of Intelligent Systems 34, no. 6 (2024): 502–9. https://doi.org/10.5391/jkiis.2024.34.6.502.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Xiao Zunlan and Ning Xiaohong. "Dynamic BERT-SVM Hybrid Model for Enhanced Semantic Similarity Evaluation in English Teaching Texts." Journal of English Language Teaching and Applied Linguistics 7, no. 1 (2025): 01–11. https://doi.org/10.32996/jeltal.2025.7.1.1.

Full text

Abstract:

Accurate semantic similarity evaluation in English teaching texts is essential for enhancing automated feedback systems and personalized learning. This study introduces a Dynamic BERT-SVM Hybrid Model, an innovative framework that combines the deep contextual understanding of BERT (Bidirectional Encoder Representations from Transformers) with the robust classification capabilities of Support Vector Machines (SVM). The primary objective is to develop a method that effectively addresses the complexities of language semantics in educational materials by leveraging BERT's ability to generate rich, dynamic embeddings and SVM's proficiency in handling high-dimensional data. The model processes English teaching texts through BERT to obtain nuanced semantic representations, which are subsequently classified by an optimized SVM. Extensive experiments were conducted on a diverse dataset encompassing various genres and proficiency levels. The Dynamic BERT-SVM Hybrid Model outperformed baseline models, including pure BERT and traditional machine learning approaches, achieving higher accuracy, precision, recall, and F1-scores. Additionally, the model demonstrated strong generalizability across different text types, highlighting its adaptability for real-world educational applications. This research bridges advanced natural language processing techniques with educational technology, providing a robust tool for precise semantic evaluation. The Dynamic BERT-SVM Hybrid Model sets a new standard for semantic similarity assessment in language education, offering significant contributions to both academic research and practical instructional methodologies.

APA, Harvard, Vancouver, ISO, and other styles

49

Umair, Muhammad, Iftikhar Alam, Atif Khan, Inayat Khan, Niamat Ullah, and Mohammad Yusuf Momand. "N-GPETS: Neural Attention Graph-Based Pretrained Statistical Model for Extractive Text Summarization." Computational Intelligence and Neuroscience 2022 (November 22, 2022): 1–14. http://dx.doi.org/10.1155/2022/6241373.

Full text

Abstract:

The extractive summarization approach involves selecting the source document’s salient sentences to build a summary. One of the most important aspects of extractive summarization is learning and modelling cross-sentence associations. Inspired by the popularity of Transformer-based Bidirectional Encoder Representations (BERT) pretrained linguistic model and graph attention network (GAT) having a sophisticated network that captures intersentence associations, this research work proposes a novel neural model N-GPETS by combining heterogeneous graph attention network with BERT model along with statistical approach using TF-IDF values for extractive summarization task. Apart from sentence nodes, N-GPETS also works with different semantic word nodes of varying granularity levels that serve as a link between sentences, improving intersentence interaction. Furthermore, proposed N-GPETS becomes more improved and feature-rich by integrating graph layer with BERT encoder at graph initialization step rather than employing other neural network encoders such as CNN or LSTM. To the best of our knowledge, this work is the first attempt to combine the BERT encoder and TF-IDF values of the entire document with a heterogeneous attention graph structure for the extractive summarization task. The empirical outcomes on benchmark news data sets CNN/DM show that the proposed model N-GPETS gets favorable results in comparison with other heterogeneous graph structures employing the BERT model and graph structures without the BERT model.

APA, Harvard, Vancouver, ISO, and other styles

50

Sureja, Nitesh, Nandini M. Chaudhari, Jalpa Bhatt, et al. "An improved reptile search algorithm-based machine learning for sentiment analysis." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 1 (2025): 755. http://dx.doi.org/10.11591/ijece.v15i1.pp755-766.

Full text

Abstract:

The rapid growth of mobile technologies has transformed social media, making it crucial for expressing emotions and thoughts. When making significant decisions, businesses and governments can benefit from understanding public opinion. This information makes sentiment analysis vital for understanding public sentiment polarity. This study develops a hyper tuned deep learning model with swarm intelligence and many approaches for sentiment analysis. convolutional neural network (CNN), bidirectional encoder representations from transformers (BERT), long short-term memory (LSTM), CNN-LSTM, BERT-LSTM, and BERT-CNN are the six deep learning models of the sentiment analysis using deep learning with reinforced learning based on reptile search algorithm (SA-DLRLRSA) model. The reptile search algorithm, an enhanced swarm intelligence algorithm (SIA), optimizes deep learning model hyper parameters. Word2Vec word embedding is used to convert textual input sequences to representative embedding spaces. Pre-trained Word2Vec embedding is also used to address issue of unbalanced datasets. Experimental results demonstrate that the SA-DLRLRSA model works best with accuracies of 93.1%, 94.7%, 96.8%, 96.3%, 97.2%, and 98.3% utilizing CNN, LSTM, BERT, CNN-LSTM, BERT-CNN, and BERT-LSTM.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!