Увійти

Готові списки джерел за темами / Natural language processing biomedical nlp deep learning transfer learning / Статті в журналах

Статті в журналах з теми "Natural language processing biomedical nlp deep learning transfer learning"

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Natural language processing biomedical nlp deep learning transfer learning.

Автор: Grafiati

Опубліковано: 20 червня 2021

Оновлено: 11 березня 2023

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Natural language processing biomedical nlp deep learning transfer learning".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Vaghasia, Rishil. "An Improvised Approach of Deep Learning Neural Networks in NLP Applications." International Journal for Research in Applied Science and Engineering Technology 11, no. 1 (January 31, 2023): 1599–603. http://dx.doi.org/10.22214/ijraset.2023.48884.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract: In recent years, natural language processing (NLP) has drawn a lot of interest for its ability to computationally represent and analyze human language. Its uses have expanded to include machine translation, email spam detection, information extraction, summarization, medical diagnosis, and question answering, among other areas. The purpose of this research is to investigate how deep learning and neural networks are used to analyze the syntax of natural language. This research first investigates a feed-forward neural network-based classifier for a transfer-based dependent syntax analyzer. This study presents a long-term memory neural network-based dependent syntactic analysis paradigm. This model, which will serve as a feature extractor, is based on the feed-forward neural network model mentioned before. After the feature extractor is learned, we train a recursive neural network classifier that is optimized by sentences using a long short-term memory neural network as a classifier of the transfer action and the characteristics retrieved by the syntactic analyzer as its input. Syntactic analysis replaces the method of modeling independent analysis with one that models the analysis of the entire sentence as a whole. The experimental findings demonstrate that the model has improved its performance more than the benchmark techniques.

2

Garrido-Muñoz , Ismael, Arturo Montejo-Ráez , Fernando Martínez-Santiago , and L. Alfonso Ureña-López . "A Survey on Bias in Deep NLP." Applied Sciences 11, no. 7 (April 2, 2021): 3184. http://dx.doi.org/10.3390/app11073184.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.

3

Guarasci, Raffaele, Giuseppe De Pietro, and Massimo Esposito. "Quantum Natural Language Processing: Challenges and Opportunities." Applied Sciences 12, no. 11 (June 2, 2022): 5651. http://dx.doi.org/10.3390/app12115651.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The meeting between Natural Language Processing (NLP) and Quantum Computing has been very successful in recent years, leading to the development of several approaches of the so-called Quantum Natural Language Processing (QNLP). This is a hybrid field in which the potential of quantum mechanics is exploited and applied to critical aspects of language processing, involving different NLP tasks. Approaches developed so far span from those that demonstrate the quantum advantage only at the theoretical level to the ones implementing algorithms on quantum hardware. This paper aims to list the approaches developed so far, categorizing them by type, i.e., theoretical work and those implemented on classical or quantum hardware; by task, i.e., general purpose such as syntax-semantic representation or specific NLP tasks, like sentiment analysis or question answering; and by the resource used in the evaluation phase, i.e., whether a benchmark dataset or a custom one has been used. The advantages offered by QNLP are discussed, both in terms of performance and methodology, and some considerations about the possible usage QNLP approaches in the place of state-of-the-art deep learning-based ones are given.

4

Ok, Changwon, Geonseok Lee, and Kichun Lee. "Informative Language Encoding by Variational Autoencoders Using Transformer." Applied Sciences 12, no. 16 (August 9, 2022): 7968. http://dx.doi.org/10.3390/app12167968.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In natural language processing (NLP), Transformer is widely used and has reached the state-of-the-art level in numerous NLP tasks such as language modeling, summarization, and classification. Moreover, a variational autoencoder (VAE) is an efficient generative model in representation learning, combining deep learning with statistical inference in encoded representations. However, the use of VAE in natural language processing often brings forth practical difficulties such as a posterior collapse, also known as Kullback–Leibler (KL) vanishing. To mitigate this problem, while taking advantage of the parallelization of language data processing, we propose a new language representation model as the integration of two seemingly different deep learning models, which is a Transformer model solely coupled with a variational autoencoder. We compare the proposed model with previous works, such as a VAE connected with a recurrent neural network (RNN). Our experiments with four real-life datasets show that implementation with KL annealing mitigates posterior collapses. The results also show that the proposed Transformer model outperforms RNN-based models in reconstruction and representation learning, and that the encoded representations of the proposed model are more informative than other tested models.

5

Gupta, Manish, and Puneet Agrawal. "Compression of Deep Learning Models for Text: A Survey." ACM Transactions on Knowledge Discovery from Data 16, no. 4 (August 31, 2022): 1–55. http://dx.doi.org/10.1145/3487045.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In recent years, the fields of natural language processing (NLP) and information retrieval (IR) have made tremendous progress thanks to deep learning models like Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTMs) networks, and Transformer [ 121 ] based models like Bidirectional Encoder Representations from Transformers (BERT) [ 24 ], Generative Pre-training Transformer (GPT-2) [ 95 ], Multi-task Deep Neural Network (MT-DNN) [ 74 ], Extra-Long Network (XLNet) [ 135 ], Text-to-text transfer transformer (T5) [ 96 ], T-NLG [ 99 ], and GShard [ 64 ]. But these models are humongous in size. On the other hand, real-world applications demand small model size, low response times, and low computational power wattage. In this survey, we discuss six different types of methods (Pruning, Quantization, Knowledge Distillation (KD), Parameter Sharing, Tensor Decomposition, and Sub-quadratic Transformer-based methods) for compression of such models to enable their deployment in real industry NLP projects. Given the critical need of building applications with efficient and small models, and the large amount of recently published work in this area, we believe that this survey organizes the plethora of work done by the “deep learning for NLP” community in the past few years and presents it as a coherent story.

6

Schomacker, Thorben, and Marina Tropmann-Frick. "Language Representation Models: An Overview." Entropy 23, no. 11 (October 28, 2021): 1422. http://dx.doi.org/10.3390/e23111422.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of the last five years have resulted in techniques that have allowed for the practical application of transfer learning in NLP. The advances in the field have been substantial, and the milestone of outperforming human baseline performance based on the general language understanding evaluation has been achieved. This paper implements a targeted literature review to outline, describe, explain, and put into context the crucial techniques that helped achieve this milestone. The research presented here is a targeted review of neural language models that present vital steps towards a general language representation model.

7

Laparra, Egoitz, Aurelie Mascio, Sumithra Velupillai, and Timothy Miller. "A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records." Yearbook of Medical Informatics 30, no. 01 (August 2021): 239–44. http://dx.doi.org/10.1055/s-0041-1726522.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Summary Objectives: We survey recent work in biomedical NLP on building more adaptable or generalizable models, with a focus on work dealing with electronic health record (EHR) texts, to better understand recent trends in this area and identify opportunities for future research. Methods: We searched PubMed, the Institute of Electrical and Electronics Engineers (IEEE), the Association for Computational Linguistics (ACL) anthology, the Association for the Advancement of Artificial Intelligence (AAAI) proceedings, and Google Scholar for the years 2018-2020. We reviewed abstracts to identify the most relevant and impactful work, and manually extracted data points from each of these papers to characterize the types of methods and tasks that were studied, in which clinical domains, and current state-of-the-art results. Results: The ubiquity of pre-trained transformers in clinical NLP research has contributed to an increase in domain adaptation and generalization-focused work that uses these models as the key component. Most recently, work has started to train biomedical transformers and to extend the fine-tuning process with additional domain adaptation techniques. We also highlight recent research in cross-lingual adaptation, as a special case of adaptation. Conclusions: While pre-trained transformer models have led to some large performance improvements, general domain pre-training does not always transfer adequately to the clinical domain due to its highly specialized language. There is also much work to be done in showing that the gains obtained by pre-trained transformers are beneficial in real world use cases. The amount of work in domain adaptation and transfer learning is limited by dataset availability and creating datasets for new domains is challenging. The growing body of research in languages other than English is encouraging, and more collaboration between researchers across the language divide would likely accelerate progress in non-English clinical NLP.

8

Sarhan, Injy, and Marco Spruit. "Can We Survive without Labelled Data in NLP? Transfer Learning for Open Information Extraction." Applied Sciences 10, no. 17 (August 20, 2020): 5758. http://dx.doi.org/10.3390/app10175758.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Various tasks in natural language processing (NLP) suffer from lack of labelled training data, which deep neural networks are hungry for. In this paper, we relied upon features learned to generate relation triples from the open information extraction (OIE) task. First, we studied how transferable these features are from one OIE domain to another, such as from a news domain to a bio-medical domain. Second, we analyzed their transferability to a semantically related NLP task, namely, relation extraction (RE). We thereby contribute to answering the question: can OIE help us achieve adequate NLP performance without labelled data? Our results showed comparable performance when using inductive transfer learning in both experiments by relying on a very small amount of the target data, wherein promising results were achieved. When transferring to the OIE bio-medical domain, we achieved an F-measure of 78.0%, only 1% lower when compared to traditional learning. Additionally, transferring to RE using an inductive approach scored an F-measure of 67.2%, which was 3.8% lower than training and testing on the same task. Hereby, our analysis shows that OIE can act as a reliable source task.

9

Peña-Torres, Jefferson A., Raúl E. Gutiérrez, Víctor A. Bucheli, and Fabio A. González. "How to Adapt Deep Learning Models to a New Domain: The Case of Biomedical Relation Extraction." TecnoLógicas 22 (December 5, 2019): 49–62. http://dx.doi.org/10.22430/22565337.1483.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In this article, we study the relation extraction problem from Natural Language Processing (NLP) implementing a domain adaptation setting without external resources. We trained a Deep Learning (DL) model for Relation Extraction (RE), which extracts semantic relations in the biomedical domain. However, can the model be applied to different domains? The model should be adaptable to automatically extract relationships across different domains using the DL network. Completely training DL models in a short time is impractical because the models should quickly adapt to different datasets in several domains without delay. Therefore, adaptation is crucial for intelligent systems, where changing factors and unanticipated perturbations are common. In this study, we present a detailed analysis of the problem, as well as preliminary experimentation, results, and their evaluation.

10

Son, Suhyune, Seonjeong Hwang, Sohyeun Bae, Soo Jun Park, and Jang-Hwan Choi. "A Sequential and Intensive Weighted Language Modeling Scheme for Multi-Task Learning-Based Natural Language Understanding." Applied Sciences 11, no. 7 (March 31, 2021): 3095. http://dx.doi.org/10.3390/app11073095.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Multi-task learning (MTL) approaches are actively used for various natural language processing (NLP) tasks. The Multi-Task Deep Neural Network (MT-DNN) has contributed significantly to improving the performance of natural language understanding (NLU) tasks. However, one drawback is that confusion about the language representation of various tasks arises during the training of the MT-DNN model. Inspired by the internal-transfer weighting of MTL in medical imaging, we introduce a Sequential and Intensive Weighted Language Modeling (SIWLM) scheme. The SIWLM consists of two stages: (1) Sequential weighted learning (SWL), which trains a model to learn entire tasks sequentially and concentrically, and (2) Intensive weighted learning (IWL), which enables the model to focus on the central task. We apply this scheme to the MT-DNN model and call this model the MTDNN-SIWLM. Our model achieves higher performance than the existing reference algorithms on six out of the eight GLUE benchmark tasks. Moreover, our model outperforms MT-DNN by 0.77 on average on the overall task. Finally, we conducted a thorough empirical investigation to determine the optimal weight for each GLUE task.

11

Prottasha, Nusrat Jahan, Abdullah As Sami, Md Kowsher, Saydul Akbar Murad, Anupam Kumar Bairagi, Mehedi Masud, and Mohammed Baz. "Transfer Learning for Sentiment Analysis Using BERT Based Supervised Fine-Tuning." Sensors 22, no. 11 (May 30, 2022): 4157. http://dx.doi.org/10.3390/s22114157.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The growth of the Internet has expanded the amount of data expressed by users across multiple platforms. The availability of these different worldviews and individuals’ emotions empowers sentiment analysis. However, sentiment analysis becomes even more challenging due to a scarcity of standardized labeled data in the Bangla NLP domain. The majority of the existing Bangla research has relied on models of deep learning that significantly focus on context-independent word embeddings, such as Word2Vec, GloVe, and fastText, in which each word has a fixed representation irrespective of its context. Meanwhile, context-based pre-trained language models such as BERT have recently revolutionized the state of natural language processing. In this work, we utilized BERT’s transfer learning ability to a deep integrated model CNN-BiLSTM for enhanced performance of decision-making in sentiment analysis. In addition, we also introduced the ability of transfer learning to classical machine learning algorithms for the performance comparison of CNN-BiLSTM. Additionally, we explore various word embedding techniques, such as Word2Vec, GloVe, and fastText, and compare their performance to the BERT transfer learning strategy. As a result, we have shown a state-of-the-art binary classification performance for Bangla sentiment analysis that significantly outperforms all embedding and algorithms.

12

Kowsher, Md, Md Shohanur Islam Sobuj, Md Fahim Shahriar, Nusrat Jahan Prottasha, Mohammad Shamsul Arefin, Pranab Kumar Dhar, and Takeshi Koshiba. "An Enhanced Neural Word Embedding Model for Transfer Learning." Applied Sciences 12, no. 6 (March 10, 2022): 2848. http://dx.doi.org/10.3390/app12062848.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Due to the expansion of data generation, more and more natural language processing (NLP) tasks are needing to be solved. For this, word representation plays a vital role. Computation-based word embedding in various high languages is very useful. However, until now, low-resource languages such as Bangla have had very limited resources available in terms of models, toolkits, and datasets. Considering this fact, in this paper, an enhanced BanglaFastText word embedding model is developed using Python and two large pre-trained Bangla models of FastText (Skip-gram and cbow). These pre-trained models were trained on a collected large Bangla corpus (around 20 million points of text data, in which every paragraph of text is considered as a data point). BanglaFastText outperformed Facebook’s FastText by a significant margin. To evaluate and analyze the performance of these pre-trained models, the proposed work accomplished text classification based on three popular textual Bangla datasets, and developed models using various machine learning classical approaches, as well as a deep neural network. The evaluations showed a superior performance over existing word embedding techniques and the Facebook Bangla FastText pre-trained model for Bangla NLP. In addition, the performance in the original work concerning these textual datasets provides excellent results. A Python toolkit is proposed, which is convenient for accessing the models and using the models for word embedding, obtaining semantic relationships word-by-word or sentence-by-sentence; sentence embedding for classical machine learning approaches; and also the unsupervised finetuning of any Bangla linguistic dataset.

13

Borah, Trinayan, and S. Ganesh Kumar. "Application of NLP and Machine Learning for Mental Health Improvement." International Journal of Engineering and Advanced Technology 11, no. 6 (August 30, 2022): 47–52. http://dx.doi.org/10.35940/ijeat.f3657.0811622.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Humans' most powerful tool is their mental wellness. Individuals' well-being can be impacted by poor mental health. This paper focuses on a smart technical solution to the problem of mental health issues detection related to the stress, sadness, depression, anxiety etc. which if not handled efficiently may further lead to a severe problem. The paper deals with the designing of an automated smart system using social media posts, that will help mental health experts to successfully identify and understand about the mental health condition of social media users. That can be done based on text analysis of rich social media resources such as Reddit, Twitter posts. The implementation of the system is done using Natural Language Processing (NLP) methods, machine learning and deep learning algorithms. The models are trained using a prepared dataset of social media postings. With this automated system the mental health experts can able to detect the stress or some other emotions of social media uses in a very earlier as well as faster way. The proposed system can predict five emotional categories: 'Happy', 'Angry', 'Surprise', 'Sad', 'Fear' based on machine learning (Logistic Regression, Random Forest, SVM), deep learning Long Short-Term Memory (LSTM) and BERT transfer learning algorithms. All the applied algorithms are evaluated using confusion matrix, the highest accuracy and f1 score achieved is more than 90%, which is better than the existing human emotion detection systems.

14

Ait-Mlouk, Addi, Sadi A. Alawadi, Salman Toor, and Andreas Hellander. "FedQAS: Privacy-Aware Machine Reading Comprehension with Federated Learning." Applied Sciences 12, no. 6 (March 18, 2022): 3130. http://dx.doi.org/10.3390/app12063130.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Machine reading comprehension (MRC) of text data is a challenging task in Natural Language Processing (NLP), with a lot of ongoing research fueled by the release of the Stanford Question Answering Dataset (SQuAD) and Conversational Question Answering (CoQA). It is considered to be an effort to teach computers how to “understand” a text, and then to be able to answer questions about it using deep learning. However, until now, large-scale training on private text data and knowledge sharing has been missing for this NLP task. Hence, we present FedQAS, a privacy-preserving machine reading system capable of leveraging large-scale private data without the need to pool those datasets in a central location. The proposed approach combines transformer models and federated learning technologies. The system is developed using the FEDn framework and deployed as a proof-of-concept alliance initiative. FedQAS is flexible, language-agnostic, and allows intuitive participation and execution of local model training. In addition, we present the architecture and implementation of the system, as well as provide a reference evaluation based on the SQuAD dataset, to showcase how it overcomes data privacy issues and enables knowledge sharing between alliance members in a Federated learning setting.

15

Dong, Shanshan, and Chang Liu. "Sentiment Classification for Financial Texts Based on Deep Learning." Computational Intelligence and Neuroscience 2021 (October 11, 2021): 1–9. http://dx.doi.org/10.1155/2021/9524705.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Sentiment classification for financial texts is of great importance for predicting stock markets and financial crises. At present, with the popularity of applications in the field of natural language processing (NLP) adopting deep learning, the application of automatic text classification and text-based sentiment classification has become more and more extensive. However, in the field of financial text-based sentiment classification, due to a lack of labeled samples, such applications are limited. A domain-adaptation-based financial text sentiment classification method is proposed in this paper, which can adopt source domain (SD) text data with sentiment labels and a large amount of unlabeled target domain (TD) financial text data as training samples for the proposed neural network. The proposed method is a cross-domain transfer-learning-based method. The domain classification subnetwork is added to the original neural network, and the domain classification loss function is also added to the original training loss function. Therefore, the network can simultaneously adapt to the target domain and then accomplish the classification task. The experiment of the proposed sentiment classification transfer learning method is carried out through an open-source dataset. The proposed method in this paper uses the reviews of Amazon Books, DVDs, electronics, and kitchen appliances as the source domain for cross-domain learning, and the classification accuracy rates can reach 65.0%, 61.2%, 61.6%, and 66.3%, respectively. Compared with nontransfer learning, the classification accuracy rate has improved by 11.0%, 7.6%, 11.4%, and 13.4%, respectively.

16

Kastrati, Zenun, Fisnik Dalipi, Ali Shariq Imran, Krenare Pireva Nuci, and Mudasir Ahmad Wani. "Sentiment Analysis of Students’ Feedback with NLP and Deep Learning: A Systematic Mapping Study." Applied Sciences 11, no. 9 (April 28, 2021): 3986. http://dx.doi.org/10.3390/app11093986.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In the last decade, sentiment analysis has been widely applied in many domains, including business, social networks and education. Particularly in the education domain, where dealing with and processing students’ opinions is a complicated task due to the nature of the language used by students and the large volume of information, the application of sentiment analysis is growing yet remains challenging. Several literature reviews reveal the state of the application of sentiment analysis in this domain from different perspectives and contexts. However, the body of literature is lacking a review that systematically classifies the research and results of the application of natural language processing (NLP), deep learning (DL), and machine learning (ML) solutions for sentiment analysis in the education domain. In this article, we present the results of a systematic mapping study to structure the published information available. We used a stepwise PRISMA framework to guide the search process and searched for studies conducted between 2015 and 2020 in the electronic research databases of the scientific literature. We identified 92 relevant studies out of 612 that were initially found on the sentiment analysis of students’ feedback in learning platform environments. The mapping results showed that, despite the identified challenges, the field is rapidly growing, especially regarding the application of DL, which is the most recent trend. We identified various aspects that need to be considered in order to contribute to the maturity of research and development in the field. Among these aspects, we highlighted the need of having structured datasets, standardized solutions and increased focus on emotional expression and detection.

17

Kaur, Gaganpreet, Pratibha ., Amandeep Kaur, and Meenu Khurana. "A Review of Opinion Mining Techniques." ECS Transactions 107, no. 1 (April 24, 2022): 10125–32. http://dx.doi.org/10.1149/10701.10125ecst.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Opinion mining also known as Sentiment analysis is one of the most recent challenges in Natural Language Processing (NLP). Individuals expressing their opinions on various platforms such as Facebook, Twitter, and Yelp are also a difficult task because innovation has increased exponentially. With the popularity of social media, a massive amount of data namely comments, reviews and opinions have been generated. According to researchers, analysis of sentiments is based on a sentence level, document level, aspect level and user level. The analysis of this data consumes more time and is difficult for processing. As a result, there is a need to create an intelligent system that can classify or determine positive, negative, and neutral opinions. The main goal of this paper is to provide a brief summary of opinion mining using techniques like machine learning, deep learning, transfer learning, and the Hadoop framework.

18

Qasim, Rukhma, Waqas Haider Bangyal, Mohammed A. Alqarni, and Abdulwahab Ali Almazroi. "A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification." Journal of Healthcare Engineering 2022 (January 7, 2022): 1–17. http://dx.doi.org/10.1155/2022/3498123.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Text Classification problem has been thoroughly studied in information retrieval problems and data mining tasks. It is beneficial in multiple tasks including medical diagnose health and care department, targeted marketing, entertainment industry, and group filtering processes. A recent innovation in both data mining and natural language processing gained the attention of researchers from all over the world to develop automated systems for text classification. NLP allows categorizing documents containing different texts. A huge amount of data is generated on social media sites through social media users. Three datasets have been used for experimental purposes including the COVID-19 fake news dataset, COVID-19 English tweet dataset, and extremist-non-extremist dataset which contain news blogs, posts, and tweets related to coronavirus and hate speech. Transfer learning approaches do not experiment on COVID-19 fake news and extremist-non-extremist datasets. Therefore, the proposed work applied transfer learning classification models on both these datasets to check the performance of transfer learning models. Models are trained and evaluated on the accuracy, precision, recall, and F1-score. Heat maps are also generated for every model. In the end, future directions are proposed.

19

Al Duhayyim, Mesfer, Sana Alazwari, Hanan Abdullah Mengash, Radwa Marzouk, Jaber S. Alzahrani, Hany Mahgoub, Fahd Althukair, and Ahmed S. Salama. "Metaheuristics Optimization with Deep Learning Enabled Automated Image Captioning System." Applied Sciences 12, no. 15 (July 31, 2022): 7724. http://dx.doi.org/10.3390/app12157724.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Image captioning is a popular topic in the domains of computer vision and natural language processing (NLP). Recent advancements in deep learning (DL) models have enabled the improvement of the overall performance of the image captioning approach. This study develops a metaheuristic optimization with a deep learning-enabled automated image captioning technique (MODLE-AICT). The proposed MODLE-AICT model focuses on the generation of effective captions to the input images by using two processes involving encoding unit and decoding unit. Initially, at the encoding part, the salp swarm algorithm (SSA), with a HybridNet model, is utilized to generate effectual input image representation using fixed-length vectors, showing the novelty of the work. Moreover, the decoding part includes a bidirectional gated recurrent unit (BiGRU) model used to generate descriptive sentences. The inclusion of an SSA-based hyperparameter optimizer helps in attaining effectual performance. For inspecting the enhanced performance of the MODLE-AICT model, a series of simulations were carried out, and the results are examined under several aspects. The experimental values suggested the betterment of the MODLE-AICT model over recent approaches.

20

Benítez-Andrades, José Alberto, Álvaro González-Jiménez, Álvaro López-Brea, Jose Aveleira-Mata, José-Manuel Alija-Pérez, and María Teresa García-Ordás. "Detecting racism and xenophobia using deep learning models on Twitter data: CNN, LSTM and BERT." PeerJ Computer Science 8 (March 1, 2022): e906. http://dx.doi.org/10.7717/peerj-cs.906.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

With the growth that social networks have experienced in recent years, it is entirely impossible to moderate content manually. Thanks to the different existing techniques in natural language processing, it is possible to generate predictive models that automatically classify texts into different categories. However, a weakness has been detected concerning the language used to train such models. This work aimed to develop a predictive model based on BERT, capable of detecting racist and xenophobic messages in tweets written in Spanish. A comparison was made with different Deep Learning models. A total of five predictive models were developed, two based on BERT and three using other deep learning techniques, CNN, LSTM and a model combining CNN + LSTM techniques. After exhaustively analyzing the results obtained by the different models, it was found that the one that got the best metrics was BETO, a BERT-based model trained only with texts written in Spanish. The results of our study show that the BETO model achieves a precision of 85.22% compared to the 82.00% precision of the mBERT model. The rest of the models obtained between 79.34% and 80.48% precision. On this basis, it has been possible to justify the vital importance of developing native transfer learning models for solving Natural Language Processing (NLP) problems in Spanish. Our main contribution is the achievement of promising results in the field of racism and hate speech in Spanish by applying different deep learning techniques.

21

Zhou, Shuohua, and Yanping Zhang. "DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering." Applied Sciences 11, no. 23 (November 26, 2021): 11251. http://dx.doi.org/10.3390/app112311251.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

With the outbreak of COVID-19 that has prompted an increased focus on self-care, more and more people hope to obtain disease knowledge from the Internet. In response to this demand, medical question answering and question generation tasks have become an important part of natural language processing (NLP). However, there are limited samples of medical questions and answers, and the question generation systems cannot fully meet the needs of non-professionals for medical questions. In this research, we propose a BERT medical pretraining model, using GPT-2 for question augmentation and T5-Small for topic extraction, calculating the cosine similarity of the extracted topic and using XGBoost for prediction. With augmentation using GPT-2, the prediction accuracy of our model outperforms the state-of-the-art (SOTA) model performance. Our experiment results demonstrate the outstanding performance of our model in medical question answering and question generation tasks, and its great potential to solve other biomedical question answering challenges.

22

JP, Sanjanasri, Vijay Krishna Menon, Soman KP, Rajendran S, and Agnieszka Wolk. "Generation of Cross-Lingual Word Vectors for Low-Resourced Languages Using Deep Learning and Topological Metrics in a Data-Efficient Way." Electronics 10, no. 12 (June 8, 2021): 1372. http://dx.doi.org/10.3390/electronics10121372.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Linguists have been focused on a qualitative comparison of the semantics from different languages. Evaluation of the semantic interpretation among disparate language pairs like English and Tamil is an even more formidable task than for Slavic languages. The concept of word embedding in Natural Language Processing (NLP) has enabled a felicitous opportunity to quantify linguistic semantics. Multi-lingual tasks can be performed by projecting the word embeddings of one language onto the semantic space of the other. This research presents a suite of data-efficient deep learning approaches to deduce the transfer function from the embedding space of English to that of Tamil, deploying three popular embedding algorithms: Word2Vec, GloVe and FastText. A novel evaluation paradigm was devised for the generation of embeddings to assess their effectiveness, using the original embeddings as ground truths. Transferability across other target languages of the proposed model was assessed via pre-trained Word2Vec embeddings from Hindi and Chinese languages. We empirically prove that with a bilingual dictionary of a thousand words and a corresponding small monolingual target (Tamil) corpus, useful embeddings can be generated by transfer learning from a well-trained source (English) embedding. Furthermore, we demonstrate the usability of generated target embeddings in a few NLP use-case tasks, such as text summarization, part-of-speech (POS) tagging, and bilingual dictionary induction (BDI), bearing in mind that those are not the only possible applications.

23

Lee, Ju-Sang, Joon-Choul Shin, and Choel-Young Ock. "The Multi-Hot Representation-Based Language Model to Maintain Morpheme Units." Applied Sciences 12, no. 20 (October 20, 2022): 10612. http://dx.doi.org/10.3390/app122010612.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Natural language models brought rapid developments to Natural Language Processing (NLP) performance following the emergence of large-scale deep learning models. Language models have previously used token units to represent natural language while reducing the proportion of unknown tokens. However, tokenization in language models raises language-specific issues. One of the key issues is that separating words by morphemes may cause distortion to the original meaning; also, it can prove challenging to apply the information surrounding a word, such as its semantic network. We propose a multi-hot representation language model to maintain Korean morpheme units. This method represents a single morpheme as a group of syllable-based tokens for cases where no matching tokens exist. This model has demonstrated similar performance to existing models in various natural language processing applications. The proposed model retains the minimum unit of meaning by maintaining the morpheme units and can easily accommodate the extension of semantic information.

24

Leyh-Bannurah, Sami-Ramzi, Zhe Tian, Pierre I. Karakiewicz, Ulrich Wolffgang, Guido Sauter, Margit Fisch, Dirk Pehrke, Hartwig Huland, Markus Graefen, and Lars Budäus. "Deep Learning for Natural Language Processing in Urology: State-of-the-Art Automated Extraction of Detailed Pathologic Prostate Cancer Data From Narratively Written Electronic Health Records." JCO Clinical Cancer Informatics, no. 2 (December 2018): 1–9. http://dx.doi.org/10.1200/cci.18.00080.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Purpose Entering all information from narrative documentation for clinical research into databases is time consuming, costly, and nearly impossible. Even high-volume databases do not cover all patient characteristics and drawn results may be limited. A new viable automated solution is machine learning based on deep neural networks applied to natural language processing (NLP), extracting detailed information from narratively written (eg, pathologic radical prostatectomy [RP]) electronic health records (EHRs). Methods Within an RP pathologic database, 3,679 RP EHRs were randomly split into 70% training and 30% test data sets. Training EHRs were automatically annotated, providing a semiautomatically annotated corpus of narratively written pathologic reports with initially context-free gold standard encodings. Primary and secondary Gleason pattern, corresponding percentages, tumor stage, nodal stage, total volume, tumor volume and diameter, and surgical margin were variables of interest. Second, state-of-the-art NLP techniques were used to train an industry-standard language model for pathologic EHRs by transfer learning. Finally, accuracy of the named entity extractors was compared with the gold standard encodings. Results Agreement rates (95% confidence interval) for primary and secondary Gleason patterns each were 91.3% (89.4 to 93.0), corresponding to the following: Gleason percentages, 70.5% (67.6 to 73.3) and 80.9% (78.4 to 83.3); tumor stage, 99.3% (98.6 to 99.7); nodal stage, 98.7% (97.8 to 99.3); total volume, 98.3% (97.3 to 99.0); tumor volume, 93.3% (91.6 to 94.8); maximum diameter, 96.3% (94.9 to 97.3); and surgical margin, 98.7% (97.8 to 99.3). Cumulative agreement was 91.3%. Conclusion Our proposed NLP pipeline offers new abilities for precise and efficient data management from narrative documentation for clinical research. The scalable approach potentially allows the NLP pipeline to be generalized to other genitourinary EHRs, tumor entities, and other medical disciplines.

25

Croce, Danilo, Giuseppe Castellucci, and Roberto Basili. "Adversarial training for few-shot text classification." Intelligenza Artificiale 14, no. 2 (January 11, 2021): 201–14. http://dx.doi.org/10.3233/ia-200051.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.

26

Qiao, Yanhua, Xiaolei Zhu, and Haipeng Gong. "BERT-Kcr: prediction of lysine crotonylation sites by a transfer learning method with pre-trained BERT models." Bioinformatics 38, no. 3 (October 13, 2021): 648–54. http://dx.doi.org/10.1093/bioinformatics/btab712.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Motivation As one of the most important post-translational modifications (PTMs), protein lysine crotonylation (Kcr) has attracted wide attention, which involves in important physiological activities, such as cell differentiation and metabolism. However, experimental methods are expensive and time-consuming for Kcr identification. Instead, computational methods can predict Kcr sites in silico with high efficiency and low cost. Results In this study, we proposed a novel predictor, BERT-Kcr, for protein Kcr sites prediction, which was developed by using a transfer learning method with pre-trained bidirectional encoder representations from transformers (BERT) models. These models were originally used for natural language processing (NLP) tasks, such as sentence classification. Here, we transferred each amino acid into a word as the input information to the pre-trained BERT model. The features encoded by BERT were extracted and then fed to a BiLSTM network to build our final model. Compared with the models built by other machine learning and deep learning classifiers, BERT-Kcr achieved the best performance with AUROC of 0.983 for 10-fold cross validation. Further evaluation on the independent test set indicates that BERT-Kcr outperforms the state-of-the-art model Deep-Kcr with an improvement of about 5% for AUROC. The results of our experiment indicate that the direct use of sequence information and advanced pre-trained models of NLP could be an effective way for identifying PTM sites of proteins. Availability and implementation The BERT-Kcr model is publicly available on http://zhulab.org.cn/BERT-Kcr_models/. Supplementary information Supplementary data are available at Bioinformatics online.

27

Moon, Junhyung, Gyuyoung Park, and Jongpil Jeong. "POP-ON: Prediction of Process Using One-Way Language Model Based on NLP Approach." Applied Sciences 11, no. 2 (January 18, 2021): 864. http://dx.doi.org/10.3390/app11020864.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In business process management, the monitoring service is an important element that can prevent various problems in advance from before they occur in companies and industries. Execution log is created in an information system that is aware of the enterprise process, which helps predict the process. The ultimate goal of the proposed method is to predict the process following the running process instance and predict events based on previously completed event log data. Companies can flexibly respond to unwanted deviations in their workflow. When solving the next event prediction problem, we use a fully attention-based transformer, which has performed well in recent natural language processing approaches. After recognizing the name attribute of the event in the natural language and predicting the next event, several necessary elements were applied. It is trained using the proposed deep learning model according to specific pre-processing steps. Experiments using various business process log datasets demonstrate the superior performance of the proposed method. The name of the process prediction model we propose is “POP-ON”.

28

Cheng, Zishuai, Baojiang Cui, Tao Qi, Wenchuan Yang, and Junsong Fu. "An Improved Feature Extraction Approach for Web Anomaly Detection Based on Semantic Structure." Security and Communication Networks 2021 (February 11, 2021): 1–11. http://dx.doi.org/10.1155/2021/6661124.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Anomaly-based Web application firewalls (WAFs) are vital for providing early reactions to novel Web attacks. In recent years, various machine learning, deep learning, and transfer learning-based anomaly detection approaches have been developed to protect against Web attacks. Most of them directly treat the request URL as a general string that consists of letters and roughly use natural language processing (NLP) methods (i.e., Word2Vec and Doc2Vec) or domain knowledge to extract features. In this paper, we proposed an improved feature extraction approach which leveraged the advantage of the semantic structure of URLs. Semantic structure is an inherent interpretative property of the URL that identifies the function and vulnerability of each part in the URL. The evaluations on CSIC-2020 show that our feature extraction method has better performance than conventional feature extraction routine by more than average dramatic 5% improvement in accuracy, recall, and F1-score.

29

Pardamean, Amsal, and Hilman F. Pardede. "Tuned bidirectional encoder representations from transformers for fake news detection." Indonesian Journal of Electrical Engineering and Computer Science 22, no. 3 (June 1, 2021): 1667. http://dx.doi.org/10.11591/ijeecs.v22.i3.pp1667-1671.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Online medias are currently the dominant source of Information due to not being limited by time and place, fast and wide distributions. However, inaccurate news, or often referred as fake news is a major problem in news dissemination for online medias. Inaccurate news is information that is not true, that is engineered to cover the real information and has no factual basis. Usually, inaccurate news is made in the form of news that has mass appeal and is presented in the guise of genuine and legitimate news nuances to deceive or change the reader's mind or opinion. Identification of inaccurate news from real news can be done with natural language processing (NLP) technologies. In this paper, we proposed bidirectional encoder representations from transformers (BERT) for inaccurate news identification. BERT is a language model based on deep learning technologies and it has found effective for many NLP tasks. In this study, we use transfer learning and fine-tuning to adapt BERT for inaccurate news identification. The experiments show that our method could achieve accuracy of 99.23%, recall 99.46%, precision 98.86%, and F-Score of 99.15%. It is largely better than traditional method for the same tasks.

30

Yoon, Hong-Jun, Christopher Stanley, J. Blair Christian, Hilda B. Klasky, Andrew E. Blanchard, Eric B. Durbin, Xiao-Cheng Wu, et al. "Optimal vocabulary selection approaches for privacy-preserving deep NLP model training for information extraction and cancer epidemiology." Cancer Biomarkers 33, no. 2 (February 14, 2022): 185–98. http://dx.doi.org/10.3233/cbm-210306.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

BACKGROUND: With the use of artificial intelligence and machine learning techniques for biomedical informatics, security and privacy concerns over the data and subject identities have also become an important issue and essential research topic. Without intentional safeguards, machine learning models may find patterns and features to improve task performance that are associated with private personal information. OBJECTIVE: The privacy vulnerability of deep learning models for information extraction from medical textural contents needs to be quantified since the models are exposed to private health information and personally identifiable information. The objective of the study is to quantify the privacy vulnerability of the deep learning models for natural language processing and explore a proper way of securing patients’ information to mitigate confidentiality breaches. METHODS: The target model is the multitask convolutional neural network for information extraction from cancer pathology reports, where the data for training the model are from multiple state population-based cancer registries. This study proposes the following schemes to collect vocabularies from the cancer pathology reports; (a) words appearing in multiple registries, and (b) words that have higher mutual information. We performed membership inference attacks on the models in high-performance computing environments. RESULTS: The comparison outcomes suggest that the proposed vocabulary selection methods resulted in lower privacy vulnerability while maintaining the same level of clinical task performance.

31

Shi, Binbin, Lijuan Zhang, Jie Huang, Huilin Zheng, Jian Wan, and Lei Zhang. "MDA: An Intelligent Medical Data Augmentation Scheme Based on Medical Knowledge Graph for Chinese Medical Tasks." Applied Sciences 12, no. 20 (October 21, 2022): 10655. http://dx.doi.org/10.3390/app122010655.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Text data augmentation is essential in the field of medicine for the tasks of natural language processing (NLP). However, most of the traditional text data augmentation focuses on the English datasets, and there is little research on the Chinese datasets to augment Chinese sentences. Nevertheless, the traditional text data augmentation ignores the semantics between words in sentences, besides, it has limitations in alleviating the problem of the diversity of augmented sentences. In this paper, a novel medical data augmentation (MDA) is proposed for NLP tasks, which combines the medical knowledge graph with text data augmentation to generate augmented data. Experiments on the named entity recognition task and relational classification task demonstrate that the MDA can significantly enhance the efficiency of the deep learning models compared to cases without augmentation.

32

Mars, Mourad. "From Word Embeddings to Pre-Trained Language Models: A State-of-the-Art Walkthrough." Applied Sciences 12, no. 17 (September 1, 2022): 8805. http://dx.doi.org/10.3390/app12178805.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

With the recent advances in deep learning, different approaches to improving pre-trained language models (PLMs) have been proposed. PLMs have advanced state-of-the-art (SOTA) performance on various natural language processing (NLP) tasks such as machine translation, text classification, question answering, text summarization, information retrieval, recommendation systems, named entity recognition, etc. In this paper, we provide a comprehensive review of prior embedding models as well as current breakthroughs in the field of PLMs. Then, we analyse and contrast the various models and provide an analysis of the way they have been built (number of parameters, compression techniques, etc.). Finally, we discuss the major issues and future directions for each of the main points.

33

Balouchzahi, Fazlourrahman, Grigori Sidorov, and Hosahalli Lakshmaiah Shashirekha. "Fake news spreaders profiling using N-grams of various types and SHAP-based feature selection." Journal of Intelligent & Fuzzy Systems 42, no. 5 (March 31, 2022): 4437–48. http://dx.doi.org/10.3233/jifs-219233.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Complex learning approaches along with complicated and expensive features are not always the best or the only solution for Natural Language Processing (NLP) tasks. Despite huge progress and advancements in learning approaches such as Deep Learning (DL) and Transfer Learning (TL), there are many NLP tasks such as Text Classification (TC), for which basic Machine Learning (ML) classifiers perform superior to DL or TL approaches. Added to this, an efficient feature engineering step can significantly improve the performance of ML based systems. To check the efficacy of ML based systems and feature engineering on TC, this paper explores char, character sequences, syllables, word n-grams as well as syntactic n-grams as features and SHapley Additive exPlanations (SHAP) values to select the important features from the collection of extracted features. Voting Classifiers (VC) with soft and hard voting of four ML classifiers, namely: Support Vector Machine (SVM) with Linear and Radial Basis Function (RBF) kernel, Logistic Regression (LR), and Random Forest (RF) was trained and evaluated on Fake News Spreaders Profiling (FNSP) shared task dataset in PAN 2020. This shared task consists of profiling fake news spreaders in English and Spanish languages. The proposed models exhibited an average accuracy of 0.785 for both languages in this shared task and outperformed the best models submitted to this task.

34

Almanaseer, Waref, Mohammad Alshraideh, and Omar Alkadi. "A Deep Belief Network Classification Approach for Automatic Diacritization of Arabic Text." Applied Sciences 11, no. 11 (June 4, 2021): 5228. http://dx.doi.org/10.3390/app11115228.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep learning has emerged as a new area of machine learning research. It is an approach that can learn features and hierarchical representation purely from data and has been successfully applied to several fields such as images, sounds, text and motion. The techniques developed from deep learning research have already been impacting the research on Natural Language Processing (NLP). Arabic diacritics are vital components of Arabic text that remove ambiguity from words and reinforce the meaning of the text. In this paper, a Deep Belief Network (DBN) is used as a diacritizer for Arabic text. DBN is an algorithm among deep learning that has recently proved to be very effective for a variety of machine learning problems. We evaluate the use of DBNs as classifiers in automatic Arabic text diacritization. The DBN was trained to individually classify each input letter with the corresponding diacritized version. Experiments were conducted using two benchmark datasets, the LDC ATB3 and Tashkeela. Our best settings achieve a DER and WER of 2.21% and 6.73%, receptively, on the ATB3 benchmark with an improvement of 26% over the best published results. On the Tashkeela benchmark, our system continues to achieve high accuracy with a DER of 1.79% and 14% improvement.

35

Mouratidis, Despoina, Katia Lida Kermanidis, and Vilelmini Sosoni. "Innovatively Fused Deep Learning with Limited Noisy Data for Evaluating Translations from Poor into Rich Morphology." Applied Sciences 11, no. 2 (January 11, 2021): 639. http://dx.doi.org/10.3390/app11020639.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Evaluation of machine translation (MT) into morphologically rich languages has not been well studied despite its importance. This paper proposes a classifier, that is, a deep learning (DL) schema for MT evaluation, based on different categories of information (linguistic features, natural language processing (NLP) metrics and embeddings), by using a model for machine learning based on noisy and small datasets. The linguistic features are string based for the language pairs English (EN)–Greek (EL) and EN–Italian (IT). The paper also explores the linguistic differences that affect evaluation accuracy between different kinds of corpora. A comparative study between using a simple embedding layer (mathematically calculated) and pre-trained embeddings is conducted. Moreover, an analysis of the impact of feature selection and dimensionality reduction on classification accuracy has been conducted. Results show that using a neural network (NN) model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation for EN–EL and EN–IT, by an increase of almost 0.40 points in correlation with human judgments on pairwise MT evaluation. It is observed that the proposed algorithm achieved better results on noisy and small datasets. In addition, for a more integrated analysis of the accuracy results, a qualitative linguistic analysis has been carried out in order to address complex linguistic phenomena.

36

Mahany, Ahmed, Heba Khaled, Nouh Sabri Elmitwally, Naif Aljohani, and Said Ghoniemy. "Negation and Speculation in NLP: A Survey, Corpora, Methods, and Applications." Applied Sciences 12, no. 10 (May 21, 2022): 5209. http://dx.doi.org/10.3390/app12105209.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in various natural languages and domains. Furthermore, we discuss the ongoing research into recent rule-based, supervised, and transfer learning techniques for the detection of negating and speculative content. Many English corpora for various domains are now annotated with negation and speculation; moreover, the availability of annotated corpora in other languages has started to increase. However, this growth is insufficient to address these important phenomena in languages with limited resources. The use of cross-lingual models and translation of the well-known languages are acceptable alternatives. We also highlight the lack of consistent annotation guidelines and the shortcomings of the existing techniques, and suggest alternatives that may speed up progress in this research direction. Adding more syntactic features may alleviate the limitations of the existing techniques, such as cue ambiguity and detecting the discontinuous scopes. In some NLP applications, inclusion of a system that is negation- and speculation-aware improves performance, yet this aspect is still not addressed or considered an essential step.

37

Gupta, Meenu, Hao Wu, Simrann Arora, Akash Gupta, Gopal Chaudhary, and Qiaozhi Hua. "Gene Mutation Classification through Text Evidence Facilitating Cancer Tumour Detection." Journal of Healthcare Engineering 2021 (July 27, 2021): 1–16. http://dx.doi.org/10.1155/2021/8689873.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

A cancer tumour consists of thousands of genetic mutations. Even after advancement in technology, the task of distinguishing genetic mutations, which act as driver for the growth of tumour with passengers (Neutral Genetic Mutations), is still being done manually. This is a time-consuming process where pathologists interpret every genetic mutation from the clinical evidence manually. These clinical shreds of evidence belong to a total of nine classes, but the criterion of classification is still unknown. The main aim of this research is to propose a multiclass classifier to classify the genetic mutations based on clinical evidence (i.e., the text description of these genetic mutations) using Natural Language Processing (NLP) techniques. The dataset for this research is taken from Kaggle and is provided by the Memorial Sloan Kettering Cancer Center (MSKCC). The world-class researchers and oncologists contribute the dataset. Three text transformation models, namely, CountVectorizer, TfidfVectorizer, and Word2Vec, are utilized for the conversion of text to a matrix of token counts. Three machine learning classification models, namely, Logistic Regression (LR), Random Forest (RF), and XGBoost (XGB), along with the Recurrent Neural Network (RNN) model of deep learning, are applied to the sparse matrix (keywords count representation) of text descriptions. The accuracy score of all the proposed classifiers is evaluated by using the confusion matrix. Finally, the empirical results show that the RNN model of deep learning has performed better than other proposed classifiers with the highest accuracy of 70%.

38

Rosado, Eduardo, Miguel Garcia-Remesal, Sergio Paraiso-Medina, Alejandro Pazos, and Victor Maojo. "Using Machine Learning to Collect and Facilitate Remote Access to Biomedical Databases: Development of the Biomedical Database Inventory." JMIR Medical Informatics 9, no. 2 (February 25, 2021): e22976. http://dx.doi.org/10.2196/22976.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Background Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. Objective To address this issue, we developed the Biomedical Database Inventory (BiDI), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them seamlessly. Methods We designed an ensemble of deep learning methods to extract database mentions. To train the system, we annotated a set of 1242 articles that included mentions of database publications. Such a data set was used along with transfer learning techniques to train an ensemble of deep learning natural language processing models targeted at database publication detection. Results The system obtained an F1 score of 0.929 on database detection, showing high precision and recall values. When applying this model to the PubMed and PubMed Central databases, we identified over 10,000 unique databases. The ensemble model also extracted the weblinks to the reported databases and discarded irrelevant links. For the extraction of weblinks, the model achieved a cross-validated F1 score of 0.908. We show two use cases: one related to “omics” and the other related to the COVID-19 pandemic. Conclusions BiDI enables access to biomedical resources over the internet and facilitates data-driven research and other scientific initiatives. The repository is openly available online and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (ie, biomedical and others).

39

Silvestri, Stefano, Francesco Gargiulo, and Mario Ciampi. "Iterative Annotation of Biomedical NER Corpora with Deep Neural Networks and Knowledge Bases." Applied Sciences 12, no. 12 (June 7, 2022): 5775. http://dx.doi.org/10.3390/app12125775.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The large availability of clinical natural language documents, such as clinical narratives or diagnoses, requires the definition of smart automatic systems for their processing and analysis, but the lack of annotated corpora in the biomedical domain, especially in languages different from English, makes it difficult to exploit the state-of-art machine-learning systems to extract information from such kinds of documents. For these reasons, healthcare professionals lose big opportunities that can arise from the analysis of this data. In this paper, we propose a methodology to reduce the manual efforts needed to annotate a biomedical named entity recognition (B-NER) corpus, exploiting both active learning and distant supervision, respectively based on deep learning models (e.g., Bi-LSTM, word2vec FastText, ELMo and BERT) and biomedical knowledge bases, in order to speed up the annotation task and limit class imbalance issues. We assessed this approach by creating an Italian-language electronic health record corpus annotated with biomedical domain entities in a small fraction of the time required for a fully manual annotation. The obtained corpus was used to train a B-NER deep neural network whose performances are comparable with the state of the art, with an F1-Score equal to 0.9661 and 0.8875 on two test sets.

40

Balabin, Helena, Charles Tapley Hoyt, Colin Birkenbihl, Benjamin M. Gyori, John Bachman, Alpha Tom Kodamullil, Paul G. Plöger, Martin Hofmann-Apitius, and Daniel Domingo-Fernández. "STonKGs: a sophisticated transformer trained on biomedical text and knowledge graphs." Bioinformatics 38, no. 6 (January 5, 2022): 1648–56. http://dx.doi.org/10.1093/bioinformatics/btac001.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Motivation The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models. However, representations based on a single modality are inherently limited. Results To generate better representations of biological knowledge, we propose STonKGs, a Sophisticated Transformer trained on biomedical text and Knowledge Graphs (KGs). This multimodal Transformer uses combined input sequences of structured information from KGs and unstructured text data from biomedical literature to learn joint representations in a shared embedding space. First, we pre-trained STonKGs on a knowledge base assembled by the Integrated Network and Dynamical Reasoning Assembler consisting of millions of text-triple pairs extracted from biomedical literature by multiple NLP systems. Then, we benchmarked STonKGs against three baseline models trained on either one of the modalities (i.e. text or KG) across eight different classification tasks, each corresponding to a different biological application. Our results demonstrate that STonKGs outperforms both baselines, especially on the more challenging tasks with respect to the number of classes, improving upon the F1-score of the best baseline by up to 0.084 (i.e. from 0.881 to 0.965). Finally, our pre-trained model as well as the model architecture can be adapted to various other transfer learning applications. Availability and implementation We make the source code and the Python package of STonKGs available at GitHub (https://github.com/stonkgs/stonkgs) and PyPI (https://pypi.org/project/stonkgs/). The pre-trained STonKGs models and the task-specific classification models are respectively available at https://huggingface.co/stonkgs/stonkgs-150k and https://zenodo.org/communities/stonkgs. Supplementary information Supplementary data are available at Bioinformatics online.

41

Saleh, Hager, Sherif Mostafa, Lubna Abdelkareim Gabralla, Ahmad O. Aseeri, and Shaker El-Sappagh. "Enhanced Arabic Sentiment Analysis Using a Novel Stacking Ensemble of Hybrid and Deep Learning Models." Applied Sciences 12, no. 18 (September 7, 2022): 8967. http://dx.doi.org/10.3390/app12188967.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Sentiment analysis (SA) is a machine learning application that drives people’s opinions from text using natural language processing (NLP) techniques. Implementing Arabic SA is challenging for many reasons, including equivocation, numerous dialects, lack of resources, morphological diversity, lack of contextual information, and hiding of sentiment terms in the implicit text. Deep learning models such as convolutional neural networks (CNN) and long short-term memory (LSTM) have significantly improved in the Arabic SA domain. Hybrid models based on CNN combined with long short-term memory (LSTM) or gated recurrent unit (GRU) have further improved the performance of single DL models. In addition, the ensemble of deep learning models, especially stacking ensembles, is expected to increase the robustness and accuracy of the previous DL models. In this paper, we proposed a stacking ensemble model that combined the prediction power of CNN and hybrid deep learning models to predict Arabic sentiment accurately. The stacking ensemble algorithm has two main phases. Three DL models were optimized in the first phase, including deep CNN, hybrid CNN-LSTM, and hybrid CNN-GRU. In the second phase, these three separate pre-trained models’ outputs were integrated with a support vector machine (SVM) meta-learner. To extract features for DL models, the continuous bag of words (CBOW) and the skip-gram models with 300 dimensions of the word embedding were used. Arabic health services datasets (Main-AHS and Sub-AHS) and the Arabic sentiment tweets dataset were used to train and test the models (ASTD). A number of well-known deep learning models, including DeepCNN, hybrid CNN-LSTM, hybrid CNN-GRU, and conventional ML algorithms, have been used to compare the performance of the proposed ensemble model. We discovered that the proposed deep stacking model achieved the best performance compared to the previous models. Based on the CBOW word embedding, the proposed model achieved the highest accuracy of 92.12%, 95.81%, and 81.4% for Main-AHS, Sub-AHS, and ASTD datasets, respectively.

42

Rivera Zavala, Renzo, and Paloma Martinez. "The Impact of Pretrained Language Models on Negation and Speculation Detection in Cross-Lingual Medical Text: Comparative Study." JMIR Medical Informatics 8, no. 12 (December 3, 2020): e18953. http://dx.doi.org/10.2196/18953.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Background Negation and speculation are critical elements in natural language processing (NLP)-related tasks, such as information extraction, as these phenomena change the truth value of a proposition. In the clinical narrative that is informal, these linguistic facts are used extensively with the objective of indicating hypotheses, impressions, or negative findings. Previous state-of-the-art approaches addressed negation and speculation detection tasks using rule-based methods, but in the last few years, models based on machine learning and deep learning exploiting morphological, syntactic, and semantic features represented as spare and dense vectors have emerged. However, although such methods of named entity recognition (NER) employ a broad set of features, they are limited to existing pretrained models for a specific domain or language. Objective As a fundamental subsystem of any information extraction pipeline, a system for cross-lingual and domain-independent negation and speculation detection was introduced with special focus on the biomedical scientific literature and clinical narrative. In this work, detection of negation and speculation was considered as a sequence-labeling task where cues and the scopes of both phenomena are recognized as a sequence of nested labels recognized in a single step. Methods We proposed the following two approaches for negation and speculation detection: (1) bidirectional long short-term memory (Bi-LSTM) and conditional random field using character, word, and sense embeddings to deal with the extraction of semantic, syntactic, and contextual patterns and (2) bidirectional encoder representations for transformers (BERT) with fine tuning for NER. Results The approach was evaluated for English and Spanish languages on biomedical and review text, particularly with the BioScope corpus, IULA corpus, and SFU Spanish Review corpus, with F-measures of 86.6%, 85.0%, and 88.1%, respectively, for NeuroNER and 86.4%, 80.8%, and 91.7%, respectively, for BERT. Conclusions These results show that these architectures perform considerably better than the previous rule-based and conventional machine learning–based systems. Moreover, our analysis results show that pretrained word embedding and particularly contextualized embedding for biomedical corpora help to understand complexities inherent to biomedical text.

43

Ali, Noha, Ahmed H. AbuEl-Atta, and Hala H. Zayed. "Enhancing the performance of cancer text classification model based on cancer hallmarks." IAES International Journal of Artificial Intelligence (IJ-AI) 10, no. 2 (June 1, 2021): 316. http://dx.doi.org/10.11591/ijai.v10.i2.pp316-323.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

<span id="docs-internal-guid-cb130a3a-7fff-3e11-ae3d-ad2310e265f8"><span>Deep learning (DL) algorithms achieved state-of-the-art performance in computer vision, speech recognition, and natural language processing (NLP). In this paper, we enhance the convolutional neural network (CNN) algorithm to classify cancer articles according to cancer hallmarks. The model implements a recent word embedding technique in the embedding layer. This technique uses the concept of distributed phrase representation and multi-word phrases embedding. The proposed model enhances the performance of the existing model used for biomedical text classification. The result of the proposed model overcomes the previous model by achieving an F-score equal to 83.87% using an unsupervised technique that trained on PubMed abstracts called PMC vectors (PMCVec) embedding. Also, we made another experiment on the same dataset using the recurrent neural network (RNN) algorithm with two different word embeddings Google news and PMCVec which achieving F-score equal to 74.9% and 76.26%, respectively.</span></span>

44

Salman, Muhammad, Hafiz Suliman Munawar, Khalid Latif, Muhammad Waseem Akram, Sara Imran Khan, and Fahim Ullah. "Big Data Management in Drug–Drug Interaction: A Modern Deep Learning Approach for Smart Healthcare." Big Data and Cognitive Computing 6, no. 1 (March 9, 2022): 30. http://dx.doi.org/10.3390/bdcc6010030.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The detection and classification of drug–drug interactions (DDI) from existing data are of high importance because recent reports show that DDIs are among the major causes of hospital-acquired conditions and readmissions and are also necessary for smart healthcare. Therefore, to avoid adverse drug interactions, it is necessary to have an up-to-date knowledge of DDIs. This knowledge could be extracted by applying text-processing techniques to the medical literature published in the form of ‘Big Data’ because, whenever a drug interaction is investigated, it is typically reported and published in healthcare and clinical pharmacology journals. However, it is crucial to automate the extraction of the interactions taking place between drugs because the medical literature is being published in immense volumes, and it is impossible for healthcare professionals to read and collect all of the investigated DDI reports from these Big Data. To avoid this time-consuming procedure, the Information Extraction (IE) and Relationship Extraction (RE) techniques that have been studied in depth in Natural Language Processing (NLP) could be very promising. Since 2011, a lot of research has been reported in this particular area, and there are many approaches that have been implemented that can also be applied to biomedical texts to extract DDI-related information. A benchmark corpus is also publicly available for the advancement of DDI extraction tasks. The current state-of-the-art implementations for extracting DDIs from biomedical texts has employed Support Vector Machines (SVM) or other machine learning methods that work on manually defined features and that might be the cause of the low precision and recall that have been achieved in this domain so far. Modern deep learning techniques have also been applied for the automatic extraction of DDIs from the scientific literature and have proven to be very promising for the advancement of DDI extraction tasks. As such, it is pertinent to investigate deep learning techniques for the extraction and classification of DDIs in order for them to be used in the smart healthcare domain. We proposed a deep neural network-based method (SEV-DDI: Severity-Drug–Drug Interaction) with some further-integrated units/layers to achieve higher precision and accuracy. After successfully outperforming other methods in the DDI classification task, we moved a step further and utilized the methods in a sentiment analysis task to investigate the severity of an interaction. The ability to determine the severity of a DDI will be very helpful for clinical decision support systems in making more accurate and informed decisions, ensuring the safety of the patients.

45

Vu, Van-Hai, Quang-Phuoc Nguyen, Ebipatei Victoria Tunyan, and Cheol-Young Ock. "Improving the Performance of Vietnamese–Korean Neural Machine Translation with Contextual Embedding." Applied Sciences 11, no. 23 (November 23, 2021): 11119. http://dx.doi.org/10.3390/app112311119.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

With the recent evolution of deep learning, machine translation (MT) models and systems are being steadily improved. However, research on MT in low-resource languages such as Vietnamese and Korean is still very limited. In recent years, a state-of-the-art context-based embedding model introduced by Google, bidirectional encoder representations for transformers (BERT), has begun to appear in the neural MT (NMT) models in different ways to enhance the accuracy of MT systems. The BERT model for Vietnamese has been developed and significantly improved in natural language processing (NLP) tasks, such as part-of-speech (POS), named-entity recognition, dependency parsing, and natural language inference. Our research experimented with applying the Vietnamese BERT model to provide POS tagging and morphological analysis (MA) for Vietnamese sentences,, and applying word-sense disambiguation (WSD) for Korean sentences in our Vietnamese–Korean bilingual corpus. In the Vietnamese–Korean NMT system, with contextual embedding, the BERT model for Vietnamese is concurrently connected to both encoder layers and decoder layers in the NMT model. Experimental results assessed through BLEU, METEOR, and TER metrics show that contextual embedding significantly improves the quality of Vietnamese–Korean NMT.

46

Ali, Wazir, Jay Kumar, Zenglin Xu, Rajesh Kumar, and Yazhou Ren. "Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition." Applied Sciences 11, no. 19 (September 28, 2021): 9038. http://dx.doi.org/10.3390/app11199038.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Named entity recognition (NER) is a fundamental task in many natural language processing (NLP) applications, such as text summarization and semantic information retrieval. Recently, deep neural networks (NNs) with the attention mechanism yield excellent performance in NER by taking advantage of character-level and word-level representation learning. In this paper, we propose a deep context-aware bidirectional long short-term memory (CaBiLSTM) model for the Sindhi NER task. The model relies upon contextual representation learning (CRL), bidirectional encoder, self-attention, and sequential conditional random field (CRF). The CaBiLSTM model incorporates task-oriented CRL based on joint character-level and word-level representations. It takes character-level input to learn the character representations. Afterwards, the character representations are transformed into word features, and the bidirectional encoder learns the word representations. The output of the final encoder is fed into the self-attention through a hidden layer before decoding. Finally, we employ the CRF for the prediction of label sequences. The baselines and the proposed CaBiLSTM model are compared by exploiting pretrained Sindhi GloVe (SdGloVe), Sindhi fastText (SdfastText), task-oriented, and CRL-based word representations on the recently proposed SiNER dataset. Our proposed CaBiLSTM model achieved a high F1-score of 91.25% on the SiNER dataset with CRL without relying on additional handmade features, such as hand-crafted rules, gazetteers, or dictionaries.

47

Mitra, Avijit, Bhanu Pratap Singh Rawat, David D. McManus, and Hong Yu. "Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study." JMIR Medical Informatics 9, no. 7 (July 2, 2021): e27527. http://dx.doi.org/10.2196/27527.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Background Accurate detection of bleeding events from electronic health records (EHRs) is crucial for identifying and characterizing different common and serious medical problems. To extract such information from EHRs, it is essential to identify the relations between bleeding events and related clinical entities (eg, bleeding anatomic sites and lab tests). With the advent of natural language processing (NLP) and deep learning (DL)-based techniques, many studies have focused on their applicability for various clinical applications. However, no prior work has utilized DL to extract relations between bleeding events and relevant entities. Objective In this study, we aimed to evaluate multiple DL systems on a novel EHR data set for bleeding event–related relation classification. Methods We first expert annotated a new data set of 1046 deidentified EHR notes for bleeding events and their attributes. On this data set, we evaluated three state-of-the-art DL architectures for the bleeding event relation classification task, namely, convolutional neural network (CNN), attention-guided graph convolutional network (AGGCN), and Bidirectional Encoder Representations from Transformers (BERT). We used three BERT-based models, namely, BERT pretrained on biomedical data (BioBERT), BioBERT pretrained on clinical text (Bio+Clinical BERT), and BioBERT pretrained on EHR notes (EhrBERT). Results Our experiments showed that the BERT-based models significantly outperformed the CNN and AGGCN models. Specifically, BioBERT achieved a macro F1 score of 0.842, outperforming both the AGGCN (macro F1 score, 0.828) and CNN models (macro F1 score, 0.763) by 1.4% (P<.001) and 7.9% (P<.001), respectively. Conclusions In this comprehensive study, we explored and compared different DL systems to classify relations between bleeding events and other medical concepts. On our corpus, BERT-based models outperformed other DL models for identifying the relations of bleeding-related entities. In addition to pretrained contextualized word representation, BERT-based models benefited from the use of target entity representation over traditional sequence representation

48

Moqurrab, Syed Atif, Adeel Anjum, Abid Khan, Mansoor Ahmed, Awais Ahmad, and Gwanggil Jeon. "Deep-Confidentiality : An IoT-Enabled Privacy-Preserving Framework for Unstructured Big Biomedical Data." ACM Transactions on Internet Technology 22, no. 2 (May 31, 2022): 1–21. http://dx.doi.org/10.1145/3421509.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Due to the Internet of Things evolution, the clinical data is exponentially growing and using smart technologies. The generated big biomedical data is confidential, as it contains a patient’s personal information and findings. Usually, big biomedical data is stored over the cloud, making it convenient to be accessed and shared. In this view, the data shared for research purposes helps to reveal useful and unexposed aspects. Unfortunately, sharing of such sensitive data also leads to certain privacy threats. Generally, the clinical data is available in textual format (e.g., perception reports). Under the domain of natural language processing, many research studies have been published to mitigate the privacy breaches in textual clinical data. However, there are still limitations and shortcomings in the current studies that are inevitable to be addressed. In this article, a novel framework for textual medical data privacy has been proposed as Deep-Confidentiality . The proposed framework improves Medical Entity Recognition (MER) using deep neural networks and sanitization compared to the current state-of-the-art techniques. Moreover, the new and generic utility metric is also proposed, which overcomes the shortcomings of the existing utility metric. It provides the true representation of sanitized documents as compared to the original documents. To check our proposed framework’s effectiveness, it is evaluated on the i2b2-2010 NLP challenge dataset, which is considered one of the complex medical data for MER. The proposed framework improves the MER with 7.8% recall, 7% precision, and 3.8% F1-score compared to the existing deep learning models. It also improved the data utility of sanitized documents up to 13.79%, where the value of the k is 3.

49

Syed, Muzamil Hussain, and Sun-Tae Chung. "MenuNER: Domain-Adapted BERT Based NER Approach for a Domain with Limited Dataset and Its Application to Food Menu Domain." Applied Sciences 11, no. 13 (June 28, 2021): 6007. http://dx.doi.org/10.3390/app11136007.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Entity-based information extraction is one of the main applications of Natural Language Processing (NLP). Recently, deep transfer-learning utilizing contextualized word embedding from pre-trained language models has shown remarkable results for many NLP tasks, including Named-entity recognition (NER). BERT (Bidirectional Encoder Representations from Transformers) is gaining prominent attention among various contextualized word embedding models as a state-of-the-art pre-trained language model. It is quite expensive to train a BERT model from scratch for a new application domain since it needs a huge dataset and enormous computing time. In this paper, we focus on menu entity extraction from online user reviews for the restaurant and propose a simple but effective approach for NER task on a new domain where a large dataset is rarely available or difficult to prepare, such as food menu domain, based on domain adaptation technique for word embedding and fine-tuning the popular NER task network model ‘Bi-LSTM+CRF’ with extended feature vectors. The proposed NER approach (named as ‘MenuNER’) consists of two step-processes: (1) Domain adaptation for target domain; further pre-training of the off-the-shelf BERT language model (BERT-base) in semi-supervised fashion on a domain-specific dataset, and (2) Supervised fine-tuning the popular Bi-LSTM+CRF network for downstream task with extended feature vectors obtained by concatenating word embedding from the domain-adapted pre-trained BERT model from the first step, character embedding and POS tag feature information. Experimental results on handcrafted food menu corpus from customers’ review dataset show that our proposed approach for domain-specific NER task, that is: food menu named-entity recognition, performs significantly better than the one based on the baseline off-the-shelf BERT-base model. The proposed approach achieves 92.5% F1 score on the YELP dataset for the MenuNER task.

50

Suravarapu, Vasu Krishna, and Hemprasad Yashwant Patil. "Person Identification and Gender Classification Based on Vision Transformers for Periocular Images." Applied Sciences 13, no. 5 (February 28, 2023): 3116. http://dx.doi.org/10.3390/app13053116.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Many biometrics advancements have been widely used for security applications. This field’s evolution began with fingerprints and continued with periocular imaging, which has gained popularity due to the pandemic scenario. CNN (convolutional neural networks) has revolutionized the computer vision domain by demonstrating various state-of-the-art results (performance metrics) with the help of deep-learning-based architectures. The latest transformation has happened with the invention of transformers, which are used in NLP (natural language processing) and are presently being adapted for computer vision. In this work, we have implemented five different ViT- (vision transformer) based architectures for person identification and gender classification. The experiment was performed on the ViT architectures and their modified counterparts. In general, the samples selected for train:val:test splits are random, and the trained model may get affected by overfitting. To overcome this, we have performed 5-fold cross-validation-based analysis. The experiment’s performance matrix indicates that the proposed method achieved better results for gender classification as well as person identification. We also experimented with train-val-test partitions for benchmarking with existing architectures and observed significant improvements. We utilized the publicly available UBIPr dataset for performing this experimentation.