To see the other types of publications on this topic, follow the link: Word embeddings.

Journal articles on the topic 'Word embeddings'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Word embeddings.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ahn, Yoonjoo, Eugene Rhee, and Jihoon Lee. "Dual embedding with input embedding and output embedding for better word representation." Indonesian Journal of Electrical Engineering and Computer Science 27, no. 2 (2022): 1091–99. https://doi.org/10.11591/ijeecs.v27.i2.pp1091-1099.

Full text
Abstract:
Recent studies in distributed vector representations for words have variety of ways to represent words. We propose a various ways using input embedding and output embedding to better represent words than single model. We compared the performance in terms of word analogy and word similarity with each input and output embeddings and various dual embeddings which are the combination of those two embeddings. Performance evaluation results show that the proposed dual embeddings outperform each single embedding, especially with the way of simply adding input and output embeddings. We figured out two
APA, Harvard, Vancouver, ISO, and other styles
2

Ahn, Yoonjoo, Eugene Rhee, and Jihoon Lee. "Dual embedding with input embedding and output embedding for better word representation." Indonesian Journal of Electrical Engineering and Computer Science 27, no. 2 (2022): 1091. http://dx.doi.org/10.11591/ijeecs.v27.i2.pp1091-1099.

Full text
Abstract:
Recent <span lang="EN-US">studies in distributed vector representations for words have variety of ways to represent words. We propose a various ways using input embedding and output embedding to better represent words than single model. We compared the performance in terms of word analogy and word similarity with each input and output embeddings and various dual embeddings which are the combination of those two embeddings. Performance evaluation results show that the proposed dual embeddings outperform each single embedding, especially with the way of simply adding input and output embed
APA, Harvard, Vancouver, ISO, and other styles
3

Srinidhi, K., T. L.S Tejaswi, CH Rama Rupesh Kumar, and I. Sai Siva Charan. "An Advanced Sentiment Embeddings with Applications to Sentiment Based Result Analysis." International Journal of Engineering & Technology 7, no. 2.32 (2018): 393. http://dx.doi.org/10.14419/ijet.v7i2.32.15721.

Full text
Abstract:
We propose an advanced well-trained sentiment analysis based adoptive analysis “word specific embedding’s, dubbed sentiment embedding’s”. Using available word and phrase embedded learning and trained algorithms mainly make use of contexts of terms but ignore the sentiment of texts and analyzing the process of word and text classifications. sentimental analysis on unlike words conveying same meaning matched to corresponding word vector. This problem is bridged by combining encoding opinion carrying text with sentiment embeddings words. But performing sentimental analysis on e-commerce, social n
APA, Harvard, Vancouver, ISO, and other styles
4

Zhu, Lixing, Yulan He, and Deyu Zhou. "A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings." Transactions of the Association for Computational Linguistics 8 (August 2020): 471–85. http://dx.doi.org/10.1162/tacl_a_00326.

Full text
Abstract:
We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings. In particular, we assume that global latent topics are shared across documents, a word is generated by a hidden semantic vector encoding its contextual semantic meaning, and its context words are generated conditional on both the hidden semantic vector and global latent topics. Topics are trained jointly with the word embeddings. The trained model maps words to topic-dependent embeddings, which naturally addresses the issue of word polysemy. Experimental re
APA, Harvard, Vancouver, ISO, and other styles
5

Yadav, Aditya Kumar. "Refined Global Word Embeddings Based on Sentiment Concept for Sentiment Analysis." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem49245.

Full text
Abstract:
ABSTRACT Sentiment analysis is a significant area of study in natural language processing that finds extensive use in journalism, politics, and other domains. In sentiment analysis, word embeddings are important. The sentiment lexicons are directly incorporated into conventional word representation using the current senstiment embeddings techniques. This sentiment representation technique is unable to offer precise sentiment information for words in many situations since it can only distinguish the sentiment information of distinct words, not the same word in several settings. To address the i
APA, Harvard, Vancouver, ISO, and other styles
6

Jang, Youngjin, and Harksoo Kim. "Reliable Classification of FAQs with Spelling Errors Using an Encoder-Decoder Neural Network in Korean." Applied Sciences 9, no. 22 (2019): 4758. http://dx.doi.org/10.3390/app9224758.

Full text
Abstract:
To resolve lexical disagreement problems between queries and frequently asked questions (FAQs), we propose a reliable sentence classification model based on an encoder-decoder neural network. The proposed model uses three types of word embeddings; fixed word embeddings for representing domain-independent meanings of words, fined-tuned word embeddings for representing domain-specific meanings of words, and character-level word embeddings for bridging lexical gaps caused by spelling errors. It also uses class embeddings to represent domain knowledge associated with each category. In the experime
APA, Harvard, Vancouver, ISO, and other styles
7

Chang, Haw-Shiuan, Amol Agrawal, and Andrew McCallum. "Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (2021): 6956–65. http://dx.doi.org/10.1609/aaai.v35i8.16857.

Full text
Abstract:
Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences. We propose a novel embedding method for a text sequence (a phrase or a sentence) where each sequence is represented by a distinct set of multi-mode codebook embeddings to capture different semantic facets of its meaning. The codebook embeddings can be viewed as the cluster centers which summarize the distribution of possibly co-occurring words in a pre-trained word embedding spac
APA, Harvard, Vancouver, ISO, and other styles
8

Ramos-Vargas, Rigo E., Israel Román-Godínez, and Sulema Torres-Ramos. "Comparing general and specialized word embeddings for biomedical named entity recognition." PeerJ Computer Science 7 (February 18, 2021): e384. http://dx.doi.org/10.7717/peerj-cs.384.

Full text
Abstract:
Increased interest in the use of word embeddings, such as word representation, for biomedical named entity recognition (BioNER) has highlighted the need for evaluations that aid in selecting the best word embedding to be used. One common criterion for selecting a word embedding is the type of source from which it is generated; that is, general (e.g., Wikipedia, Common Crawl), or specific (e.g., biomedical literature). Using specific word embeddings for the BioNER task has been strongly recommended, considering that they have provided better coverage and semantic relationships among medical ent
APA, Harvard, Vancouver, ISO, and other styles
9

Schick, Timo, and Hinrich Schütze. "Learning Semantic Representations for Novel Words: Leveraging Both Form and Context." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6965–73. http://dx.doi.org/10.1609/aaai.v33i01.33016965.

Full text
Abstract:
Word embeddings are a key component of high-performing natural language processing (NLP) systems, but it remains a challenge to learn good representations for novel words on the fly, i.e., for words that did not occur in the training data. The general problem setting is that word embeddings are induced on an unlabeled training corpus and then a model is trained that embeds novel words into this induced embedding space. Currently, two approaches for learning embeddings of novel words exist: (i) learning an embedding from the novel word’s surface-form (e.g., subword n-grams) and (ii) learning an
APA, Harvard, Vancouver, ISO, and other styles
10

Shen, Feiyu, Chenpeng Du, and Kai Yu. "Acoustic Word Embeddings for End-to-End Speech Synthesis." Applied Sciences 11, no. 19 (2021): 9010. http://dx.doi.org/10.3390/app11199010.

Full text
Abstract:
The most recent end-to-end speech synthesis systems use phonemes as acoustic input tokens and ignore the information about which word the phonemes come from. However, many words have their specific prosody type, which may significantly affect the naturalness. Prior works have employed pre-trained linguistic word embeddings as TTS system input. However, since linguistic information is not directly relevant to how words are pronounced, TTS quality improvement of these systems is mild. In this paper, we propose a novel and effective way of jointly training acoustic phone and word embeddings for e
APA, Harvard, Vancouver, ISO, and other styles
11

Martina, Toshevska, Stojanovska Frosina, and Kalajdjiesk Jovan. "The Ability of Word Embeddings to Capture Word Similarities." International Journal on Natural Language Computing (IJNLC) Vol.9, No.3, June 2020 9, no. 3 (2023): 18. https://doi.org/10.5281/zenodo.7827290.

Full text
Abstract:
Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning techniques use already pre-trained distributed word representations, commonly called word embeddings. Determining the most qualitative word embeddings is of crucial importance for such models. However, selecting the appropriate word embeddings is a perplexing task since the projected embedding space is not intuitive to humans. In this paper, we explore different appr
APA, Harvard, Vancouver, ISO, and other styles
12

Song, Yuting, Biligsaikhan Batjargal, and Akira Maeda. "Learning Japanese-English Bilingual Word Embeddings by Using Language Specificity." International Journal of Asian Language Processing 30, no. 03 (2020): 2050014. http://dx.doi.org/10.1142/s2717554520500149.

Full text
Abstract:
Cross-lingual word embeddings have been gaining attention because they can capture the semantic meaning of words across languages, which can be applied to cross-lingual tasks. Most methods learn a single mapping (e.g., a linear mapping) to transform a word embedding space from one language to another. To improve bilingual word embeddings, we propose an advanced method that adds a language-specific mapping. We focus on learning Japanese-English bilingual word embedding mapping by considering the specificity of the Japanese language. We evaluated our method by comparing it with single mapping-ba
APA, Harvard, Vancouver, ISO, and other styles
13

Zhang, Yuhan, Wenqi Chen, Ruihan Zhang, and Xiajie Zhang. "Representing affect information in word embeddings." Experiments in Linguistic Meaning 2 (January 27, 2023): 310. http://dx.doi.org/10.3765/elm.2.5391.

Full text
Abstract:
A growing body of research in natural language processing (NLP) and natural language understanding (NLU) is investigating human-like knowledge learned or encoded in the word embeddings from large language models. This is a step towards understanding what knowledge language models capture that resembles human understanding of language and communication. Here, we investigated whether and how the affect meaning of a word (i.e., valence, arousal, dominance) is encoded in word embeddings pre-trained in large neural networks. We used the human-labeled dataset (Mohammad 2018) as the ground truth and
APA, Harvard, Vancouver, ISO, and other styles
14

Liao, Xianwen, Yongzhong Huang, Changfu Wei, Chenhao Zhang, Yongqing Deng, and Ke Yi. "Efficient Estimate of Low-Frequency Words’ Embeddings Based on the Dictionary: A Case Study on Chinese." Applied Sciences 11, no. 22 (2021): 11018. http://dx.doi.org/10.3390/app112211018.

Full text
Abstract:
Obtaining high-quality embeddings of out-of-vocabularies (OOVs) and low-frequency words is a challenge in natural language processing (NLP). To efficiently estimate the embeddings of OOVs and low-frequency words, we propose a new method that uses the dictionary to estimate the embeddings of OOVs and low-frequency words. More specifically, the explanatory note of an entry in dictionaries accurately describes the semantics of the corresponding word. Naturally, we adopt the sentence representation model to extract the semantics of the explanatory note and regard the semantics as the embedding of
APA, Harvard, Vancouver, ISO, and other styles
15

Li, Qizhi, Xianyong Li, Yajun Du, Yongquan Fan, and Xiaoliang Chen. "A New Sentiment-Enhanced Word Embedding Method for Sentiment Analysis." Applied Sciences 12, no. 20 (2022): 10236. http://dx.doi.org/10.3390/app122010236.

Full text
Abstract:
Since some sentiment words have similar syntactic and semantic features in the corpus, existing pre-trained word embeddings always perform poorly in sentiment analysis tasks. This paper proposes a new sentiment-enhanced word embedding (S-EWE) method to improve the effectiveness of sentence-level sentiment classification. This sentiment enhancement method takes full advantage of the mapping relationship between word embeddings and their corresponding sentiment orientations. This method first converts words to word embeddings and assigns sentiment mapping vectors to all word embeddings. Then, wo
APA, Harvard, Vancouver, ISO, and other styles
16

Mao, Xingliang, Shuai Chang, Jinjing Shi, Fangfang Li, and Ronghua Shi. "Sentiment-Aware Word Embedding for Emotion Classification." Applied Sciences 9, no. 7 (2019): 1334. http://dx.doi.org/10.3390/app9071334.

Full text
Abstract:
Word embeddings are effective intermediate representations for capturing semantic regularities between words in natural language processing (NLP) tasks. We propose sentiment-aware word embedding for emotional classification, which consists of integrating sentiment evidence within the emotional embedding component of a term vector. We take advantage of the multiple types of emotional knowledge, just as the existing emotional lexicon, to build emotional word vectors to represent emotional information. Then the emotional word vector is combined with the traditional word embedding to construct the
APA, Harvard, Vancouver, ISO, and other styles
17

Yang, Zekun, and Juan Feng. "A Causal Inference Method for Reducing Gender Bias in Word Embedding Relations." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 9434–41. http://dx.doi.org/10.1609/aaai.v34i05.6486.

Full text
Abstract:
Word embedding has become essential for natural language processing as it boosts empirical performances of various tasks. However, recent research discovers that gender bias is incorporated in neural word embeddings, and downstream tasks that rely on these biased word vectors also produce gender-biased results. While some word-embedding gender-debiasing methods have been developed, these methods mainly focus on reducing gender bias associated with gender direction and fail to reduce the gender bias presented in word embedding relations. In this paper, we design a causal and simple approach for
APA, Harvard, Vancouver, ISO, and other styles
18

Alsuhaibani, Mohammed, and Danushka Bollegala. "Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications." Computational and Mathematical Methods in Medicine 2021 (November 16, 2021): 1–12. http://dx.doi.org/10.1155/2021/9761163.

Full text
Abstract:
Word embedding models have recently shown some capability to encode hierarchical information that exists in textual data. However, such models do not explicitly encode the hierarchical structure that exists among words. In this work, we propose a method to learn hierarchical word embeddings (HWEs) in a specific order to encode the hierarchical information of a knowledge base (KB) in a vector space. To learn the word embeddings, our proposed method considers not only the hypernym relations that exist between words in a KB but also contextual information in a text corpus. The experimental result
APA, Harvard, Vancouver, ISO, and other styles
19

Hashimoto, Tatsunori B., David Alvarez-Melis, and Tommi S. Jaakkola. "Word Embeddings as Metric Recovery in Semantic Spaces." Transactions of the Association for Computational Linguistics 4 (December 2016): 273–86. http://dx.doi.org/10.1162/tacl_a_00098.

Full text
Abstract:
Continuous word representations have been remarkably useful across NLP tasks but remain poorly understood. We ground word embeddings in semantic spaces studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. To this end, we relate log co-occurrences of words in large corpora to semantic similarity assessments and show that co-occurrences are indeed consistent with an Euclidean semantic space hypothesis. Framing word embedding as metric recovery of a semantic space unifies existing word embedding algorithms, ties them to manifold learning, and de
APA, Harvard, Vancouver, ISO, and other styles
20

Yan, Muheng, Yu-Ru Lin, Rebecca Hwa, Ali Mert Ertugrul, Meiqi Guo, and Wen-Ting Chung. "MimicProp: Learning to Incorporate Lexicon Knowledge into Distributed Word Representation for Social Media Analysis." Proceedings of the International AAAI Conference on Web and Social Media 14 (May 26, 2020): 738–49. http://dx.doi.org/10.1609/icwsm.v14i1.7339.

Full text
Abstract:
Lexicon-based methods and word embeddings are the two widely used approaches for analyzing texts in social media. The choice of an approach can have a significant impact on the reliability of the text analysis. For example, lexicons provide manually curated, domain-specific attributes about a limited set of words, while word embeddings learn to encode some loose semantic interpretations for a much broader set of words. Text analysis can benefit from a representation that offers both the broad coverage of word embeddings and the domain knowledge of lexicons. This paper presents MimicProp, a new
APA, Harvard, Vancouver, ISO, and other styles
21

Balogh, Vanda, Gábor Berend, Dimitrios I. Diochnos, and György Turán. "Understanding the Semantic Content of Sparse Word Embeddings Using a Commonsense Knowledge Base." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 7399–406. http://dx.doi.org/10.1609/aaai.v34i05.6235.

Full text
Abstract:
Word embeddings have developed into a major NLP tool with broad applicability. Understanding the semantic content of word embeddings remains an important challenge for additional applications. One aspect of this issue is to explore the interpretability of word embeddings. Sparse word embeddings have been proposed as models with improved interpretability. Continuing this line of research, we investigate the extent to which human interpretable semantic concepts emerge along the bases of sparse word representations. In order to have a broad framework for evaluation, we consider three general appr
APA, Harvard, Vancouver, ISO, and other styles
22

Pezhman, Sheinidashtego, and Musaev Aibek. "LEARNING CROSS-LINGUAL WORD EMBEDDINGS WITH UNIVERSAL CONCEPTS." International Journal on Web Service Computing (IJWSC) 10, no. 1/2/3 (2019): 13–20. https://doi.org/10.5281/zenodo.3889327.

Full text
Abstract:
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal languages inspired new efforts to extend the model to support diversified languages. State-of-the-art methods for learning cross-lingual word embeddings rely on the alignment of monolingual word embedding spaces. Our goal is to implement a word co-occurrence across languages with the universal concepts’ method. Such concepts are notions that are fundamental to humankind and are thus persistent across languages, e.g., a man or woman, war or peace, etc. Given bilingual lexicons, we built uni
APA, Harvard, Vancouver, ISO, and other styles
23

Pezhman, Sheinidashtegol, and Musaev Aibek. "LEARNING CROSS-LINGUAL WORD EMBEDDINGS WITH UNIVERSAL CONCEPTS." International Journal on Web Service Computing (IJWSC) 10, no. 1/2/3 (2019): 13–20. https://doi.org/10.5281/zenodo.3477888.

Full text
Abstract:
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal languages inspired new efforts to extend the model to support diversified languages. State-of-the-art methods for learning cross-lingual word embeddings rely on the alignment of monolingual word embedding spaces. Our goal is to implement a word co-occurrence across languages with the universal concepts’ method. Such concepts are notions that are fundamental to humankind and are thus persistent across languages, e.g., a man or woman, war or peace, etc. Given bilingual lexicons, we built uni
APA, Harvard, Vancouver, ISO, and other styles
24

Karsi, Redouane, Mounia Zaim, and Jamila El Alami. "Leveraging Pre-Trained Contextualized Word Embeddings to Enhance Sentiment Classification of Drug Reviews." Revue d'Intelligence Artificielle 35, no. 4 (2021): 307–14. http://dx.doi.org/10.18280/ria.350405.

Full text
Abstract:
Traditionally, pharmacovigilance data are collected during clinical trials on a small sample of patients and are therefore insufficient to adequately assess drugs. Nowadays, consumers use online drug forums to share their opinions and experiences about medication. These feedbacks, which are widely available on the web, are automatically analyzed to extract relevant information for decision-making. Currently, sentiment analysis methods are being put forward to leverage consumers' opinions and produce useful drug monitoring indicators. However, these methods' effectiveness depends on the quality
APA, Harvard, Vancouver, ISO, and other styles
25

David, Merlin Susan, and Shini Renjith. "Comparison of word embeddings in text classification based on RNN and CNN." IOP Conference Series: Materials Science and Engineering 1187, no. 1 (2021): 012029. http://dx.doi.org/10.1088/1757-899x/1187/1/012029.

Full text
Abstract:
Abstract This paper presents a comparison of word embeddings in text classification using RNN and CNN. In the field of image classification, deep learning methods like as RNN and CNN have shown to be popular. CNN is most popular model among deep learning techniques in the field of NLP because of its simplicity and parallelism, even if the dataset is huge. Word embedding techniques employed are GloVe and fastText. Use of different word embeddings showed a major difference in the accuracy of the models. When it comes to embedding of rare words, GloVe can sometime perform poorly. Inorder to tackl
APA, Harvard, Vancouver, ISO, and other styles
26

Li, Quanzhi, Sameena Shah, Xiaomo Liu, and Armineh Nourbakhsh. "Data Sets: Word Embeddings Learned from Tweets and General Data." Proceedings of the International AAAI Conference on Web and Social Media 11, no. 1 (2017): 428–36. http://dx.doi.org/10.1609/icwsm.v11i1.14859.

Full text
Abstract:
A word embedding is a low-dimensional, dense and real-valued vector representation of a word. Word embeddings have been used in many NLP tasks. They are usually generated from a large text corpus. The embedding of a word captures both its syntactic and semantic aspects. Tweets are short, noisy and have unique lexical and semantic features that are different from other types of text. Therefore, it is necessary to have word embeddings learned specifically from tweets. In this paper, we present ten word embedding data sets. In addition to the data sets learned from just tweet data, we also built
APA, Harvard, Vancouver, ISO, and other styles
27

Gao, Yan, Yandong Wang, Patrick Wang, and Lei Gu. "Medical Named Entity Extraction from Chinese Resident Admit Notes Using Character and Word Attention-Enhanced Neural Network." International Journal of Environmental Research and Public Health 17, no. 5 (2020): 1614. http://dx.doi.org/10.3390/ijerph17051614.

Full text
Abstract:
The resident admit notes (RANs) in electronic medical records (EMRs) is first-hand information to study the patient’s condition. Medical entity extraction of RANs is an important task to get disease information for medical decision-making. For Chinese electronic medical records, each medical entity contains not only word information but also rich character information. Effective combination of words and characters is very important for medical entity extraction. We propose a medical entity recognition model based on a character and word attention-enhanced (CWAE) neural network for Chinese RANs
APA, Harvard, Vancouver, ISO, and other styles
28

P. Bhopale, Bhopale, and Ashish Tiwari. "LEVERAGING NEURAL NETWORK PHRASE EMBEDDING MODEL FOR QUERY REFORMULATION IN AD-HOC BIOMEDICAL INFORMATION RETRIEVAL." Malaysian Journal of Computer Science 34, no. 2 (2021): 151–70. http://dx.doi.org/10.22452/mjcs.vol34no2.2.

Full text
Abstract:
This study presents a spark enhanced neural network phrase embedding model to leverage query representation for relevant biomedical literature retrieval. Information retrieval for clinical decision support demands high precision. In recent years, word embeddings have been evolved as a solution to such requirements. It represents vocabulary words in low-dimensional vectors in the context of their similar words; however, it is inadequate to deal with semantic phrases or multi-word units. Learning vector embeddings for phrases by maintaining word meanings is a challenging task. This study propose
APA, Harvard, Vancouver, ISO, and other styles
29

Mao, Yuqing, and Kin Wah Fung. "Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts." Journal of the American Medical Informatics Association 27, no. 10 (2020): 1538–46. http://dx.doi.org/10.1093/jamia/ocaa136.

Full text
Abstract:
Abstract Objective The study sought to explore the use of deep learning techniques to measure the semantic relatedness between Unified Medical Language System (UMLS) concepts. Materials and Methods Concept sentence embeddings were generated for UMLS concepts by applying the word embedding models BioWordVec and various flavors of BERT to concept sentences formed by concatenating UMLS terms. Graph embeddings were generated by the graph convolutional networks and 4 knowledge graph embedding models, using graphs built from UMLS hierarchical relations. Semantic relatedness was measured by the cosin
APA, Harvard, Vancouver, ISO, and other styles
30

Romanyuk, Andriy. "Vector Representations of Ukrainian Words." Ukraina Moderna 27, no. 27 (2019): 46–72. http://dx.doi.org/10.30970/uam.2019.27.1062.

Full text
Abstract:
I n this paper, Ukrainian word embeddings and their properties are examined. Provided are a theoretical description, a brief account of the most common technologies used to produce an embedding, and lists of implemented algorithms. Word2wec, the first technology for calculating word embeddings, is used to demonstrate modern approaches of calculating using neural networks. Word2wec and FastText, which evolved from word2vec, are compared, and FastText’s benefits are described. Word embeddings have been applied to solving majority of the practical tasks of natural language processing. One of the
APA, Harvard, Vancouver, ISO, and other styles
31

Padarian, José, and Ignacio Fuentes. "Word embeddings for application in geosciences: development, evaluation, and examples of soil-related concepts." SOIL 5, no. 2 (2019): 177–87. http://dx.doi.org/10.5194/soil-5-177-2019.

Full text
Abstract:
Abstract. A large amount of descriptive information is available in geosciences. This information is usually considered subjective and ill-favoured compared with its numerical counterpart. Considering the advances in natural language processing and machine learning, it is possible to utilise descriptive information and encode it as dense vectors. These word embeddings, which encode information about a word and its linguistic relationships with other words, lay on a multidimensional space where angles and distances have a linguistic interpretation. We used 280 764 full-text scientific articles
APA, Harvard, Vancouver, ISO, and other styles
32

Meyer, Francois, der Merwe Brink van, and Dirko Coetsee. "Learning Concept Embeddings from Temporal Data." JUCS - Journal of Universal Computer Science 24, no. (10) (2018): 1378–402. https://doi.org/10.3217/jucs-024-10-1378.

Full text
Abstract:
Word embedding techniques can be used to learn vector representations of concepts from temporal datasets. Previous attempts to do this amounted to appling word embedding techniques to event sequences. We propose a concept embedding model that extends existing word embedding techniques to take time into account by explicitly modelling the time between concept occurrences. The model is implemented and evaluated using medical temporal data. It is found that incorporating time into the learning algorithm can improve the quality of the resulting embeddings, as measured by an existing methodological
APA, Harvard, Vancouver, ISO, and other styles
33

Alkaabi, Hussein, Ali Kadhim Jasim, and Ali Darroudi. "From Static to Contextual: A Survey of Embedding Advances in NLP." PERFECT: Journal of Smart Algorithms 2, no. 2 (2025): 57–66. https://doi.org/10.62671/perfect.v2i2.77.

Full text
Abstract:
Embedding techniques have been a cornerstone of Natural Language Processing (NLP), enabling machines to represent textual data in a form that captures semantic and syntactic relationships. Over the years, the field has witnessed a significant evolution—from static word embeddings, such as Word2Vec and GloVe, which represent words as fixed vectors, to dynamic, contextualized embeddings like BERT and GPT, which generate word representations based on their surrounding context. This survey provides a comprehensive overview of embedding techniques, tracing their development from early methods to st
APA, Harvard, Vancouver, ISO, and other styles
34

Garg, Nikhil, Londa Schiebinger, Dan Jurafsky, and James Zou. "Word embeddings quantify 100 years of gender and ethnic stereotypes." Proceedings of the National Academy of Sciences 115, no. 16 (2018): E3635—E3644. http://dx.doi.org/10.1073/pnas.1720347115.

Full text
Abstract:
Word embeddings are a powerful machine-learning framework that represents each English word by a vector. The geometric relationship between these vectors captures meaningful semantic relationships between the corresponding words. In this paper, we develop a framework to demonstrate how the temporal dynamics of the embedding helps to quantify changes in stereotypes and attitudes toward women and ethnic minorities in the 20th and 21st centuries in the United States. We integrate word embeddings trained on 100 y of text data with the US Census to show that changes in the embedding track closely w
APA, Harvard, Vancouver, ISO, and other styles
35

Bandyopadhyay, Saptarashmi, Jason Xu, Neel Pawar, and David Touretzky. "Interactive Visualizations of Word Embeddings for K-12 Students." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 11 (2022): 12713–20. http://dx.doi.org/10.1609/aaai.v36i11.21548.

Full text
Abstract:
Word embeddings, which represent words as dense feature vectors, are widely used in natural language processing. In their seminal paper on word2vec, Mikolov and colleagues showed that a feature space created by training a word prediction network on a large text corpus will encode semantic information that supports analogy by vector arithmetic, e.g., "king" minus "man" plus "woman" equals "queen". To help novices appreciate this idea, people have sought effective graphical representations of word embeddings. We describe a new interactive tool for visually exploring word embeddings. Our tool all
APA, Harvard, Vancouver, ISO, and other styles
36

JP, Sanjanasri, Vijay Krishna Menon, Soman KP, Rajendran S, and Agnieszka Wolk. "Generation of Cross-Lingual Word Vectors for Low-Resourced Languages Using Deep Learning and Topological Metrics in a Data-Efficient Way." Electronics 10, no. 12 (2021): 1372. http://dx.doi.org/10.3390/electronics10121372.

Full text
Abstract:
Linguists have been focused on a qualitative comparison of the semantics from different languages. Evaluation of the semantic interpretation among disparate language pairs like English and Tamil is an even more formidable task than for Slavic languages. The concept of word embedding in Natural Language Processing (NLP) has enabled a felicitous opportunity to quantify linguistic semantics. Multi-lingual tasks can be performed by projecting the word embeddings of one language onto the semantic space of the other. This research presents a suite of data-efficient deep learning approaches to deduce
APA, Harvard, Vancouver, ISO, and other styles
37

Najafabadi, Maryam Khanian, Thoon Zar Chi Ko, Saman Shojae Chaeikar, and Nasrin Shabani. "A Multi-Level Embedding Framework for Decoding Sarcasm Using Context, Emotion, and Sentiment Feature." Electronics 13, no. 22 (2024): 4429. http://dx.doi.org/10.3390/electronics13224429.

Full text
Abstract:
Sarcasm detection in text poses significant challenges for traditional sentiment analysis, as it often requires an understanding of context, word meanings, and emotional undertones. For example, in the sentence “I totally love working on Christmas holiday”, detecting sarcasm depends on capturing the contrast between affective words and their context. Existing methods often focus on single-embedding levels, such as word-level or affective-level, neglecting the importance of multi-level context. In this paper, we propose SAWE (Sentence, Affect, and Word Embeddings), a framework that combines sen
APA, Harvard, Vancouver, ISO, and other styles
38

Khushhal, Saquib, Abdul Majid, Syed Ali Abass, Rabia Riaz, Mohammad Babar, and Shafiq Ahmad. "Cword2vec: a novel morphological rule-based word embedding approach for Urdu text sentiment analysis." PeerJ Computer Science 11 (July 15, 2025): e2937. https://doi.org/10.7717/peerj-cs.2937.

Full text
Abstract:
Word embeddings are essential to natural language processing tasks because they contain a single word’s syntactic and semantic information. Word embeddings have been developed widely for numerous spoken languages across the globe like English. The research community needs to pay more attention to the Urdu language despite its significant number of speakers, which amounts to approximately 231.3 million individuals. Urdu is a complex language because word boundaries in Urdu are unspecified, as it does not employ delimiters between words. The compound word, a multiword expression, is a more compl
APA, Harvard, Vancouver, ISO, and other styles
39

Ravindran, Renjith P., and Kavi Narayana Murthy. "Syntactic Coherence in Word Embedding Spaces." International Journal of Semantic Computing 15, no. 02 (2021): 263–90. http://dx.doi.org/10.1142/s1793351x21500057.

Full text
Abstract:
Word embeddings have recently become a vital part of many Natural Language Processing (NLP) systems. Word embeddings are a suite of techniques that represent words in a language as vectors in an n-dimensional real space that has been shown to encode a significant amount of syntactic and semantic information. When used in NLP systems, these representations have resulted in improved performance across a wide range of NLP tasks. However, it is not clear how syntactic properties interact with the more widely studied semantic properties of words. Or what the main factors in the modeling formulation
APA, Harvard, Vancouver, ISO, and other styles
40

Parikh, Soham, Anahita Davoudi, Shun Yu, Carolina Giraldo, Emily Schriver, and Danielle Mowery. "Lexicon Development for COVID-19-related Concepts Using Open-source Word Embedding Sources: An Intrinsic and Extrinsic Evaluation." JMIR Medical Informatics 9, no. 2 (2021): e21679. http://dx.doi.org/10.2196/21679.

Full text
Abstract:
Background Scientists are developing new computational methods and prediction models to better clinically understand COVID-19 prevalence, treatment efficacy, and patient outcomes. These efforts could be improved by leveraging documented COVID-19–related symptoms, findings, and disorders from clinical text sources in an electronic health record. Word embeddings can identify terms related to these clinical concepts from both the biomedical and nonbiomedical domains, and are being shared with the open-source community at large. However, it’s unclear how useful openly available word embeddings are
APA, Harvard, Vancouver, ISO, and other styles
41

Jadon, Anil Kumar, and Suresh Kumar. "Enhancing emotion detection with synergistic combination of word embeddings and convolutional neural networks." Indonesian Journal of Electrical Engineering and Computer Science 35, no. 3 (2024): 1933. http://dx.doi.org/10.11591/ijeecs.v35.i3.pp1933-1941.

Full text
Abstract:
Recognizing emotions in textual data is crucial in a wide range of natural language processing (NLP) applications, from consumer sentiment research to mental health evaluation. The word embedding techniques play a pivotal role in text processing. In this paper, the performance of several well-known word embedding methods is evaluated in the context of emotion recognition. The classification of emotions is further enhanced using a convolutional neural network (CNN) model because of its propensity to capture local patterns and its recent triumphs in text-related tasks. The integration of CNN wit
APA, Harvard, Vancouver, ISO, and other styles
42

Anil, Kumar Jadon Suresh Kumar. "Enhancing emotion detection with synergistic combination of word embeddings and convolutional neural networks." Indonesian Journal of Electrical Engineering and Computer Science 35, no. 3 (2024): 1933–41. https://doi.org/10.11591/ijeecs.v35.i3.pp1933-1941.

Full text
Abstract:
Recognizing emotions in textual data is crucial in a wide range of natural language processing (NLP) applications, from consumer sentiment research to mental health evaluation. The word embedding techniques play a pivotal role in text processing. In this paper, the performance of several well-known word embedding methods is evaluated in the context of emotion recognition. The classification of emotions is further enhanced using a convolutional neural network (CNN) model because of its propensity to capture local patterns and its recent triumphs in text-related tasks. The integration of CNN wit
APA, Harvard, Vancouver, ISO, and other styles
43

Doval, Yerai, Jesús Vilares, and Carlos Gómez-Rodríguez. "Towards Robust Word Embeddings for Noisy Texts." Applied Sciences 10, no. 19 (2020): 6893. http://dx.doi.org/10.3390/app10196893.

Full text
Abstract:
Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media. In this work, we propose a simple extension to the skipgram model in which we introduce the concept of bridge-words, which are artificial words added to the model to strengthen the similarity between standard words and their noisy variants. Our new embeddings outperform baseline models on noisy texts on a wide range of evaluation tasks, both intrinsic and extrinsic,
APA, Harvard, Vancouver, ISO, and other styles
44

Oh, Dongsuk, Jungwoo Lim, and Heuiseok Lim. "Neuro-Symbolic Word Embedding Using Textual and Knowledge Graph Information." Applied Sciences 12, no. 19 (2022): 9424. http://dx.doi.org/10.3390/app12199424.

Full text
Abstract:
The construction of high-quality word embeddings is essential in natural language processing. In existing approaches using a large text corpus, the word embeddings learn only sequential patterns in the context; thus, accurate learning of the syntax and semantic relationships between words is limited. Several methods have been proposed for constructing word embeddings using syntactic information. However, these methods are not trained for the semantic relationships between words in sentences or external knowledge. In this paper, we present a method for improved word embeddings using symbolic gr
APA, Harvard, Vancouver, ISO, and other styles
45

Moudhich, Ihab, and Abdelhadi Fennan. "Evaluating sentiment analysis and word embedding techniques on Brexit." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 695–702. https://doi.org/10.11591/ijai.v13.i1.pp695-702.

Full text
Abstract:
In this study, we investigate the effectiveness of pre-trained word embeddings for sentiment analysis on a real-world topic, namely Brexit. We compare the performance of several popular word embedding models such global vectors for word representation (GloVe), FastText, word to vec (word2vec), and embeddings from language models (ELMo) on a dataset of tweets related to Brexit and evaluate their ability to classify the sentiment of the tweets as positive, negative, or neutral. We find that pre-trained word embeddings provide useful features for sentiment analysis and can significantly improve t
APA, Harvard, Vancouver, ISO, and other styles
46

Lassner, David, Stephanie Brandl, Anne Baillot, and Shinichi Nakajima. "Domain-Specific Word Embeddings with Structure Prediction." Transactions of the Association for Computational Linguistics 11 (March 27, 2023): 320–35. http://dx.doi.org/10.1162/tacl_a_00538.

Full text
Abstract:
Abstract Complementary to finding good general word embeddings, an important question for representation learning is to find dynamic word embeddings, for example, across time or domain. Current methods do not offer a way to use or predict information on structure between sub-corpora, time or domain and dynamic embeddings can only be compared after post-alignment. We propose novel word embedding methods that provide general word representations for the whole corpus, domain- specific representations for each sub-corpus, sub-corpus structure, and embedding alignment simultaneously. We present an
APA, Harvard, Vancouver, ISO, and other styles
47

Tahmasebi, Nina. "A Study on Word2Vec on a Historical Swedish Newspaper Corpus." Digital Humanities in the Nordic and Baltic Countries Publications 1, no. 1 (2018): 25–37. http://dx.doi.org/10.5617/dhnbpub.11007.

Full text
Abstract:
Detecting word sense changes can be of great interest in the field of digital humanities. Thus far, most investigations and automatic methods have been developed and carried out on English text and most recent methods make use of word embeddings. This paper presents a study on using Word2Vec, a neural word embedding method, on a Swedish historical newspaper collection. Our study includes a set of 11 words and our focus is the quality and stability of the word vectors over time. We investigate whether a word embedding method like Word2Vec can be effectively used on texts where the volume and qu
APA, Harvard, Vancouver, ISO, and other styles
48

Takase, Sho, Jun Suzuki, and Masaaki Nagata. "Character n-Gram Embeddings to Improve RNN Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5074–82. http://dx.doi.org/10.1609/aaai.v33i01.33015074.

Full text
Abstract:
This paper proposes a novel Recurrent Neural Network (RNN) language model that takes advantage of character information. We focus on character n-grams based on research in the field of word embedding construction (Wieting et al. 2016). Our proposed method constructs word embeddings from character ngram embeddings and combines them with ordinary word embeddings. We demonstrate that the proposed method achieves the best perplexities on the language modeling datasets: Penn Treebank, WikiText-2, and WikiText-103. Moreover, we conduct experiments on application tasks: machine translation and headli
APA, Harvard, Vancouver, ISO, and other styles
49

Corcoran, Padraig, Geraint Palmer, Laura Arman, Dawn Knight, and Irena Spasić. "Creating Welsh Language Word Embeddings." Applied Sciences 11, no. 15 (2021): 6896. http://dx.doi.org/10.3390/app11156896.

Full text
Abstract:
Word embeddings are representations of words in a vector space that models semantic relationships between words by means of distance and direction. In this study, we adapted two existing methods, word2vec and fastText, to automatically learn Welsh word embeddings taking into account syntactic and morphological idiosyncrasies of this language. These methods exploit the principles of distributional semantics and, therefore, require a large corpus to be trained on. However, Welsh is a minoritised language, hence significantly less Welsh language data are publicly available in comparison to Englis
APA, Harvard, Vancouver, ISO, and other styles
50

Si, Yuqi, Jingqi Wang, Hua Xu, and Kirk Roberts. "Enhancing clinical concept extraction with contextual embeddings." Journal of the American Medical Informatics Association 26, no. 11 (2019): 1297–304. http://dx.doi.org/10.1093/jamia/ocz096.

Full text
Abstract:
Abstract Objective Neural network–based representations (“embeddings”) have dramatically advanced natural language processing (NLP) tasks, including clinical NLP tasks such as concept extraction. Recently, however, more advanced embedding methods and representations (eg, ELMo, BERT) have further pushed the state of the art in NLP, yet there are no common best practices for how to integrate these representations into clinical tasks. The purpose of this study, then, is to explore the space of possible options in utilizing these new models for clinical concept extraction, including comparing thes
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!