Log in

Relevant bibliographies by topics / Word2Vee / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Word2Vee.

Dissertations / Theses on the topic 'Word2Vee'

Author: Grafiati

Published: 5 June 2025

Last updated: 1 August 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Word2Vee.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Pettersson, Tove. "Word2vec2syn : Synonymidentifiering med Word2vec." Thesis, Linköpings universitet, Statistik och maskininlärning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157638.

Full text

Abstract:

Inom NLP (eng. natural language processing) är synonymidentifiering en av de språkvetenskapliga utmaningarna som många antar. Fodina Language Technology AB är ett företag som skapat ett verktyg, Termograph, ämnad att samla termer inom företag och hålla den interna språkanvändningen konsekvent. En metodkombination bestående av språkteknologiska strategier utgör synonymidentifieringen och Fodina önskar ett större täckningsområde samt mer dynamik i framtagningsprocessen. Därav syftade detta arbete till att ta fram en ny metod, utöver metodkombinationen, för just synonymidentifiering. En färdigträ

APA, Harvard, Vancouver, ISO, and other styles

2

Šůstek, Martin. "Word2vec modely s přidanou kontextovou informací." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2017. http://www.nusl.cz/ntk/nusl-363837.

Full text

Abstract:

This thesis is concerned with the explanation of the word2vec models. Even though word2vec was introduced recently (2013), many researchers have already tried to extend, understand or at least use the model because it provides surprisingly rich semantic information. This information is encoded in N-dim vector representation and can be recall by performing some operations over the algebra. As an addition, I suggest a model modifications in order to obtain different word representation. To achieve that, I use public picture datasets. This thesis also includes parts dedicated to word2vec extensio

APA, Harvard, Vancouver, ISO, and other styles

3

Lena, Erika. "Combining Vector Space Model and Word2Vec: a preliminary study." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23545/.

Full text

Abstract:

This paper investigates traditional Information Retrieval (IR) methods for syntax evaluation of documents. The main aim of this study is to search for a method to combine IR models with new studies in Natural Language Processing (NLP), specifically with Word2Vec model. These techniques provide a semantic distributed representation of terms which can be used to improve retrieval and documents ranking. The present work focus on the selection of words to be used for documents retrieval. A further point of interest is the research of a suitable weighting scheme to be applied for the eva

APA, Harvard, Vancouver, ISO, and other styles

4

Handler, Abram. "An empirical study of semantic similarity in WordNet and Word2Vec." ScholarWorks@UNO, 2014. http://scholarworks.uno.edu/td/1922.

Full text

Abstract:

This thesis performs an empirical analysis of Word2Vec by comparing its output to WordNet, a well-known, human-curated lexical database. It finds that Word2Vec tends to uncover more of certain types of semantic relations than others -- with Word2Vec returning more hypernyms, synonomyns and hyponyms than hyponyms or holonyms. It also shows the probability that neighbors separated by a given cosine distance in Word2Vec are semantically related in WordNet. This result both adds to our understanding of the still-unknown Word2Vec and helps to benchmark new semantic tools built from word vectors.

APA, Harvard, Vancouver, ISO, and other styles

5

Kojic, Kemal, and Emil Petersson. "Automatisk synonymgenerering med Word2Vec for query expansion inom e-handel." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20818.

Full text

Abstract:

I detta arbete undersöks hur väl automatisk synonymgenerering genom maskininlärnings-metoden Word2Vec, som tränats över en datamängd från Google News på hundra miljarder ord, lämpar sig för query expansion inom ehandel. Detta görs genom användning av produkt- och eventdata från ett välkänt modebolag där synonymer genereras utifrån söksträngar som loggats i eventdata genom olika metoder som i sin tur bildar synonymböcker som används i framtida sökningar med hjälp av query expansion. För att kunna besvara studiens forskningsfrågor utförs först en kvantitativ analys. Denna analys utförs på data s

APA, Harvard, Vancouver, ISO, and other styles

6

Aguiar, Eliane Martins de. "Aplicação do Word2vec e do Gradiente descendente dstocástico em tradução automática." reponame:Repositório Institucional do FGV, 2016. http://hdl.handle.net/10438/16798.

Full text

Abstract:

Submitted by Eliane Martins de Aguiar (elianemart@gmail.com) on 2016-08-01T21:03:09Z No. of bitstreams: 1 dissertacao-ElianeMartins.pdf: 6062037 bytes, checksum: 14567c2feca25a81d6942be3b8bc8a65 (MD5)<br>Approved for entry into archive by Janete de Oliveira Feitosa (janete.feitosa@fgv.br) on 2016-08-03T20:29:34Z (GMT) No. of bitstreams: 1 dissertacao-ElianeMartins.pdf: 6062037 bytes, checksum: 14567c2feca25a81d6942be3b8bc8a65 (MD5)<br>Approved for entry into archive by Maria Almeida (maria.socorro@fgv.br) on 2016-08-23T20:12:35Z (GMT) No. of bitstreams: 1 dissertacao-ElianeMartins.pdf: 606

APA, Harvard, Vancouver, ISO, and other styles

7

Lewenhaupt, Adam, and Emil Brismar. "The impact of corpus choice in domain specific knowledge representation." Thesis, KTH, Skolan för industriell teknik och management (ITM), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-220679.

Full text

Abstract:

Recent advents in the machine learning community, driven by larger datasets and novel algorithmic approaches to deep reinforcement learning, reward the use of large datasets. In this thesis, we examine whether dataset size has a signicant impact on the recall quality in a very specic knowledge domain. We compare a large corpus extracted from Wikipedia to smaller ones from Stackoverow and evaluate their representational quality of niche computer science knowledge. We show that a smaller dataset with high-quality data points greatly outperform a larger one, even though the smaller is a subset of

APA, Harvard, Vancouver, ISO, and other styles

8

Kärrby, Andreas, and Maja Tennander. "Detecting changes in word associations over short time periods : Analysing Twitter data with Word2Vec over time." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280315.

Full text

Abstract:

The meanings and connotations of words are constantly changing. Traditionally, one way to track such changes over relatively long time periods is by analysing variations in word usage in written records such as books and newspapers, and by comparing dictionary entries for the words of interest. For short time periods, however, these methods are rendered unsuitable due to the limited amount of such language data. In this thesis, we explore a method to detect changes in word associations over short time periods by analysing word usage on Twitter. A large amount of tweets (roughly 3 million) were

APA, Harvard, Vancouver, ISO, and other styles

9

Ramström, Kasper. "Botnet detection on flow data using the reconstruction error from Autoencoders trained on Word2Vec network embeddings." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-393285.

Full text

Abstract:

Botnet network attacks are a growing issue in network security. These types of attacks consist out of compromised devices which are used for malicious activities. Many traditional systems use pre-defined pattern matching methods for detecting network intrusions based on the characteristics of previously seen attacks. This means that previously unseen attacks often go unnoticed as they do not have the patterns that the traditional systems are looking for. This paper proposes an anomaly detection approach which doesn’t use the characteristics of known attacks in order to detect new ones, instead

APA, Harvard, Vancouver, ISO, and other styles

10

Blomkvist, Oscar. "Machine Learning Based Sentiment Classification of Text, with Application to Equity Research Reports." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-257506.

Full text

Abstract:

In this thesis, we analyse the sentiment in equity research reports written by analysts at Skandinaviska Enskilda Banken (SEB). We provide a description of established statistical and machine learning methods for classifying the sentiment in text documents as positive or negative. Specifically, a form of recurrent neural network known as long short-term memory (LSTM) is of interest. We investigate two different labelling regimes for generating training data from the reports. Benchmark classification accuracies are obtained using logistic regression models. Finally, two different word embedding

APA, Harvard, Vancouver, ISO, and other styles

11

Garcia, Bernal Daniel. "Decentralizing Large-Scale Natural Language Processing with Federated Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-278822.

Full text

Abstract:

Natural Language Processing (NLP) is one of the most popular and visible forms of Artificial Intelligence in recent years. This is partly because it has to do with a common characteristic of human beings: language. NLP applications allow to create new services in the industrial sector in order to offer new solutions and provide significant productivity gains. All of this has happened thanks to the rapid progression of Deep Learning models. Large scale contextual representation models, such asWord2Vec, ELMo and BERT, have significantly advanced NLP in recently years. With these latest NLP model

APA, Harvard, Vancouver, ISO, and other styles

12

Fong, Vivian Lin. "Software Requirements Classification Using Word Embeddings and Convolutional Neural Networks." DigitalCommons@CalPoly, 2018. https://digitalcommons.calpoly.edu/theses/1851.

Full text

Abstract:

Software requirements classification, the practice of categorizing requirements by their type or purpose, can improve organization and transparency in the requirements engineering process and thus promote requirement fulfillment and software project completion. Requirements classification automation is a prominent area of research as automation can alleviate the tediousness of manual labeling and loosen its necessity for domain-expertise. This thesis explores the application of deep learning techniques on software requirements classification, specifically the use of word embeddings for documen

APA, Harvard, Vancouver, ISO, and other styles

13

Akinepally, Pratima Rao. "Investigating Performance of Different Models at Short Text Topic Modelling." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288531.

Full text

Abstract:

The key objective of this project was to quantitatively and qualitatively assess the performance of a sentence embedding model, Universal Sentence Encoder (USE), and a word embedding model, word2vec, at the task of topic modelling. The first step in the process was data collection. The data used for the project was podcast descriptions available at Spotify, and the topics associated with them. Following this, the data was used to generate description vectors and topic vectors using the embedding models, which were then used to assign topics to descriptions. The results from this study led to t

APA, Harvard, Vancouver, ISO, and other styles

14

Hyberg, Martin. "Software Issue Time Estimation With Natural Language Processing and Machine Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295202.

Full text

Abstract:

Time estimation for software issues is crucial to planning projects. Developers and experts have for many decades tried to estimate time requirements for issues as accurately as possible. The methods that are used today are often time-consuming and complex. This thesis investigates if the time estimation process can be done with natural language processing and machine learning. Three different word embeddings were used to represent the free text description, bag-of-words with tf-idf weighing, word2Vec and fastText. The different word embeddings were then fed into two types of machine learning

APA, Harvard, Vancouver, ISO, and other styles

15

Le, Thu Anh. "An Exploration of the Word2vec Algorithm: Creating a Vector Representation of a Language Vocabulary that Encodes Meaning and Usage Patterns in the Vector Space Structure." Thesis, University of North Texas, 2016. https://digital.library.unt.edu/ark:/67531/metadc849728/.

Full text

Abstract:

This thesis is an exloration and exposition of a highly efficient shallow neural network algorithm called word2vec, which was developed by T. Mikolov et al. in order to create vector representations of a language vocabulary such that information about the meaning and usage of the vocabulary words is encoded in the vector space structure. Chapter 1 introduces natural language processing, vector representations of language vocabularies, and the word2vec algorithm. Chapter 2 reviews the basic mathematical theory of deterministic convex optimization. Chapter 3 provides background on some concepts

APA, Harvard, Vancouver, ISO, and other styles

16

Korger, Christina. "Clustering of Distributed Word Representations and its Applicability for Enterprise Search." Master's thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-208869.

Full text

Abstract:

Machine learning of distributed word representations with neural embeddings is a state-of-the-art approach to modelling semantic relationships hidden in natural language. The thesis “Clustering of Distributed Word Representations and its Applicability for Enterprise Search” covers different aspects of how such a model can be applied to knowledge management in enterprises. A review of distributed word representations and related language modelling techniques, combined with an overview of applicable clustering algorithms, constitutes the basis for practical studies. The latter have two goals: fi

APA, Harvard, Vancouver, ISO, and other styles

17

Alkathiri, Abdul Aziz. "Decentralized Large-Scale Natural Language Processing Using Gossip Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281277.

Full text

Abstract:

The field of Natural Language Processing in machine learning has seen rising popularity and use in recent years. The nature of Natural Language Processing, which deals with natural human language and computers, has led to the research and development of many algorithms that produce word embeddings. One of the most widely-used of these algorithms is Word2Vec. With the abundance of data generated by users and organizations and the complexity of machine learning and deep learning models, performing training using a single machine becomes unfeasible. The advancement in distributed machine learning

APA, Harvard, Vancouver, ISO, and other styles

18

Lekic, Sasa, and Kasper Liu. "Intent classification through conversational interfaces : Classification within a small domain." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-257863.

Full text

Abstract:

Natural language processing and Machine learning are subjects undergoing intense study nowadays. These fields are continually spreading, and are more interrelated than ever before. A case in point is text classification which is an instance of Machine learning(ML) application in Natural Language processing(NLP).Although these subjects have evolved over the recent years, they still have some problems that have to be considered. Some are related to the computing power techniques from these subjects require, whereas the others to how much training data they require.The research problem addressed

APA, Harvard, Vancouver, ISO, and other styles

19

Foschini, Federico. "Analisi della polarità su chat di livestreaming." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/10561/.

Full text

Abstract:

La Sentiment analysis, nata nell'ambito dell’informatica, è una delle aree di ricerca più attive nel campo dell’analisi del linguaggio naturale e si è diffusa ampiamente anche in altri rami scientifici come ad esempio le scienze sociali, l’economia e il marketing. L’enorme diffusione della sentiment analysis coincide con la crescita dei cosiddetti social media: siti di commercio e recensioni di prodotti, forum di discussione, blog, micro-blog e di vari social network. L'obiettivo del presente lavoro di tesi è stato quello di progettare un sistema di sentiment analysis in grado di rileva

APA, Harvard, Vancouver, ISO, and other styles

20

Bugo, Laura. "authorship analysis: studio delle metodologie e sviluppo di un sistema di riconoscimento." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text

Abstract:

Lo scopo del lavoro è quello di implementare un programma per il riconoscimento degli autori che permetta di individuare, tra un gruppo di sospetti, l'autore di un testo ignoto, avendo in input alcuni testi per ogni sospetto. Dai testi degli autori sono state estratte delle caratteristiche stilistiche costruite basandosi su esperimenti presenti in letteratura e attraverso l'utilizzo di nuove tecnologie non ancora testate nel problema dell'authorship attribution, Le caratteristiche stilistiche costruite sono quindi utilizzate per riconoscere gli autori dei testi di cui non è nota la paternità

APA, Harvard, Vancouver, ISO, and other styles

21

Elm, Emilia. "Comparison of methods applied to job matching based on soft skills." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-420374.

Full text

Abstract:

The expression ''Hire for attitude, train for skills'' is used as a motive to create a matching program where companies and job seekers' soft qualities are measured and compared against each other. Are there better or worse methods for this purpose, and how do they compare with each other? By associating soft qualities with companies and job seekers, it is possible to generate a value for how well they match. Therefore, data has been collected on several companies and job seekers. Their associated qualities are then translated into numerical vectors that can be used for matching purposes, wher

APA, Harvard, Vancouver, ISO, and other styles

22

Gouron, Romain Víctor Olivier. "Estudiando obras literarias con herramientas de procesamiento de lenguaje natural." Tesis, Universidad de Chile, 2017. http://repositorio.uchile.cl/handle/2250/146690.

Full text

Abstract:

Ingeniero Civil Matemático<br>En los últimos años, el procesamiento de lenguaje natural (Natural Language Proces-sing, o NLP) ha experimentado importantes avances. Específicamente, en 2013, Google lanzó "word2vec", un algoritmo que propone, a partir de un corpus dado, una representación vecto-rial de las palabras que lo componen. Dicho algoritmo ha tenido un gran éxito principalmentepor dos razones: La primera es el bajo costo computacional de su entrenamiento que permitióun uso masivo, mientras que la segunda es la intuitiva topología inducida por la representación vectorial ilustrada por el

APA, Harvard, Vancouver, ISO, and other styles

23

Giorgis, Stavros. "Evaluation of Approaches for Representation and Sentiment of Customer Reviews." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291214.

Full text

Abstract:

Classification of sentiment on customer reviews is a real-world application for many companies that offer text analytics and opinion extraction on customer reviews on different domains such as consumer electronics, hotels, restaurants, and car rental agencies. Natural Language Processing’s latest progress has seen the development of many new state-of-the-art approaches for representing the meaning of sentences, phrases, and words in the text using vector space models, so-called embeddings. In this thesis, we evaluated the most current and most popular text representation techniques against tra

APA, Harvard, Vancouver, ISO, and other styles

24

Murgia, Antonio. "Lightweight Internet Traffic Classification - A Subject Based Solution with Word Embeddings." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/10569/.

Full text

Abstract:

Internet traffic classification is a relevant and mature research field, anyway of growing importance and with still open technical challenges, also due to the pervasive presence of Internet-connected devices into everyday life. We claim the need for innovative traffic classification solutions capable of being lightweight, of adopting a domain-based approach, of not only concentrating on application-level protocol categorization but also classifying Internet traffic by subject. To this purpose, this paper originally proposes a classification solution that leverages domain name informati

APA, Harvard, Vancouver, ISO, and other styles

25

Wang, Run Fen. "Semantic Text Matching Using Convolutional Neural Networks." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-362134.

Full text

Abstract:

Semantic text matching is a fundamental task for many applications in NaturalLanguage Processing (NLP). Traditional methods using term frequencyinversedocument frequency (TF-IDF) to match exact words in documentshave one strong drawback which is TF-IDF is unable to capture semanticrelations between closely-related words which will lead to a disappointingmatching result. Neural networks have recently been used for various applicationsin NLP, and achieved state-of-the-art performances on many tasks.Recurrent Neural Networks (RNN) have been tested on text classificationand text matching, but it d

APA, Harvard, Vancouver, ISO, and other styles

26

Kindberg, Erik. "Word embeddings and Patient records : The identification of MRI risk patients." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157467.

Full text

Abstract:

Identification of risks ahead of MRI examinations is identified as a cumbersome and time-consuming process at the Linköping University Hospital radiology clinic. The hospital staff often have to search through large amounts of unstructured patient data to find information about implants. Word embeddings has been identified as a possible tool to speed up this process. The purpose of this thesis is to evaluate this method, and that is done by training a Word2Vec model on patient journal data and analyzing the close neighbours of key search words by calculating cosine similarity. The 50 closest n

APA, Harvard, Vancouver, ISO, and other styles

27

Lachmann, Tim, and Johan Sabel. "Distributionella representationer av ord för effektiv informationssökning : Algoritmer för sökning i kundsupportforum." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209695.

Full text

Abstract:

I takt med att informationsmängden ökar i samhället ställs högre krav på mer förfinade metoder för sökning och hantering av information. Att utvinna relevant data från företagsinterna system blir en mer komplex uppgift då större informationsmängder måste hanteras och mycket kommunikation förflyttas till digitala plattformar. Metoder för vektorbaserad ordinbäddning har under senare år gjort stora framsteg; i synnerhet visade Google 2013 banbrytande resultat med modellen Word2vec och överträffade äldre metoder. Vi implementerar en sökmotor som utnyttjar ordinbäddningar baserade på Word2vec och l

APA, Harvard, Vancouver, ISO, and other styles

28

Wallin, Moa. "Ambiguous synonyms : Implementing an unsupervised WSD system for division of synonym clusters containing multiple senses." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157622.

Full text

Abstract:

When clustering together synonyms, complications arise in cases of the words having multiple senses as each sense’s synonyms are erroneously clustered together. The task of automatically distinguishing word senses in cases of ambiguity, known as word sense disambiguation (WSD), has been an extensively researched problem over the years. This thesis studies the possibility of applying an unsupervised machine learning based WSD-system for analysing existing synonym clusters (N = 149) and dividing them correctly when two or more senses are present. Based on sense embeddings induced from a large co

APA, Harvard, Vancouver, ISO, and other styles

29

Fulda, Nancy Ellen. "Semantically Aligned Sentence-Level Embeddings for Agent Autonomy and Natural Language Understanding." BYU ScholarsArchive, 2019. https://scholarsarchive.byu.edu/etd/7550.

Full text

Abstract:

Many applications of neural linguistic models rely on their use as pre-trained features for downstream tasks such as dialog modeling, machine translation, and question answering. This work presents an alternate paradigm: Rather than treating linguistic embeddings as input features, we treat them as common sense knowledge repositories that can be queried using simple mathematical operations within the embedding space, without the need for additional training. Because current state-of-the-art embedding models were not optimized for this purpose, this work presents a novel embedding model designe

APA, Harvard, Vancouver, ISO, and other styles

30

Sundström, Johan. "Sentiment analysis of Swedish reviews and transfer learning using Convolutional Neural Networks." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-339066.

Full text

Abstract:

Sentiment analysis is a field within machine learning that focus on determine the contextual polarity of subjective information. It is a technique that can be used to analyze the "voice of the customer" and has been applied with success for the English language for opinionated information such as customer reviews, political opinions and social media data. A major problem regarding machine learning models is that they are domain dependent and will therefore not perform well for other domains. Transfer learning or domain adaption is a research field that study a model's ability of transferring k

APA, Harvard, Vancouver, ISO, and other styles

31

van, Luenen Anne Fleur. "Recognising Moral Foundations in Online Extremist Discourse : A Cross-Domain Classification Study." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-426921.

Full text

Abstract:

So far, studies seeking to recognise moral foundations in texts have been relatively successful (Araque et al., 2019; Lin et al., 2018; Mooijman et al., 2017; Rezapouret al., 2019). There are, however, two issues with these studies: Firstly, it is an extensive process to gather and annotate sufficient material for training. Secondly, models are only trained and tested within the same domain. It is yet unexplored how these models for moral foundation prediction perform when tested in other domains, but from their experience with annotation, Hoover et al. (2017) describe how moral sentiments on

APA, Harvard, Vancouver, ISO, and other styles

32

Svensson, Pontus. "Automated Image Suggestions for News Articles : An Evaluation of Text and Image Representations in an Image Retrieval System." Thesis, Linköpings universitet, Interaktiva och kognitiva system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166669.

Full text

Abstract:

Multimodal machine learning is a subfield of machine learning that aims to relate data from different modalities, such as texts and images. One of the many applications that could be built upon this technique is an image retrieval system that, given a text query, retrieves suitable images from a database. In this thesis, a retrieval system based on canonical correlation is used to suggest images for news articles. Different dense text representations produced by Word2vec and Doc2vec, and image representations produced by pre-trained convolutional neural networks are explored to find out how th

APA, Harvard, Vancouver, ISO, and other styles

33

Ankaräng, Fredrik, and Fabian Waldner. "Evaluating Random Forest and a Long Short-Term Memory in Classifying a Given Sentence as a Question or Non-Question." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-262209.

Full text

Abstract:

Natural language processing and text classification are topics of much discussion among researchers of machine learning. Contributions in the form of new methods and models are presented on a yearly basis. However, less focus is aimed at comparing models, especially comparing models that are less complex to state-of-the-art models. This paper compares a Random Forest with a Long-Short Term Memory neural network for the task of classifying sentences as questions or non-questions, without considering punctuation. The models were trained and optimized on chat data from a Swedish insurance company

APA, Harvard, Vancouver, ISO, and other styles

34

FUTIA, GIUSEPPE. "Neural Networks forBuilding Semantic Models and Knowledge Graphs." Doctoral thesis, Politecnico di Torino, 2020. http://hdl.handle.net/11583/2850594.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Jadrníček, Zbyněk. "Shlukování slov podle významu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2015. http://www.nusl.cz/ntk/nusl-234899.

Full text

Abstract:

This thesis is focused on the problem of semantic similarity of words in English language. At first reader is informed about theory of word sense clustering, then there are described chosen methods and tools related to the topic. In the practical part we design and implement system for determining semantic similarity using Word2Vec tool, particularly we focus on biomedical texts of MEDLINE database. At the end of the thesis we discuss reached results and give some ideas to improve the system.

APA, Harvard, Vancouver, ISO, and other styles

36

Moon, Gordon Euhyun. "Parallel Algorithms for Machine Learning." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1561980674706558.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Malmberg, Jacob, Öhman Marcus Nystad, and Alexandra Hotti. "Implementing Machine Learning in the Credit Process of a Learning Organization While Maintaining Transparency Using LIME." Thesis, KTH, Industriell ekonomi och organisation (Inst.), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-232579.

Full text

Abstract:

To determine whether a credit limit for a corporate client should be changed, a financial institution writes a PM containingtext and financial data that then is assessed by a credit committee which decides whether to increase the limit or not. To make thisprocess more efficient, machine learning algorithms was used to classify the credit PMs instead of a committee. Since most machinelearning algorithms are black boxes, the LIME framework was used to find the most important features driving the classification. Theresults of this study show that credit memos can be classified with high accuracy

APA, Harvard, Vancouver, ISO, and other styles

38

Maitre, Julien. "Détection et analyse des signaux faibles. Développement d’un framework d’investigation numérique pour un service caché Lanceurs d’alerte." Thesis, La Rochelle, 2022. http://www.theses.fr/2022LAROS020.

Full text

Abstract:

Ce manuscrit s’inscrit dans le cadre du développement d’une plateforme d’analyse automatique de documents associée à un service sécurisé lanceurs d’alerte, de type GlobalLeaks. Nous proposons une chaine d’extraction à partir de corpus de document, d’analyse semi-automatisée et de recherche au moyen de requêtes Web pour in fine, proposer des tableaux de bord décrivant les signaux faibles potentiels. Nous identifions et levons un certain nombre de verrous méthodologiques et technologiques inhérents : 1) à l’analyse automatique de contenus textuels avec un minimum d’a priori, 2) à l’enrichissemen

APA, Harvard, Vancouver, ISO, and other styles

39

Moyer, Eric David. "What Machines Understand about Personality Words after Reading the News." Wright State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=wright1404902086.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Keisala, Simon. "Using a Character-Based Language Model for Caption Generation." Thesis, Linköpings universitet, Interaktiva och kognitiva system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-163001.

Full text

Abstract:

Using AI to automatically describe images is a challenging task. The aim of this study has been to compare the use of character-based language models with one of the current state-of-the-art token-based language models, im2txt, to generate image captions, with focus on morphological correctness. Previous work has shown that character-based language models are able to outperform token-based language models in morphologically rich languages. Other studies show that simple multi-layered LSTM-blocks are able to learn to replicate the syntax of its training data. To study the usability of character

APA, Harvard, Vancouver, ISO, and other styles

41

Yu, Chen-Ning, and 游鎮寧. "Situation Retrieval Cognition Based on Word2Vec." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/9fz33a.

Full text

Abstract:

碩士<br>國立勤益科技大學<br>資訊工程系<br>105<br>This study proposes the relationship between words by word vector, design and analyze a set of human - computer interaction retrieval based on Word2Vec to find out where the user wants to express, and do not need to match the computer-defined keyword setting. The user can develop interactive and matching situations without entering specific words. This paper developed a more flexible search keyword matching to improve the shortcomings of the inflexible chat robot. In this paper, the system will automatically match the keyword to find the user expressed the imp

APA, Harvard, Vancouver, ISO, and other styles

42

Huang, Chung-Ting, and 黃中廷. "A Study of word2vec: Application of Media Bias Investigation." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/rhjs53.

Full text

Abstract:

碩士<br>國立政治大學<br>統計學系<br>107<br>News media plays an important role in information transmission and supervising the government, but the problem of media bias is accompanied with massive numbers of news especially in political news. Word2Vec is used to map categorical data into real number space. The correlation between words can be measured after quantifying. In this paper, we apply Word2Vec on the news data of Taiwan electronic media, capturing keywords and analyzing them to find out the existence of media bias. We also explore the differences of views between the model in Word2Vec and original

APA, Harvard, Vancouver, ISO, and other styles

43

Wu, Zong-Yao, and 吳宗耀. "Sentiment Analysis for Patient-Author Text: Using Word2Vec and Symptoms." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/q78nt2.

Full text

Abstract:

碩士<br>中原大學<br>資訊工程研究所<br>105<br>Recently, Sentiment analysis (SA) is gaining popularity. Most previous work studied product reviews with machine learning techniques to predict the sentiment polarity. They focused on how to build the patterns like statistical language models or to extract semantic features from texts. In this paper, we apply SA techniques to patient-authored text on online medical communities. Our datasets are patient-authored text (PAT) from a well-known medical website, patientslikeme.com (PLM). Patients can share mood phrases, severity of symptoms, treatment, and quality of

APA, Harvard, Vancouver, ISO, and other styles

44

Wang, Yu-Hsuan, and 王育軒. "Segmental Audio Word2Vec: Representing Utterances as Sequences of Audio Word Vectors." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/3zy3z6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Cheng, Hao-Hsin, and 鄭皓心. "Developing an Automated Scoring Technique for Divergent Thinking Tests Based on word2vec." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/abgw56.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

TIAN, JIA-JHEN, and 田家禎. "Using Word2vec to Examine the Mappings between Educational Objectives and Core Competencies." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/yrs934.

Full text

Abstract:

碩士<br>國立屏東大學<br>資訊管理學系碩士班<br>108<br>The Educational Objective is to educate and shape the talents of the students, and the core competence is the abilities that students need to poss to achieve these Educational Objectives. If the correspondence between Educational Objectives and core competencies is not perfect, it may cause certain Educational Objectives to be difficult to achieve. Colleges and universities usually develop their Educational Objectives and core competencies by recruiting scholars and experts inside and outside the school to determine the corresponding relationship manually. T

APA, Harvard, Vancouver, ISO, and other styles

47

Shen, Chia-Hao, and 沈家豪. "Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/z7v2w8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Gungor, Abdulmecit. "Benchmarking authorship attribution techniques using over a thousand books by fifty Victorian era novelists." Thesis, 2018. https://doi.org/10.7912/C21T01.

Full text

Abstract:

Indiana University-Purdue University Indianapolis (IUPUI)<br>Authorship attribution (AA) is the process of identifying the author of a given text and from the machine learning perspective, it can be seen as a classification problem. In the literature, there are a lot of classification methods for which feature extraction techniques are conducted. In this thesis, we explore information retrieval techniques such as Word2Vec, paragraph2vec, and other useful feature selection and extraction techniques for a given text with different classifiers. We have performed experiments on novels that are ext

APA, Harvard, Vancouver, ISO, and other styles

49

(9761117), Shayan Ali A. Akbar. "Source code search for automatic bug localization." Thesis, 2020.

Find full text

Abstract:

This dissertation advances the state-of-the-art in information retrieval (IR) based automatic bug localization for large software systems. We present techniques from three generations of IR based bug localization and compare their performances on our large and diverse bug localization dataset --- the Bugzbook dataset. The three generations span over fifteen years of research in mining software repositories for bug localization and include: (1) the generation of simple bag-of-words (BoW) based techniques, (2) the generation in which software-centric information such as bug and code change hist

APA, Harvard, Vancouver, ISO, and other styles

50

Pombo, José Luís Fava de Matos. "Landing on the right job : a machine learning approach to match candidates with jobs applying semantic embeddings." Master's thesis, 2019. http://hdl.handle.net/10362/60405.

Full text

Abstract:

Project Work presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics<br>Job application’ screening is a challenging and time-consuming task to execute manually. For recruiting companies such as Landing.Jobs it poses constraints on the ability to scale the business. Some systems have been built for assisting recruiters screening applications but they tend to overlook the challenges related with natural language. On the other side, most people nowadays specially in the IT-sector use the Internet to look for jobs, however, given the huge amount

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!