To see the other types of publications on this topic, follow the link: Attention LSTM.

Dissertations / Theses on the topic 'Attention LSTM'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 16 dissertations / theses for your research on the topic 'Attention LSTM.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Singh, J. P., A. Kumar, Nripendra P. Rana, and Y. K. Dwivedi. "Attention-based LSTM network for rumor veracity estimation of tweets." Springer, 2020. http://hdl.handle.net/10454/17942.

Full text
Abstract:
Yes
Twitter has become a fertile place for rumors, as information can spread to a large number of people immediately. Rumors can mislead public opinion, weaken social order, decrease the legitimacy of government, and lead to a significant threat to social stability. Therefore, timely detection and debunking rumor are urgently needed. In this work, we proposed an Attention-based Long-Short Term Memory (LSTM) network that uses tweet text with thirteen different linguistic and user features to distinguish rumor and non-rumor tweets. The performance of the proposed Attention-based LSTM model is compared with several conventional machine and deep learning models. The proposed Attention-based LSTM model achieved an F1-score of 0.88 in classifying rumor and non-rumor tweets, which is better than the state-of-the-art results. The proposed system can reduce the impact of rumors on society and weaken the loss of life, money, and build the firm trust of users with social media platforms.
APA, Harvard, Vancouver, ISO, and other styles
2

Kindbom, Hannes. "Investigating the Attribution Quality of LSTM with Attention and SHAP : Going Beyond Predictive Performance." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302412.

Full text
Abstract:
Estimating each marketing channel’s impact on conversion can help advertisers develop strategies and spend their marketing budgets optimally. This problem is often referred to as attribution modelling, and it is gaining increasing attention in both the industry and academia as access to online tracking data improves. Focusing on achieving higher predictive performance, the Long Short- Term Memory (LSTM) architecture is currently trending as a data-driven solution to attribution modelling. However, such deep neural networks have been criticised for being difficult to interpret. Interpretability is critical, since channel attributions are generally obtained by studying how a model makes a binary conversion prediction given a sequence of clicks or views of ads in different channels. Therefore, this degree project studies and compares the quality of LSTM attributions, calculated with SHapleyAdditive exPlanations (SHAP), attention and fractional scores to three baseline models. The fractional score is the mean difference in a model’s predicted conversion probability with and without a channel. Furthermore, a synthetic data generator based on a Poisson process is developed and validated against real data to measure attribution quality as the Mean Absolute Error (MAE) between calculated attributions and the true causal relationships between channel clicks and conversions. The experimental results demonstrate that the quality of attributions is not unambiguously reflected by the predictive performance of LSTMs. In general, it is not possible to assume a high attribution quality solely based on high predictive performance. For example, all models achieve ~82% accuracy on real data, whereas LSTM Fractional and SHAP produce the lowest attribution quality of 0:0566 and 0:0311 MAE respectively. This can be compared to an improved MAE of 0:0058, which is obtained with a Last-Touch Attribution (LTA) model. The attribution quality also varies significantly depending on which attribution calculation method is used for the LSTM. This suggests that the ongoing quest for improved accuracy may be questioned and that it is not always justified to use an LSTM when aiming for high quality attributions.
Genom att estimera påverkan varje marknadsföringskanal har på konverteringar, kan annonsörer utveckla strategier och spendera sina marknadsföringsbudgetar optimalt. Det här kallas ofta attributionsmodellering och det får alltmer uppmärksamhet i både näringslivet och akademin när tillgången till spårningsinformation ökar online. Med fokus på att uppnå högre prediktiv prestanda är Long Short-Term Memory (LSTM) för närvarande en populär datadriven lösning inom attributionsmodellering. Sådana djupa neurala nätverk har dock kritiserats för att vara svårtolkade. Tolkningsbarhet är viktigt, då kanalattributioner generellt fås genom att studera hur en modell gör en binär konverteringsprediktering givet en sekvens av klick eller visningar av annonser i olika kanaler. Det här examensarbetet studerar och jämför därför kvaliteten av en LSTMs attributioner, beräknade med SHapley Additive exPlanations (SHAP), attention och fractional scores mot tre grundmodeller. Fractional scores beräknas som medelvärdesdifferensen av en modells predikterade konverteringssannolikhet med och utan en viss kanal. Därutöver utvecklas en syntetisk datagenerator baserad på en Poissonprocess, vilken valideras mot verklig data. Generatorn används för att kunna mäta attributionskvalitet som Mean Absolute Error (MAE) mellan beräknade attributioner och de verkliga kausala sambanden mellan kanalklick och konverteringar. De experimentella resultaten visar att attributionskvaliteten inte entydigt avspeglas av en LSTMs prediktiva prestanda. Det är generellt inte möjligt att anta en hög attributionskvalitet enbart baserat på en hög prediktiv prestanda. Alla modeller uppnår exempelvis ~82% prediktiv träffsäkerhet på verklig data, medan LSTM Fractional och SHAP ger den lägsta attributionskvaliteten på 0:0566 respektive 0:0311 MAE. Det här kan jämföras mot en förbättrad MAE på 0:0058, som erhålls med en Last-touch-modell. Kvaliteten på attributioner varierar också signifikant beroende på vilket metod för attributionsberäkning som används för LSTM. Det här antyder att den pågående strävan efter högre prediktiv träffsäkerhet kan ifrågasättas och att det inte alltid är berättigat att använda en LSTM när attributioner av hög kvalitet eftersträvas.
APA, Harvard, Vancouver, ISO, and other styles
3

Forch, Valentin, Julien Vitay, and Fred H. Hamker. "Recurrent Spatial Attention for Facial Emotion Recognition." Technische Universität Chemnitz, 2020. https://monarch.qucosa.de/id/qucosa%3A72453.

Full text
Abstract:
Automatic processing of emotion information through deep neural networks (DNN) can have great benefits (e.g., for human-machine interaction). Vice versa, machine learning can profit from concepts known from human information processing (e.g., visual attention). We employed a recurrent DNN incorporating a spatial attention mechanism for facial emotion recognition (FER) and compared the output of the network with results from human experiments. The attention mechanism enabled the network to select relevant face regions to achieve state-of-the-art performance on a FER database containing images from realistic settings. A visual search strategy showing some similarities with human saccading behavior emerged when the model’s perceptive capabilities were restricted. However, the model then failed to form a useful scene representation.
APA, Harvard, Vancouver, ISO, and other styles
4

Bopaiah, Jeevith. "A recurrent neural network architecture for biomedical event trigger classification." UKnowledge, 2018. https://uknowledge.uky.edu/cs_etds/73.

Full text
Abstract:
A “biomedical event” is a broad term used to describe the roles and interactions between entities (such as proteins, genes and cells) in a biological system. The task of biomedical event extraction aims at identifying and extracting these events from unstructured texts. An important component in the early stage of the task is biomedical trigger classification which involves identifying and classifying words/phrases that indicate an event. In this thesis, we present our work on biomedical trigger classification developed using the multi-level event extraction dataset. We restrict the scope of our classification to 19 biomedical event types grouped under four broad categories - Anatomical, Molecular, General and Planned. While most of the existing approaches are based on traditional machine learning algorithms which require extensive feature engineering, our model relies on neural networks to implicitly learn important features directly from the text. We use natural language processing techniques to transform the text into vectorized inputs that can be used in a neural network architecture. As per our knowledge, this is the first time neural attention strategies are being explored in the area of biomedical trigger classification. Our best results were obtained from an ensemble of 50 models which produced a micro F-score of 79.82%, an improvement of 1.3% over the previous best score.
APA, Harvard, Vancouver, ISO, and other styles
5

Soncini, Filippo. "Classificazione di documenti tramite reti neurali." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20509/.

Full text
Abstract:
Questo elaborato è stato proposto con l’obbiettivo di affrontare il problema della classificazione di documenti utilizzando sia contenuti visivi che testuali, cercando di analizzare diverse reti e diverse combinazioni di esse per poi sviluppare un modello personalizzato.
APA, Harvard, Vancouver, ISO, and other styles
6

Näslund, Per. "Artificial Neural Networks in Swedish Speech Synthesis." Thesis, KTH, Tal-kommunikation, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239350.

Full text
Abstract:
Text-to-speech (TTS) systems have entered our daily lives in the form of smart assistants and many other applications. Contemporary re- search applies machine learning and artificial neural networks (ANNs) to synthesize speech. It has been shown that these systems outperform the older concatenative and parametric methods. In this paper, ANN-based methods for speech synthesis are ex- plored and one of the methods is implemented for the Swedish lan- guage. The implemented method is dubbed “Tacotron” and is a first step towards end-to-end ANN-based TTS which puts many differ- ent ANN-techniques to work. The resulting system is compared to a parametric TTS through a strength-of-preference test that is carried out with 20 Swedish speaking subjects. A statistically significant pref- erence for the ANN-based TTS is found. Test subjects indicate that the ANN-based TTS performs better than the parametric TTS when it comes to audio quality and naturalness but sometimes lacks in intelli- gibility.
Talsynteser, också kallat TTS (text-to-speech) används i stor utsträckning inom smarta assistenter och många andra applikationer. Samtida forskning applicerar maskininlärning och artificiella neurala nätverk (ANN) för att utföra talsyntes. Det har visats i studier att dessa system presterar bättre än de äldre konkatenativa och parametriska metoderna. I den här rapporten utforskas ANN-baserade TTS-metoder och en av metoderna implementeras för det svenska språket. Den använda metoden kallas “Tacotron” och är ett första steg mot end-to-end TTS baserat på neurala nätverk. Metoden binder samman flertalet olika ANN-tekniker. Det resulterande systemet jämförs med en parametriskt TTS genom ett graderat preferens-test som innefattar 20 svensktalande försökspersoner. En statistiskt säkerställd preferens för det ANN- baserade TTS-systemet fastställs. Försökspersonerna indikerar att det ANN-baserade TTS-systemet presterar bättre än det parametriska när det kommer till ljudkvalitet och naturlighet men visar brister inom tydlighet.
APA, Harvard, Vancouver, ISO, and other styles
7

Carman, Benjamin Andrew. "Translating LaTeX to Coq: A Recurrent Neural Network Approach to Formalizing Natural Language Proofs." Ohio University Honors Tutorial College / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ouhonors161919616626269.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ujihara, Rintaro. "Multi-objective optimization for model selection in music classification." Thesis, KTH, Optimeringslära och systemteori, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370.

Full text
Abstract:
With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds.
I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder.
APA, Harvard, Vancouver, ISO, and other styles
9

GAO, SHAO-EN, and 高紹恩. "Share Price Trend Prediction Using Attention with LSTM Structure." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/n57t99.

Full text
Abstract:
碩士
國立勤益科技大學
資訊工程系
107
Stock market has a considerable impact in the whole financial market.Among researches on prediction, stock price movements prediction is a quite hot topic. In this paper, stock price movements were predicted by utilizing various stock information by technical means of deep learning.The architecture based on LSTM using Attention proposed in this paper was proven through experiment to be able to effectively improve prediction accuracy. This paper uses deep learning to predict the trend of stock prices.Since the price increase of stocks is usually related to the stock price in the past, a long term short term memory LSTM based architecture is proposed. LSTM improves the long term dependence of traditional RNN, effectively improves the accuracy and stability of prediction,and improves the accuracy and stability of the network by adding Attention.
APA, Harvard, Vancouver, ISO, and other styles
10

Tseng, Po-Yen, and 曾博彥. "Android Malware Analysis Based on System Call sequences and Attention-LSTM." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/gdrth9.

Full text
Abstract:
碩士
國立中央大學
資訊管理學系
107
With the popularity of Android mobile devices, detecting and protecting malicious software has become an important issue. Although there have been studies proposed that dynamic analysis can overcome the shortcomings of avoidance detection problems such as code obfuscated. However, how to learn more detail of correlation between the sequence-type features extracted by dynamic analysis to improve the resolution accuracy of the classification model is the direction of many research efforts. This study extracts the system call sequence as a feature, and extracts the system call correlation through the Long Short-Term Memory (LSTM) deep learning model. In addition, in order to avoid the increase of the length of the system call sequence and reduce the accuracy of the model classification, the attention mechanism is added to the classification model. The experimental results show that through the two-layer of Bi- LSTM architecture and the deep neural network of the Attention mechanism, the resolution of benign and malicious programs is 93.5%, and the classification of benign programs and two other malicious types is detailed. The result is an accuracy of 93.1%, showing excellent classification ability.
APA, Harvard, Vancouver, ISO, and other styles
11

Su, Ruei-Ye, and 蘇瑞燁. "A Bi-directional LSTM-CNN Model with Attention for Chinese Sentiment Analysis." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/2y9j7r.

Full text
Abstract:
碩士
樹德科技大學
資訊工程系碩士班
107
With the massive development of social media, people are used to sharing personal ideas and opinions on social media service platforms and most people have personal viewpoints on certain specific topics. As time goes on, large amounts of data are generated, which contain potentially valuable information from the perspective of business. In the field of NLP (Natural Language Processing), sentiment analysis in Chinese messages is one of the major approaches to grasping Internet public opinion. This paper originally proposed a LSAEB-CNN (Bi-LSTM Self-Attention of Emoticon-Based Convolutional Neural Network), which is a deep learning method that combines Bi-directional Long Short-Term Memory (Bi-LSTM) with Convolutional Neural Networks (CNN), and embeds emoticons into Self-Attention. The method could effectively identify different emotional polarities without external knowledge, but the focus in Self-Attention excessive attention to problems. This paper thus proposes a further improved method: Bi-LSTM Multi-Head Attention of Emoticon-Based Convolutional Neural Network (LMAEB-CNN) on Self-Attention. Most importantly, the method lets each vector perform multi-layer operations. The data was collected from Plurk, the micro-blogging service, with deep learning conducted in Keras. Chinese micro-blogs were checked for sentiment polarity classification and the study achieved an accuracy rate of about 98.9%, which is significantly higher than other methods.
APA, Harvard, Vancouver, ISO, and other styles
12

Huang, Yen-Cheng, and 黃彥誠. "Deep Neural Network with Attention Mechansim and LSTM for Temporal Information Exploration in Classification of Motor-Imagery EEG." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/94vys5.

Full text
Abstract:
碩士
國立交通大學
資訊科學與工程研究所
107
The EEG signal is a medium to realize a brain-computer interface (BCI) system which help motor-disabled patients to communicate with the outside world by external devices. The problems associated with this task include recordings with a poor signal-to-noise ratio and contamination from external body movements, such as, muscle activity, blinking, and head movement. Considerable variability between subjects and recording sessions compounds the difficulty of this task, particularly when seeking to train a model using trials obtained from all of the subjects. Recently, there are works demonstrating the postive outcome using CNN in task of motor-imagery classification. This paper outlines two novel neural network architecture for the classification of motor imagery EEG recordings using deep learning techniques. One of proposed methods comprises an attention mechanism, the another model is CNN equipped with LSTM. The attenion mechanism in the former model calculating the importance of each electrode; the LSTM in the latter model used for finding the temporal information within features. Compared to the results obtained using a variety of state-of-the-art deep learning techniques, the proposed scheme represents a considerable advancement in classification accuracy when applied to the BCI Competition IVdataset IIa, reaching accuracy 85.2%. Besides, when the proposed models were applied to motor-imagery EEG data collected in this work, the models yielded better results compared to pure CNN model by 9.2%. Asides from comparing the accuracy to effectiveness of the proposed models, we also determine that the attention mechanism mentioned above performs the same process as CSP and common temporal pattern (CTP), wherein inputs from all classes are projected onto a similar coordinate system considered the optimal space for classification. Moreover, through power-feature corrlation maps, visualzation of LSTM, and representation erasure determined by RL, we rationalize the semantic meanings behind operations of CNN as well as LSTM and, eventually, illustrate out two decisive factors of temporal features affecting the capability of LSTM in sequence modeling: (i) critical time range for classification and (ii) correct frequency range for event-related potential (ERP) which induces the activation of the features. These two factors could be indications for designing models consisted of CNN and RNN for processing other types of bio-signal which are also closely in relationship with ERP.
APA, Harvard, Vancouver, ISO, and other styles
13

Soudamalla, Sharath Kumar. "Implications of Conversational AI on Humanoid Robots." 2019. https://monarch.qucosa.de/id/qucosa%3A72426.

Full text
Abstract:
Humanizing Technologies GmbH develops Intelligent software for the humanoid robots from Softbank Robotics. The main objective of this thesis is to develop and deploy Conversational Artificial Intelligence software into the humanoid robots using deep learning techniques. Development of conversational agents using Machine Learning or Artificial Intelligence is an intriguing issue with regards to Natural Language Processing. Great research and experimentation is being conducted in this area. Currently most of the chatbots are developed with rule based programming that cannot hold conversation which replicates real human interaction. This issue is addressed in this thesis with the development of Deep learning conversational AI based on Sequence to sequence, Attention mechanism, Transfer learning, Active learning and Beam search decoding which emulates human like conversation. The complete end to end conversational AI software is designed, implemented and deployed in this thesis work according to the conceptual specifications. The research objectives are successfully accomplished and results of the proposed concept are dis- cussed in detail.
APA, Harvard, Vancouver, ISO, and other styles
14

Hsu, Yi-Kuan, and 許以觀. "A Factory-aware Attentional LSTM Model for PM2.5 Prediction." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/fcx28w.

Full text
Abstract:
碩士
國立交通大學
資訊管理研究所
107
With air quality issues becoming a global concern, many countries is facing lot of air pollution problems. While monitoring stations have been established to collect air quality information, and scientists have been committed to the study of air quality predictions, but few studies have taken the different monitoring areas and industrial features into account. In this paper, we propose a deep neural network for PM2.5 predictions, named FAA-LSTM, collecting air quality data from three types of monitors and factory data that is highly related to air quality. A spatial transformation component is designed to obtain the local factors by segmenting the monitoring areas into grids and we consider the influence of neighboring factory data over local PM2.5 grids by adopting attention mechanism to find out the importance. Next, the factor of global air quality station is considered. We combine these heterogeneous data and feed it into a long short-term memory neural network to extract the hidden features and forecast PM2.5 concentrations. In this research, we evaluate our model FAA-LSTM with data from EPA and Academia Sinica in Taichung, surpassing the results of multiple methods, including linear regression, support vector regression, multi-layer perceptron and LSTM.
APA, Harvard, Vancouver, ISO, and other styles
15

Wang, Yu-Jen, and 王育任. "Using Attentive to improve Recursive LSTM End-to- End Chinese Discourse Parsing." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/4zurd5.

Full text
Abstract:
碩士
國立中央大學
資訊工程學系
107
Discourse parser can help us to understand the relationship and connection between sentences from different angles, but the tree structure data still need to rely on manual marking, which makes this technology cannot be directly used in life. So far, there have been many research studies on automatically construct the complete tree structure on the computer. Since deep learning has progressed rapidly in recent years, the construction method for discourse parser has also changed from the traditional SVM, CRF method to the current recursive neural. In the Chinese corpus tree library CDTB, the parsing analysis problem can be divided into four main problems, including elementary discourse unit (EDU) segmentation, tree structure construction, center labeling, and sense labeling. In this paper, we use many state-of-the-art deep learning techniques, such as attentive recursive neural networks, self-attentive, and BERT to improve the performance. In the end, we succeed to increase the accuracy by more than 10% of F1 of each task, reaching the best performance we know so far.
APA, Harvard, Vancouver, ISO, and other styles
16

Hsu, Chih-Jung, and 徐志榮. "Predicting Transportation Demand based on AR-LSTMs Model with Multi-Head Attention." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/j7pg8k.

Full text
Abstract:
碩士
國立中央大學
軟體工程研究所
107
Smart transportation is a crucial issue for a smart city, and the forecast for taxi demand is one of the important topics in smart transportation. If we can effectively predict the taxi demand in the near future, we may be able to reduce the taxi vacancy rate, reduce the waiting time of the passengers, increase the number of trip counts for a taxi, expand driver’s income, and diminish the power consumption and pollution caused by vehicle dispatches. This paper proposes an efficient taxi demand prediction model based on state-of-the-art deep learning architecture. Specifically, we use the LSTM model as the foundation, because the LSTM model is effective in predicting time-series datasets. We enhance the LSTM model by introducing the attention mechanism such that the traffic during the peak hour and the off-peak period can better be predicted. We leverage a multi-layer architecture to increase the predicting accuracy. Additionally, we design a loss function that incorporates both the absolute mean-square-error (which tends under-estimate the low taxi demand areas) and the relative meansquare-error (which tends to misestimate the high taxi demand areas). To validate our model, we conduct experiments on two real datasets — the NYC taxi demand dataset and the Taiwan Taxi’s taxi demand dataset in Taipei City. We compare the proposed model with non-machine learning based models, traditional machine learning models, and deep learning models. Experimental results show that the proposed model outperforms the baseline models.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography