Dissertations / Theses on the topic 'Attention LSTM'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 16 dissertations / theses for your research on the topic 'Attention LSTM.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Singh, J. P., A. Kumar, Nripendra P. Rana, and Y. K. Dwivedi. "Attention-based LSTM network for rumor veracity estimation of tweets." Springer, 2020. http://hdl.handle.net/10454/17942.
Full textTwitter has become a fertile place for rumors, as information can spread to a large number of people immediately. Rumors can mislead public opinion, weaken social order, decrease the legitimacy of government, and lead to a significant threat to social stability. Therefore, timely detection and debunking rumor are urgently needed. In this work, we proposed an Attention-based Long-Short Term Memory (LSTM) network that uses tweet text with thirteen different linguistic and user features to distinguish rumor and non-rumor tweets. The performance of the proposed Attention-based LSTM model is compared with several conventional machine and deep learning models. The proposed Attention-based LSTM model achieved an F1-score of 0.88 in classifying rumor and non-rumor tweets, which is better than the state-of-the-art results. The proposed system can reduce the impact of rumors on society and weaken the loss of life, money, and build the firm trust of users with social media platforms.
Kindbom, Hannes. "Investigating the Attribution Quality of LSTM with Attention and SHAP : Going Beyond Predictive Performance." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302412.
Full textGenom att estimera påverkan varje marknadsföringskanal har på konverteringar, kan annonsörer utveckla strategier och spendera sina marknadsföringsbudgetar optimalt. Det här kallas ofta attributionsmodellering och det får alltmer uppmärksamhet i både näringslivet och akademin när tillgången till spårningsinformation ökar online. Med fokus på att uppnå högre prediktiv prestanda är Long Short-Term Memory (LSTM) för närvarande en populär datadriven lösning inom attributionsmodellering. Sådana djupa neurala nätverk har dock kritiserats för att vara svårtolkade. Tolkningsbarhet är viktigt, då kanalattributioner generellt fås genom att studera hur en modell gör en binär konverteringsprediktering givet en sekvens av klick eller visningar av annonser i olika kanaler. Det här examensarbetet studerar och jämför därför kvaliteten av en LSTMs attributioner, beräknade med SHapley Additive exPlanations (SHAP), attention och fractional scores mot tre grundmodeller. Fractional scores beräknas som medelvärdesdifferensen av en modells predikterade konverteringssannolikhet med och utan en viss kanal. Därutöver utvecklas en syntetisk datagenerator baserad på en Poissonprocess, vilken valideras mot verklig data. Generatorn används för att kunna mäta attributionskvalitet som Mean Absolute Error (MAE) mellan beräknade attributioner och de verkliga kausala sambanden mellan kanalklick och konverteringar. De experimentella resultaten visar att attributionskvaliteten inte entydigt avspeglas av en LSTMs prediktiva prestanda. Det är generellt inte möjligt att anta en hög attributionskvalitet enbart baserat på en hög prediktiv prestanda. Alla modeller uppnår exempelvis ~82% prediktiv träffsäkerhet på verklig data, medan LSTM Fractional och SHAP ger den lägsta attributionskvaliteten på 0:0566 respektive 0:0311 MAE. Det här kan jämföras mot en förbättrad MAE på 0:0058, som erhålls med en Last-touch-modell. Kvaliteten på attributioner varierar också signifikant beroende på vilket metod för attributionsberäkning som används för LSTM. Det här antyder att den pågående strävan efter högre prediktiv träffsäkerhet kan ifrågasättas och att det inte alltid är berättigat att använda en LSTM när attributioner av hög kvalitet eftersträvas.
Forch, Valentin, Julien Vitay, and Fred H. Hamker. "Recurrent Spatial Attention for Facial Emotion Recognition." Technische Universität Chemnitz, 2020. https://monarch.qucosa.de/id/qucosa%3A72453.
Full textBopaiah, Jeevith. "A recurrent neural network architecture for biomedical event trigger classification." UKnowledge, 2018. https://uknowledge.uky.edu/cs_etds/73.
Full textSoncini, Filippo. "Classificazione di documenti tramite reti neurali." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20509/.
Full textNäslund, Per. "Artificial Neural Networks in Swedish Speech Synthesis." Thesis, KTH, Tal-kommunikation, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239350.
Full textTalsynteser, också kallat TTS (text-to-speech) används i stor utsträckning inom smarta assistenter och många andra applikationer. Samtida forskning applicerar maskininlärning och artificiella neurala nätverk (ANN) för att utföra talsyntes. Det har visats i studier att dessa system presterar bättre än de äldre konkatenativa och parametriska metoderna. I den här rapporten utforskas ANN-baserade TTS-metoder och en av metoderna implementeras för det svenska språket. Den använda metoden kallas “Tacotron” och är ett första steg mot end-to-end TTS baserat på neurala nätverk. Metoden binder samman flertalet olika ANN-tekniker. Det resulterande systemet jämförs med en parametriskt TTS genom ett graderat preferens-test som innefattar 20 svensktalande försökspersoner. En statistiskt säkerställd preferens för det ANN- baserade TTS-systemet fastställs. Försökspersonerna indikerar att det ANN-baserade TTS-systemet presterar bättre än det parametriska när det kommer till ljudkvalitet och naturlighet men visar brister inom tydlighet.
Carman, Benjamin Andrew. "Translating LaTeX to Coq: A Recurrent Neural Network Approach to Formalizing Natural Language Proofs." Ohio University Honors Tutorial College / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ouhonors161919616626269.
Full textUjihara, Rintaro. "Multi-objective optimization for model selection in music classification." Thesis, KTH, Optimeringslära och systemteori, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370.
Full textI och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder.
GAO, SHAO-EN, and 高紹恩. "Share Price Trend Prediction Using Attention with LSTM Structure." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/n57t99.
Full text國立勤益科技大學
資訊工程系
107
Stock market has a considerable impact in the whole financial market.Among researches on prediction, stock price movements prediction is a quite hot topic. In this paper, stock price movements were predicted by utilizing various stock information by technical means of deep learning.The architecture based on LSTM using Attention proposed in this paper was proven through experiment to be able to effectively improve prediction accuracy. This paper uses deep learning to predict the trend of stock prices.Since the price increase of stocks is usually related to the stock price in the past, a long term short term memory LSTM based architecture is proposed. LSTM improves the long term dependence of traditional RNN, effectively improves the accuracy and stability of prediction,and improves the accuracy and stability of the network by adding Attention.
Tseng, Po-Yen, and 曾博彥. "Android Malware Analysis Based on System Call sequences and Attention-LSTM." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/gdrth9.
Full text國立中央大學
資訊管理學系
107
With the popularity of Android mobile devices, detecting and protecting malicious software has become an important issue. Although there have been studies proposed that dynamic analysis can overcome the shortcomings of avoidance detection problems such as code obfuscated. However, how to learn more detail of correlation between the sequence-type features extracted by dynamic analysis to improve the resolution accuracy of the classification model is the direction of many research efforts. This study extracts the system call sequence as a feature, and extracts the system call correlation through the Long Short-Term Memory (LSTM) deep learning model. In addition, in order to avoid the increase of the length of the system call sequence and reduce the accuracy of the model classification, the attention mechanism is added to the classification model. The experimental results show that through the two-layer of Bi- LSTM architecture and the deep neural network of the Attention mechanism, the resolution of benign and malicious programs is 93.5%, and the classification of benign programs and two other malicious types is detailed. The result is an accuracy of 93.1%, showing excellent classification ability.
Su, Ruei-Ye, and 蘇瑞燁. "A Bi-directional LSTM-CNN Model with Attention for Chinese Sentiment Analysis." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/2y9j7r.
Full text樹德科技大學
資訊工程系碩士班
107
With the massive development of social media, people are used to sharing personal ideas and opinions on social media service platforms and most people have personal viewpoints on certain specific topics. As time goes on, large amounts of data are generated, which contain potentially valuable information from the perspective of business. In the field of NLP (Natural Language Processing), sentiment analysis in Chinese messages is one of the major approaches to grasping Internet public opinion. This paper originally proposed a LSAEB-CNN (Bi-LSTM Self-Attention of Emoticon-Based Convolutional Neural Network), which is a deep learning method that combines Bi-directional Long Short-Term Memory (Bi-LSTM) with Convolutional Neural Networks (CNN), and embeds emoticons into Self-Attention. The method could effectively identify different emotional polarities without external knowledge, but the focus in Self-Attention excessive attention to problems. This paper thus proposes a further improved method: Bi-LSTM Multi-Head Attention of Emoticon-Based Convolutional Neural Network (LMAEB-CNN) on Self-Attention. Most importantly, the method lets each vector perform multi-layer operations. The data was collected from Plurk, the micro-blogging service, with deep learning conducted in Keras. Chinese micro-blogs were checked for sentiment polarity classification and the study achieved an accuracy rate of about 98.9%, which is significantly higher than other methods.
Huang, Yen-Cheng, and 黃彥誠. "Deep Neural Network with Attention Mechansim and LSTM for Temporal Information Exploration in Classification of Motor-Imagery EEG." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/94vys5.
Full text國立交通大學
資訊科學與工程研究所
107
The EEG signal is a medium to realize a brain-computer interface (BCI) system which help motor-disabled patients to communicate with the outside world by external devices. The problems associated with this task include recordings with a poor signal-to-noise ratio and contamination from external body movements, such as, muscle activity, blinking, and head movement. Considerable variability between subjects and recording sessions compounds the difficulty of this task, particularly when seeking to train a model using trials obtained from all of the subjects. Recently, there are works demonstrating the postive outcome using CNN in task of motor-imagery classification. This paper outlines two novel neural network architecture for the classification of motor imagery EEG recordings using deep learning techniques. One of proposed methods comprises an attention mechanism, the another model is CNN equipped with LSTM. The attenion mechanism in the former model calculating the importance of each electrode; the LSTM in the latter model used for finding the temporal information within features. Compared to the results obtained using a variety of state-of-the-art deep learning techniques, the proposed scheme represents a considerable advancement in classification accuracy when applied to the BCI Competition IVdataset IIa, reaching accuracy 85.2%. Besides, when the proposed models were applied to motor-imagery EEG data collected in this work, the models yielded better results compared to pure CNN model by 9.2%. Asides from comparing the accuracy to effectiveness of the proposed models, we also determine that the attention mechanism mentioned above performs the same process as CSP and common temporal pattern (CTP), wherein inputs from all classes are projected onto a similar coordinate system considered the optimal space for classification. Moreover, through power-feature corrlation maps, visualzation of LSTM, and representation erasure determined by RL, we rationalize the semantic meanings behind operations of CNN as well as LSTM and, eventually, illustrate out two decisive factors of temporal features affecting the capability of LSTM in sequence modeling: (i) critical time range for classification and (ii) correct frequency range for event-related potential (ERP) which induces the activation of the features. These two factors could be indications for designing models consisted of CNN and RNN for processing other types of bio-signal which are also closely in relationship with ERP.
Soudamalla, Sharath Kumar. "Implications of Conversational AI on Humanoid Robots." 2019. https://monarch.qucosa.de/id/qucosa%3A72426.
Full textHsu, Yi-Kuan, and 許以觀. "A Factory-aware Attentional LSTM Model for PM2.5 Prediction." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/fcx28w.
Full text國立交通大學
資訊管理研究所
107
With air quality issues becoming a global concern, many countries is facing lot of air pollution problems. While monitoring stations have been established to collect air quality information, and scientists have been committed to the study of air quality predictions, but few studies have taken the different monitoring areas and industrial features into account. In this paper, we propose a deep neural network for PM2.5 predictions, named FAA-LSTM, collecting air quality data from three types of monitors and factory data that is highly related to air quality. A spatial transformation component is designed to obtain the local factors by segmenting the monitoring areas into grids and we consider the influence of neighboring factory data over local PM2.5 grids by adopting attention mechanism to find out the importance. Next, the factor of global air quality station is considered. We combine these heterogeneous data and feed it into a long short-term memory neural network to extract the hidden features and forecast PM2.5 concentrations. In this research, we evaluate our model FAA-LSTM with data from EPA and Academia Sinica in Taichung, surpassing the results of multiple methods, including linear regression, support vector regression, multi-layer perceptron and LSTM.
Wang, Yu-Jen, and 王育任. "Using Attentive to improve Recursive LSTM End-to- End Chinese Discourse Parsing." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/4zurd5.
Full text國立中央大學
資訊工程學系
107
Discourse parser can help us to understand the relationship and connection between sentences from different angles, but the tree structure data still need to rely on manual marking, which makes this technology cannot be directly used in life. So far, there have been many research studies on automatically construct the complete tree structure on the computer. Since deep learning has progressed rapidly in recent years, the construction method for discourse parser has also changed from the traditional SVM, CRF method to the current recursive neural. In the Chinese corpus tree library CDTB, the parsing analysis problem can be divided into four main problems, including elementary discourse unit (EDU) segmentation, tree structure construction, center labeling, and sense labeling. In this paper, we use many state-of-the-art deep learning techniques, such as attentive recursive neural networks, self-attentive, and BERT to improve the performance. In the end, we succeed to increase the accuracy by more than 10% of F1 of each task, reaching the best performance we know so far.
Hsu, Chih-Jung, and 徐志榮. "Predicting Transportation Demand based on AR-LSTMs Model with Multi-Head Attention." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/j7pg8k.
Full text國立中央大學
軟體工程研究所
107
Smart transportation is a crucial issue for a smart city, and the forecast for taxi demand is one of the important topics in smart transportation. If we can effectively predict the taxi demand in the near future, we may be able to reduce the taxi vacancy rate, reduce the waiting time of the passengers, increase the number of trip counts for a taxi, expand driver’s income, and diminish the power consumption and pollution caused by vehicle dispatches. This paper proposes an efficient taxi demand prediction model based on state-of-the-art deep learning architecture. Specifically, we use the LSTM model as the foundation, because the LSTM model is effective in predicting time-series datasets. We enhance the LSTM model by introducing the attention mechanism such that the traffic during the peak hour and the off-peak period can better be predicted. We leverage a multi-layer architecture to increase the predicting accuracy. Additionally, we design a loss function that incorporates both the absolute mean-square-error (which tends under-estimate the low taxi demand areas) and the relative meansquare-error (which tends to misestimate the high taxi demand areas). To validate our model, we conduct experiments on two real datasets — the NYC taxi demand dataset and the Taiwan Taxi’s taxi demand dataset in Taipei City. We compare the proposed model with non-machine learning based models, traditional machine learning models, and deep learning models. Experimental results show that the proposed model outperforms the baseline models.