To see the other types of publications on this topic, follow the link: Deep learning neural network.

Dissertations / Theses on the topic 'Deep learning neural network'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Deep learning neural network.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Sarpangala, Kishan. "Semantic Segmentation Using Deep Learning Neural Architectures." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin157106185092304.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Redkar, Shrutika. "Deep Learning Binary Neural Network on an FPGA." Digital WPI, 2017. https://digitalcommons.wpi.edu/etd-theses/407.

Full text
Abstract:
In recent years, deep neural networks have attracted lots of attentions in the field of computer vision and artificial intelligence. Convolutional neural network exploits spatial correlations in an input image by performing convolution operations in local receptive fields. When compared with fully connected neural networks, convolutional neural networks have fewer weights and are faster to train. Many research works have been conducted to further reduce computational complexity and memory requirements of convolutional neural networks, to make it applicable to low-power embedded applications. This thesis focuses on a special class of convolutional neural network with only binary weights and activations, referred as binary neural networks. Weights and activations for convolutional and fully connected layers are binarized to take only two values, +1 and -1. Therefore, the computations and memory requirement have been reduced significantly. The proposed architecture of binary neural networks has been implemented on an FPGA as a real time, high speed, low power computer vision platform. Only on-chip memories are utilized in the FPGA design. The FPGA implementation is evaluated using the CIFAR-10 benchmark and achieved a processing speed of 332,164 images per second for CIFAR-10 dataset with classification accuracy of about 86.06%.
APA, Harvard, Vancouver, ISO, and other styles
3

Abrishami, Hedayat. "Deep Learning Based Electrocardiogram Delineation." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1563525992210273.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Xutao. "Chinese Text Classification Based On Deep Learning." Thesis, Mittuniversitetet, Avdelningen för informationssystem och -teknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-35322.

Full text
Abstract:
Text classification has always been a concern in area of natural language processing, especially nowadays the data are getting massive due to the development of internet. Recurrent neural network (RNN) is one of the most popular method for natural language processing due to its recurrent architecture which give it ability to process serialized information. In the meanwhile, Convolutional neural network (CNN) has shown its ability to extract features from visual imagery. This paper combine the advantages of RNN and CNN and proposed a model called BLSTM-C for Chinese text classification. BLSTM-C begins with a Bidirectional long short-term memory (BLSTM) layer which is an special kind of RNN to get a sequence output based on the past context and the future context. Then it feed this sequence to CNN layer which is utilized to extract features from the previous sequence. We evaluate BLSTM-C model on several tasks such as sentiment classification and category classification and the result shows our model’s remarkable performance on these text tasks.
APA, Harvard, Vancouver, ISO, and other styles
5

Kabore, Raogo. "Hybrid deep neural network anomaly detection system for SCADA networks." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2020. http://www.theses.fr/2020IMTA0190.

Full text
Abstract:
Les systèmes SCADA sont de plus en plus ciblés par les cyberattaques en raison de nombreuses vulnérabilités dans le matériel, les logiciels, les protocoles et la pile de communication. Ces systèmes utilisent aujourd'hui du matériel, des logiciels, des systèmes d'exploitation et des protocoles standard. De plus, les systèmes SCADA qui étaient auparavant isolés sont désormais interconnectés aux réseaux d'entreprise et à Internet, élargissant ainsi la surface d'attaque. Dans cette thèse, nous utilisons une approche deep learning pour proposer un réseau de neurones profonds hybride efficace pour la détection d'anomalies dans les systèmes SCADA. Les principales caractéristiques des données SCADA sont apprises de manière automatique et non supervisée, puis transmises à un classificateur supervisé afin de déterminer si ces données sont normales ou anormales, c'est-à-dire s'il y a une cyber-attaque ou non. Par la suite, en réponse au défi dû au temps d’entraînement élevé des modèles deep learning, nous avons proposé une approche distribuée de notre système de détection d'anomalies afin de réduire le temps d’entraînement de notre modèle
SCADA systems are more and more targeted by cyber-attacks because of many vulnerabilities inhardware, software, protocols and the communication stack. Those systems nowadays use standard hardware, software, operating systems and protocols. Furthermore, SCADA systems which used to be air-gaped are now interconnected to corporate networks and to the Internet, widening the attack surface.In this thesis, we are using a deep learning approach to propose an efficient hybrid deep neural network for anomaly detection in SCADA systems. The salient features of SCADA data are automatically and unsupervisingly learnt, and then fed to a supervised classifier in order to dertermine if those data are normal or abnormal, i.e if there is a cyber-attack or not. Afterwards, as a response to the challenge caused by high training time of deep learning models, we proposed a distributed approach of our anomaly detection system in order lo lessen the training time of our model
APA, Harvard, Vancouver, ISO, and other styles
6

Lopes, André Teixeira. "Facial expression recognition using deep learning - convolutional neural network." Universidade Federal do Espírito Santo, 2016. http://repositorio.ufes.br/handle/10/4301.

Full text
Abstract:
Made available in DSpace on 2016-08-29T15:33:24Z (GMT). No. of bitstreams: 1 tese_9629_dissertacao(1)20160411-102533.pdf: 9277551 bytes, checksum: c18df10308db5314d25f9eb1543445b3 (MD5) Previous issue date: 2016-03-03
CAPES
O reconhecimento de expressões faciais tem sido uma área de pesquisa ativa nos últimos dez anos, com uma área de aplicação em crescimento como animação de personagens e neuro-marketing. O reconhecimento de uma expressão facial não é um problema fácil para métodos de aprendizagem de máquina, dado que pessoas diferentes podem variar na forma com que mostram suas expressões. Até uma imagem da mesma pessoa em uma expressão pode variar em brilho, cor de fundo e posição. Portanto, reconhecer expressões faciais ainda é um problema desafiador em visão computacional. Para resolver esses problemas, nesse trabalho, nós propomos um sistema de reconhecimento de expressões faciais que usa redes neurais de convolução. Geração sintética de dados e diferentes operações de pré-processamento foram estudadas em conjunto com várias arquiteturas de redes neurais de convolução. A geração sintética de dados e as etapas de pré-processamento foram usadas para ajudar a rede na seleção de características. Experimentos foram executados em três bancos de dados largamente utilizados (CohnKanade, JAFFE, e BU3DFE) e foram feitas validações entre bancos de dados(i.e., treinar em um banco de dados e testar em outro). A abordagem proposta mostrou ser muito efetiva, melhorando os resultados do estado-da-arte na literatura.
Facial expression recognition has been an active research area in the past ten years, with growing application areas such avatar animation, neuromarketing and sociable robots. The recognition of facial expressions is not an easy problem for machine learning methods, since people can vary signi cantly in the way that they show their expressions. Even images of the same person in one expression can vary in brightness, background and position. Hence, facial expression recognition is still a challenging problem. To address these problems, in this work we propose a facial expression recognition system that uses Convolutional Neural Networks. Data augmentation and di erent preprocessing steps were studied together with various Convolutional Neural Networks architectures. The data augmentation and pre-processing steps were used to help the network on the feature selection. Experiments were carried out with three largely used databases (Cohn-Kanade, JAFFE, and BU3DFE) and cross-database validations (i.e. training in one database and test in another) were also performed. The proposed approach has shown to be very e ective, improving the state-of-the-art results in the literature and allowing real time facial expression recognition with standard PC computers.
APA, Harvard, Vancouver, ISO, and other styles
7

Suárez-Varela, Macià José Rafael. "Enabling knowledge-defined networks : deep reinforcement learning, graph neural networks and network analytics." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/669212.

Full text
Abstract:
Significant breakthroughs in the last decade in the Machine Learning (ML) field have ushered in a new era of Artificial Intelligence (AI). Particularly, recent advances in Deep Learning (DL) have enabled to develop a new breed of modeling and optimization tools with a plethora of applications in different fields like natural language processing, or computer vision. In this context, the Knowledge-Defined Networking (KDN) paradigm highlights the lack of adoption of AI techniques in computer networks and – as a result – proposes a novel architecture that relies on Software-Defined Networking (SDN) and modern network analytics techniques to facilitate the deployment of ML-based solutions for efficient network operation. This dissertation aims to be a step forward in the realization of Knowledge-Defined Networks. In particular, we focus on the application of AI techniques to control and optimize networks more efficiently and automatically. To this end, we identify two components within the KDN context whose development may be crucial to achieve self-operating networks in the future: (i) the automatic control module, and (ii) the network analytics platform. The first part of this thesis is devoted to the construction of efficient automatic control modules. First, we explore the application of Deep Reinforcement Learning (DRL) algorithms to optimize the routing configuration in networks. DRL has recently demonstrated an outstanding capability to solve efficiently decision-making problems in other fields. However, first DRL-based attempts to optimize routing in networks have failed to achieve good results, often under-performing traditional heuristics. In contrast to previous DRL-based solutions, we propose a more elaborate network representation that facilitates DRL agents to learn efficient routing strategies. Our evaluation results show that DRL agents using the proposed representation achieve better performance and learn faster how to route traffic in an Optical Transport Network (OTN) use case. Second, we lay the foundations on the use of Graph Neural Networks (GNN) to build ML-based network optimization tools. GNNs are a newly proposed family of DL models specifically tailored to operate and generalize over graphs of variable size and structure. In this thesis, we posit that GNNs are well suited to model the relationships between different network elements inherently represented as graphs (e.g., topology, routing). Particularly, we use a custom GNN architecture to build a routing optimization solution that – unlike previous ML-based proposals – is able to generalize well to topologies, routing configurations, and traffic never seen during the training phase. The second part of this thesis investigates the design of practical and efficient network analytics solutions in the KDN context. Network analytics tools are crucial to provide the control plane with a rich and timely view of the network state. However this is not a trivial task considering that all this information turns typically into big data in real-world networks. In this context, we analyze the main aspects that should be considered when measuring and classifying traffic in SDN (e.g., scalability, accuracy, cost). As a result, we propose a practical solution that produces flow-level measurement reports similar to those of NetFlow/IPFIX in traditional networks. The proposed system relies only on native features of OpenFlow – currently among the most established standards in SDN – and incorporates mechanisms to maintain efficiently flow-level statistics in commodity switches and report them asynchronously to the control plane. Additionally, a system that combines ML and Deep Packet Inspection (DPI) identifies the applications that generate each traffic flow.
La evolución del campo del Aprendizaje Maquina (ML) en la última década ha dado lugar a una nueva era de la Inteligencia Artificial (AI). En concreto, algunos avances en el campo del Aprendizaje Profundo (DL) han permitido desarrollar nuevas herramientas de modelado y optimización con múltiples aplicaciones en campos como el procesado de lenguaje natural, o la visión artificial. En este contexto, el paradigma de Redes Definidas por Conocimiento (KDN) destaca la falta de adopción de técnicas de AI en redes y, como resultado, propone una nueva arquitectura basada en Redes Definidas por Software (SDN) y en técnicas modernas de análisis de red para facilitar el despliegue de soluciones basadas en ML. Esta tesis pretende representar un avance en la realización de redes basadas en KDN. En particular, investiga la aplicación de técnicas de AI para operar las redes de forma más eficiente y automática. Para ello, identificamos dos componentes en el contexto de KDN cuyo desarrollo puede resultar esencial para conseguir redes operadas autónomamente en el futuro: (i) el módulo de control automático y (ii) la plataforma de análisis de red. La primera parte de esta tesis aborda la construcción del módulo de control automático. En primer lugar, se explora el uso de algoritmos de Aprendizaje Profundo por Refuerzo (DRL) para optimizar el encaminamiento de tráfico en redes. DRL ha demostrado una capacidad sobresaliente para resolver problemas de toma de decisiones en otros campos. Sin embargo, los primeros trabajos que han aplicado DRL a la optimización del encaminamiento en redes no han conseguido rendimientos satisfactorios. Frente a dichas soluciones previas, proponemos una representación más elaborada de la red que facilita a los agentes DRL aprender estrategias de encaminamiento eficientes. Nuestra evaluación muestra que cuando los agentes DRL utilizan la representación propuesta logran mayor rendimiento y aprenden más rápido cómo encaminar el tráfico en un caso práctico en Redes de Transporte Ópticas (OTN). En segundo lugar, se presentan las bases sobre la utilización de Redes Neuronales de Grafos (GNN) para construir herramientas de optimización de red. Las GNN constituyen una nueva familia de modelos de DL específicamente diseñados para operar y generalizar sobre grafos de tamaño y estructura variables. Esta tesis destaca la idoneidad de las GNN para modelar las relaciones entre diferentes elementos de red que se representan intrínsecamente como grafos (p. ej., topología, encaminamiento). En particular, utilizamos una arquitectura GNN específicamente diseñada para optimizar el encaminamiento de tráfico que, a diferencia de las propuestas anteriores basadas en ML, es capaz de generalizar correctamente sobre topologías, configuraciones de encaminamiento y tráfico nunca vistos durante el entrenamiento La segunda parte de esta tesis investiga el diseño de herramientas de análisis de red eficientes en el contexto de KDN. El análisis de red resulta esencial para proporcionar al plano de control una visión completa y actualizada del estado de la red. No obstante, esto no es una tarea trivial considerando que esta información representa una cantidad masiva de datos en despliegues de red reales. Esta parte de la tesis analiza los principales aspectos a considerar a la hora de medir y clasificar el tráfico en SDN (p. ej., escalabilidad, exactitud, coste). Como resultado, se propone una solución práctica que genera informes de medidas de tráfico a nivel de flujo similares a los de NetFlow/IPFIX en redes tradicionales. El sistema propuesto utiliza sólo funciones soportadas por OpenFlow, actualmente uno de los estándares más consolidados en SDN, y permite mantener de forma eficiente estadísticas de tráfico en conmutadores con características básicas y enviarlas de forma asíncrona hacia el plano de control. Asimismo, un sistema que combina ML e Inspección Profunda de Paquetes (DPI) identifica las aplicaciones que generan cada flujo de tráfico.
APA, Harvard, Vancouver, ISO, and other styles
8

Squadrani, Lorenzo. "Deep neural networks and thermodynamics." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Find full text
Abstract:
Deep learning is the most effective and used approach to artificial intelligence, and yet it is far from being properly understood. The understanding of it is the way to go to further improve its effectiveness and in the best case to gain some understanding of the "natural" intelligence. We attempt a step in this direction with the aim of physics. We describe a convolutional neural network for image classification (trained on CIFAR-10) within the descriptive framework of Thermodynamics. In particular we define and study the temperature of each component of the network. Our results provides a new point of view on deep learning models, which may be a starting point towards a better understanding of artificial intelligence.
APA, Harvard, Vancouver, ISO, and other styles
9

Chen, Tairui. "Going Deeper with Convolutional Neural Network for Intelligent Transportation." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-theses/144.

Full text
Abstract:
Over last several decades, computer vision researchers have been devoted to find good feature to solve different tasks, object recognition, object detection, object segmentation, activity recognition and so forth. Ideal features transform raw pixel intensity values to a representation in which these computer vision problems are easier to solve. Recently, deep feature from covolutional neural network(CNN) have attracted many researchers to solve many problems in computer vision. In the supervised setting, these hierarchies are trained to solve specific problems by minimizing an objective function for different tasks. More recently, the feature learned from large scale image dataset have been proved to be very effective and generic for many computer vision task. The feature learned from recognition task can be used in the object detection task. This work aims to uncover the principles that lead to these generic feature representations in the transfer learning, which does not need to train the dataset again but transfer the rich feature from CNN learned from ImageNet dataset. This work aims to uncover the principles that lead to these generic feature representations in the transfer learning, which does not need to train the dataset again but transfer the rich feature from CNN learned from ImageNet dataset. We begin by summarize some related prior works, particularly the paper in object recognition, object detection and segmentation. We introduce the deep feature to computer vision task in intelligent transportation system. First, we apply deep feature in object detection task, especially in vehicle detection task. Second, to make fully use of objectness proposals, we apply proposal generator on road marking detection and recognition task. Third, to fully understand the transportation situation, we introduce the deep feature into scene understanding in road. We experiment each task for different public datasets, and prove our framework is robust.
APA, Harvard, Vancouver, ISO, and other styles
10

Parakkal, Sreenivasan Akshai. "Deep learning prediction of Quantmap clusters." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-445909.

Full text
Abstract:
The hypothesis that similar chemicals exert similar biological activities has been widely adopted in the field of drug discovery and development. Quantitative Structure-Activity Relationship (QSAR) models have been used ubiquitously in drug discovery to understand the function of chemicals in biological systems. A common QSAR modeling method calculates similarity scores between chemicals to assess their biological function. However, due to the fact that some chemicals can be similar and yet have different biological activities, or conversely can be structurally different yet have similar biological functions, various methods have instead been developed to quantify chemical similarity at the functional level. Quantmap is one such method, which utilizes biological databases to quantify the biological similarity between chemicals. Quantmap uses quantitative molecular network topology analysis to cluster chemical substances based on their bioactivities. This method by itself, unfortunately, cannot assign new chemicals (those which may not yet have biological data) to the derived clusters. Owing to the fact that there is a lack of biological data for many chemicals, deep learning models were explored in this project with respect to their ability to correctly assign unknown chemicals to Quantmap clusters. The deep learning methods explored included both convolutional and recurrent neural networks. Transfer learning/pretraining based approaches and data augmentation methods were also investigated. The best performing model, among those considered, was the Seq2seq model (a recurrent neural network containing two joint networks, a perceiver and an interpreter network) without pretraining, but including data augmentation.
APA, Harvard, Vancouver, ISO, and other styles
11

Wu, Chunyang. "Structured deep neural networks for speech recognition." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/276084.

Full text
Abstract:
Deep neural networks (DNNs) and deep learning approaches yield state-of-the-art performance in a range of machine learning tasks, including automatic speech recognition. The multi-layer transformations and activation functions in DNNs, or related network variations, allow complex and difficult data to be well modelled. However, the highly distributed representations associated with these models make it hard to interpret the parameters. The whole neural network is commonly treated a ``black box''. The behaviours of activation functions and the meanings of network parameters are rarely controlled in the standard DNN training. Though a sensible performance can be achieved, the lack of interpretations to network structures and parameters causes better regularisation and adaptation on DNN models challenging. In regularisation, parameters have to be regularised universally and indiscriminately. For instance, the widely used L2 regularisation encourages all parameters to be zeros. In adaptation, it requires to re-estimate a large number of independent parameters. Adaptation schemes in this framework cannot be effectively performed when there are limited adaptation data. This thesis investigates structured deep neural networks. Special structures are explicitly designed, and they are imposed with desired interpretation to improve DNN regularisation and adaptation. For regularisation, parameters can be separately regularised based on their functions. For adaptation, parameters can be adapted in groups or partially adapted according to their roles in the network topology. Three forms of structured DNNs are proposed in this thesis. The contributions of these models are presented as follows. The first contribution of this thesis is the multi-basis adaptive neural network. This form of structured DNN introduces a set of parallel sub-networks with restricted connections. The design of restricted connectivity allows different aspects of data to be explicitly learned. Sub-network outputs are then combined, and this combination module is used as the speaker-dependent structure that can be robustly estimated for adaptation. The second contribution of this thesis is the stimulated deep neural network. This form of structured DNN relates and smooths activation functions in regions of the network. It aids the visualisation and interpretation of DNN models but also has the potential to reduce over-fitting. Novel adaptation schemes can be performed on it, taking advantages of the smooth property that the stimulated DNN offer. The third contribution of this thesis is the deep activation mixture model. Also, this form of structured DNN encourages the outputs of activation functions to achieve a smooth surface. The output of one hidden layer is explicitly modelled as the sum of a mixture model and a residual model. The mixture model forms an activation contour, and the residual model depicts fluctuations around this contour. The smoothness yielded by a mixture model helps to regularise the overall model and allows novel adaptation schemes.
APA, Harvard, Vancouver, ISO, and other styles
12

Maestri, Rita. "Metodiche di deep learning e applicazioni all’imaging medico: la radiomica." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/15452/.

Full text
Abstract:
Questa tesi ha lo scopo di presentare il deep learning e una delle sue applicazioni che ha avuto molto successo nell'analisi delle immagini: la rete neurale convoluzionale. In particolare, si espongono i vantaggi, gli svantaggi e i risultati ottenuti nell'applicazione delle reti convoluzionali alla radiomica, una nuova disciplina che prevede l'estrazione di un elevato numero di feature dalle immagini mediche per elaborare modelli di supporto a diagnosi e prognosi. Nel primo capitolo si introducono concetti di machine learning utili per comprendere gli algoritmi di apprendimento usati anche nel deep learning. Poi sono presentate le reti neurali, ovvero le strutture su cui si basano gli algoritmi di deep learning. Infine, viene spiegato il funzionamento e gli utilizzi delle reti neurali convoluzionali. Nel secondo capitolo si espongono le tecniche e gli utilizzi della radiomica e, infine, i vantaggi di usare le reti neurali convoluzionali in quest'ambito, presentando alcuni recenti studi portati a termine in merito.
APA, Harvard, Vancouver, ISO, and other styles
13

Santos, Claudio Filipi Gonçalves dos. "Optical character recognition using deep learning." Universidade Estadual Paulista (UNESP), 2018. http://hdl.handle.net/11449/154100.

Full text
Abstract:
Submitted by Claudio Filipi Gonçalves dos Santos (cfsantos85@gmail.com) on 2018-05-24T11:51:59Z No. of bitstreams: 1 optical-character-recognition-16052018.pdf: 8334356 bytes, checksum: 8dd05363a96c946ae1f6d665edc80d09 (MD5)
Rejected by Elza Mitiko Sato null (elzasato@ibilce.unesp.br), reason: Solicitamos que realize correções na submissão seguindo as orientações abaixo: Problema 01) Falta a FOLHA DE APROVAÇÃO (Obrigatório pela ABNT NBR14724) Problema 02) Corrigir a ordem das páginas pré-textuais; a ordem correta (capa, folha de rosto, dedicatória, agradecimentos, epígrafe, resumo na língua vernácula, resumo em língua estrangeira, listas de ilustrações, de tabelas, de abreviaturas, de siglas e de símbolos e sumário). Problema 03) Faltam as palavras-chave no resumo e no abstracts. Na página da Seção de pós-graduação, em Instruções para Qualificação e Defesas de Dissertação e Tese, você pode acessar o modelo das páginas pré-textuais. Lembramos que o arquivo depositado no repositório deve ser igual ao impresso, o rigor com o padrão da Universidade se deve ao fato de que o seu trabalho passará a ser visível mundialmente. Agradecemos a compreensão. on 2018-05-24T20:59:53Z (GMT)
Submitted by Claudio Filipi Gonçalves dos Santos (cfsantos85@gmail.com) on 2018-05-25T00:43:19Z No. of bitstreams: 1 optical-character-recognition-16052018.pdf: 11084990 bytes, checksum: 6f8d7431cd17efd931a31c0eade10c65 (MD5)
Rejected by Elza Mitiko Sato null (elzasato@ibilce.unesp.br), reason: Solicitamos que realize correções na submissão seguindo as orientações abaixo: Problema 01) Falta a FOLHA DE APROVAÇÃO (Obrigatório pela ABNT NBR14724) Problema 02) A paginação deve ser sequencial, iniciando a contagem na folha de rosto e mostrando o número a partir da introdução, a ficha catalográfica ficará após a folha de rosto e não deverá ser contada. Problema 03) Na descrição do item: Título em outro idioma – Se você colocou no título em inglês deve por neste campo o título em outro idioma (ex: português, espanhol, francês...) Estamos encaminhando via e-mail o template/modelo para que você possa fazer as correções. Lembramos que o arquivo depositado no repositório deve ser igual ao impresso, o rigor com o padrão da Universidade se deve ao fato de que o seu trabalho passará a ser visível mundialmente. Agradecemos a compreensão. on 2018-05-25T15:22:45Z (GMT)
Submitted by Claudio Filipi Gonçalves dos Santos (cfsantos85@gmail.com) on 2018-05-25T15:52:53Z No. of bitstreams: 1 optical-character-recognition-16052018.pdf: 11089966 bytes, checksum: d6c863077a995bd2519035b8a3e97c80 (MD5)
Rejected by Elza Mitiko Sato null (elzasato@ibilce.unesp.br), reason: Solicitamos que realize correções na submissão seguindo as orientações abaixo: Problema 01) Falta a FOLHA DE APROVAÇÃO (Obrigatório pela ABNT NBR14724) Agradecemos a compreensão. on 2018-05-25T18:03:19Z (GMT)
Submitted by Claudio Filipi Gonçalves dos Santos (cfsantos85@gmail.com) on 2018-05-25T18:08:09Z No. of bitstreams: 1 Claudio Filipi Gonçalves dos Santos Corrigido Biblioteca.pdf: 8257484 bytes, checksum: 3a61ebfa8e1d16c9d0c694f46b979c1f (MD5)
Approved for entry into archive by Elza Mitiko Sato null (elzasato@ibilce.unesp.br) on 2018-05-25T18:51:24Z (GMT) No. of bitstreams: 1 santos_cfg_me_sjrp.pdf: 8257484 bytes, checksum: 3a61ebfa8e1d16c9d0c694f46b979c1f (MD5)
Made available in DSpace on 2018-05-25T18:51:24Z (GMT). No. of bitstreams: 1 santos_cfg_me_sjrp.pdf: 8257484 bytes, checksum: 3a61ebfa8e1d16c9d0c694f46b979c1f (MD5) Previous issue date: 2018-04-26
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Detectores óticos de caracteres, ou Optical Character Recognition (OCR) é o nome dado à técnologia de traduzir dados de imagens em arquivo de texto. O objetivo desse projeto é usar aprendizagem profunda, também conhecido por aprendizado hierárquico ou Deep Learning para o desenvolvimento de uma aplicação com a habilidade de detectar áreas candidatas, segmentar esses espaços dan imagem e gerar o texto contido na figura. Desde 2006, Deep Learning emergiu como uma nova área em aprendizagem de máquina. Em tempos recentes, as técnicas desenvolvidas em pesquisas com Deep Learning têm influenciado e expandido escopo, incluindo aspectos chaves nas área de inteligência artificial e aprendizagem de máquina. Um profundo estudo foi conduzido com a intenção de desenvolver um sistema OCR usando apenas arquiteturas de Deep Learning.A evolução dessas técnicas, alguns trabalhos passados e como esses trabalhos influenciaram o desenvolvimento dessa estrutura são explicados nesse texto. Essa tese demonstra com resultados como um classificador de caracteres foi desenvolvido. Em seguida é explicado como uma rede neural pode ser desenvolvida para ser usada como um detector de objetos e como ele pode ser transformado em um detector de texto. Logo após é demonstrado como duas técnicas diferentes de Deep Learning podem ser combinadas e usadas na tarefa de transformar segmentos de imagens em uma sequência de caracteres. Finalmente é demonstrado como o detector de texto e o sistema transformador de imagem em texto podem ser combinados para se desenvolver um sistema OCR completo que detecta regiões de texto nas imagens e o que está escrito nessa região. Esse estudo demonstra que a idéia de usar apenas estruturas de Deep Learning podem ter performance melhores do técnicas baseadas em outras áreas da computação como por exemplo o processamento de imagens. Para detecção de texto foi alcançado mais de 70% de precisão quando uma arquitetura mais complexa foi usada, por volta de 69% de traduções de imagens para texto corretas e por volta de 50% na tarefa ponta-à-ponta de detectar as áreas de texto e traduzi-las em sequência de caracteres.
Optical Character Recognition (OCR) is the name given to the technology used to translate image data into a text file. The objective of this project is to use Deep Learning techniques to develop a software with the ability to segment images, detecting candidate characters and generating textthatisinthepicture. Since2006,DeepLearningorhierarchicallearning, emerged as a new machine learning area. Over recent years, the techniques developed from deep learning research have influenced and expanded scope, including key aspects of artificial intelligence and machine learning. A thorough study was carried out in order to develop an OCR system using only Deep Learning architectures. It is explained the evolution of these techniques, some past works and how they influenced thisframework’sdevelopment. Inthisthesisitisdemonstratedwithresults how a single character classifier was developed. Then it is explained how a neural network can be developed to be an object detector and how to transform this object detector into a text detector. After that it shows how a set of two Deep Learning techniques can be combined and used in the taskoftransformingacroppedregionofanimageinastringofcharacters. Finally, it demonstrates how the text detector and the Image-to-Text systemswerecombinedinordertodevelopafullend-to-endOCRsystemthat detects the regions of a given image containing text and what is written in this region. It shows the idea of using only Deep Learning structures can outperform other techniques based on other areas like image processing. In text detection it reached over 70% of precision when a more complex architecture was used, around 69% of correct translation of image-to-text areasandaround50%onend-to-endtaskofdetectingareasandtranslating them into text.
1623685
APA, Harvard, Vancouver, ISO, and other styles
14

Tran, Khanh-Hung. "Semi-supervised dictionary learning and Semi-supervised deep neural network." Thesis, université Paris-Saclay, 2021. http://www.theses.fr/2021UPASP014.

Full text
Abstract:
Depuis les années 2010, l’apprentissage automatique (ML) est l’un des sujets qui retient beaucoup l'attention des chercheurs scientifiques. De nombreux modèles de ML ont démontré leur capacité produire d’excellent résultats dans des divers domaines comme Vision par ordinateur, Traitement automatique des langues, Robotique… Toutefois, la plupart de ces modèles emploient l’apprentissage supervisé, qui requiert d’un massive annotation. Par conséquent, l’objectif de cette thèse est d’étudier et de proposer des approches semi-supervisées qui ont plusieurs avantages par rapport à l’apprentissage supervisé. Au lieu d’appliquer directement un classificateur semi-supervisé sur la représentation originale des données, nous utilisons plutôt des types de modèle qui intègrent une phase de l’apprentissage de représentation avant de la phase de classification, pour mieux s'adapter à la non linéarité des données. Dans le premier temps, nous revisitons des outils qui permettent de construire notre modèles semi-supervisés. Tout d’abord, nous présentons deux types de modèle qui possèdent l’apprentissage de représentation dans leur architecture : l’apprentissage de dictionnaire et le réseau de neurones, ainsi que les méthodes d’optimisation pour chaque type de model, en plus, dans le cas de réseau de neurones, nous précisons le problème avec les exemples contradictoires. Ensuite, nous présentons les techniques qui accompagnent souvent avec l’apprentissage semi-supervisé comme l’apprentissage de variétés et le pseudo-étiquetage. Dans le deuxième temps, nous travaillons sur l’apprentissage de dictionnaire. Nous synthétisons en général trois étapes pour construire un modèle semi-supervisée à partir d’un modèle supervisé. Ensuite, nous proposons notre modèle semi-supervisée pour traiter le problème de classification typiquement dans le cas d’un faible nombre d’échantillons d’entrainement (y compris tous labellisés et non labellisés échantillons). D'une part, nous appliquons la préservation de la structure de données de l’espace original à l’espace de code parcimonieux (l’apprentissage de variétés), ce qui est considéré comme la régularisation pour les codes parcimonieux. D'autre part, nous intégrons un classificateur semi-supervisé dans l’espace de code parcimonieux. En outre, nous effectuons le codage parcimonieux pour les échantillons de test en prenant en compte aussi la préservation de la structure de données. Cette méthode apporte une amélioration sur le taux de précision par rapport à des méthodes existantes. Dans le troisième temps, nous travaillons sur le réseau de neurones. Nous proposons une approche qui s’appelle "manifold attack" qui permets de renforcer l’apprentissage de variétés. Cette approche est inspirée par l’apprentissage antagoniste : trouver des points virtuels qui perturbent la fonction de coût sur l’apprentissage de variétés (en la maximisant) en fixant les paramètres du modèle; ensuite, les paramètres du modèle sont mis à jour, en minimisant cette fonction de coût et en fixant les points virtuels. Nous fournissons aussi des critères pour limiter l’espace auquel les points virtuels appartiennent et la méthode pour les initialiser. Cette approche apporte non seulement une amélioration sur le taux de précision mais aussi une grande robustesse contre les exemples contradictoires. Enfin, nous analysons des similarités et des différences, ainsi que des avantages et inconvénients entre l’apprentissage de dictionnaire et le réseau de neurones. Nous proposons quelques perspectives sur ces deux types de modèle. Dans le cas de l’apprentissage de dictionnaire semi-supervisé, nous proposons quelques techniques en inspirant par le réseau de neurones. Quant au réseau de neurones, nous proposons d’intégrer "manifold attack" sur les modèles génératifs
Since the 2010's, machine learning (ML) has been one of the topics that attract a lot of attention from scientific researchers. Many ML models have been demonstrated their ability to produce excellent results in various fields such as Computer Vision, Natural Language Processing, Robotics... However, most of these models use supervised learning, which requires a massive annotation. Therefore, the objective of this thesis is to study and to propose semi-supervised learning approaches that have many advantages over supervised learning. Instead of directly applying a semi-supervised classifier on the original representation of data, we rather use models that integrate a representation learning stage before the classification stage, to better adapt to the non-linearity of the data. In the first step, we revisit tools that allow us to build our semi-supervised models. First, we present two types of model that possess representation learning in their architecture: dictionary learning and neural network, as well as the optimization methods for each type of model. Moreover, in the case of neural network, we specify the problem with adversarial examples. Then, we present the techniques that often accompany with semi-supervised learning such as variety learning and pseudo-labeling. In the second part, we work on dictionary learning. We synthesize generally three steps to build a semi-supervised model from a supervised model. Then, we propose our semi-supervised model to deal with the classification problem typically in the case of a low number of training samples (including both labelled and non-labelled samples). On the one hand, we apply the preservation of the data structure from the original space to the sparse code space (manifold learning), which is considered as regularization for sparse codes. On the other hand, we integrate a semi-supervised classifier in the sparse code space. In addition, we perform sparse coding for test samples by taking into account also the preservation of the data structure. This method provides an improvement on the accuracy rate compared to other existing methods. In the third step, we work on neural network models. We propose an approach called "manifold attack" which allows reinforcing manifold learning. This approach is inspired from adversarial learning : finding virtual points that disrupt the cost function on manifold learning (by maximizing it) while fixing the model parameters; then the model parameters are updated by minimizing this cost function while fixing these virtual points. We also provide criteria for limiting the space to which the virtual points belong and the method for initializing them. This approach provides not only an improvement on the accuracy rate but also a significant robustness to adversarial examples. Finally, we analyze the similarities and differences, as well as the advantages and disadvantages between dictionary learning and neural network models. We propose some perspectives on both two types of models. In the case of semi-supervised dictionary learning, we propose some techniques inspired by the neural network models. As for the neural network, we propose to integrate manifold attack on generative models
APA, Harvard, Vancouver, ISO, and other styles
15

Yin, Yonghua. "Random neural networks for deep learning." Thesis, Imperial College London, 2018. http://hdl.handle.net/10044/1/64917.

Full text
Abstract:
The random neural network (RNN) is a mathematical model for an 'integrate and fire' spiking network that closely resembles the stochastic behaviour of neurons in mammalian brains. Since its proposal in 1989, there have been numerous investigations into the RNN's applications and learning algorithms. Deep learning (DL) has achieved great success in machine learning, but there has been no research into the properties of the RNN for DL to combine their power. This thesis intends to bridge the gap between RNNs and DL, in order to provide powerful DL tools that are faster, and that can potentially be used with less energy expenditure than existing methods. Based on the RNN function approximator proposed by Gelenbe in 1999, the approximation capability of the RNN is investigated and an efficient classifier is developed. By combining the RNN, DL and non-negative matrix factorisation, new shallow and multi-layer non-negative autoencoders are developed. The autoencoders are tested on typical image datasets and real-world datasets from different domains, and the test results yield the desired high learning accuracy. The concept of dense nuclei/clusters is examined, using RNN theory as a basis. In dense nuclei, neurons may interconnect via soma-to-soma interactions and conventional synaptic connections. A mathematical model of the dense nuclei is proposed and the transfer function can be deduced. A multi-layer architecture of the dense nuclei is constructed for DL, whose value is demonstrated by experiments on multi-channel datasets and server-state classification in cloud servers. A theoretical study into the multi-layer architecture of the standard RNN (MLRNN) for DL is presented. Based on the layer-output analyses, the MLRNN is shown to be a universal function approximator. The effects of the layer number on the learning capability and high-level representation extraction are analysed. A hypothesis for transforming the DL problem into a moment-learning problem is also presented. The power of the standard RNN for DL is investigated. The ability of the RNN with only positive parameters to conduct image convolution operations is demonstrated. The MLRNN equipped with the developed training algorithm achieves comparable or better classification at a lower computation cost than conventional DL methods.
APA, Harvard, Vancouver, ISO, and other styles
16

Landeen, Trevor J. "Association Learning Via Deep Neural Networks." DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7028.

Full text
Abstract:
Deep learning has been making headlines in recent years and is often portrayed as an emerging technology on a meteoric rise towards fully sentient artificial intelligence. In reality, deep learning is the most recent renaissance of a 70 year old technology and is far from possessing true intelligence. The renewed interest is motivated by recent successes in challenging problems, the accessibility made possible by hardware developments, and dataset availability. The predecessor to deep learning, commonly known as the artificial neural network, is a computational network setup to mimic the biological neural structure found in brains. However, unlike human brains, artificial neural networks, in most cases cannot make inferences from one problem to another. As a result, developing an artificial neural network requires a large number of examples of desired behavior for a specific problem. Furthermore, developing an artificial neural network capable of solving the problem can take days, or even weeks, of computations. Two specific problems addressed in this dissertation are both input association problems. One problem challenges a neural network to identify overlapping regions in images and is used to evaluate the ability of a neural network to learn associations between inputs of similar types. The other problem asks a neural network to identify which observed wireless signals originated from observed potential sources and is used to assess the ability of a neural network to learn associations between inputs of different types. The neural network solutions to both problems introduced, discussed, and evaluated in this dissertation demonstrate deep learning’s applicability to problems which have previously attracted little attention.
APA, Harvard, Vancouver, ISO, and other styles
17

Campbell, Tanner, and Tanner Campbell. "A Deep Learning Approach to Autonomous Relative Terrain Navigation." Thesis, The University of Arizona, 2017. http://hdl.handle.net/10150/626706.

Full text
Abstract:
Autonomous relative terrain navigation is a problem at the forefront of many space missions involving close proximity operations to any target body. With no definitive answer, there are many techniques to help cope with this issue using both passive and active sensors, but almost all require high fidelity models of the associated dynamics in the environment. Convolutional Neural Networks (CNNs) trained with images rendered from a digital terrain map (DTM) of the body’s surface can provide a way to side-step the issue of unknown or complex dynamics while still providing reliable autonomous navigation. This is achieved by directly mapping an image to a relative position to the target body. The portability of trained CNNs allows “offline” training that can yield a matured network capable of being loaded onto a spacecraft for real-time position acquisition. In this thesis the lunar surface is used as the proving ground for this optical navigation technique, but the methods used are not unique to the Moon, and are applicable in general.
APA, Harvard, Vancouver, ISO, and other styles
18

Aspandi, Latif Decky. "Deep spatio-temporal neural network for facial analysis." Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/671209.

Full text
Abstract:
Automatic Facial Analysis is one of the most important field of computer vision due to its significant impacts to the world we currently live in. Among many applications of Automatic Facial Analysis, Facial Alignment and Facial-Based Emotion Recognition are two most prominent tasks considering their roles in this field. That is, the former serves as intermediary steps enabling many higher facial analysis tasks, and the latter provides direct, real-world high level facial-based analysis and applications to the society. Together, they have significant impacts ranging from biometric recognition, facial recognition, health, and many others. These facial analysis tasks are currently even more relevant given the emergence of big-data, that enables rapid development of machine learning based models advancing their current state of the arts accuracy. In regard to this, the uses of video-based data as the part of the development of current datasets have been more frequent. These sequence based data have been explicitly exploited in the other relevant machine learning fields through the use of inherent temporal information, that in contrast, it has not been the case for both of Facial Alignment and Facial-Based Emotion Recognition tasks. Furthermore, the in-the-wild characteristics of the data that exist on the current datasets present additional challenge for developing an accurate system to these tasks. In this context, the main purpose of this thesis is to evaluate the benefit of incorporating both temporal information and the in-the-wild data characteristics that are largely overlooked on both Facial Alignment and Facial-Based Emotion Recognition. We mainly focus in the use of deep learning based models given their capability and capacity to leverage on the current sheer size of input data. Also, we investigate the introduction of an internal noise modellings in order to assess their impacts to the proposed works. Specifically, this thesis analyses the benefit of sequence modelling through progressive learning applied to facial tracking task, while it is also fully end to end trainable. This arrangement allows us to evaluate the optimum sequence length to increase the quality of our models estimation. Subsequently, we expand our investigations to the introduction of internal noise modelling to benefit from the characteristics of each image degradation for single-image facial alignment, alongside the facial tracking task. Following this approach, we can study and quantify its direct impacts. We then combine both sequence based approach and internal noise modelling by proposing the unified systems that can simultaneously perform both of single-image facial alignment and facial tracking, with state of the art accuracy result. Motivated by our findings from Facial Alignment task, we then expand these approaches to Facial-Based Emotion Recognition problem. We first explore the use of adversarial learning to enhance our image degradation modelling, and simultaneously increase the efficiency of our approaches through the formation of internal visual latent features. We then equip our base sequence modelling with soft attention modules to allow the proposed model to adjust their focus using the adaptive weighting scheme. Subsequently, we introduce a more effective fusion method for both facial features modality and visual representation of audio using gating mechanism. In this stage, we also analyse the impacts of our proposed gating mechanisms along with the attention enhanced sequence modelling. Finally, we found that these approaches improve our models estimation quality leading to the high level of accuracy, outperforming the results from other alternatives.
L’anàlisi facial es un dels camps importants en Visió per Ordinador degut a l’impacte que té en el mon on vivim. L’alineament facial i el reconeixement d’emocions basat en cares son dues tasques fonamentals en aquest camp. Mentre la primera tasca pot ser un pas intermedi per tasques d’anàlisi posterior, la segona aporta aplicacions directes, socialment útils. Les dues juntes tenen un impacte que va del reconeixement biomètric a captar l’estat emocional de la persona. En l’era actual del Big Data, aquestes tasques d’anàlisi facial son encara més rellevants ja que es possible un progrés continuat de l’estat de l ‘art. L’ús de grans bases de dades basades en vídeo ha permès l’ús de models temporals en l’aprenentatge automàtic i en Visió per Ordinador. Malgrat això, l’ús de models temporals es encara insuficient. A més a més, la presentació de les dades en forma natural -sense restriccions- afegeix nous desafiaments per desenvolupar sistemes precisos. En aquest context, el principal objectiu d’aquesta tesi consisteix en avaluar el benefici d’incorporar les dues coses, informació temporal i dades amb característiques naturals ja que aquests fets encara es tenen poc en compte tant en l’alineament facial com en el reconeixement d’emocions facials. Ens centrarem principalment en l’ús de models basats en l’aprenentatge profund, atesa la seva capacitat per aprofitar grans quantitats de dades, i també utilitzarem el modelatge del soroll en les dades per avaluar l’impacte sobre els algoritmes desenvolupats. Concretament, en aquesta tesi s’analitza l’impacte de modelar les seqüències mitjançant aprenentatges progressius aplicades al seguiment facial i que es poden aprendre del principi al final. D’aquesta manera podem avaluar la longitud temporal òptima per evitar una precisió subòptima. Posteriorment, investiguem la incorporació de models de soroll interns per poder treure profit de les característiques de cada degradació visual i aconseguir l’alineació facial de cada imatge. D’aquesta manera, podem estudiar-ne els impactes i quantificar-ne els efectes directes. A continuació , combinant tant el modelatge basat en seqüències com el modelat de soroll intern, vam crear un sistema unificat que pot realitzar un seguiment de la imatge i del rostre amb precisió. Aquest model de seguiment de l’alineació facial robust a imprevistos i a degradacions, l’ampliem a la computació afectiva, basada en el reconeixement d’emocions facials. Explorem primer l’ús de l’aprenentatge adversari per millorar tan el model de degradació de la imatge com el model de característiques latents. D’aquí resulta una millora de l’eficiència del sistema. A continuació, equipem el model amb mòduls d’atenció per deixar que el model processi la seqüència segons aquesta ponderació adaptativa. Finalment, introduïm un mètode de fusió més eficaç tant per model de trets facials com per a la representació visual d’àudio mitjançant un mecanisme de selecció (gated). A més, també analitzem els impactes d’aquests mecanismes de selecció i el modelatge de seqüències millorat per l’atenció. Hem trobat que aquests enfocaments milloren la qualitat de la nostra estimació i hem aconseguit la precisió actual de l’estat de l’art.
APA, Harvard, Vancouver, ISO, and other styles
19

Avramova, Vanya. "Curriculum Learning with Deep Convolutional Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-178453.

Full text
Abstract:
Curriculum learning is a machine learning technique inspired by the way humans acquire knowledge and skills: by mastering simple concepts first, and progressing through information with increasing difficulty to grasp more complex topics. Curriculum Learning, and its derivatives Self Paced Learning (SPL) and Self Paced Learning with Diversity (SPLD), have been previously applied within various machine learning contexts: Support Vector Machines (SVMs), perceptrons, and multi-layer neural networks, where they have been shown to improve both training speed and model accuracy. This project ventured to apply the techniques within the previously unexplored context of deep learning, by investigating how they affect the performance of a deep convolutional neural network (ConvNet) trained on a large labeled image dataset. The curriculum was formed by presenting the training samples to the network in order of increasing difficulty, measured by the sample's loss value based on the network's objective function. The project evaluated SPL and SPLD, and proposed two new curriculum learning sub-variants, p-SPL and p-SPLD, which allow for a smooth progresson of sample inclusion during training. The project also explored the "inversed" versions of the SPL, SPLD, p-SPL and p-SPLD techniques, where the samples were selected for the curriculum in order of decreasing difficulty. The experiments demonstrated that all learning variants perform fairly similarly, within ≈1% average test accuracy margin, based on five trained models per variant. Surprisingly, models trained with the inversed version of the algorithms performed slightly better than the standard curriculum training variants. The SPLD-Inversed, SPL-Inversed and SPLD networks also registered marginally higher accuracy results than the network trained with the usual random sample presentation. The results suggest that while sample ordering does affect the training process, the optimal order in which samples are presented may vary based on the data set and algorithm used. The project also investigated whether some samples were more beneficial for the training process than others. Based on sample difficulty, subsets of samples were removed from the training data set. The models trained on the remaining samples were compared to a default model trained on all samples. On the data set used, removing the “easiest” 10% of samples had no effect on the achieved test accuracy compared to the default model, and removing the “easiest” 40% of samples reduced model accuracy by only ≈1% (compared to ≈6% loss when 40% of the "most difficult" samples were removed, and ≈3% loss when 40% of samples were randomly removed). Taking away the "easiest" samples first (up to a certain percentage of the data set) affected the learning process less negatively than removing random samples, while removing the "most difficult" samples first had the most detrimental effect. The results suggest that the networks derived most learning value from the "difficult" samples, and that a large subset of the "easiest" samples can be excluded from training with minimal impact on the attained model accuracy. Moreover, it is possible to identify these samples early during training, which can greatly reduce the training time for these models.
APA, Harvard, Vancouver, ISO, and other styles
20

Tavanaei, Amirhossein. "Spiking Neural Networks and Sparse Deep Learning." Thesis, University of Louisiana at Lafayette, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=10807940.

Full text
Abstract:

This document proposes new methods for training multi-layer and deep spiking neural networks (SNNs), specifically, spiking convolutional neural networks (CNNs). Training a multi-layer spiking network poses difficulties because the output spikes do not have derivatives and the commonly used backpropagation method for non-spiking networks is not easily applied. Our methods use novel versions of the brain-like, local learning rule named spike-timing-dependent plasticity (STDP) that incorporates supervised and unsupervised components. Our method starts with conventional learning methods and converts them to spatio-temporally local rules suited for SNNs.

The training uses two components for unsupervised feature extraction and supervised classification. The first component refers to new STDP rules for spike-based representation learning that trains convolutional filters and initial representations. The second introduces new STDP-based supervised learning rules for spike pattern classification via an approximation to gradient descent by combining the STDP and anti-STDP rules. Specifically, the STDP-based supervised learning model approximates gradient descent by using temporally local STDP rules. Stacking these components implements a novel sparse, spiking deep learning model. Our spiking deep learning model is categorized as a variation of spiking CNNs of integrate-and-fire (IF) neurons with performance comparable with the state-of-the-art deep SNNs. The experimental results show the success of the proposed model for image classification. Our network architecture is the only spiking CNN which provides bio-inspired STDP rules in a hierarchy of feature extraction and classification in an entirely spike-based framework.

APA, Harvard, Vancouver, ISO, and other styles
21

Kabir, Md Faisal. "Application of Deep Learning in Deep Space Wireless Signal Identification for Intelligent Channel Sensing." University of Toledo / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1588886429314726.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Alpire, Adam. "Predicting Solar Radiation using a Deep Neural Network." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-215715.

Full text
Abstract:
Simulating the global climate in fine granularity is essential in climate science research. Current algorithms for computing climate models are based on mathematical models that are computationally expensive. Climate simulation runs can take days or months to execute on High Performance Computing (HPC) platforms. As such, the amount of computational resources determines the level of resolution for the simulations. If simulation time could be reduced without compromising model fidelity, higher resolution simulations would be possible leading to potentially new insights in climate science research. In this project, broadband radiative transfer modeling is examined, as this is an important part in climate simulators that takes around 30% to 50% time of a typical general circulation model. This thesis project presents a convolutional neural network (CNN) to model this most time consuming component. As a result, swift radiation prediction through the trained deep neural network achieves a 7x speedup compared to the calculation time of the original function. The average prediction error (MSE) is around 0.004 with 98.71% of accuracy.
Högupplösta globala klimatsimuleringar är oumbärliga för klimatforskningen.De algoritmer som i dag används för att beräkna klimatmodeller baserar sig på matematiska modeller som är beräkningsmässigt tunga. Klimatsimuleringar kan ta dagar eller månader att utföra på superdator (HPC). På så vis begränsas detaljnivån av vilka datorresurser som finns tillgängliga. Om simuleringstiden kunde minskas utan att kompromissa på modellens riktighet skulle detaljrikedomen kunna ökas och nya insikter göras möjliga. Detta projekt undersöker Bredband Solstrålning modellering eftersom det är en betydande del av dagens klimatsimulationer och upptar mellan 30-50% av beräkningstiden i en typisk generell cirkulationsmodell (GCM). Denna uppsats presenterar ett neuralt faltningsnätverk som ersätter denna beräkningsintensiva del. Resultatet är en sju gångers uppsnabbning jämfört med den ursprungliga metoden. Genomsnittliga uppskattningsfelet är 0.004 med 98.71 procents noggrannhet.
APA, Harvard, Vancouver, ISO, and other styles
23

Mancevo, del Castillo Ayala Diego. "Compressing Deep Convolutional Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217316.

Full text
Abstract:
Deep Convolutional Neural Networks and "deep learning" in general stand at the cutting edge on a range of applications, from image based recognition and classification to natural language processing, speech and speaker recognition and reinforcement learning. Very deep models however are often large, complex and computationally expensive to train and evaluate. Deep learning models are thus seldom deployed natively in environments where computational resources are scarce or expensive. To address this problem we turn our attention towards a range of techniques that we collectively refer to as "model compression" where a lighter student model is trained to approximate the output produced by the model we wish to compress. To this end, the output from the original model is used to craft the training labels of the smaller student model. This work contains some experiments on CIFAR-10 and demonstrates how to use the aforementioned techniques to compress a people counting model whose precision, recall and F1-score are improved by as much as 14% against our baseline.
APA, Harvard, Vancouver, ISO, and other styles
24

Engström, Isak. "Automated Gait Analysis : Using Deep Metric Learning." Thesis, Linköpings universitet, Medie- och Informationsteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-178139.

Full text
Abstract:
Sectors of security, safety, and defence require methods for identifying people on the individual level. Automation of these tasks has the potential of outperforming manual labor, as well as relieving workloads. The ever-extending surveillance camera networks, advances in human pose estimation from monocular cameras, together with the progress of deep learning techniques, pave the way for automated walking gait analysis as an identification method. This thesis investigates the use of 2D kinematic pose sequences to represent gait, monocularly extracted from a limited dataset containing walking individuals captured from five camera views. The sequential information of the gait is captured using recurrent neural networks. Techniques in deep metric learning are applied to evaluate two network models, with contrasting output dimensionalities, against deep-metric-, and non-deep-metric-based embedding spaces. The results indicate that the gait representation, network designs, and network learning structure show promise when identifying individuals, scaling particularly well to unseen individuals. However, with the limited dataset, the network models performed best when the dataset included the labels from both the individuals and the camera views simultaneously, contrary to when the data only contained the labels from the individuals without the information of the camera views. For further investigations, an extension of the data would be required to evaluate the accuracy and effectiveness of these methods, for the re-identification task of each individual.

Examensarbetet är utfört vid Institutionen för teknik och naturvetenskap (ITN) vid Tekniska fakulteten, Linköpings universitet

APA, Harvard, Vancouver, ISO, and other styles
25

Liu, Qian. "Deep spiking neural networks." Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/deep-spiking-neural-networks(336e6a37-2a0b-41ff-9ffb-cca897220d6c).html.

Full text
Abstract:
Neuromorphic Engineering (NE) has led to the development of biologically-inspired computer architectures whose long-term goal is to approach the performance of the human brain in terms of energy efficiency and cognitive capabilities. Although there are a number of neuromorphic platforms available for large-scale Spiking Neural Network (SNN) simulations, the problem of programming these brain-like machines to be competent in cognitive applications still remains unsolved. On the other hand, Deep Learning has emerged in Artificial Neural Network (ANN) research to dominate state-of-the-art solutions for cognitive tasks. Thus the main research problem emerges of understanding how to operate and train biologically-plausible SNNs to close the gap in cognitive capabilities between SNNs and ANNs. SNNs can be trained by first training an equivalent ANN and then transferring the tuned weights to the SNN. This method is called ‘off-line’ training, since it does not take place on an SNN directly, but rather on an ANN instead. However, previous work on such off-line training methods has struggled in terms of poor modelling accuracy of the spiking neurons and high computational complexity. In this thesis we propose a simple and novel activation function, Noisy Softplus (NSP), to closely model the response firing activity of biologically-plausible spiking neurons, and introduce a generalised off-line training method using the Parametric Activation Function (PAF) to map the abstract numerical values of the ANN to concrete physical units, such as current and firing rate in the SNN. Based on this generalised training method and its fine tuning, we achieve the state-of-the-art accuracy on the MNIST classification task using spiking neurons, 99.07%, on a deep spiking convolutional neural network (ConvNet). We then take a step forward to ‘on-line’ training methods, where Deep Learning modules are trained purely on SNNs in an event-driven manner. Existing work has failed to provide SNNs with recognition accuracy equivalent to ANNs due to the lack of mathematical analysis. Thus we propose a formalised Spike-based Rate Multiplication (SRM) method which transforms the product of firing rates to the number of coincident spikes of a pair of rate-coded spike trains. Moreover, these coincident spikes can be captured by the Spike-Time-Dependent Plasticity (STDP) rule to update the weights between the neurons in an on-line, event-based, and biologically-plausible manner. Furthermore, we put forward solutions to reduce correlations between spike trains; thereby addressing the result of performance drop in on-line SNN training. The promising results of spiking Autoencoders (AEs) and Restricted Boltzmann Machines (SRBMs) exhibit equivalent, sometimes even superior, classification and reconstruction capabilities compared to their non-spiking counterparts. To provide meaningful comparisons between these proposed SNN models and other existing methods within this rapidly advancing field of NE, we propose a large dataset of spike-based visual stimuli and a corresponding evaluation methodology to estimate the overall performance of SNN models and their hardware implementations.
APA, Harvard, Vancouver, ISO, and other styles
26

Rohlén, Andreas. "UAV geolocalization in Swedish fields and forests using Deep Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300390.

Full text
Abstract:
The ability for unmanned autonomous aerial vehicles (UAV) to localize themselves in an environment is fundamental for them to be able to function, even if they do not have access to a global positioning system. Recently, with the success of deep learning in vision based tasks, there have been some proposed methods for absolute geolocalization using vison based deep learning with satellite and UAV images. Most of these are only tested in urban environments, which begs the question: How well do they work in non-urban areas like forests and fields? One drawback of deep learning is that models are often regarded as black boxes, as it is hard to know why the models make the predictions they do, i.e. what information is important and is used for the prediction. To solve this, several neural network interpretation methods have been developed. These methods provide explanations so that we may understand these models better. This thesis investigates the localization accuracy of one geolocalization method in both urban and non-urban environments as well as applies neural network interpretation in order to see if it can explain the potential difference in localization accuracy of the method in these different environments. The results show that the method performs best in urban environments, getting a mean absolute horizontal error of 38.30m and a mean absolute vertical error of 16.77m, while it performed significantly worse in non-urban environments, getting a mean absolute horizontal error of 68.11m and a mean absolute vertical error 22.83m. Further, the results show that if the satellite images and images from the unmanned aerial vehicle are collected during different seasons of the year, the localization accuracy is even worse, resulting in a mean absolute horizontal error of 86.91m and a mean absolute vertical error of 23.05m. The neural network interpretation did not aid in providing an explanation for why the method performs worse in non-urban environments and is not suitable for this kind of problem.
Obemannade autonoma luftburna fordons (UAV) förmåga att lokaliera sig själva är fundamental för att de ska fungera, även om de inte har tillgång till globala positioneringssystem. Med den nyliga framgången hos djupinlärning applicerat på visuella problem har det kommit metoder för absolut geolokalisering med visuell djupinlärning med satellit- och UAV-bilder. De flesta av dessa metoder har bara blivit testade i stadsmiljöer, vilket leder till frågan: Hur väl fungerar dessa metoder i icke-urbana områden som fält och skogar? En av nackdelarna med djupinlärning är att dessa modeller ofta ses som svarta lådor eftersom det är svårt att veta varför modellerna gör de gissningar de gör, alltså vilken information som är viktig och används för gissningen. För att lösa detta har flera metoder för att tolka neurala nätverk utvecklats. Dessa metoder ger förklaringar så att vi kan förstå dessa modeller bättre. Denna uppsats undersöker lokaliseringsprecisionen hos en geolokaliseringsmetod i både urbana och icke-urbana miljöer och applicerar även en tolkningsmetod för neurala nätverk för att se ifall den kan förklara den potentialla skillnaden i precision hos metoden i dessa olika miljöer. Resultaten visar att metoden fungerar bäst i urbana miljöer där den får ett genomsnittligt absolut horisontellt lokaliseringsfel på 38.30m och ett genomsnittligt absolut vertikalt fel på 16.77m medan den presterade signifikant sämre i icke-urbana miljöer där den fick ett genomsnittligt absolut horisontellt lokaliseringsfel på 68.11m och ett genomsnittligt absolut vertikalt fel på 22.83m. Vidare visar resultaten att om satellitbilderna och UAV-bilderna är tagna från olika årstider blir lokaliseringsprecisionen ännu sämre, där metoden får genomsnittligt absolut horisontellt lokaliseringsfel på 86.91m och ett genomsnittligt absolut vertikalt fel på 23.05m. Tolkningsmetoden hjälpte inte i att förklara varför metoden fungerar sämre i icke-urbana miljöer och är inte passande att använda för denna sortens problem.
APA, Harvard, Vancouver, ISO, and other styles
27

Vekhande, Swapnil Sudhir. "Deep Learning Neural Network-based Sinogram Interpolation for Sparse-View CT Reconstruction." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/90182.

Full text
Abstract:
Computed Tomography (CT) finds applications across domains like medical diagnosis, security screening, and scientific research. In medical imaging, CT allows physicians to diagnose injuries and disease more quickly and accurately than other imaging techniques. However, CT is one of the most significant contributors of radiation dose to the general population and the required radiation dose for scanning could lead to cancer. On the other hand, a shallow radiation dose could sacrifice image quality causing misdiagnosis. To reduce the radiation dose, sparse-view CT, which includes capturing a smaller number of projections, becomes a promising alternative. However, the image reconstructed from linearly interpolated views possesses severe artifacts. Recently, Deep Learning-based methods are increasingly being used to interpret the missing data by learning the nature of the image formation process. The current methods are promising but operate mostly in the image domain presumably due to lack of projection data. Another limitation is the use of simulated data with less sparsity (up to 75%). This research aims to interpolate the missing sparse-view CT in the sinogram domain using deep learning. To this end, a residual U-Net architecture has been trained with patch-wise projection data to minimize Euclidean distance between the ground truth and the interpolated sinogram. The model can generate highly sparse missing projection data. The results show improvement in SSIM and RMSE by 14% and 52% respectively with respect to the linear interpolation-based methods. Thus, experimental sparse-view CT data with 90% sparsity has been successfully interpolated while improving CT image quality.
Master of Science
Computed Tomography is a commonly used imaging technique due to the remarkable ability to visualize internal organs, bones, soft tissues, and blood vessels. It involves exposing the subject to X-ray radiation, which could lead to cancer. On the other hand, the radiation dose is critical for the image quality and subsequent diagnosis. Thus, image reconstruction using only a small number of projection data is an open research problem. Deep learning techniques have already revolutionized various Computer Vision applications. Here, we have used a method which fills missing highly sparse CT data. The results show that the deep learning-based method outperforms standard linear interpolation-based methods while improving the image quality.
APA, Harvard, Vancouver, ISO, and other styles
28

Mishra, Vishal Vijayshankar. "Sequence-to-Sequence Learning using Deep Learning for Optical Character Recognition (OCR)." University of Toledo / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1513273051760905.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Arnström, Daniel. "State Estimation for Truck and Trailer Systems using Deep Learning." Thesis, Linköpings universitet, Reglerteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148630.

Full text
Abstract:
High precision control of a truck and trailer system requires accurate and robust state estimation of the system. This thesis work explores the possibility of estimating the states with high accuracy from sensors solely mounted on the truck. The sensors used are a LIDAR sensor, a rear-view camera and a RTK-GNSS receiver. Information about the angles between the truck and the trailer are extracted from LIDAR scans and camera images through deep learning and through model-based approaches. The estimates are fused together with a model of the dynamics of the system in an Extended Kalman Filter to obtain high precision state estimates. Training data for the deep learning approaches and data to evaluate and compare these methods with the model-based approaches are collected in a simulation environment established in Gazebo. The deep learning approaches are shown to give decent angle estimations but the model-based approaches are shown to result in more robust and accurate estimates. The flexibility of the deep learning approach to learn any model given sufficient training data has been highlighted and it is shown that a deep learning approach can be viable if the trailer has an irregular shape and a large amount of data is available. It is also shown that biases in measured lengths of the system can be remedied by estimating the biases online in the filter and this improves the state estimates.
APA, Harvard, Vancouver, ISO, and other styles
30

Hocquet, Guillaume. "Class Incremental Continual Learning in Deep Neural Networks." Thesis, université Paris-Saclay, 2021. http://www.theses.fr/2021UPAST070.

Full text
Abstract:
Nous nous intéressons au problème de l'apprentissage continu de réseaux de neurones artificiels dans le cas où les données ne sont accessibles que pour une seule catégorie à la fois. Pour remédier au problème de l'oubli catastrophique qui limite les performances d'apprentissage dans ces conditions, nous proposons une approche basée sur la représentation des données d'une catégorie par une loi normale. Les transformations associées à ces représentations sont effectuées à l'aide de réseaux inversibles, qui peuvent alors être entraînés avec les données d'une seule catégorie. Chaque catégorie se voit attribuer un réseau pour représenter ses caractéristiques. Prédire la catégorie revient alors à identifier le réseau le plus représentatif. L'avantage d'une telle approche est qu'une fois qu'un réseau est entraîné, il n'est plus nécessaire de le mettre à jour par la suite, chaque réseau étant indépendant des autres. C'est cette propriété particulièrement avantageuse qui démarque notre méthode des précédents travaux dans ce domaine. Nous appuyons notre démonstration sur des expériences réalisées sur divers jeux de données et montrons que notre approche fonctionne favorablement comparé à l'état de l'art. Dans un second temps, nous proposons d'optimiser notre approche en réduisant son impact en mémoire en factorisant les paramètres des réseaux. Il est alors possible de réduire significativement le coût de stockage de ces réseaux avec une perte de performances limitée. Enfin, nous étudions également des stratégies pour produire des réseaux capables d'être réutilisés sur le long terme et nous montrons leur pertinence par rapport aux réseaux traditionnellement utilisés pour l'apprentissage continu
We are interested in the problem of continual learning of artificial neural networks in the case where the data are available for only one class at a time. To address the problem of catastrophic forgetting that restrain the learning performances in these conditions, we propose an approach based on the representation of the data of a class by a normal distribution. The transformations associated with these representations are performed using invertible neural networks, which can be trained with the data of a single class. Each class is assigned a network that will model its features. In this setting, predicting the class of a sample corresponds to identifying the network that best fit the sample. The advantage of such an approach is that once a network is trained, it is no longer necessary to update it later, as each network is independent of the others. It is this particularly advantageous property that sets our method apart from previous work in this area. We support our demonstration with experiments performed on various datasets and show that our approach performs favorably compared to the state of the art. Subsequently, we propose to optimize our approach by reducing its impact on memory by factoring the network parameters. It is then possible to significantly reduce the storage cost of these networks with a limited performance loss. Finally, we also study strategies to produce efficient feature extractor models for continual learning and we show their relevance compared to the networks traditionally used for continual learning
APA, Harvard, Vancouver, ISO, and other styles
31

Boukli, Hacene Ghouthi. "Processing and learning deep neural networks on chip." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2019. http://www.theses.fr/2019IMTA0153/document.

Full text
Abstract:
Dans le domaine de l'apprentissage machine, les réseaux de neurones profonds sont devenus la référence incontournable pour un très grand nombre de problèmes. Ces systèmes sont constitués par un assemblage de couches, lesquelles réalisent des traitements élémentaires, paramétrés par un grand nombre de variables. À l'aide de données disponibles pendant une phase d'apprentissage, ces variables sont ajustées de façon à ce que le réseau de neurones réponde à la tâche donnée. Il est ensuite possible de traiter de nouvelles données. Si ces méthodes atteignent les performances à l'état de l'art dans bien des cas, ils reposent pour cela sur un très grand nombre de paramètres, et donc des complexités en mémoire et en calculs importantes. De fait, ils sont souvent peu adaptés à l'implémentation matérielle sur des systèmes contraints en ressources. Par ailleurs, l'apprentissage requiert de repasser sur les données d'entraînement plusieurs fois, et s'adapte donc difficilement à des scénarios où de nouvelles informations apparaissent au fil de l'eau. Dans cette thèse, nous nous intéressons dans un premier temps aux méthodes permettant de réduire l'impact en calculs et en mémoire des réseaux de neurones profonds. Nous proposons dans un second temps des techniques permettant d'effectuer l'apprentissage au fil de l'eau, dans un contexte embarqué
In the field of machine learning, deep neural networks have become the inescapablereference for a very large number of problems. These systems are made of an assembly of layers,performing elementary operations, and using a large number of tunable variables. Using dataavailable during a learning phase, these variables are adjusted such that the neural networkaddresses the given task. It is then possible to process new data.To achieve state-of-the-art performance, in many cases these methods rely on a very largenumber of parameters, and thus large memory and computational costs. Therefore, they are oftennot very adapted to a hardware implementation on constrained resources systems. Moreover, thelearning process requires to reuse the training data several times, making it difficult to adapt toscenarios where new information appears on the fly.In this thesis, we are first interested in methods allowing to reduce the impact of computations andmemory required by deep neural networks. Secondly, we propose techniques for learning on thefly, in an embedded context
APA, Harvard, Vancouver, ISO, and other styles
32

Plummer, Dylan. "Facilitating the Study of Chromatin Organization with Deep Learning." Case Western Reserve University School of Graduate Studies / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=case1589203000193806.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Jonsson, Max. "Deep Learning för klassificering av kundsupport-ärenden." Thesis, Högskolan i Gävle, Datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-32687.

Full text
Abstract:
Företag och organisationer som tillhandahåller kundsupport via e-post kommer över tid att samla på sig stora mängder textuella data. Tack vare kontinuerliga framsteg inom Machine Learning ökar ständigt möjligheterna att dra nytta av tidigare insamlat data för att effektivisera organisationens framtida supporthantering. Syftet med denna studie är att analysera och utvärdera hur Deep Learning kan användas för att automatisera processen att klassificera supportärenden. Studien baseras på ett svenskt företags domän där klassificeringarna sker inom företagets fördefinierade kategorier. För att bygga upp ett dataset extraherades supportärenden inkomna via e-post (par av rubrik och meddelande) från företagets supportdatabas, där samtliga ärenden tillhörde en av nio distinkta kategorier. Utvärderingen gjordes genom att analysera skillnaderna i systemets uppmätta precision då olika metoder för datastädning användes, samt då de neurala nätverken byggdes upp med olika arkitekturer. En avgränsning gjordes att endast undersöka olika typer av Convolutional Neural Networks (CNN) samt Recurrent Neural Networks (RNN) i form av både enkel- och dubbelriktade Long Short Time Memory (LSTM) celler. Resultaten från denna studie visar ingen ökning i precision för någon av de undersökta datastädningsmetoderna. Dock visar resultaten att en begränsning av den använda ordlistan heller inte genererar någon negativ effekt. En begränsning av ordlistan kan fortfarande vara användbar för att minimera andra effekter så som exempelvis träningstiden, och eventuellt även minska risken för överanpassning. Av de undersökta nätverksarkitekturerna presterade CNN bättre än RNN på det använda datasetet. Den mest gynnsamma nätverksarkitekturen var ett nätverk med en konvolution per pipeline som för två olika test-set genererade precisioner på 79,3 respektive 75,4 procent. Resultaten visar också att några kategorier är svårare för nätverket att klassificera än andra, eftersom dessa inte är tillräckligt distinkta från resterande kategorier i datasetet.
Companies and organizations providing customer support via email will over time grow a big corpus of text documents. With advances made in Machine Learning the possibilities to use this data to improve the customer support efficiency is steadily increasing. The aim of this study is to analyze and evaluate the use of Deep Learning methods for automizing the process of classifying support errands. This study is based on a Swedish company’s domain where the classification was made within the company’s predefined categories. A dataset was built by obtaining email support errands (subject and body pairs) from the company’s support database. The dataset consisted of data belonging to one of nine separate categories. The evaluation was done by analyzing the alteration in classification accuracy when using different methods for data cleaning and by using different network architectures. A delimitation was set to only examine the effects by using different combinations of Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in the shape of both unidirectional and bidirectional Long Short Time Memory (LSTM) cells. The results of this study show no increase in classification accuracy by any of the examined data cleaning methods. However, a feature reduction of the used vocabulary is proven to neither have any negative impact on the accuracy. A feature reduction might still be beneficial to minimize other side effects such as the time required to train a network, and possibly to help prevent overfitting. Among the examined network architectures CNN were proven to outperform RNN on the used dataset. The most accurate network architecture was a single convolutional network which on two different test sets reached classification rates of 79,3 and 75,4 percent respectively. The results also show some categories to be harder to classify than others, due to them not being distinct enough towards the rest of the categories in the dataset.
APA, Harvard, Vancouver, ISO, and other styles
34

Ioannou, Yani Andrew. "Structural priors in deep neural networks." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278976.

Full text
Abstract:
Deep learning has in recent years come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite breakthroughs in training deep networks, there remains a lack of understanding of both the optimization and structure of deep networks. The approach advocated by many researchers in the field has been to train monolithic networks with excess complexity, and strong regularization --- an approach that leaves much to desire in efficiency. Instead we propose that carefully designing networks in consideration of our prior knowledge of the task and learned representation can improve the memory and compute efficiency of state-of-the art networks, and even improve generalization --- what we propose to denote as structural priors. We present two such novel structural priors for convolutional neural networks, and evaluate them in state-of-the-art image classification CNN architectures. The first of these methods proposes to exploit our knowledge of the low-rank nature of most filters learned for natural images by structuring a deep network to learn a collection of mostly small, low-rank, filters. The second addresses the filter/channel extents of convolutional filters, by learning filters with limited channel extents. The size of these channel-wise basis filters increases with the depth of the model, giving a novel sparse connection structure that resembles a tree root. Both methods are found to improve the generalization of these architectures while also decreasing the size and increasing the efficiency of their training and test-time computation. Finally, we present work towards conditional computation in deep neural networks, moving towards a method of automatically learning structural priors in deep networks. We propose a new discriminative learning model, conditional networks, that jointly exploit the accurate representation learning capabilities of deep neural networks with the efficient conditional computation of decision trees. Conditional networks yield smaller models, and offer test-time flexibility in the trade-off of computation vs. accuracy.
APA, Harvard, Vancouver, ISO, and other styles
35

Bearzotti, Riccardo. "Structural damage detection using deep learning networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text
Abstract:
Research on damage detection of structures using image process- ing techniques has been actively conducted, specially on infrastruc- tures as road pavements, achieving considerably high detection accu- racies. These techniques are more and more studied all over the world cause seems be a powerful method able to replace, in some conditions, the experience and the visual ability of humans. This thesis has the purpose to introduce how the development in the last few years of the image processing can be useful to avoid some costs on structure monitoring and predict some disaster, that the most of times we listened call them as announced disasters that could be avoided. This thesis introduce the deep learning method implemented on Mat- lab to solve this problems trying to understand, in the first part, what machine learning and deep learning consist of, which is the best way to use the convolution neural networks and in which parameters work on. This we the purpose to give some background about this tech- nique in order to implement it on a large number of problems. There will be also some examples of basic codes and the outcomes are discussed, in order to figure out which is the best tool or combi- nation of tool to solve a problem of more complexity. At the end there are some consideration about useful future works that can be studied in order to help in structure monitoring in lab tests, during the life cycle and in case of collapse.
APA, Harvard, Vancouver, ISO, and other styles
36

Tiwari, Astha. "A Deep Learning Approach to Recognizing Bees in Video Analysis of Bee Traffic." DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7076.

Full text
Abstract:
Colony Collapse Disorder (CCD) has been a major threat to bee colonies around the world which affects vital human food crop pollination. The decline in bee population can have tragic consequences, for humans as well as the bees and the ecosystem. Bee health has been a cause of urgent concern for farmers and scientists around the world for at least a decade but a specific cause for the phenomenon has yet to be conclusively identified. This work uses Artificial Intelligence and Computer Vision approaches to develop and analyze techniques to help in continuous monitoring of bee traffic which will further help in monitoring forager traffic. Bee traffic is the number of bees moving in a given area in front of the hive over a given period of time. And, forager traffic is the number of bees entering and/or exiting the hive over a given period of time. Forager traffic is an important variable to monitor food availability, food demand, colony age structure, impact of pesticides, etc. on bee hives. This will lead to improved remote monitoring and general hive status and improved real time detection of the impact of pests, diseases, pesticide exposure and other hive management problems.
APA, Harvard, Vancouver, ISO, and other styles
37

Kalogiras, Vasileios. "Sentiment Classification with Deep Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217858.

Full text
Abstract:
Attitydanalys är ett delfält av språkteknologi (NLP) som försöker analysera känslan av skriven text. Detta är ett komplext problem som medför många utmaningar. Av denna anledning har det studerats i stor utsträckning. Under de senaste åren har traditionella maskininlärningsalgoritmer eller handgjord metodik använts och givit utmärkta resultat. Men den senaste renässansen för djupinlärning har växlat om intresse till end to end deep learning-modeller.Å ena sidan resulterar detta i mer kraftfulla modeller men å andra sidansaknas klart matematiskt resonemang eller intuition för dessa modeller. På grund av detta görs ett försök i denna avhandling med att kasta ljus på nyligen föreslagna deep learning-arkitekturer för attitydklassificering. En studie av deras olika skillnader utförs och ger empiriska resultat för hur ändringar i strukturen eller kapacitet hos modellen kan påverka exaktheten och sättet den representerar och ''förstår'' meningarna.
Sentiment analysis is a subfield of natural language processing (NLP) that attempts to analyze the sentiment of written text.It is is a complex problem that entails different challenges. For this reason, it has been studied extensively. In the past years traditional machine learning algorithms or handcrafted methodologies used to provide state of the art results. However, the recent deep learning renaissance shifted interest towards end to end deep learning models. On the one hand this resulted into more powerful models but on the other hand clear mathematical reasoning or intuition behind distinct models is still lacking. As a result, in this thesis, an attempt to shed some light on recently proposed deep learning architectures for sentiment classification is made.A study of their differences is performed as well as provide empirical results on how changes in the structure or capacity of a model can affect its accuracy and the way it represents and ''comprehends'' sentences.
APA, Harvard, Vancouver, ISO, and other styles
38

Sharma, Astha. "Emotion Recognition Using Deep Convolutional Neural Network with Large Scale Physiological Data." Scholar Commons, 2018. https://scholarcommons.usf.edu/etd/7570.

Full text
Abstract:
Classification of emotions plays a very important role in affective computing and has real-world applications in fields as diverse as entertainment, medical, defense, retail, and education. These applications include video games, virtual reality, pain recognition, lie detection, classification of Autistic Spectrum Disorder (ASD), analysis of stress levels, and determining attention levels. This vast range of applications motivated us to study automatic emotion recognition which can be done by using facial expression, speech, and physiological data. A person’s physiological signals such are heart rate, and blood pressure are deeply linked with their emotional states and can be used to identify a variety of emotions; however, they are less frequently explored for emotion recognition compared to audiovisual signals such as facial expression and voice. In this thesis, we investigate a multimodal approach to emotion recognition using physiological signals by showing how these signals can be combined and used to accurately identify a wide range of emotions such as happiness, sadness, and pain. We use the deep convolutional neural network for our experiments. We also detail comparisons between gender-specific models of emotion. Our investigation makes use of deep convolutional neural networks, which are the latest state of the art in supervised learning, on two publicly available databases, namely DEAP and BP4D+. We achieved an average emotion recognition accuracy of 98.89\% on BP4D+ and on DEAP it is 86.09\% for valence, 90.61\% for arousal, 90.48\% for liking and 90.95\% for dominance. We also compare our results to the current state of the art, showing the superior performance of our method.
APA, Harvard, Vancouver, ISO, and other styles
39

Michailoff, John. "Email Classification : An evaluation of Deep Neural Networks with Naive Bayes." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-37590.

Full text
Abstract:
Machine learning (ML) is an area of computer science that gives computers the ability to learn data patterns without prior programming for those patterns. Using neural networks in this area is based on simulating the biological functions of neurons in brains to learn patterns in data, giving computers a predictive ability to comprehend how data can be clustered. This research investigates the possibilities of using neural networks for classifying email, i.e. working as an email case manager. A Deep Neural Network (DNN) are multiple layers of neurons connected to each other by trainable weights. The main objective of this thesis was to evaluate how the three input arguments - data size, training time and neural network structure – affects the accuracy of Deep Neural Networks pattern recognition; also an evaluation of how the DNN performs compared to the statistical ML method, Naïve Bayes, in the form of prediction accuracy and complexity; and finally the viability of the resulting DNN as a case manager. Results show an improvement of accuracy on our networks with the increase of training time and data size respectively. By testing increasingly complex network structures (larger networks of neurons with more layers) it is observed that overfitting becomes a problem with increased training time, i.e. how accuracy decrease after a certain threshold of training time. Naïve Bayes classifiers performs worse than DNN in terms of accuracy, but better in reduced complexity; making NB viable on mobile platforms. We conclude that our developed prototype may work well in tangent with existing case management systems, tested by future research.
APA, Harvard, Vancouver, ISO, and other styles
40

Donati, Lorenzo. "Domain Adaptation through Deep Neural Networks for Health Informatics." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14888/.

Full text
Abstract:
The PreventIT project is an EU Horizon 2020 project aimed at preventing early functional decline at younger old age. The analysis of causal links between risk factors and functional decline has been made possible by the cooperation of several research institutes' studies. However, since each research institute collects and delivers different kinds of data in different formats, so far the analysis has been assisted by expert geriatricians whose role is to detect the best candidates among hundreds of fields and offer a semantic interpretation of the values. This manual data harmonization approach is very common in both scientific and industrial environments. In this thesis project an alternative method for parsing heterogeneous data is proposed. Since all the datasets represent semantically related data, being all made from longitudinal studies on aging-related metrics, it is possible to train an artificial neural network to perform an automatic domain adaptation. To achieve this goal, a Stacked Denoising Autoencoder has been implemented and trained to extract a domain-invariant representation of the data. Then, from this high-level representation, multiple classifiers have been trained to validate the model and ultimately to predict the probability of functional decline of the patient. This innovative approach to the domain adaptation process can provide an easy and fast solution to many research fields that now rely on human interaction to analyze the semantic data model and perform cross-dataset analysis. Functional decline classifiers show a great improvement in their performance when trained on the domain-invariant features extracted by the Stacked Denoising Autoencoder. Furthermore, this project applies multiple deep neural network classifiers on top of the Stacked Denoising Autoencoder representation, achieving excellent results for the prediction of functional decline in a real case study that involves two different datasets.
APA, Harvard, Vancouver, ISO, and other styles
41

Hesamifard, Ehsan. "Privacy Preserving Machine Learning as a Service." Thesis, University of North Texas, 2020. https://digital.library.unt.edu/ark:/67531/metadc1703277/.

Full text
Abstract:
Machine learning algorithms based on neural networks have achieved remarkable results and are being extensively used in different domains. However, the machine learning algorithms requires access to raw data which is often privacy sensitive. To address this issue, we develop new techniques to provide solutions for running deep neural networks over encrypted data. In this paper, we develop new techniques to adopt deep neural networks within the practical limitation of current homomorphic encryption schemes. We focus on training and classification of the well-known neural networks and convolutional neural networks. First, we design methods for approximation of the activation functions commonly used in CNNs (i.e. ReLU, Sigmoid, and Tanh) with low degree polynomials which is essential for efficient homomorphic encryption schemes. Then, we train neural networks with the approximation polynomials instead of original activation functions and analyze the performance of the models. Finally, we implement neural networks and convolutional neural networks over encrypted data and measure performance of the models.
APA, Harvard, Vancouver, ISO, and other styles
42

Mansour, Tarek M. Eng Massachusetts Institute of Technology. "Deep neural networks are lazy : on the inductive bias of deep learning." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/121680.

Full text
Abstract:
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 75-78).
Deep learning models exhibit superior generalization performance despite being heavily overparametrized. Although widely observed in practice, there is currently very little theoretical backing for such a phenomena. In this thesis, we propose a step forward towards understanding generalization in deep learning. We present evidence that deep neural networks have an inherent inductive bias that makes them inclined to learn generalizable hypotheses and avoid memorization. In this respect, we propose results that suggest that the inductive bias stems from neural networks being lazy: they tend to learn simpler rules first. We also propose a definition of simplicity in deep learning based on the implicit priors ingrained in deep neural networks.
by Tarek Mansour.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
43

Glazier, Seth William. "Sequential Survival Analysis with Deep Learning." BYU ScholarsArchive, 2019. https://scholarsarchive.byu.edu/etd/7528.

Full text
Abstract:
Survival Analysis is the collection of statistical techniques used to model the time of occurrence, i.e. survival time, of an event of interest such as death, marriage, the lifespan of a consumer product or the onset of a disease. Traditional survival analysis methods rely on assumptions that make it difficult, if not impossible to learn complex non-linear relationships between the covariates and survival time that is inherent in many real world applications. We first demonstrate that a recurrent neural network (RNN) is better suited to model problems with non-linear dependencies in synthetic time-dependent and non-time-dependent experiments.
APA, Harvard, Vancouver, ISO, and other styles
44

Purmonen, Sami. "Predicting Game Level Difficulty Using Deep Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217140.

Full text
Abstract:
We explored the usage of Monte Carlo tree search (MCTS) and deep learning in order to predict game level difficulty in Candy Crush Saga (Candy) measured as number of attempts per success. A deep neural network (DNN) was trained to predict moves from game states from large amounts of game play data. The DNN played a diverse set of levels in Candy and a regression model was fitted to predict human difficulty from bot difficulty. We compared our results to an MCTS bot. Our results show that the DNN can make estimations of game level difficulty comparable to MCTS in substantially shorter time.
Vi utforskade användning av Monte Carlo tree search (MCTS) och deep learning för attuppskatta banors svårighetsgrad i Candy Crush Saga (Candy). Ett deep neural network(DNN) tränades för att förutse speldrag från spelbanor från stora mängder speldata. DNN:en spelade en varierad mängd banor i Candy och en modell byggdes för att förutsemänsklig svårighetsgrad från DNN:ens svårighetsgrad. Resultatet jämfördes medMCTS. Våra resultat indikerar att DNN:ens kan göra uppskattningar jämförbara medMCTS men på substantiellt kortare tid.
APA, Harvard, Vancouver, ISO, and other styles
45

Monica, Riccardo. "Deep Incremental Learning for Object Recognition." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/12331/.

Full text
Abstract:
In recent years, deep learning techniques received great attention in the field of information technology. These techniques proved to be particularly useful and effective in domains like natural language processing, speech recognition and computer vision. In several real world applications deep learning approaches improved the state-of-the-art. In the field of machine learning, deep learning was a real revolution and a number of effective techniques have been proposed for both supervised and unsupervised learning and for representation learning. This thesis focuses on deep learning for object recognition, and in particular, it addresses incremental learning techniques. With incremental learning we denote approaches able to create an initial model from a small training set and to improve the model as new data are available. Using temporal coherent sequences proved to be useful for incremental learning since temporal coherence also allows to operate in unsupervised manners. A critical point of incremental learning is called forgetting which is the risk to forget previously learned patterns as new data are presented. In the first chapters of this work we introduce the basic theory on neural networks, Convolutional Neural Networks and incremental learning. CNN is today one of the most effective approaches for supervised object recognition; it is well accepted by the scientific community and largely used by ICT big players like Google and Facebook: relevant applications are Facebook face recognition and Google image search. The scientific community has several (large) datasets (e.g., ImageNet) for the development and evaluation of object recognition approaches. However very few temporally coherent datasets are available to study incremental approaches. For this reason we decided to collect a new dataset named TCD4R (Temporal Coherent Dataset For Robotics).
APA, Harvard, Vancouver, ISO, and other styles
46

Vikström, Johan. "Comparing decentralized learning to Federated Learning when training Deep Neural Networks under churn." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300391.

Full text
Abstract:
Decentralized Machine Learning could address some problematic facets with Federated Learning. There is no central server acting as an arbiter of whom or what may benefit from Machine Learning models created by the vast amount of data becoming available in recent years. It could also increase the reliability and scalability of Machine Learning systems thereby drawing the benefit of having more data accessible. Gossip Learning is such a protocol, but has primarily been designed with linear models in mind. How does Gossip Learning perform when training Deep Neural Networks? Could it be a viable alternative to Federated Learning? In this thesis, we implement Gossip Learning using two different model merging strategies. We also design and implement two extensions to this protocol with the goal of achieving higher performance when training under churn. The training methods are compared on two tasks: image classification on the Federated Extended MNIST dataset and time- series forecasting on the NN5 dataset. Additionally, we also run an experiment where learners churn, alternating between being available and unavailable. We find that Gossip Learning performs slightly better in settings where learners do not churn but is vastly outperformed in the setting where they do.
Decentraliserad Maskinginlärning kan lösa några problematiska aspekter med Federated Learning. Det finns ingen central server som agerar som domare för vilka som får gagna av Maskininlärningsmodellerna skapad av den stora mäng data som blivit tillgänglig på senare år. Det skulle också kunna öka pålitligheten och skalbarheten av Maskininlärningssystem och därav dra nytta av att mer data är tillgänglig. Gossip Learning är ett sånt protokoll, men det är primärt designat med linjära modeller i åtanke. Hur presterar Gossip Learning när man tränar Djupa Neurala Nätverk? Kan det vara ett möjligt alternativ till Federated Learning? I det här exjobbet implementerar vi Gossip Learning med två olika modelsammanslagningstekniker. Vi designar och implementerar även två tillägg till protokollet med målet att uppnå bättre prestanda när man tränar i system där noder går ner och kommer up. Träningsmetoderna jämförs på två uppgifter: bildklassificering på Federated Extended MNIST datauppsättningen och tidsserieprognostisering på NN5 datauppsättningen. Dessutom har vi även experiment då noder alternerar mellan att vara tillgängliga och otillgängliga. Vi finner att Gossip Learning presterar marginellt bättre i miljöer då noder alltid är tillgängliga men är kraftigt överträffade i miljöer då noder alternerar mellan att vara tillgängliga och otillgängliga.
APA, Harvard, Vancouver, ISO, and other styles
47

Matuh, Delic Senad. "A Convolutional Neural Network for predicting HIV Integration Sites." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279796.

Full text
Abstract:
Convolutional neural networks are commonly used when training deep networks with time-independent data and have demonstrated positive results in predicting DNA binding sites for DNA-binding proteins. Based upon the success of convolutional neural networks in predicting DNA binding sites of proteins, this project intends to determine if a convolutional neural network could predict possible HIV-B provirus integration sites. When exploring existing research, little information was found regarding DNA sequences targeted by HIV for integration, few, if any, have attempted to use artificial neural networks to identify these sequences and the integration sites themselves. Using data from the Retrovirus Integration Database, we train a convolutional artificial neural network to determine if it can detect potential target sites for HIV integration. The analysis and results reveal that the created convolutional neural network is able to predict HIV integration sites in human DNA with an accuracy that exceeds that of a potential random binary classifier. When analyzing the datasets separated by the neural network, the relative distribution of the different nucleotides in the immediate vicinity of HIV integration site reveals that some nucleotides are disproportionately occurring less often at these sites compared to nucleotides in randomly sampled human DNA.
Konvolutionella artificiella nätverk används vanligen vid tidsoberoende datamängder. Konvolutionella artificiella nätverk har varit framgångsrika med att förutse bindningssiter för DNA-bindande proteiner. Med de framsteg som gjorts med konvolutionella artificiella nätverk vill detta projekt bestämma huruvida det går att med ett konvolutionellt artificiella nätverk förutsäga möjliga siter för HIV-B integration i mänskligt DNA. Våran eftersökning visar att det finns lite kunskap om huruvida det finns nukleotidsekvenser i mänskligt DNA som främjar HIV integration. Samtidigt har få eller inga studier gjorts med konvolutionella artificiella nätverk i försök att förutsäga integrationssiter för HIV i mänskligt DNA. Genom att använda data från Retrovirus Integration Database tänker vi träna ett konvolutionellt artificiellt nätverk med syftet att försöka bestämma huruvida det tränade konvolutionella artificiella nätverket kan förutspå potentiella integrationssiter för HIV. Våra resultat visar att det skapade konvolutionella artificiella nätverket kan förutsäga HIV integration i mänskligt DNA med en träffsäkerhet som överträffar en potentiell slumpmässig binär klassificerare. Vid analys av datamängderna separerade av det neurala nätverket framträder en bild där vissa nukleotider förekommer oproportionerligt mindre frekvent i närheten av integrationssiterna i jämförelse med nukleotider i slumpmässigt genererad mänsklig DNA.
APA, Harvard, Vancouver, ISO, and other styles
48

Aspiras, Theus Herrera. "Hierarchical Autoassociative Polynomial Network for Deep Learning of Complex Manifolds." University of Dayton / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1449104879.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Valeriana, Riccardo. "Deep Learning: Algoritmo di Classificazione Immagini." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17557/.

Full text
Abstract:
La presente tesi si occupa della descrizione dell’algoritmo di Deep Learning che permette la classificazione di immagini importate dal dataset CIFAR-10. Il codice è stato testato in un ambiente di test per un classificatore di immagini per piattaforme Apple macOS, dove sono state utilizzate immagini di frutta e verdura come input per la classificazione. Nel primo capitolo viene trattato il dataset CIFAR-10, come questo è stato realizzato, la sua struttura e l’uso che se ne può fare. Successivamente viene descritto il metodo di apprendimento automatico della Classificazione Lineare, in che cosa consiste, come agisce nella suddivisione di classi e quali vantaggi e svantaggi comporta. Segue l’introduzione di metodologie di apprendimento supervisionato note come macchine a vettori di supporto (SVM o Support Vector Machines) sfruttate per la classificazione di pattern e la definizione della funzione Score, che viene utilizzata per determinare la classe a cui appartiene ciascuna immagine. Nel secondo capitolo viene trattato il Deep Learning e le Reti Neurali (non convoluzionali), modello matematico basato sulla riproduzione approssimata della struttura neuronale biologica, il cervello dei mammiferi, in particolare relativamente al funzionamento della corteccia visiva. Il terzo e ultimo capitolo tratta in modo più approfondito e tecnico l’implementazione di un algoritmo di Deep Learning sviluppato in Pytorch, assieme alla libreria di Numpy, spiegando passo per passo come questo è stato strutturato. Infine viene osservato come un classificatore di immagini per piattaforme macOS opera con figure eterogenee raccolte da internet di verdure e frutti.
APA, Harvard, Vancouver, ISO, and other styles
50

Conciatori, Marco. "tecniche di deep learning applicate a giochi atari." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/19132/.

Full text
Abstract:
La presente tesi è incentrata sulla progettazione e sperimentazione di varianti per l'algoritmo DQN, diffusamente utilizzato per affrontare problemi di reinforcement learning con reti neurali. L'obbiettivo prefissato consiste nel migliorarne le performance, soprattutto nel caso degli ambienti a ricompense sparse. Nell'ambito di questi problemi infatti gli algoritmi di reinforcement learning, tra cui lo stesso DQN, incontrano difficoltà nel conseguimento di risultati soddisfacenti. Le modifiche apportate all'algoritmo hanno lo scopo di ottimizzarne le modalità di apprendimento, seguendo, in estrema sintesi, due principali direzioni: un miglior sfruttamento delle poche ricompense a disposizione attraverso un loro più frequente utilizzo, oppure una esplorazione più efficace, ad esempio tramite l'introduzione di scelta casuale delle mosse e/o dell'entropia. Si ottengono in questo modo diverse versioni di DQN che vengono poi confrontate fra loro e con l'algoritmo originale sulla base dei risultati ottenuti.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography