Dissertations / Theses on the topic 'Artificial Intelligence; Deep learning; Representation learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Artificial Intelligence; Deep learning; Representation learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Carvalho, Micael. "Deep representation spaces." Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS292.

Full text
Abstract:
Ces dernières années, les techniques d’apprentissage profond ont fondamentalement transformé l'état de l'art de nombreuses applications de l'apprentissage automatique, devenant la nouvelle approche standard pour plusieurs d’entre elles. Les architectures provenant de ces techniques ont été utilisées pour l'apprentissage par transfert, ce qui a élargi la puissance des modèles profonds à des tâches qui ne disposaient pas de suffisamment de données pour les entraîner à partir de zéro. Le sujet d'étude de cette thèse couvre les espaces de représentation créés par les architectures profondes. Dans un premier temps, nous étudions les propriétés de leurs espaces, en prêtant un intérêt particulier à la redondance des dimensions et la précision numérique de leurs représentations. Nos résultats démontrent un fort degré de robustesse, pointant vers des schémas de compression simples et puissants. Ensuite, nous nous concentrons sur le l'affinement de ces représentations. Nous choisissons d'adopter un problème multi-tâches intermodal et de concevoir une fonction de coût capable de tirer parti des données de plusieurs modalités, tout en tenant compte des différentes tâches associées au même ensemble de données. Afin d'équilibrer correctement ces coûts, nous développons également un nouveau processus d'échantillonnage qui ne prend en compte que des exemples contribuant à la phase d'apprentissage, c'est-à-dire ceux ayant un coût positif. Enfin, nous testons notre approche sur un ensemble de données à grande échelle de recettes de cuisine et d'images associées. Notre méthode améliore de 5 fois l'état de l'art sur cette tâche, et nous montrons que l'aspect multitâche de notre approche favorise l'organisation sémantique de l'espace de représentation, lui permettant d'effectuer des sous-tâches jamais vues pendant l'entraînement, comme l'exclusion et la sélection d’ingrédients. Les résultats que nous présentons dans cette thèse ouvrent de nombreuses possibilités, y compris la compression de caractéristiques pour les applications distantes, l'apprentissage multi-modal et multitâche robuste et l'affinement de l'espace des caractéristiques. Pour l'application dans le contexte de la cuisine, beaucoup de nos résultats sont directement applicables dans une situation réelle, en particulier pour la détection d'allergènes, la recherche de recettes alternatives en raison de restrictions alimentaires et la planification de menus<br>In recent years, Deep Learning techniques have swept the state-of-the-art of many applications of Machine Learning, becoming the new standard approach for them. The architectures issued from these techniques have been used for transfer learning, which extended the power of deep models to tasks that did not have enough data to fully train them from scratch. This thesis' subject of study is the representation spaces created by deep architectures. First, we study properties inherent to them, with particular interest in dimensionality redundancy and precision of their features. Our findings reveal a strong degree of robustness, pointing the path to simple and powerful compression schemes. Then, we focus on refining these representations. We choose to adopt a cross-modal multi-task problem, and design a loss function capable of taking advantage of data coming from multiple modalities, while also taking into account different tasks associated to the same dataset. In order to correctly balance these losses, we also we develop a new sampling scheme that only takes into account examples contributing to the learning phase, i.e. those having a positive loss. Finally, we test our approach in a large-scale dataset of cooking recipes and associated pictures. Our method achieves a 5-fold improvement over the state-of-the-art, and we show that the multi-task aspect of our approach promotes a semantically meaningful organization of the representation space, allowing it to perform subtasks never seen during training, like ingredient exclusion and selection. The results we present in this thesis open many possibilities, including feature compression for remote applications, robust multi-modal and multi-task learning, and feature space refinement. For the cooking application, in particular, many of our findings are directly applicable in a real-world context, especially for the detection of allergens, finding alternative recipes due to dietary restrictions, and menu planning
APA, Harvard, Vancouver, ISO, and other styles
2

Azizpour, Hossein. "Visual Representations and Models: From Latent SVM to Deep Learning." Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-192289.

Full text
Abstract:
Two important components of a visual recognition system are representation and model. Both involves the selection and learning of the features that are indicative for recognition and discarding those features that are uninformative. This thesis, in its general form, proposes different techniques within the frameworks of two learning systems for representation and modeling. Namely, latent support vector machines (latent SVMs) and deep learning. First, we propose various approaches to group the positive samples into clusters of visually similar instances. Given a fixed representation, the sampled space of the positive distribution is usually structured. The proposed clustering techniques include a novel similarity measure based on exemplar learning, an approach for using additional annotation, and augmenting latent SVM to automatically find clusters whose members can be reliably distinguished from background class.  In another effort, a strongly supervised DPM is suggested to study how these models can benefit from privileged information. The extra information comes in the form of semantic parts annotation (i.e. their presence and location). And they are used to constrain DPMs latent variables during or prior to the optimization of the latent SVM. Its effectiveness is demonstrated on the task of animal detection. Finally, we generalize the formulation of discriminative latent variable models, including DPMs, to incorporate new set of latent variables representing the structure or properties of negative samples. Thus, we term them as negative latent variables. We show this generalization affects state-of-the-art techniques and helps the visual recognition by explicitly searching for counter evidences of an object presence. Following the resurgence of deep networks, in the last works of this thesis we have focused on deep learning in order to produce a generic representation for visual recognition. A Convolutional Network (ConvNet) is trained on a largely annotated image classification dataset called ImageNet with $\sim1.3$ million images. Then, the activations at each layer of the trained ConvNet can be treated as the representation of an input image. We show that such a representation is surprisingly effective for various recognition tasks, making it clearly superior to all the handcrafted features previously used in visual recognition (such as HOG in our first works on DPM). We further investigate the ways that one can improve this representation for a task in mind. We propose various factors involving before or after the training of the representation which can improve the efficacy of the ConvNet representation. These factors are analyzed on 16 datasets from various subfields of visual recognition.<br><p>QC 20160908</p>
APA, Harvard, Vancouver, ISO, and other styles
3

Panesar, Kulvinder. "Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions." Universitat Politécnica de Valéncia, 2020. http://hdl.handle.net/10454/18121.

Full text
Abstract:
yes<br>This paper aims to demystify the hype and attention on chatbots and its association with conversational artificial intelligence. Both are slowly emerging as a real presence in our lives from the impressive technological developments in machine learning, deep learning and natural language understanding solutions. However, what is under the hood, and how far and to what extent can chatbots/conversational artificial intelligence solutions work – is our question. Natural language is the most easily understood knowledge representation for people, but certainly not the best for computers because of its inherent ambiguous, complex and dynamic nature. We will critique the knowledge representation of heavy statistical chatbot solutions against linguistics alternatives. In order to react intelligently to the user, natural language solutions must critically consider other factors such as context, memory, intelligent understanding, previous experience, and personalized knowledge of the user. We will delve into the spectrum of conversational interfaces and focus on a strong artificial intelligence concept. This is explored via a text based conversational software agents with a deep strategic role to hold a conversation and enable the mechanisms need to plan, and to decide what to do next, and manage the dialogue to achieve a goal. To demonstrate this, a deep linguistically aware and knowledge aware text based conversational agent (LING-CSA) presents a proof-of-concept of a non-statistical conversational AI solution.
APA, Harvard, Vancouver, ISO, and other styles
4

Denize, Julien. "Self-supervised representation learning and applications to image and video analysis." Electronic Thesis or Diss., Normandie, 2023. http://www.theses.fr/2023NORMIR37.

Full text
Abstract:
Dans cette thèse, nous développons des approches d'apprentissage auto-supervisé pour l'analyse d'images et de vidéos. L'apprentissage de représentation auto-supervisé permet de pré-entraîner les réseaux neuronaux à apprendre des concepts généraux sans annotations avant de les spécialiser plus rapidement à effectuer des tâches, et avec peu d'annotations. Nous présentons trois contributions à l'apprentissage auto-supervisé de représentations d'images et de vidéos. Premièrement, nous introduisons le paradigme théorique de l'apprentissage contrastif doux et sa mise en œuvre pratique appelée Estimation Contrastive de Similarité (SCE) qui relie l'apprentissage contrastif et relationnel pour la représentation d'images. Ensuite, SCE est étendue à l'apprentissage de représentation vidéo temporelle globale. Enfin, nous proposons COMEDIAN, un pipeline pour l'apprentissage de représentation vidéo locale-temporelle pour l'architecture transformer. Ces contributions ont conduit à des résultats de pointe sur de nombreux benchmarks et ont donné lieu à de multiples contributions académiques et techniques publiées<br>In this thesis, we develop approaches to perform self-supervised learning for image and video analysis. Self-supervised representation learning allows to pretrain neural networks to learn general concepts without labels before specializing in downstream tasks faster and with few annotations. We present three contributions to self-supervised image and video representation learning. First, we introduce the theoretical paradigm of soft contrastive learning and its practical implementation called Similarity Contrastive Estimation (SCE) connecting contrastive and relational learning for image representation. Second, SCE is extended to global temporal video representation learning. Lastly, we propose COMEDIAN a pipeline for local-temporal video representation learning for transformers. These contributions achieved state-of-the-art results on multiple benchmarks and led to several academic and technical published contributions
APA, Harvard, Vancouver, ISO, and other styles
5

Tamaazousti, Youssef. "Vers l’universalité des représentations visuelle et multimodales." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLC038/document.

Full text
Abstract:
En raison de ses enjeux sociétaux, économiques et culturels, l’intelligence artificielle (dénotée IA) est aujourd’hui un sujet d’actualité très populaire. L’un de ses principaux objectifs est de développer des systèmes qui facilitent la vie quotidienne de l’homme, par le biais d’applications telles que les robots domestiques, les robots industriels, les véhicules autonomes et bien plus encore. La montée en popularité de l’IA est fortement due à l’émergence d’outils basés sur des réseaux de neurones profonds qui permettent d’apprendre simultanément, la représentation des données (qui était traditionnellement conçue à la main), et la tâche à résoudre (qui était traditionnellement apprise à l’aide de modèles d’apprentissage automatique). Ceci résulte de la conjonction des avancées théoriques, de la capacité de calcul croissante ainsi que de la disponibilité de nombreuses données annotées. Un objectif de longue date de l’IA est de concevoir des machines inspirées des humains, capables de percevoir le monde, d’interagir avec les humains, et tout ceci de manière évolutive (c’est `a dire en améliorant constamment la capacité de perception du monde et d’interaction avec les humains). Bien que l’IA soit un domaine beaucoup plus vaste, nous nous intéressons dans cette thèse, uniquement à l’IA basée apprentissage (qui est l’une des plus performante, à ce jour). Celle-ci consiste `a l’apprentissage d’un modèle qui une fois appris résoud une certaine tâche, et est généralement composée de deux sous-modules, l’un représentant la donnée (nommé ”représentation”) et l’autre prenant des décisions (nommé ”résolution de tâche”). Nous catégorisons, dans cette thèse, les travaux autour de l’IA, dans les deux approches d’apprentissage suivantes : (i) Spécialisation : apprendre des représentations à partir de quelques tâches spécifiques dans le but de pouvoir effectuer des tâches très spécifiques (spécialisées dans un certain domaine) avec un très bon niveau de performance; ii) Universalité : apprendre des représentations à partir de plusieurs tâches générales dans le but d’accomplir autant de tâches que possible dansdifférents contextes. Alors que la spécialisation a été largement explorée par la communauté de l’apprentissage profond, seules quelques tentatives implicites ont été réalisée vers la seconde catégorie, à savoir, l’universalité. Ainsi, le but de cette thèse est d’aborder explicitement le problème de l’amélioration de l’universalité des représentations avec des méthodes d’apprentissage profond, pour les données d’image et de texte. [...]<br>Because of its key societal, economic and cultural stakes, Artificial Intelligence (AI) is a hot topic. One of its main goal, is to develop systems that facilitates the daily life of humans, with applications such as household robots, industrial robots, autonomous vehicle and much more. The rise of AI is highly due to the emergence of tools based on deep neural-networks which make it possible to simultaneously learn, the representation of the data (which were traditionally hand-crafted), and the task to solve (traditionally learned with statistical models). This resulted from the conjunction of theoretical advances, the growing computational capacity as well as the availability of many annotated data. A long standing goal of AI is to design machines inspired humans, capable of perceiving the world, interacting with humans, in an evolutionary way. We categorize, in this Thesis, the works around AI, in the two following learning-approaches: (i) Specialization: learn representations from few specific tasks with the goal to be able to carry out very specific tasks (specialized in a certain field) with a very good level of performance; (ii) Universality: learn representations from several general tasks with the goal to perform as many tasks as possible in different contexts. While specialization was extensively explored by the deep-learning community, only a few implicit attempts were made towards universality. Thus, the goal of this Thesis is to explicitly address the problem of improving universality with deep-learning methods, for image and text data. We have addressed this topic of universality in two different forms: through the implementation of methods to improve universality (“universalizing methods”); and through the establishment of a protocol to quantify its universality. Concerning universalizing methods, we proposed three technical contributions: (i) in a context of large semantic representations, we proposed a method to reduce redundancy between the detectors through, an adaptive thresholding and the relations between concepts; (ii) in the context of neural-network representations, we proposed an approach that increases the number of detectors without increasing the amount of annotated data; (iii) in a context of multimodal representations, we proposed a method to preserve the semantics of unimodal representations in multimodal ones. Regarding the quantification of universality, we proposed to evaluate universalizing methods in a Transferlearning scheme. Indeed, this technical scheme is relevant to assess the universal ability of representations. This also led us to propose a new framework as well as new quantitative evaluation criteria for universalizing methods
APA, Harvard, Vancouver, ISO, and other styles
6

Cribier-Delande, Perrine. "Contexts and user modeling through disentangled representations learning." Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS407.

Full text
Abstract:
Les récents succès, parfois très médiatisés, de l’apprentissage profond ont attiré beaucoup d'attention sur le domaine. Sa force réside dans sa capacité à apprendre des représentations d’objets complexes. Pour Renault, obtenir une représentation de conducteurs est un objectif à long terme, identifié depuis longtemps. Cela lui permettrait de mieux comprendre comment ses produits sont utilisés. Renault possède une grande connaissance de la voiture et des données qu’elle utilise et produit. Ces données sont presque entièrement contenues dans le CAN. Cependant, le CAN ne contient que le fonctionnement interne de la voiture (rien sur son environnement). De nombreux autres facteurs (tels que la météo, les autres usagers, l’état de la route...) peuvent affecter la conduite, il nous faut donc les démêler. Nous avons considéré l’utilisateur (ici le conducteur) comme un contexte comme les autres. En transférant des méthodes de démêlage utilisées en image, nous avons pu créer des modèles qui apprennent des représentations démêlées des contextes. Supervisés uniquement avec de la prédiction pendant l’entrainement, nos modèles sont capables de générer des données à partir des représentations de contextes apprises. Ils peuvent même représenter de nouveaux contextes, qui ne sont vus qu’après l'entrainement (durant l’inférence). Le transfert de ces modèles sur les données CAN a permis de confirmer que les informations sur les contextes de conduite (y compris l'identité des conducteurs) sont bien contenues dans le CAN<br>The recent, sometimes very publicised, successes have drawn a lot of attention to Deep Learning (DL). Many questions are asked about the limitations of these techniques. The great strength of DL is its ability to learn representations of complex objects. Renault, as a car manufacturer, has a vested interest in discovering how their cars are used. Learning representations of drivers is one of their long-term goals. Renault's strength partly lies in their knowledge of cars and the data they use and produce. This data is almost entirely contained in the Controller Area Network (CAN). However, the CAN data only contains the inner workings of a car and not its surroundings. As many factors exterior to the driver and the car (such as weather, other road users, road condition...) can affect driving, we must find a way to disentangle them.Seeing the user (or driver) as just another context allowed us to use context modelling approaches. By transferring disentanglement approaches used in computer vision, we were able to develop models that learn disentangled representations of contexts. We tested these models with a few public datasets of time series with clearly labelled contexts. Using only forecasting as supervision during training, our models are able to generate data only from the learned representations of contexts. They even learn to represent new contexts, only seen after training.We then transferred the developed models on CAN data and were able to confirm that information about driving contexts (including driver's identity) is indeed contained in the CAN
APA, Harvard, Vancouver, ISO, and other styles
7

Kilinc, Ismail Ozsel. "Graph-based Latent Embedding, Annotation and Representation Learning in Neural Networks for Semi-supervised and Unsupervised Settings." Scholar Commons, 2017. https://scholarcommons.usf.edu/etd/7415.

Full text
Abstract:
Machine learning has been immensely successful in supervised learning with outstanding examples in major industrial applications such as voice and image recognition. Following these developments, the most recent research has now begun to focus primarily on algorithms which can exploit very large sets of unlabeled examples to reduce the amount of manually labeled data required for existing models to perform well. In this dissertation, we propose graph-based latent embedding/annotation/representation learning techniques in neural networks tailored for semi-supervised and unsupervised learning problems. Specifically, we propose a novel regularization technique called Graph-based Activity Regularization (GAR) and a novel output layer modification called Auto-clustering Output Layer (ACOL) which can be used separately or collaboratively to develop scalable and efficient learning frameworks for semi-supervised and unsupervised settings. First, singularly using the GAR technique, we develop a framework providing an effective and scalable graph-based solution for semi-supervised settings in which there exists a large number of observations but a small subset with ground-truth labels. The proposed approach is natural for the classification framework on neural networks as it requires no additional task calculating the reconstruction error (as in autoencoder based methods) or implementing zero-sum game mechanism (as in adversarial training based methods). We demonstrate that GAR effectively and accurately propagates the available labels to unlabeled examples. Our results show comparable performance with state-of-the-art generative approaches for this setting using an easier-to-train framework. Second, we explore a different type of semi-supervised setting where a coarse level of labeling is available for all the observations but the model has to learn a fine, deeper level of latent annotations for each one. Problems in this setting are likely to be encountered in many domains such as text categorization, protein function prediction, image classification as well as in exploratory scientific studies such as medical and genomics research. We consider this setting as simultaneously performed supervised classification (per the available coarse labels) and unsupervised clustering (within each one of the coarse labels) and propose a novel framework combining GAR with ACOL, which enables the network to perform concurrent classification and clustering. We demonstrate how the coarse label supervision impacts performance and the classification task actually helps propagate useful clustering information between sub-classes. Comparative tests on the most popular image datasets rigorously demonstrate the effectiveness and competitiveness of the proposed approach. The third and final setup builds on the prior framework to unlock fully unsupervised learning where we propose to substitute real, yet unavailable, parent- class information with pseudo class labels. In this novel unsupervised clustering approach the network can exploit hidden information indirectly introduced through a pseudo classification objective. We train an ACOL network through this pseudo supervision together with unsupervised objective based on GAR and ultimately obtain a k-means friendly latent representation. Furthermore, we demonstrate how the chosen transformation type impacts performance and helps propagate the latent information that is useful in revealing unknown clusters. Our results show state-of-the-art performance for unsupervised clustering tasks on MNIST, SVHN and USPS datasets with the highest accuracies reported to date in the literature.
APA, Harvard, Vancouver, ISO, and other styles
8

El-Shaer, Mennat Allah. "An Experimental Evaluation of Probabilistic Deep Networks for Real-time Traffic Scene Representation using Graphical Processing Units." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1546539166677894.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Marza, Pierre. "Learning spatial representations for single-task navigation and multi-task policies." Electronic Thesis or Diss., Lyon, INSA, 2024. http://www.theses.fr/2024ISAL0105.

Full text
Abstract:
Agir de manière autonome dans notre monde 3D requiert un large éventail de compétences, parmi lesquelles se trouvent la perception du milieu environnant, sa représentation précise et suffisamment efficace pour garder une trace du passé, la prise de décisions et l’action en vue d’atteindre des objectifs. Les animaux, par exemple les humains, se distinguent par leur robustesse lorsqu’il s’agit d’agir dans le monde. En particulier, ils savent s’adapter efficacement à de nouveaux environnements, mais sont aussi capables de maîtriser rapidement de nombreuses tâches à partir de quelques exemples. Ce manuscrit étudiera comment les réseaux neuronaux artificiels peuvent être entrainés pour acquérir un sous-ensemble de ces capacités. Nous nous concentrerons tout d’abord sur l’entrainement d’agents neuronaux à la cartographie sémantique, à la fois à partir d’un signal de supervision augmenté et avec des représentations neuronales de scènes. Les agents neuronaux sont souvent entrainés par apprentissage par renforcement (RL) à partir d’un signal de récompense peu dense. Guider l’apprentissage des capacités de cartographie d’une scène en ajoutant au signal de supervision des tâches auxiliaires favorisant le raisonnement spatial aidera à naviguer plus efficacement. Au lieu de travailler sur le signal d’entrainement des agents neuronaux, nous verrons également comment l’incorporation de représentations neuronales spécifiques de la sémantique et de la géométrie à l’architecture de l’agent peut contribuer à améliorer les performances de navigation sémantique. Ensuite, nous étudierons la meilleure façon d’explorer un environnement 3D afin de construire des représentations neuronales de l’espace qui soient aussi satisfaisantes que possible sur la base de métriques pensées pour la robotique que nous proposerons. Enfin, nous passerons d’agents de navigation à des agents multi-tâches et nous verrons à quel point il est important d’adapter les caractéristiques visuelles extraites des observations de capteurs à chaque tâche à accomplir pour réaliser une variété de tâches, mais aussi pour s’adapter à de nouvelles tâches inconnues à partir de quelques démonstrations. Ce manuscrit abordera donc différentes questions : Comment représenter une scène 3D et garder une trace de l’expérience passée dans un environnement ? – Comment s’adapter de manière robuste à de nouveaux environnements, scénarios et potentiellement de nouvelles tâches ? – Comment entrainer des agents à des tâches séquentielles à horizon long ? – Comment maîtriser conjointement toutes les sous-compétences requises ? – Quelle est l’importance de la perception en robotique ?<br>Autonomously behaving in the 3D world requires a large set of skills, among which are perceiving the surrounding environment, representing it precisely and efficiently enough to keep track of the past, making decisions and acting to achieve specified goals. Animals, for instance humans, stand out by their robustness when it comes to acting in the world. In particular, they can efficiently generalize to new environments, but are also able to rapidly master many tasks of interest from a few examples. This manuscript will study how artificial neural networks can be trained to acquire a subset of these abilities. We will first focus on training neural agents to perform semantic mapping, both from augmented supervision signal and with proposed neural-based scene representations. Neural agents are often trained with Reinforcement Learning (RL) from a sparse reward signal. Guiding the learning of scene mapping abilities by augmenting the vanilla RL supervision signal with auxiliary spatial reasoning tasks will help navigating efficiently. Instead of modifying the training signal of neural agents, we will also see how incorporating specific neural-based representations of semantics and geometry within the architecture of the agent can help improve performance in goal-driven navigation. Then, we will study how to best explore a 3D environment in order to build neural representations of space that are as satisfying as possible based on robotic-oriented metrics we will propose. Finally, we will move from navigation-only to multi-task agents, and see how important it is to tailor visual features from sensor observations to the task at hand to perform a wide variety of tasks, but also to adapt to new unknown tasks from a few demonstrations. This manuscript will thus address different important questions such as: How to represent a 3D scene and keep track of previous experience in an environment? – How to robustly adapt to new environments, scenarios, and potentially new tasks? – How to train agents on long-horizon sequential tasks? – How to jointly master all required sub-skills? – What is the importance of perception in robotics?
APA, Harvard, Vancouver, ISO, and other styles
10

Terreau, Enzo. "Apprentissage de représentations d'auteurs et d'autrices à partir de modèles de langue pour l'analyse des dynamiques d'écriture." Electronic Thesis or Diss., Lyon 2, 2024. http://www.theses.fr/2024LYO20001.

Full text
Abstract:
La démocratisation récente et massive des outils numériques a donné à tous le moyen de produire de l'information et de la partager sur le web, que ce soit à travers des blogs, des réseaux sociaux, des plateformes de partage, ... La croissance exponentielle de cette masse d'information disponible, en grande partie textuelle, nécessite le développement de modèles de traitement automatique du langage naturel (TAL), afin de la représenter mathématiquement pour ensuite la classer, la trier ou la recommander. C'est l'apprentissage de représentation. Il vise à construire un espace de faible dimension où les distances entre les objets projetées (mots, textes) reflètent les distances constatées dans le monde réel, qu'elles soient sémantique, stylistique, ...La multiplication des données disponibles, combinée à l'explosion des moyens de calculs et l'essor de l'apprentissage profond à permis de créer des modèles de langue extrêmement performant pour le plongement des mots et des documents. Ils assimilent des notions sémantiques et de langue complexes, en restant accessibles à tous et facilement spécialisables sur des tâches ou des corpus plus spécifiques. Il est possible de les utiliser pour construire des plongements d'auteurices. Seulement il est difficile de savoir sur quels aspects un modèle va se focaliser pour les rapprocher ou les éloigner. Dans un cadre littéraire, il serait préférable que les similarités se rapportent principalement au style écrit. Plusieurs problèmes se posent alors. La définition du style littéraire est floue, il est difficile d'évaluer l'écart stylistique entre deux textes et donc entre leurs plongements. En linguistique computationnelle, les approches visant à le caractériser sont principalement statistiques, s'appuyant sur des marqueurs du langage. Fort de ces constats, notre première contribution propose une méthode d'évaluation de la capacité des modèles de langue à appréhender le style écrit. Nous aurons au préalable détaillé comment le texte est représenté en apprentissage automatique puis en apprentissage profond, au niveau du mot, du document puis des auteurices. Nous aurons aussi présenté le traitement de la notion de style littéraire en TAL, base de notre méthode. Le transfert de connaissances entre les boîtes noires que sont les grands modèles de langue et ces méthodes issues de la linguistique n'en demeure pas moins complexe. Notre seconde contribution vise à réconcilier ces approches via un modèle d'apprentissage de représentations d'auteurices se focalisant sur le style, VADES (Variational Author and Document Embedding with Style). Nous nous comparons aux méthodes existantes et analysons leurs limites dans cette optique-là. Enfin, nous nous intéressons à l'apprentissage de plongements dynamiques d'auteurices et de documents. En effet, l'information temporelle est cruciale et permet une représentation plus fine des dynamiques d'écriture. Après une présentation de l'état de l'art, nous détaillons notre dernière contribution, B²ADE (Brownian Bridge for Author and Document Embedding), modélisant les auteurices comme des trajectoires. Nous finissons en décrivant plusieurs axes d'améliorations de nos méthodes ainsi que quelques problématiques pour de futurs travaux<br>The recent and massive democratization of digital tools has empowered individuals to generate and share information on the web through various means such as blogs, social networks, sharing platforms, and more. The exponential growth of available information, mostly textual data, requires the development of Natural Language Processing (NLP) models to mathematically represent it and subsequently classify, sort, or recommend it. This is the essence of representation learning. It aims to construct a low-dimensional space where the distances between projected objects (words, texts) reflect real-world distances, whether semantic, stylistic, and so on.The proliferation of available data, coupled with the rise in computing power and deep learning, has led to the creation of highly effective language models for word and document embeddings. These models incorporate complex semantic and linguistic concepts while remaining accessible to everyone and easily adaptable to specific tasks or corpora. One can use them to create author embeddings. However, it is challenging to determine the aspects on which a model will focus to bring authors closer or move them apart. In a literary context, it is preferable for similarities to primarily relate to writing style, which raises several issues. The definition of literary style is vague, assessing the stylistic difference between two texts and their embeddings is complex. In computational linguistics, approaches aiming to characterize it are mainly statistical, relying on language markers. In light of this, our first contribution is a framework to evaluate the ability of language models to grasp writing style. We will have previously elaborated on text embedding models in machine learning and deep learning, at the word, document, and author levels. We will also have presented the treatment of the notion of literary style in Natural Language Processing, which forms the basis of our method. Transferring knowledge between black-box large language models and these methods derived from linguistics remains a complex task. Our second contribution aims to reconcile these approaches through a representation learning model focusing on style, VADES (Variational Author and Document Embedding with Style). We compare our model to state-of-the-art ones and analyze their limitations in this context.Finally, we delve into dynamic author and document embeddings. Temporal information is crucial, allowing for a more fine-grained representation of writing dynamics. After presenting the state of the art, we elaborate on our last contribution, B²ADE (Brownian Bridge Author and Document Embedding), which models authors as trajectories. We conclude by outlining several leads for improving our methods and highlighting potential research directions for the future
APA, Harvard, Vancouver, ISO, and other styles
11

McNeil, Patrick N. "Integrating Multiple Modalities into Deep Learning Networks." Thesis, Nova Southeastern University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10283417.

Full text
Abstract:
<p> Deep learning networks in the literature traditionally only used a single input modality (or data stream). Integrating multiple modalities into deep learning networks with the goal of correlating extracted features was a major issue. Traditional methods involved treating each modality separately and then writing custom code to combine the extracted features.</p><p> Current solutions for small numbers of modalities (three or less) showed there are multiple architectures for modality integration. With an increase in the number of modalities, the &ldquo;curse of dimensionality&rdquo; affects the performance of the system. The research showed current methods for larger scale integrations required separate, custom created modules with another integration layer outside the deep learning network. These current solutions do not scale well nor provide good generalized performance. This research report studied architectures using multiple modalities and the creation of a scalable and efficient architecture.</p>
APA, Harvard, Vancouver, ISO, and other styles
12

Li, Hao. "Towards Fast and Efficient Representation Learning." Thesis, University of Maryland, College Park, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10845690.

Full text
Abstract:
<p> The success of deep learning and convolutional neural networks in many fields is accompanied by a significant increase in the computation cost. With the increasing model complexity and pervasive usage of deep neural networks, there is a surge of interest in fast and efficient model training and inference on both cloud and embedded devices. Meanwhile, understanding the reasons for trainability and generalization is fundamental for its further development. This dissertation explores approaches for fast and efficient representation learning with a better understanding of the trainability and generalization. In particular, we ask following questions and provide our solutions: 1) How to reduce the computation cost for fast inference? 2) How to train low-precision models on resources-constrained devices? 3) What does the loss surface looks like for neural nets and how it affects generalization?</p><p> To reduce the computation cost for fast inference, we propose to prune filters from CNNs that are identified as having a small effect on the prediction accuracy. By removing filters with small norms together with their connected feature maps, the computation cost can be reduced accordingly without using special software or hardware. We show that simple filter pruning approach can reduce the inference cost while regaining close to the original accuracy by retraining the networks.</p><p> To further reduce the inference cost, quantizing model parameters with low-precision representations has shown significant speedup, especially for edge devices that have limited computing resources, memory capacity, and power consumption. To enable on-device learning on lower-power systems, removing the dependency of full-precision model during training is the key challenge. We study various quantized training methods with the goal of understanding the differences in behavior, and reasons for success or failure. We address the issue of why algorithms that maintain floating-point representations work so well, while fully quantized training methods stall before training is complete. We show that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low-precision arithmetic.</p><p> Finally, we explore the structure of neural loss functions, and the effect of loss landscapes on generalization, using a range of visualization methods. We introduce a simple filter normalization method that helps us visualize loss function curvature, and make meaningful side-by-side comparisons between loss functions. The sharpness of minimizers correlates well with generalization error when this visualization is used. Then, using a variety of visualizations, we explore how training hyper-parameters affect the shape of minimizers, and how network architecture affects the loss landscape.</p><p>
APA, Harvard, Vancouver, ISO, and other styles
13

Nguyen, Thien Huu. "Deep Learning for Information Extraction." Thesis, New York University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10260911.

Full text
Abstract:
<p> The explosion of data has made it crucial to analyze the data and distill important information effectively and efficiently. A significant part of such data is presented in unstructured and free-text documents. This has prompted the development of the techniques for information extraction that allow computers to automatically extract structured information from the natural free-text data. Information extraction is a branch of natural language processing in artificial intelligence that has a wide range of applications, including question answering, knowledge base population, information retrieval etc. The traditional approach for information extraction has mainly involved hand-designing large feature sets (feature engineering) for different information extraction problems, i.e, entity mention detection, relation extraction, coreference resolution, event extraction, and entity linking. This approach is limited by the laborious and expensive effort required for feature engineering for different domains, and suffers from the unseen word/feature problem of natural languages. </p><p> This dissertation explores a different approach for information extraction that uses deep learning to automate the representation learning process and generate more effective features. Deep learning is a subfield of machine learning that uses multiple layers of connections to reveal the underlying representations of data. I develop the fundamental deep learning models for information extraction problems and demonstrate their benefits through systematic experiments. </p><p> First, I examine word embeddings, a general word representation that is produced by training a deep learning model on a large unlabelled dataset. I introduce methods to use word embeddings to obtain new features that generalize well across domains for relation extraction. This is done for both the feature-based method and the kernel-based method of relation extraction. </p><p> Second, I investigate deep learning models for different problems, including entity mention detection, relation extraction and event detection. I develop new mechanisms and network architectures that allow deep learning to model the structures of information extraction problems more effectively. Some extensive experiments are conducted on the domain adaptation and transfer learning settings to highlight the generalization advantage of the deep learning models for information extraction. </p><p> Finally, I investigate the joint frameworks to simultaneously solve several information extraction problems and benefit from the inter-dependencies among these problems. I design a novel memory augmented network for deep learning to properly exploit such inter-dependencies. I demonstrate the effectiveness of this network on two important problems of information extraction, i.e, event extraction and entity linking.</p><p>
APA, Harvard, Vancouver, ISO, and other styles
14

Sun, Haozhe. "Modularity in deep learning." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG090.

Full text
Abstract:
L'objectif de cette thèse est de rendre l'apprentissage profond plus efficace en termes de ressources en appliquant le principe de modularité. La thèse comporte plusieurs contributions principales : une étude de la littérature sur la modularité dans l'apprentissage profond; la conception d'OmniPrint et de Meta-Album, des outils qui facilitent l'étude de la modularité des données; des études de cas examinant les effets de l'apprentissage épisodique, un exemple de modularité des données; un mécanisme d'évaluation modulaire appelé LTU pour évaluer les risques en matière de protection de la vie privée; et la méthode RRR pour réutiliser des modèles modulaires pré-entraînés afin d'en construire des versions plus compactes. La modularité, qui implique la décomposition d'une entité en sous-entités, est un concept répandu dans diverses disciplines. Cette thèse examine la modularité sur trois axes de l'apprentissage profond : les données, la tâche et le modèle. OmniPrint et Meta-Album facilitent de comparer les modèles modulaires et d'explorer les impacts de la modularité des données. LTU garantit la fiabilité de l'évaluation de la protection de la vie privée. RRR améliore l'efficacité de l'utilisation des modèles modulaires pré-entraînés. Collectivement, cette thèse fait le lien entre le principe de modularité et l'apprentissage profond et souligne ses avantages dans certains domaines de l'apprentissage profond, contribuant ainsi à une intelligence artificielle plus efficace en termes de ressources<br>This Ph.D. thesis is dedicated to enhancing the efficiency of Deep Learning by leveraging the principle of modularity. It contains several main contributions: a literature survey on modularity in Deep Learning; the introduction of OmniPrint and Meta-Album, tools that facilitate the investigation of data modularity; case studies examining the effects of episodic few-shot learning, an instance of data modularity; a modular evaluation mechanism named LTU for assessing privacy risks; and the method RRR for reusing pre-trained modular models to create more compact versions. Modularity, which involves decomposing an entity into sub-entities, is a prevalent concept across various disciplines. This thesis examines modularity across three axes of Deep Learning: data, task, and model. OmniPrint and Meta-Album assist in benchmarking modular models and exploring data modularity's impacts. LTU ensures the reliability of the privacy assessment. RRR significantly enhances the utilization efficiency of pre-trained modular models. Collectively, this thesis bridges the modularity principle with Deep Learning and underscores its advantages in selected fields of Deep Learning, contributing to more resource-efficient Artificial Intelligence
APA, Harvard, Vancouver, ISO, and other styles
15

Tavanaei, Amirhossein. "Spiking Neural Networks and Sparse Deep Learning." Thesis, University of Louisiana at Lafayette, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=10807940.

Full text
Abstract:
<p> This document proposes new methods for training multi-layer and deep spiking neural networks (SNNs), specifically, spiking convolutional neural networks (CNNs). Training a multi-layer spiking network poses difficulties because the output spikes do not have derivatives and the commonly used backpropagation method for non-spiking networks is not easily applied. Our methods use novel versions of the brain-like, local learning rule named spike-timing-dependent plasticity (STDP) that incorporates supervised and unsupervised components. Our method starts with conventional learning methods and converts them to spatio-temporally local rules suited for SNNs. </p><p> The training uses two components for unsupervised feature extraction and supervised classification. The first component refers to new STDP rules for spike-based representation learning that trains convolutional filters and initial representations. The second introduces new STDP-based supervised learning rules for spike pattern classification via an approximation to gradient descent by combining the STDP and anti-STDP rules. Specifically, the STDP-based supervised learning model approximates gradient descent by using temporally local STDP rules. Stacking these components implements a novel sparse, spiking deep learning model. Our spiking deep learning model is categorized as a variation of spiking CNNs of integrate-and-fire (IF) neurons with performance comparable with the state-of-the-art deep SNNs. The experimental results show the success of the proposed model for image classification. Our network architecture is the only spiking CNN which provides bio-inspired STDP rules in a hierarchy of feature extraction and classification in an entirely spike-based framework.</p><p>
APA, Harvard, Vancouver, ISO, and other styles
16

Zhang, Shanshan. "Deep Learning for Unstructured Data by Leveraging Domain Knowledge." Diss., Temple University Libraries, 2019. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/580099.

Full text
Abstract:
Computer and Information Science<br>Ph.D.<br>Unstructured data such as texts, strings, images, audios, videos are everywhere due to the social interaction on the Internet and the high-throughput technology in sciences, e.g., chemistry and biology. However, for traditional machine learning algorithms, classifying a text document is far more difficult than classifying a data entry in a spreadsheet. We have to convert the unstructured data into some numeric vectors which can then be understood by machine learning algorithms. For example, a sentence is first converted to a vector of word counts, and then fed into a classification algorithm such as logistic regression and support vector machine. The creation of such numerical vectors is very challenging and difficult. Recent progress in deep learning provides us a new way to jointly learn features and train classifiers for unstructured data. For example, recurrent neural networks proved successful at learning from a sequence of word indices; convolutional neural networks are effective to learn from videos, which are sequences of pixel matrices. Our research focuses on developing novel deep learning approaches for text and graph data. Breakthroughs using deep learning have been made during the last few years for many core tasks in natural language processing, such as machine translation, POS tagging, named entity recognition, etc. However, when it comes to informal and noisy text data, such as tweets, HTMLs, OCR, there are two major issues with modern deep learning technologies. First, deep learning requires large amount of labeled data to train an effective model; second, neural network architectures that work with natural language are not proper with informal text. In this thesis, we address the two important issues and develop new deep learning approaches in four supervised and unsupervised tasks with noisy text. We first present a deep feature engineering approach for informative tweets discovery during the emerging disasters. We propose to use unlabeled microblogs to cluster words into a limited number of clusters and use the word clusters as features for tweets discovery. Our results indicate that when the number of labeled tweets is 100 or less, the proposed approach is superior to the standard classification based on the bag or words feature representation. We then introduce a human-in-the-loop (HIL) framework for entity identification from noisy web text. Our work explores ways to combine the expressive power of REs, ability of deep learning to learn from large data into a new integrated framework for entity identification from web data. The evaluation on several entity identification problems shows that the proposed framework achieves very high accuracy while requiring only a modest human involvement. We further extend the framework of entity identification to an iterative HIL framework that addresses the entity recognition problem. We particularly investigate how human invest their time when a user is allowed to choose between regex construction and manual labeling. Finally, we address a fundamental problem in the text mining domain, i.e, embedding of rare and out-of-vocabulary (OOV) words, by refining word embedding models and character embedding models in an iterative way. We illustrate the simplicity but effectiveness of our method when applying it to online professional profiles allowing noisy user input. Graph neural networks have been shown great success in the domain of drug design and material sciences, where organic molecules and crystal structures of materials are represented as attributed graphs. A deep learning architecture that is capable of learning from graph nodes and graph edges is crucial for property estimation of molecules. In this dissertation, We propose a simple graph representation for molecules and three neural network architectures that is able to directly learn predictive functions from graphs. We discover that, it is true graph networks are superior than feature-driven algorithms for formation energy prediction. However, the superiority can not be reproduced on band gap prediction. We also discovered that our proposed simple shallow neural networks perform comparably with the state-of-the-art deep neural networks.<br>Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
17

He, Fengxiang. "Theoretical Deep Learning." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/25674.

Full text
Abstract:
Deep learning has long been criticised as a black-box model for lacking sound theoretical explanation. During the PhD course, I explore and establish theoretical foundations for deep learning. In this thesis, I present my contributions positioned upon existing literature: (1) analysing the generalizability of the neural networks with residual connections via complexity and capacity-based hypothesis complexity measures; (2) modeling stochastic gradient descent (SGD) by stochastic differential equations (SDEs) and their dynamics, and further characterizing the generalizability of deep learning; (3) understanding the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems, which sheds light in reconciling the over-representation and excellent generalizability of deep learning; and (4) discovering the interplay between generalization, privacy preservation, and adversarial robustness, which have seen rising concerns in deep learning deployment.
APA, Harvard, Vancouver, ISO, and other styles
18

Prabhakar, Nachiketh. "Deep Learning To Improve Hi-C Data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=case1562582435975614.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Kapkic, Ahmet. "Empirical Study of Pedestrian Detection using Deep Learning." Youngstown State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1620400901834886.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Benedetti, Riccardo. "From Artificial Intelligence to Artificial Art: Deep Learning with Generative Adversarial Networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18167/.

Full text
Abstract:
Neural Network had a great impact on Artificial Intelligence and nowadays the Deep Learning algorithms are widely used to extract knowledge from huge amount of data. This thesis aims to revisit the evolution of Deep Learning from the origins till the current state-of-art by focusing on a particular prospective. The main question we try to answer is: can AI exhibit artistic abilities comparable to the human ones? Recovering the definition of the Turing Test, we propose a similar formulation of the concept, indeed, we would like to test the machine's ability to exhibit artistic behaviour equivalent to, or indistinguishable from, that of a human. The argument we will analyze as a support for this debate is an interesting and innovative idea coming from the field of Deep Learning, known as Generative Adversarial Network (GAN). GAN is basically a system composed of two neural network fighting each other in a zero-sum game. The ''bullets'' fired during this challenge are simply images generated by one of the two networks. The interesting part in this scenario is that, with a proper system design and training, after several iteration these fake generated images start to become more and more closer to the ones we see in the reality, making indistinguishable what is real from what is not. We will talk about some real anecdotes around GANs to spice up even more the discussion generated by the question previously posed and we will present some recent real world application based on GANs to emphasize their importance also in term of business. We will conclude with a practical experiment over an Amazon catalogue of clothing images and reviews with the aim of generating new never seen product starting from the most popular existing ones.
APA, Harvard, Vancouver, ISO, and other styles
21

Shi, Shaohuai. "Communication optimizations for distributed deep learning." HKBU Institutional Repository, 2020. https://repository.hkbu.edu.hk/etd_oa/813.

Full text
Abstract:
With the increasing amount of data and the growing computing power, deep learning techniques using deep neural networks (DNNs) have been successfully applied in many practical artificial intelligence applications. The mini-batch stochastic gradient descent (SGD) algorithm and its variants are the most widely used algorithms in training deep models. The SGD algorithm is an iterative algorithm that needs to update the model parameters many times by traversing the training data, which is very time-consuming even using the single powerful GPU or TPU. Therefore, it becomes a common practice to exploit multiple processors (e.g., GPUs or TPUs) to accelerate the training process using distributed SGD. However, the iterative nature of distributed SGD requires multiple processors to iteratively communicate with each other to collaboratively update the model parameters. The intensive communication cost easily becomes the system bottleneck and limits the system scalability. In this thesis, we study the communication-efficient techniques for distributed SGD to improve the system scalability and thus accelerate the training process. We identify the performance issues in distributed SGD through benchmarking and modeling and then propose several communication optimization algorithms to address the communication issues. First, we build a performance model with a directed acyclic graph (DAG) to modeling the training process of distributed SGD and verify the model with extensive benchmarks on existing state-of-the-art deep learning frameworks including Caffe, MXNet, TensorFlow, and CNTK. Our benchmarking and modeling point out that existing optimizations for the communication problems are sub-optimal, which we need to address in this thesis. Second, to address the startup problem (due to the high latency of each communication) of layer-wise communications with wait-free backpropagation (WFBP), we propose an optimal gradient merging solution for WFBP, named MG-WFBP, that exploits the layer-wise property to well overlap the communication tasks with the computing tasks and can be adaptive to the training environments. Experiments are conducted on dense-GPU clusters with Ethernet and InfiniBand, and the results show that MG-WFBP can well address the startup problem in distributed training of layer-wise structured DNNs. Third, to make the high computing-intensive training tasks be possible in GPU clusters with low- bandwidth interconnect, we investigate the gradient compression techniques in distributed training. The top-{dollar}k{dollar} sparsification can well compress the communication traffic with little impact on the model convergence, but it suffers from a linear communication complexity to the number of workers so that top-{dollar}k{dollar} sparsification cannot scale well in large-scale clusters. To address the problem, we propose a global top-{dollar}k{dollar} (gTop-{dollar}k{dollar}) sparsification algorithm that reduces the communication complexity to be logarithmic to the number of workers. We also provide detailed theoretical analysis for the gTop-{dollar}k{dollar} SGD training algorithm, and the theoretical results show that our gTop-{dollar}k{dollar} SGD has the same order of convergence rate with SGD. Experiments are conducted on up to 64-GPU cluster to verify that gTop-{dollar}k{dollar} SGD significantly improves the system scalability with only a slight impact on the model convergence. Lastly, to enjoy the both benefits of the pipelining technique and the gradient sparsification algorithm, we propose a new distributed training algorithm, layer-wise adaptive gradient sparsification SGD (LAGS-SGD), which supports layer-wise sparsification and communication, and we theoretically and empirically prove that the LAGS-SGD preserves the convergence properties. To further alliterate the impact of the startup problem of layer-wise communications in LAGS-SGD, we also propose the optimal gradient merging solution for LAGS-SGD, named OMGS-SGD, and theoretical prove its optimality. The experimental results on a 16-node GPU cluster connected 1Gbps Ethernet show that OMGS-SGD can always improve the system scalability while the model convergence properties are not affected
APA, Harvard, Vancouver, ISO, and other styles
22

Sarpangala, Kishan. "Semantic Segmentation Using Deep Learning Neural Architectures." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin157106185092304.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Aboul-Enien, Hisham Abdel-Ghaffer. "Neural network learning and knowledge representation in a multi-agent system." Thesis, Imperial College London, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.252040.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Lander, Sean. "An evolutionary method for training autoencoders for deep learning networks." Thesis, University of Missouri - Columbia, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10180878.

Full text
Abstract:
<p> Introduced in 2006, Deep Learning has made large strides in both supervised an unsupervised learning. The abilities of Deep Learning have been shown to beat both generic and highly specialized classification and clustering techniques with little change to the underlying concept of a multi-layer perceptron. Though this has caused a resurgence of interest in neural networks, many of the drawbacks and pitfalls of such systems have yet to be addressed after nearly 30 years: speed of training, local minima and manual testing of hyper-parameters.</p><p> In this thesis we propose using an evolutionary technique in order to work toward solving these issues and increase the overall quality and abilities of Deep Learning Networks. In the evolution of a population of autoencoders for input reconstruction, we are able to abstract multiple features for each autoencoder in the form of hidden nodes, scoring the autoencoders based on their ability to reconstruct their input, and finally selecting autoencoders for crossover and mutation with hidden nodes as the chromosome. In this way we are able to not only quickly find optimal abstracted feature sets but also optimize the structure of the autoencoder to match the features being selected. This also allows us to experiment with different training methods in respect to data partitioning and selection, reducing overall training time drastically for large and complex datasets. This proposed method allows even large datasets to be trained quickly and efficiently with little manual parameter choice required by the user, leading to faster, more accurate creation of Deep Learning Networks.</p>
APA, Harvard, Vancouver, ISO, and other styles
25

Bhagat, Kunj H. "Automatic Snooker-Playing Robot with Speech Recognition Using Deep Learning." Thesis, California State University, Long Beach, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10977867.

Full text
Abstract:
<p> Research on natural language processing, such as for image and speech recognition, is rapidly changing focus from statistical methods to neural networks. In this study, we introduce speech recognition capabilities along with computer vision to allow a robot to play snooker completely by itself. The color of the ball to be pocketed is provided as an audio input using an audio device such as a microphone. The system is able to recognize the color from the input using a trained deep learning network. The system then commands the camera to locate the ball of the identified color on a snooker table by using computer vision. To pocket the target ball, the system then predicts the best shot using an algorithm. This activity can be executed accurately based on the efficiency of the trained deep learning model.</p><p>
APA, Harvard, Vancouver, ISO, and other styles
26

Nelsson, Mikael. "Deep learning for medical report texts." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-356140.

Full text
Abstract:
Data within the medical sector is often stored as free text entries. This is especially true for report texts, which are written after an examination. To be able to automatically gather data from these texts they need to be analyzed and classified to show what findings the examinations had. This thesis compares three state of the art deep learning approaches to classify short medical report texts. This is done for two types of examinations, so the concept of transfer learning plays a role in the evaluation. An optimal model should learn concepts that are applicable for more than one type of examinations, since we can expect the texts to be similar. The two data set from the examinations are also of different sizes, and both have an uneven distribution among the target classes. One of the models is based on techniques traditionally used for language processing using deep learning. The two other models are based on techniques usually used for image recognition and classification. The latter models proves to be the best across the different metrics, not least in the sense of transfer learning as they improve the results when learning from both types of examinations. This becomes especially apparent for the lowest frequent class from the smaller data set as none of the models correctly predict this class without using transfer learning.
APA, Harvard, Vancouver, ISO, and other styles
27

Abrishami, Hedayat. "Deep Learning Based Electrocardiogram Delineation." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1563525992210273.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Franceschelli, Giorgio. "Generative Deep Learning and Creativity." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text
Abstract:
“Non ha la presunzione di originare nulla; può solo fare ciò che noi sappiamo ordinarle di fare”. Così, oltre 150 anni fa, Lady Lovelace commentava la Macchina Analitica di Babbage, l’antenato dei nostri computer. Una frase che, a distanza di tanti anni, suona quasi come una sfida: grazie alla diffusione delle tecniche di Generative Deep Learning e alle ricerche nell’ambito della Computational Creativity, sempre più sforzi sono stati destinati allo smentire l’ormai celebre Obiezione della Lovelace. Proprio a partire da questa, quattro domande formano i capisaldi della Computational Creativity: se è possibile sfruttare tecniche computazionali per comprendere la creatività umana; e, soprattutto, se i computer possono fare cose che sembrino creative (se non che siano effettivamente creative), e se possono imparare a riconoscere la creatività. Questa tesi si propone dunque di inserirsi in tale contesto, esplorando queste ultime due questioni grazie a tecniche di Deep Learning. In particolare, sulla base della definizione di creatività proposta da Margaret Boden, è presentata una metrica data dalla somma pesata di tre singole componenti (valore, novità e sorpresa) per il riconoscimento della creatività. In aggiunta, sfruttando tale misura, è presentato anche UCAN (Unexpectedly Creative Adversarial Network), un modello generativo orientato alla creatività, che impara a produrre opere creative massimizzando la metrica di cui sopra. Sia il generatore sia la metrica sono stati testati sul contesto della poesia americana del diciannovesimo secolo; i risultati ottenuti mostrano come la metrica sia effettivamente in grado di intercettare la traiettoria storica, e come possa rappresentare un importante passo avanti per lo studio della Computational Creativity; il generatore, pur non ottenendo risultati altrettanto eccellenti, si pone quale punto di partenza per la definizione futura di un modello effettivamente creativo.
APA, Harvard, Vancouver, ISO, and other styles
29

Morri, Francesco. "A thermodynamic approach to deep learning." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Find full text
Abstract:
Neural Networks are an incredibly powerful tool used to solve complex problems. The actual functioning of this tool and its behaviour when applied to different kind of problems is not completely explain though. In this work we study the behaviour of a neural network, used to classify images, through a physical model, based on statistical thermodynamics. We found interesting results regarding the temperature of the different components of the network, that may be exploited in a more efficient training algorithm.
APA, Harvard, Vancouver, ISO, and other styles
30

Chen, Jianan. "Deep Learning Based Multimodal Retrieval." Electronic Thesis or Diss., Rennes, INSA, 2023. http://www.theses.fr/2023ISAR0019.

Full text
Abstract:
Les tâches multimodales jouent un rôle crucial dans la progression vers l'atteinte de l'intelligence artificielle (IA) générale. L'objectif principal de la recherche multimodale est d'exploiter des algorithmes d'apprentissage automatique pour extraire des informations sémantiques pertinentes, en comblant le fossé entre différentes modalités telles que les images visuelles, le texte linguistique et d'autres sources de données. Il convient de noter que l'entropie de l'information associée à des données hétérogènes pour des sémantiques de haut niveau identiques varie considérablement, ce qui pose un défi important pour les modèles multimodaux. Les modèles de réseau multimodal basés sur l'apprentissage profond offrent une solution efficace pour relever les difficultés découlant des différences substantielles d'entropie de l’information. Ces modèles présentent une précision et une stabilité impressionnantes dans les tâches d'appariement d'informations multimodales à grande échelle, comme la recherche d'images et de textes. De plus, ils démontrent de solides capacités d'apprentissage par transfert, permettant à un modèle bien entraîné sur une tâche multimodale d'être affiné et appliqué à une nouvelle tâche multimodale. Dans nos recherches, nous développons une nouvelle base de données multimodale et multi-vues générative spécifiquement conçue pour la tâche de segmentation référentielle multimodale. De plus, nous établissons une référence de pointe (SOTA) pour les modèles de segmentation d'expressions référentielles dans le domaine multimodal. Les résultats de nos expériences comparatives sont présentés de manière visuelle, offrant des informations claires et complètes<br>Multimodal tasks play a crucial role in the progression towards achieving general artificial intelligence (AI). The primary goal of multimodal retrieval is to employ machine learning algorithms to extract relevant semantic information, bridging the gap between different modalities such as visual images, linguistic text, and other data sources. It is worth noting that the information entropy associated with heterogeneous data for the same high-level semantics varies significantly, posing a significant challenge for multimodal models. Deep learning-based multimodal network models provide an effective solution to tackle the difficulties arising from substantial differences in information entropy. These models exhibit impressive accuracy and stability in large-scale cross-modal information matching tasks, such as image-text retrieval. Furthermore, they demonstrate strong transfer learning capabilities, enabling a well-trained model from one multimodal task to be fine-tuned and applied to a new multimodal task, even in scenarios involving few-shot or zero-shot learning. In our research, we develop a novel generative multimodal multi-view database specifically designed for the multimodal referential segmentation task. Additionally, we establish a state-of-the-art (SOTA) benchmark and multi-view metric for referring expression segmentation models in the multimodal domain. The results of our comparative experiments are presented visually, providing clear and comprehensive insights
APA, Harvard, Vancouver, ISO, and other styles
31

Abd, Gaus Yona Falinie. "Artificial intelligence system for continuous affect estimation from naturalistic human expressions." Thesis, Brunel University, 2018. http://bura.brunel.ac.uk/handle/2438/16348.

Full text
Abstract:
The analysis and automatic affect estimation system from human expression has been acknowledged as an active research topic in computer vision community. Most reported affect recognition systems, however, only consider subjects performing well-defined acted expression, in a very controlled condition, so they are not robust enough for real-life recognition tasks with subject variation, acoustic surrounding and illumination change. In this thesis, an artificial intelligence system is proposed to continuously (represented along a continuum e.g., from -1 to +1) estimate affect behaviour in terms of latent dimensions (e.g., arousal and valence) from naturalistic human expressions. To tackle the issues, feature representation and machine learning strategies are addressed. In feature representation, human expression is represented by modalities such as audio, video, physiological signal and text modality. Hand- crafted features is extracted from each modality per frame, in order to match with consecutive affect label. However, the features extracted maybe missing information due to several factors such as background noise or lighting condition. Haar Wavelet Transform is employed to determine if noise cancellation mechanism in feature space should be considered in the design of affect estimation system. Other than hand-crafted features, deep learning features are also analysed in terms of the layer-wise; convolutional and fully connected layer. Convolutional Neural Network such as AlexNet, VGGFace and ResNet has been selected as deep learning architecture to do feature extraction on top of facial expression images. Then, multimodal fusion scheme is applied by fusing deep learning feature and hand-crafted feature together to improve the performance. In machine learning strategies, two-stage regression approach is introduced. In the first stage, baseline regression methods such as Support Vector Regression are applied to estimate each affect per time. Then in the second stage, subsequent model such as Time Delay Neural Network, Long Short-Term Memory and Kalman Filter is proposed to model the temporal relationships between consecutive estimation of each affect. In doing so, the temporal information employed by a subsequent model is not biased by high variability present in consecutive frame and at the same time, it allows the network to exploit the slow changing dynamic between emotional dynamic more efficiently. Following of two-stage regression approach for unimodal affect analysis, fusion information from different modalities is elaborated. Continuous emotion recognition in-the-wild is leveraged by investigating mathematical modelling for each emotion dimension. Linear Regression, Exponent Weighted Decision Fusion and Multi-Gene Genetic Programming are implemented to quantify the relationship between each modality. In summary, the research work presented in this thesis reveals a fundamental approach to automatically estimate affect value continuously from naturalistic human expression. The proposed system, which consists of feature smoothing, deep learning feature, two-stage regression framework and fusion using mathematical equation between modalities is demonstrated. It offers strong basis towards the development artificial intelligent system on estimation continuous affect estimation, and more broadly towards building a real-time emotion recognition system for human-computer interaction.
APA, Harvard, Vancouver, ISO, and other styles
32

Chaudhuri, Sumon. "Three essays on how deep learning can revolutionize coopetition contexts." Electronic Thesis or Diss., Cergy-Pontoise, Ecole supérieure des sciences économiques et commerciales, 2024. http://www.theses.fr/2024ESEC0003.

Full text
Abstract:
L'apprentissage profond a été utilisé pour trouver de nouvelles solutions à des problèmes de longue date dans la recherche en sciences sociales. Il nous permet non seulement d'analyser de multiples modalités de données (texte, audio, vidéo, images), mais aussi de découvrir de nouvelles techniques pour améliorer les approches de modélisation existantes. Dans ma thèse, j'étudie comment l'apprentissage profond peut apporter de nouvelles solutions aux problèmes dans le contexte de la coopetition. La coopétition fait référence à un environnement dans lequel plusieurs agents peuvent simultanément rivaliser et collaborer les uns avec les autres. Naturellement, la plus grande question dans ce domaine est de savoir comment trouver un équilibre entre les forces opposées de la collaboration et de la concurrence. Dans ma thèse, j'utilise l'apprentissage profond pour résoudre ce problème à la fois au niveau du consommateur et au niveau de l'entreprise. Chacun de mes chapitres examine un contexte de concurrence différent. Le chapitre 1 porte sur les négociations entre les entreprises et leurs clients, le chapitre 2 sur la concurrence entre plusieurs entreprises sur la base des données des clients, et le chapitre 3 sur les jeux de négociation non structurés avec des informations asymétriques entre deux agents. Traduit avec DeepL.com (version gratuite)<br>Deep learning has been used to find novel solutions to long-standing problems in social science research. It not only allows us to analyze multiple modalities of data (i.e., text, audio, video, images) but also uncover new techniques to improve upon existing modeling approaches. In my thesis, I look at how deep learning can provide new solutions to problems in the context of coopetition. Coopetition refers to a setting where multiple agents can simultaneously compete, as well as collaborate with each other. Naturally, the biggest question in this area is how to find a balance between the opposing forces of collaboration and competition. In my thesis, I use deep learning to tackle this problem statement at both the consumer level and the firm level. Each of my chapters considers a different coopetition context. Chapter 1 looks at negotiations between firms and their customers, chapter 2 is on coopetition between multiple firms on the basis of customer data, and chapter 3 focuses on unstructured bargaining games with asymmetric information between two agents
APA, Harvard, Vancouver, ISO, and other styles
33

Li, Chang. "Sequence learning using deep neural networks with flexibility and interpretability." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/25390.

Full text
Abstract:
Throughout this thesis, I investigate two long-standing yet rarely explored sequence learning challenges under the Probabilistic Graphical Models (PGMs) framework: learning multi-timescale representations on a single sequence and learning higher-order dynamics between multi-sequences. The first challenge is tackled with Hidden Markov Models (HMMs), a type of directed PGMs, under the reinforcement learning framework. I prove that the Semi-Markov Decision Problem (SMDP) formulated option framework [Sutton et al., 1999, Bacon et al., 2017, Zhang and Whiteson, 2019], one of the most promising Hierarchical Reinforcement Learning (HRL) frameworks, has a Markov Decision Problem (MDP) equivalence. Based on this equivalence, a simple yet effective Skill-Action (SA) architecture is proposed. Our empirical studies on challenging robot simulation environments demonstrate that SA significantly outperforms all baselines on both infinite horizon and transfer learning environments. Because of its exceptional scalability, SA gives rise to a large scale pre-training architecture in reinforcement learning. The second challenge is tackled with Markov Random Fields (MRFs), also known as undirected PGMs, under the supervised learning framework. I employ binary MRFs with weighted Lower Linear Envelope Potentials (LLEPs) to capture higher-order dependencies. I propose an exact inference algorithm under the graph-cuts framework and an efficient learning algorithm under the Latent Structural Support Vector Machines (LSSVMs) framework. In order to learn higher-order latent dynamics on time series, we layer multi-task recurrent neural networks (RNNs) on top of Markov random fields (MRFs). A sub-gradient algorithm is employed to perform end-to-end training. We conduct thorough empirical studies on three popular Chinese stock market indexes and the proposed method outperforms all baselines. To our best knowledge, the proposed technique is the first to investigate higher-order dynamics between stocks.
APA, Harvard, Vancouver, ISO, and other styles
34

Wong, Alison. "Artificial Intelligence for Astronomical Imaging." Thesis, The University of Sydney, 2023. https://hdl.handle.net/2123/30068.

Full text
Abstract:
Astronomy is the ultimate observational science. Objects outside our solar system are beyond our reach, so we are limited to acquiring knowledge at a distance. This motivates the need to advance astrophysical imaging technologies, particularly for the field of high contrast imaging, where some of the most highly prized science goals require high fidelity imagery of exoplanets and of the circumstellar structures associated with stellar and planetary birth. Such technical capabilities address questions of both the birth and death of stars which in turn informs the grand recycling of matter in the chemical evolution of the galaxy and universe itself. Ground-based astronomical observation primarily relies on extreme adaptive optics systems in order to extract signals arising from faint structures within the immediate vicinity of luminous host stars. These systems are distinguished from standard adaptive optics systems in performing faster and more precise wavefront correction which leads to better imaging performance. The overall theme of this thesis therefore ties together advanced topics in artificial intelligence with techniques and technologies required for the field of high contrast imaging. This is accomplished with demonstrations of deep learning methods used to improve the performance of extreme adaptive optics systems and is deployed and benchmarked with data obtained at the Subaru Coronagraphic Extreme Adaptive Optics (SCExAO) system operating at the observatory on the summit of Mauna Kea in Hawaii. Solutions encompass both hardware and software, with optimal recovery of scientific outcomes delivered by model fitting of high contrast imaging data with modern machine learning techniques. This broad-ranging study subjecting acquisition, analysis and modelling of data hopes to yield more accurate and higher fidelity observables which in turn delivers improved interpretation and scientific delivery.
APA, Harvard, Vancouver, ISO, and other styles
35

Jai, Mansouri Nabil. "Deep learning approach for ultrasound signals processing." Electronic Thesis or Diss., Université de Montpellier (2022-....), 2023. http://www.theses.fr/2023UMONS058.

Full text
Abstract:
L’imagerie ultrasonore est une méthode de contrôle non destructif qui est utilisée pour plusieurs applications dans le domaine médical et industriel. Cette méthode présente de nombreux avantages qui favorisent son utilisation par rapport aux autres méthodes de contrôle non destructif. Parallèlement, l’apprentissage profond a attiré l’attention des chercheurs dans de multiples domaines. En particulier, dans le domaine de l’imagerie ultrasonore, plusieurs études basées sur des algorithmes de l’apprentissage profond ont porté sur le traitement des ondes ultrasonores et des signaux ultrasonores qui ont été exploités par la suite pour effectuer des mesures ou pour former des images acoustiques.L’objectif principal de cette thèse est d’améliorer les méthodes de traitement des signaux ultrasonores en utilisant des algorithmes de l’apprentissage profond. Une revue de la littérature a démontré que la plupart des études visent à prendre des mesures précises qui sont souvent exploitées pour synthétiser les images ultrasonores. À cette fin, cette thèse s’est concentrée sur l’estimation précise du temps de vol (TV) des ondes ultrasonores en utilisant des réseaux de neurones artificiels. Les résultats des méthodes proposés ont montré des améliorations considérables dans la précision de l’estimation des TVs où la méthode proposée a surpassé la méthode standard du traitement du signal sur 99% des échantillons. Les échantillons restants ont été analysés et ont montré un biais statistique. Pour cette raison, un modèle de l’apprentissage profond pour la classification a été proposé pour distinguer ces distributions de données et qui a atteint une précision de 99,7%, ce qui permet d’employer la méthode appropriée en fonction de la distribution du signal. Cette distinction a conduit à une amélioration de la précision de l’estimation du TV, ce qui améliorera directement la qualité des mesures. La principale limitation à ce stade était la présence de bruit qui est couramment rencontré sur les signaux ultrasonores réels. Le bruit dégrade l’information contenue dans les signaux en fonction de son intensité. Pour cette raison, cette thèse a traité ce sujet séparément en utilisant une approche centrée sur les données.Dans ce but, cette thèse propose une méthode de débruitage issue de l’apprentissage profond employant des auto-encodeurs convolutifs avec des connexions à saut d’attention. La réduction du bruit de cette méthode a été comparée à celles des méthodes de traitement du signal, des méthodes basées sur l’apprentissage automatique et d’autres méthodes de l’apprentissage profond. Cette comparaison a étudié l’amélioration du rapport signal/bruit (SNR) et le coefficient de Pearson (P’r) avant et après la réduction du bruit. La méthode de réduction du bruit proposée a montré une amélioration considérable du rapport signal/bruit qui a atteint 30 dB sur des signaux très bruités. De plus, elle a maintenu le P’r très proche de 1 même sur des signaux avec un SNR initial faible. L’architecture de cette méthode a été entraînée à déconvoluer les signaux ultrasonores également. Théoriquement, la résolution axiale de l’imagerie ultrasonore est limitée à la moitié de la longueur d’impulsion spatiale. La méthode de déconvolution proposée permet une localisation précise des échos et réduit la résolution axiale à moins d’un vingtième de la longueur d’impulsion spatiale de l’écho.Enfin, les travaux de cette thèse se sont consacrés à la mise en place d’un pipeline permettant de reproduire les mêmes résultats sur des signaux réels. Les tests sur les signaux réels pour la mesure d’épaisseur et l’exploration de l’état de surface ont montré́ des résultats similaires à ceux des signaux synthétiques<br>Ultrasonic imaging is a non-destructive testing method that is used for many applications in the medical and industrial fields. This method has a number of advantages over other non-destructive testing methods. At the same time, deep learning has attracted the attention of researchers in many fields. In particular, in the field of ultrasound imaging, several studies based on deep learning algorithms have focused on the processing of ultrasound waves and ultrasound signals, which have subsequently been used to perform measurements or to form acoustic images.The main objective of this thesis is to improve ultrasound signal processing methods using deep learning algorithms. A review of the literature has shown that most studies aim to take precise measurements which are often exploited to synthesize ultrasound images. To this end, this thesis focused on the accurate estimation of the Time-of-Flight (ToF) of ultrasound waves using artificial neural networks. The results of the proposed methods showed considerable improvements in the accuracy of ToF estimation where the proposed method outperformed the standard signal processing method on 99% of the samples. The remaining samples were analyzed and showed statistical bias. For this reason, a classification deep learning model was proposed to distinguish these data distributions and which achieved an accuracy of 99.7%, allowing the appropriate method to be used depending on the signal distribution. This distinction led to an improvement in the accuracy of the ToF estimate, which will directly improve the quality of the measurements. The main limitation at this stage was the presence of noise, which is commonly encountered in real ultrasound signals. Noise degrades the information contained in signals as a function of its intensity. For this reason, this thesis treated this topic separately using a data-centric approach.To this end, this thesis proposes a denoising deep learning method employing convolutional autoencoders with attention skip connections. The noise reduction of this method was compared with that of signal processing methods, machine learning-based methods and other deep learning methods. The comparison looked at the improvement in signal-to-noise ratio (SNR) and Pearson's coefficient (P'r) before and after noise reduction. The proposed noise reduction method showed a considerable improvement in SNR, which reached 30 dB on very noisy signals. In addition, it maintained the P'r very close to 1 even on signals with a low initial SNR. The architecture of this method has also been trained to deconvolve ultrasound signals. Theoretically, the axial resolution of ultrasound imaging is limited to half the spatial pulse length. The proposed deconvolution method allows precise localization of echoes and reduces the axial resolution to less than one-twentieth of the echo's spatial pulse length.Finally, the work in this thesis was devoted to setting up a pipeline that would allow the reproduction of the same results on real signals. Tests on real signals for thickness measurement and surface condition exploration showed́ similar results to those on synthetic signals
APA, Harvard, Vancouver, ISO, and other styles
36

Maltbie, Nicholas. "Integrating Explainability in Deep Learning Application Development: A Categorization and Case Study." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1623169431719474.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

FELICETTI, ANDREA. "Artificial Intelligence approaches for spatial data processing." Doctoral thesis, Università Politecnica delle Marche, 2021. http://hdl.handle.net/11566/289699.

Full text
Abstract:
I ricercatori hanno esplorato i benefici e le applicazioni degli algoritmi di intelligenza artificiale (IA) in diversi scenari. Per l'elaborazione dei dati spaziali, l'IA offre enormi opportunità. Le domande fondamentali, la ricerca si inclina a capire come l'IA può essere applicata o deve essere creata specificamente per i dati spaziali. Questo cambiamento sta avendo un impatto significativo sui dati spaziali. Il Machine Learning (ML) è stato un componente importante per l'analisi dei dati spaziali e per la loro classificazione, clustering e previsione. Inoltre, il deep learning (DL) viene integrato per estrarre automaticamente informazioni utili per la classificazione, il rilevamento di oggetti, la segmentazione semantica, ecc. L'integrazione di AI, ML e DL in geomatica ha introdotto il concetto di Geospatial Artificial Intelligence (GeoAI), che è un nuovo paradigma per la scoperta della conoscenza geospaziale e oltre. Partendo da tale premessa, questa tesi affronta il tema dello sviluppo di tecniche basate sull'IA per l'analisi e l'interpretazione di dati spaziali complessi. L'analisi ha coperto diverse lacune, per esempio la definizione delle relazioni tra gli approcci basati sull'IA e i dati spaziali. Considerando la natura multidisciplinare dei dati spaziali, gli sforzi maggiori sono stati fatti per quanto riguarda i dati dei social media, le immagini termografiche a infrarossi (IRT), le ortofoto e le nuvole di punti. Inizialmente, è stata condotta una revisione della letteratura per capire le principali tecnologie di acquisizione dei dati e se e come i metodi e le tecniche di IA potrebbero aiutare in questo campo. Un'attenzione specifica è data allo stato dell'arte dell'IA, che è stata importante per affrontare quattro diversi problemi: la gestione delle destinazioni turistiche utilizzando la sentiment analysis e le informazioni di geo-localizzazione; il rilevamento automatico delle anomalie negli impianti fotovoltaici; la segmentazione dei mosaici basata sul deep learning; il rilevamento di punti del viso per la modellazione 3D della testa in ambito medico. Le applicazioni IA proposte aprono nuove e importanti opportunità per la comunità geomatica. I nuovi dataset raccolti, così come i dati complessi presi in esame, rendono la ricerca sfidante. Infatti, è fondamentale valutare le prestazioni dei metodi allo stato dell'arte per dimostrare la loro forza e debolezza e aiutare a identificare la ricerca futura per la progettazione di algoritmi IA più robusti. Per una valutazione completa delle prestazioni, è di grande importanza sviluppare una libreria di benchmark per valutare lo stato dell'arte, perché i metodi di progettazione che sono sintonizzati su un problema specifico non funzionano correttamente su altri problemi. Un'intensa attenzione è stata dedicata all'esplorazione di modelli e algoritmi specifici. I metodi di IA adottati per lo sviluppo delle applicazioni proposte, hanno dimostrato di essere in grado di estrarre caratteristiche statistiche complesse e di apprendere in modo efficiente le loro rappresentazioni, permettendo di generalizzare bene su un'ampia varietà di compiti di IA, tra cui la classificazione delle immagini, il riconoscimento del testo e così via. Le limitazioni puntano verso aree inesplorate per indagini future, servendo come utili linee guida per le future direzioni di ricerca.<br>Researchers have explored the benefits and applications of artificial intelligence (AI) algorithms in different scenario. For the processing of spatial data, AI offers overwhelming opportunities. Fundamental questions include how AI can be specifically applied to or must be specifically created for spatial data. This change is also having a significant impact on spatial data. Machine learning (ML) has been an important component for spatial analysis for classification, clustering, and prediction. In addition, deep learning (DL) is being integrated to automatically extract useful information for classification, object detection, semantic and instance segmentation, etc. The integration of AI, ML, and DL in geomatics has lead the concept of Geospatial Artificial Intelligence (GeoAI), which is a new paradigm for geo-information knowledge discovery and beyond. Starting from such a premise, this thesis addresses the topic of developing AI-based techniques for analysing and interpreting complex spatial data. The analysis has covered several gaps, for instance defining relationships between AI-based approaches and spatial data. Considering the multidisciplinary nature of spatial data, major efforts have been undertaken in regard to social media data, infrared thermographic (IRT) images, orthophotos, and point clouds. Initially, a literature review was conducted to understand the main data acquisition technologies and if and how AI methods and techniques could help in this field. More in deep, specific attention is given to the state of the art in AI with the selected data type mentioned above, which is important to deal with four different problem: tourism destination management using sentiment analysis and geo-location information; automatic faults detection on photovoltaic farms; mosaic segmentation based on deep cascading learning; face landmarks detection for head 3D modelling for medical applications. The proposed AI applications open up a wealth of novel and important opportunities for both geomatics and computer science community. The newly collected datasets, as well as the complexity of data taken into exam, make the research challenging. In fact, it is crucial to evaluate the performance of state of the art methods to demonstrate their strength and weakness and help identifying future research for designing more robust AI algorithms. For comprehensive performance evaluation, it is of great importance developing a library and benchmarks to gauge the state of the art, because the design methods tuned to a specific problem do not work properly on other problems. Intensive attention has been drawn to the exploration of tailored learning models and algorithms. The tailored AI methods, adopted for the development of the proposed applications, have shown to be capable of extracting complex statistical features and efficiently learning their representations, allowing it to generalize well across a wide variety of AI tasks, including image classification, text recognition and so on. Limitations point towards unexplored areas for future investigations, serving as useful guidelines for future research directions.
APA, Harvard, Vancouver, ISO, and other styles
38

Parascandolo, Fiorenzo. "Trading System: a Deep Reinforcement Learning Approach." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022.

Find full text
Abstract:
The main objective of this work is to show the advantages of Reinforcement Learning-based approaches to develop a Trading System. The experimental results showed the great adaptability of the developed models, which obtained very satisfactory econometric performances in five datasets of Forex Market characterized by different volatilities. The TradingEnv baseline provided by OpenAi was used to simulate the financial market. The latter has been improved by implementing a rendering of the simulation and the commission plan applied by a real Electronic Communication Network. As regards the artificial agent, the main contributions are the use of the Gramian Angular Field transformation to encode the historical financial series in images and the experimental proof that the presence of Locally Connected Layers brings a benefit in terms of performances. Vanilla Saliency Map was used as an explainability method to tune the window size of the observations of the environment. From the explanation of the best performing model it is possible to observe how the most important information are the price changes observed with greater granularity in accordance with the theoretical results proven at the state of the art on the historical financial series.
APA, Harvard, Vancouver, ISO, and other styles
39

Monica, Riccardo. "Deep Incremental Learning for Object Recognition." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/12331/.

Full text
Abstract:
In recent years, deep learning techniques received great attention in the field of information technology. These techniques proved to be particularly useful and effective in domains like natural language processing, speech recognition and computer vision. In several real world applications deep learning approaches improved the state-of-the-art. In the field of machine learning, deep learning was a real revolution and a number of effective techniques have been proposed for both supervised and unsupervised learning and for representation learning. This thesis focuses on deep learning for object recognition, and in particular, it addresses incremental learning techniques. With incremental learning we denote approaches able to create an initial model from a small training set and to improve the model as new data are available. Using temporal coherent sequences proved to be useful for incremental learning since temporal coherence also allows to operate in unsupervised manners. A critical point of incremental learning is called forgetting which is the risk to forget previously learned patterns as new data are presented. In the first chapters of this work we introduce the basic theory on neural networks, Convolutional Neural Networks and incremental learning. CNN is today one of the most effective approaches for supervised object recognition; it is well accepted by the scientific community and largely used by ICT big players like Google and Facebook: relevant applications are Facebook face recognition and Google image search. The scientific community has several (large) datasets (e.g., ImageNet) for the development and evaluation of object recognition approaches. However very few temporally coherent datasets are available to study incremental approaches. For this reason we decided to collect a new dataset named TCD4R (Temporal Coherent Dataset For Robotics).
APA, Harvard, Vancouver, ISO, and other styles
40

Bhuyan, Bikram Pratim. "Neuro-symbolic knowledge hypergraphs : knowledge representation and learning in neuro-symbolic artificial intelligence." Electronic Thesis or Diss., université Paris-Saclay, 2025. http://www.theses.fr/2025UPASG024.

Full text
Abstract:
L'intégration du raisonnement symbolique et de l'apprentissage neuronal en Intelligence Artificielle (IA) est devenue de plus en plus cruciale à mesure que la demande de modèles capables de gérer des données complexes, dynamiques et interconnectées croît. Alors que les approches traditionnelles ont fait des progrès dans ces domaines séparément, un cadre unifié combinant ces paradigmes est essentiel pour faire progresser la capacité de l'IA à interpréter, apprendre et prédire dans des environnements réels.Malgré les avancées des modèles symboliques et neuronaux, la littérature existante révèle un manque de cadres qui intègrent efficacement les deux, en particulier dans le contexte de la représentation et de l'apprentissage des connaissances spatio-temporelles. Les graphes de connaissances traditionnels (KGs), bien qu'utiles, peinent à capturer les relations d'ordre supérieur et les changements dynamiques et temporels. Cette limitation nécessite une nouvelle approche capable d'intégrer une logique d'ordre supérieur et une structure flexible pour modéliser les complexités du monde réel.L'objectif de cette étude est de développer et de valider un cadre d'Hypergraphe de Connaissances Neuro-Symboliques qui étend les graphes de connaissances traditionnels en Graphes de Connaissances d'Ordre Supérieur (HOKGs), capables de représenter des relations n-aires et temporelles. Le cadre intègre la Logique Temporelle du Second Ordre Monadique (MSOTL) pour le raisonnement temporel et les Réseaux Neuronaux d'Hypergraphes (HGNNs) pour l'apprentissage et la modélisation prédictive, reliant ainsi les paradigmes symboliques et neuronaux.La méthodologie consiste à formuler une structure d'hypergraphe robuste qui encode des relations spatio-temporelles et sémantiques en utilisant la MSOTL. Celle-ci est ensuite couplée aux réseaux neuronaux d'hypergraphes, intégrant des mécanismes de convolution et d'attention pour un apprentissage et une inférence efficaces. Le cadre est testé sur des scénarios réels, en particulier dans l'agriculture urbaine, pour démontrer ses capacités prédictives et sa robustesse.Les résultats clés montrent que le cadre proposé améliore significativement la capacité d'expression et d'inférence par rapport aux KGs traditionnels. La composante MSOTL permet une modélisation précise des relations temporelles et spatiales, tandis que les HGNNs valident l'exactitude prédictive. L'étude de cas sur l'agriculture urbaine met en évidence l'utilité du cadre, démontrant comment il peut fournir des informations significatives et des prédictions précises sur les pratiques agricoles dynamiques.Cette recherche a de vastes implications pour les applications de l'IA nécessitant une représentation des connaissances complexes et dynamiques, telles que les villes intelligentes, la surveillance environnementale, et au-delà. L'approche proposée, basée sur l'hypergraphe, ouvre la voie à l'intégration de logiques d'ordre supérieur, de connaissances basées sur l'ontologie et de l'apprentissage profond, offrant une solution globale aux limites observées dans les graphes de connaissances actuels et les modèles d'IA<br>The integration of symbolic reasoning and neural learning in Artificial Intelligence (AI) has become increasingly important as the demand for models capable of handling complex, dynamic, and interconnected data grows. While traditional approaches have made progress in these domains separately, a unified framework that combines these paradigms is crucial for advancing AI's ability to interpret, learn, and predict in real-world environments.Despite the advancements in symbolic and neural models, existing literature reveals a gap in frameworks that effectively merge the two, particularly in the context of spatio-temporal knowledge representation and learning. Traditional knowledge graphs (KGs), though useful, struggle with capturing high-order relationships and dynamic, temporal changes. This limitation necessitates a novel approach that can incorporate higher-order logic and flexible structure to model real-world complexities.The objective of this study is to develop and validate a Neuro-Symbolic Knowledge Hypergraph framework that extends traditional knowledge graphs into Higher-Ordered Knowledge Graphs (HOKGs) capable of representing n-ary and temporal relationships. The framework integrates Monadic Second-Order Temporal Logic (MSOTL) for temporal reasoning and Hypergraph Neural Networks (HGNNs) for learning and predictive modeling, bridging the symbolic and neural paradigms.The methodology involves formulating a robust hypergraph structure that encodes spatio-temporal and semantic relationships using MSOTL. It is then coupled with hypergraph neural networks, incorporating convolution and attention mechanisms for effective learning and inference. The framework is tested on real-world scenarios, specifically in urban agriculture, to demonstrate its predictive capabilities and robustness.Key findings show that the proposed framework significantly enhances the expressiveness and inference capacity compared to traditional KGs. The MSOTL component ensures precise modeling of temporal and spatial relationships, while HGNNs validate predictive accuracy. The case study on urban agriculture highlights the framework's utility, showcasing how it can provide meaningful insights and precise predictions on dynamic agricultural practices.This research has broad implications for AI applications requiring complex, dynamic knowledge representation, such as smart cities, environmental monitoring, and beyond. The proposed hypergraph-based approach opens pathways for integrating higher-order logic, ontology-based knowledge, and deep learning, offering a comprehensive solution to the limitations observed in current knowledge graphs and AI models
APA, Harvard, Vancouver, ISO, and other styles
41

El, Qadi El Haouari Ayoub. "An EXplainable Artificial Intelligence Credit Rating System." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS486.

Full text
Abstract:
Au cours des dernières années, le déficit de financement du commerce a atteint le chiffre alarmant de 1 500 milliards de dollars, soulignant une crise croissante dans le commerce mondial. Ce déficit est particulièrement préjudiciable aux petites et moyennes entreprises (PME), qui éprouvent souvent des difficultés à accéder au financement du commerce. Les systèmes traditionnels d'évaluation du crédit, qui constituent l'épine dorsale du finance-ment du commerce, ne sont pas toujours adaptés pour évaluer correctement la solvabilité des PME. Le terme "credit scoring" désigne les méthodes et techniques utilisées pour évaluer la solvabilité des individus ou des entreprises. Le score généré est ensuite utilisé par les institutions financières pour prendre des décisions sur l'approbation des prêts, les taux d'intérêt et les limites de crédit. L'évaluation du crédit présente plusieurs caractéristiques qui en font une tâche difficile. Tout d'abord, le manque d'explicabilité des modèles complexes d'apprentissage automatique entraîne souvent une moindre acceptation des évaluations de crédit, en particulier parmi les parties prenantes qui exigent un processus décisionnel transparent. Cette opacité peut constituer un obstacle à l'adoption généralisée de techniques d'évaluation avancées. Un autre défi important est la variabilité de la disponibilité des données entre les pays et les dossiers financiers souvent incomplets des PME, ce qui rend difficile le développement de modèles universellement applicables. Dans cette thèse, nous avons d'abord abordé la question de l'explicabilité en utilisant des techniques de pointe dans le domaine de l'intelligence artificielle explicable (XAI). Nous avons introduit une nouvelle stratégie consistant à comparer les explications générées par les modèles d'apprentissage automatique avec les critères utilisés par les experts en crédit. Cette analyse comparative a révélé une divergence entre le raisonnement du modèle et le jugement de l'expert, soulignant la nécessité d'incorporer les critères de l'expert dans la phase de formation du modèle. Les résultats suggèrent que l'alignement des explications générées par la machine sur l'expertise humaine pourrait être une étape cruciale dans l'amélioration de l'acceptation et de la fiabilité du modèle.Par la suite, nous nous sommes concentrés sur le défi que représentent les don-nées financières éparses ou incomplètes. Nous avons incorporé des évaluations de crédit textuelles dans le modèle d'évaluation du crédit en utilisant des techniques de pointe de traitement du langage naturel (NLP). Nos résultats ont démontré que les modèles formés à la fois avec des données financières et des évaluations de crédit textuelles étaient plus performants que ceux qui s'appuyaient uniquement sur des données financières. En outre, nous avons montré que notre approche pouvait effectivement générer des scores de crédit en utilisant uniquement des évaluations de risque textuelles, offrant ainsi une solution viable pour les scénarios dans lesquels les mesures financières traditionnelles ne sont pas disponibles ou insuffisantes<br>Over the past few years, the trade finance gap has surged to an alarming 1.5 trillion dollars, underscoring a growing crisis in global commerce. This gap is particularly detrimental tosmall and medium-sized enterprises (SMEs), which often find it difficult to access trade finance. Traditional credit scoring systems, which are the backbone of trade finance, are not always tailored to assess the credit worthiness of SMEs adequately. The term credit scoring stands for the methods and techniques used to evaluate the credit worthiness of individuals or business. The score generated is then used by financial institutions to make decisions on loan approvals, interest rates, and credit limits. Credit scoring present several characteristics that makes it a challenging task. First, the lack of explainability in complex machine learning models often results in less acceptance of credit assessments, particulary among stakeholders who require transparent decision-making process. This opacity can be an obstacle in the widespread adoption of advanced scoring techniques. Another significant challenge is the variability in data availability across countries and the often incomplete financial records of SME's which makes it difficult to develop universally applicable models.In this thesis, we initially tackled the issue of explainability by employing state-of-the-art techniques in Explainable Artificial Intelligence (XAI). We introduced a novel strategy that involved comparing the explanations generated by machine learning models with the criteria used by credit experts. This comparative analysis revealed a divergence between the model's reasoning and the expert's judgment, underscoring the necessity of incorporating expert criteria into the training phase of the model. The findings suggest that aligning machine-generated explanations with human expertise could be a pivotal step in enhancing the model's acceptance and trustworthiness. Subsequently, we shifted our focus to address the challenge of sparse or incomplete financial data. We incorporated textual credit assessments into the credit scoring model using cutting-edge Natural Language Processing (NLP) techniques. Our results demon-strated that models trained with both financial data and textual credit assessments out-performed those relying solely on financial data. Moreover, we showed that our approach could effectively generate credit scores using only textual risk assessments, thereby offer-ing a viable solution for scenarios where traditional financial metrics are unavailable or insufficient
APA, Harvard, Vancouver, ISO, and other styles
42

Neverova, Natalia. "Deep learning for human motion analysis." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI029/document.

Full text
Abstract:
L'objectif de ce travail est de développer des méthodes avancées d'apprentissage pour l’analyse et l'interprétation automatique du mouvement humain à partir de sources d'information diverses, telles que les images, les vidéos, les cartes de profondeur, les données de type “MoCap” (capture de mouvement), les signaux audio et les données issues de capteurs inertiels. A cet effet, nous proposons plusieurs modèles neuronaux et des algorithmes d’entrainement associés pour l’apprentissage supervisé et semi-supervisé de caractéristiques. Nous proposons des approches de modélisation des dépendances temporelles, et nous montrons leur efficacité sur un ensemble de tâches fondamentales, comprenant la détection, la classification, l’estimation de paramètres et la vérification des utilisateurs (la biométrie). En explorant différentes stratégies de fusion, nous montrons que la fusion des modalités à plusieurs échelles spatiales et temporelles conduit à une augmentation significative des taux de reconnaissance, ce qui permet au modèle de compenser les erreurs des classifieurs individuels et le bruit dans les différents canaux. En outre, la technique proposée assure la robustesse du classifieur face à la perte éventuelle d’un ou de plusieurs canaux. Dans un deuxième temps nous abordons le problème de l’estimation de la posture de la main en présentant une nouvelle méthode de régression à partir d’images de profondeur. Dernièrement, dans le cadre d’un projet séparé (mais lié thématiquement), nous explorons des modèles temporels pour l'authentification automatique des utilisateurs de smartphones à partir de leurs habitudes de tenir, de bouger et de déplacer leurs téléphones. Dans ce contexte, les données sont acquises par des capteurs inertiels embraqués dans les appareils mobiles<br>The research goal of this work is to develop learning methods advancing automatic analysis and interpreting of human motion from different perspectives and based on various sources of information, such as images, video, depth, mocap data, audio and inertial sensors. For this purpose, we propose a several deep neural models and associated training algorithms for supervised classification and semi-supervised feature learning, as well as modelling of temporal dependencies, and show their efficiency on a set of fundamental tasks, including detection, classification, parameter estimation and user verification. First, we present a method for human action and gesture spotting and classification based on multi-scale and multi-modal deep learning from visual signals (such as video, depth and mocap data). Key to our technique is a training strategy which exploits, first, careful initialization of individual modalities and, second, gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. Moving forward, from 1 to N mapping to continuous evaluation of gesture parameters, we address the problem of hand pose estimation and present a new method for regression on depth images, based on semi-supervised learning using convolutional deep neural networks, where raw depth data is fused with an intermediate representation in the form of a segmentation of the hand into parts. In separate but related work, we explore convolutional temporal models for human authentication based on their motion patterns. In this project, the data is captured by inertial sensors (such as accelerometers and gyroscopes) built in mobile devices. We propose an optimized shift-invariant dense convolutional mechanism and incorporate the discriminatively-trained dynamic features in a probabilistic generative framework taking into account temporal characteristics. Our results demonstrate, that human kinematics convey important information about user identity and can serve as a valuable component of multi-modal authentication systems
APA, Harvard, Vancouver, ISO, and other styles
43

PASQUALINI, LUCA. "Real World Problems through Deep Reinforcement Learning." Doctoral thesis, Università di Siena, 2022. http://hdl.handle.net/11365/1192945.

Full text
Abstract:
Reinforcement Learning (RL) represents a very promising field in the umbrella of Machine Learning (ML). Using algorithms inspired by psychology, specifically by the Operant Conditioning of Behaviorism, RL makes it possible to solve problems from scratch, without any prior knowledge nor data about the task at hand. When used in conjuction with Neural Networks (NNs), RL has proven to be especially effective: we call this Deep Reinforcement Learning (DRL). In recent past, DRL proved super-human capabilities on many games, but its real world applications are varied and range from robotics to general optimization problems. One of the main focuses of current research and literature in the broader field of Machine Learning (ML) revolves around benchmarks, in a never ending challenge between researchers to the last decimal figure on certain metrics. However, having to pass some benchmark or to beat some other approach as the main objective is, more often than not, limiting from the point of view of actually contributing to the overall goal of ML: to automate as many real tasks as possible. Following this intuition, this thesis proposes to first analyze a collection of really varied real world tasks and then to develop a set of associated models. Finally, we apply DRL to solve these tasks by means of exploration and exploitation of these models. Specifically, we start from studying how using the score as target influences the performance of a well-known artificial player of Go, in order to develop an agent capable of teaching humans how to play to maximize their score. Then, we move onto machine creativity, using DRL in conjuction with state-of-the-art Natural Language Processing (NLP) techniques to generate and revise poems in a human-like fashion. We then dive deep into a queue optimization task, to dynamically schedule Ultra Reliable Low Latency Communication (URLLC) packets on top of a set of frequencies previously allocated for enhanced Mobile Broad Band (eMBB) users. Finally, we propose a novel DRL approach to the task of generating black-box Pseudo Random Number Generators (PRNGs) with variable periods, by exploiting the autonomous navigation of a state-of-the-art DRL algorithm both in a feedforward and a recurrent fashion.
APA, Harvard, Vancouver, ISO, and other styles
44

ABUKMEIL, MOHANAD. "UNSUPERVISED GENERATIVE MODELS FOR DATA ANALYSIS AND EXPLAINABLE ARTIFICIAL INTELLIGENCE." Doctoral thesis, Università degli Studi di Milano, 2022. http://hdl.handle.net/2434/889159.

Full text
Abstract:
For more than a century, the methods of learning representation and the exploration of the intrinsic structures of data have developed remarkably and currently include supervised, semi-supervised, and unsupervised methods. However, recent years have witnessed the flourishing of big data, where typical dataset dimensions are high, and the data can come in messy, missing, incomplete, unlabeled, or corrupted forms. Consequently, discovering and learning the hidden structure buried inside such data becomes highly challenging. From this perspective, latent data analysis and dimensionality reduction play a substantial role in decomposing the exploratory factors and learning the hidden structures of data, which encompasses the significant features that characterize the categories and trends among data samples in an ordered manner. That is by extracting patterns, differentiating trends, and testing hypotheses to identify anomalies, learning compact knowledge, and performing many different machine learning (ML) tasks such as classification, detection, and prediction. Unsupervised generative learning (UGL) methods are a class of ML characterized by their possibility of analyzing and decomposing latent data, reducing dimensionality, visualizing the manifold of data, and learning representations with limited levels of predefined labels and prior assumptions. Furthermore, explainable artificial intelligence (XAI) is an emerging field of ML that deals with explaining the decisions and behaviors of learned models. XAI is also associated with UGL models to explain the hidden structure of data, and to explain the learned representations of ML models. However, the current UGL models lack large-scale generalizability and explainability in the testing stage, which leads to restricting their potential in ML and XAI applications. To overcome the aforementioned limitations, this thesis proposes innovative methods that integrate UGL and XAI to enable data factorization and dimensionality reduction to improve the generalizability of the learned ML models. Moreover, the proposed methods enable visual explainability in modern applications as anomaly detection and autonomous driving systems. The main research contributions are listed as follows: • A novel overview of UGL models including blind source separation (BSS), manifold learning (MfL), and neural networks (NNs). Also, the overview considers open issues and challenges among each UGL method. • An innovative method to identify the dimensions of the compact feature space via a generalized rank in the application of image dimensionality reduction. • An innovative method to hierarchically reduce and visualize the manifold of data to improve the generalizability in limited data learning scenarios, and computational complexity reduction applications. • An original method to visually explain autoencoders by reconstructing an attention map in the application of anomaly detection and explainable autonomous driving systems. The novel methods introduced in this thesis are benchmarked on publicly available datasets, and they outperformed the state-of-the-art methods considering different evaluation metrics. Furthermore, superior results were obtained with respect to the state-of-the-art to confirm the feasibility of the proposed methodologies concerning the computational complexity, availability of learning data, model explainability, and high data reconstruction accuracy.
APA, Harvard, Vancouver, ISO, and other styles
45

Yang, Zhaoyuan Yang. "Adversarial Reinforcement Learning for Control System Design: A Deep Reinforcement Learning Approach." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu152411491981452.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Beretta, Davide. "Experience Replay in Sparse Rewards Problems using Deep Reinforcement Techniques." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17531/.

Full text
Abstract:
In questo lavoro si introduce il lettore al Reinforcement Learning, un'area del Machine Learning su cui negli ultimi anni è stata fatta molta ricerca. In seguito vengono presentate alcune modifiche ad ACER, un algoritmo noto e molto interessante che fa uso di Experience Replay. Lo scopo è quello di cercare di aumentarne le performance su problemi generali ma in particolar modo sugli sparse reward problem. Per verificare la bontà delle idee proposte è utilizzato Montezuma's Revenge, un gioco sviluppato per Atari 2600 e considerato tra i più difficili da trattare.
APA, Harvard, Vancouver, ISO, and other styles
47

Mollaret, Sébastien. "Artificial intelligence algorithms in quantitative finance." Thesis, Paris Est, 2021. http://www.theses.fr/2021PESC2002.

Full text
Abstract:
L'intelligence artificielle est devenue de plus en plus populaire en finance quantitative avec l'augmentation des capacités de calcul ainsi que de la complexité des modèles et a conduit à de nombreuses applications financières. Dans cette thèse, nous explorons trois applications différentes pour résoudre des défis concernant le domaine des dérivés financiers allant de la sélection de modèle, à la calibration de modèle ainsi que la valorisation des dérivés. Dans la Partie I, nous nous intéressons à un modèle avec changement de régime de volatilité afin de valoriser des dérivés sur actions. Les paramètres du modèle sont estimés à l'aide de l'algorithme d'Espérance-Maximisation (EM) et une composante de volatilité locale est ajoutée afin que le modèle soit calibré sur les prix d'options vanilles à l'aide de la méthode particulaire. Dans la Partie II, nous utilisons ensuite des réseaux de neurones profonds afin de calibrer un modèle à volatilité stochastique, dans lequel la volatilité est représentée par l'exponentielle d'un processus d'Ornstein-Uhlenbeck, afin d'approximer la fonction qui lie les paramètres du modèle aux volatilités implicites correspondantes hors ligne. Une fois l'approximation couteuse réalisée hors ligne, la calibration se réduit à un problème d'optimisation standard et rapide. Dans la Partie III, nous utilisons enfin des réseaux de neurones profonds afin de valorisation des options américaines sur de grands paniers d'actions pour surmonter la malédiction de la dimension. Différentes méthodes sont étudiées avec une approche de type Longstaff-Schwartz, où nous approximons les valeurs de continuation, et une approche de type contrôle stochastique, où nous résolvons l'équation différentielle partielle de valorisation en la reformulant en problème de contrôle stochastique à l'aide de la formule de Feynman-Kac non linéaire<br>Artificial intelligence has become more and more popular in quantitative finance given the increase of computer capacities as well as the complexity of models and has led to many financial applications. In the thesis, we have explored three different applications to solve financial derivatives challenges, from model selection, to model calibration and pricing. In Part I, we focus on a regime-switching model to price equity derivatives. The model parameters are estimated using the Expectation-Maximization (EM) algorithm and a local volatility component is added to fit vanilla option prices using the particle method. In Part II, we then use deep neural networks to calibrate a stochastic volatility model, where the volatility is modelled as the exponential of an Ornstein-Uhlenbeck process, by approximating the mapping between model parameters and corresponding implied volatilities offline. Once the expensive approximation has been performed offline, the calibration reduces to a standard &amp; fast optimization problem.In Part III, we finally use deep neural networks to price American option on large baskets to solve the curse of the dimensionality. Different methods are studied with a Longstaff-Schwartz approach, where we approximate the continuation values, and a stochastic control approach, where we solve the pricing partial differential equation by reformulating the problem as a stochastic control problem using the non-linear Feynman-Kac formula
APA, Harvard, Vancouver, ISO, and other styles
48

Reza, Tasmia. "Object Detection Using Feature Extraction and Deep Learning for Advanced Driver Assistance Systems." Thesis, Mississippi State University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10841471.

Full text
Abstract:
<p> A comparison of performance between tradition support vector machine (SVM), single kernel, multiple kernel learning (MKL), and modern deep learning (DL) classifiers are observed in this thesis. The goal is to implement different machine-learning classification system for object detection of three-dimensional (3D) Light Detection and Ranging (LiDAR) data. The linear SVM, non linear single kernel, and MKL requires hand crafted features for training and testing their algorithm. The DL approach learns the features itself and trains the algorithm. At the end of these studies, an assessment of all the different classification methods are shown.</p><p>
APA, Harvard, Vancouver, ISO, and other styles
49

Shapiro, Daniel. "Composing Recommendations Using Computer Screen Images: A Deep Learning Recommender System for PC Users." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36272.

Full text
Abstract:
A new way to train a virtual assistant with unsupervised learning is presented in this thesis. Rather than integrating with a particular set of programs and interfaces, this new approach involves shallow integration between the virtual assistant and computer through machine vision. In effect the assistant interprets the computer screen in order to produce helpful recommendations to assist the computer user. In developing this new approach, called AVRA, the following methods are described: an unsupervised learning algorithm which enables the system to watch and learn from user behavior, a method for fast filtering of the text displayed on the computer screen, a deep learning classifier used to recognize key onscreen text in the presence of OCR translation errors, and a recommendation filtering algorithm to triage the many possible action recommendations. AVRA is compared to a similar commercial state-of-the-art system, to highlight how this work adds to the state of the art. AVRA is a deep learning image processing and recommender system that can col- laborate with the computer user to accomplish various tasks. This document presents a comprehensive overview of the development and possible applications of this novel vir- tual assistant technology. It detects onscreen tasks based upon the context it perceives by analyzing successive computer screen images with neural networks. AVRA is a rec- ommender system, as it assists the user by producing action recommendations regarding onscreen tasks. In order to simplify the interaction between the user and AVRA, the system was designed to only produce action recommendations that can be accepted with a single mouse click. These action recommendations are produced without integration into each individual application executing on the computer. Furthermore, the action recommendations are personalized to the user’s interests utilizing a history of the user’s interaction.
APA, Harvard, Vancouver, ISO, and other styles
50

Assouel, Rim. "Entity-centric representations in deep learning." Thesis, 2020. http://hdl.handle.net/1866/24306.

Full text
Abstract:
Humans' incredible capacity to model the complexity of the physical world is possible because they cast this complexity as the composition of simpler entities and rules to process them. Extensive work in cognitive science indeed shows that human perception and reasoning ability is structured around objects. Motivated by this observation, a growing number of recent work focused on entity-centric approaches to learning representation and their potential to facilitate downstream tasks. In the first contribution, we show how an entity-centric approach to learning a transition model allows us to extract meaningful visual entities and to learn transition rules that achieve better compositional generalization. In the second contribution, we show how an entity-centric approach to generating graphs allows us to design a model for conditional graph generation that permits direct optimisation of the graph properties. We investigate the performance of our model in a prototype-based molecular graph generation task. In this task, called lead optimization in drug discovery, we wish to adjust a few physico-chemical properties of a molecule that has proven efficient in vitro in order to make a drug out of it.<br>L'incroyable capacité des humains à modéliser la complexité du monde physique est rendue possible par la décomposition qu'ils en font en un ensemble d'entités et de règles simples. De nombreux travaux en sciences cognitives montre que la perception humaine et sa capacité à raisonner est essentiellement centrée sur la notion d'objet. Motivés par cette observation, de récents travaux se sont intéressés aux différentes approches d'apprentissage de représentations centrées sur des entités et comment ces représentations peuvent être utilisées pour résoudre plus facilement des tâches sous-jacentes. Dans la première contribution on montre comment une architecture centrée sur la notion d'entité va permettre d'extraire des entités visuelles interpretables et d'apprendre un modèle du monde plus robuste aux différentes configurations d'objets. Dans la deuxième contribution on s’intéresse à un modèle de génération de graphes dont l'architecture est également centrée sur la notion d'entités et comment cette architecture rend plus facile l'apprentissage d'une génération conditionelle à certaines propriétés du graphe. On s’intéresse plus particulièrement aux applications en découverte de médicaments. Dans cette tâche, on souhaite optimiser certaines propriétés physico-chmiques du graphe d'une molécule qui a été efficace in-vitro et dont on veut faire un médicament.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography