Dissertations / Theses: 'Learning and Forgetting'

1

Packer, Heather S. "Evolving ontologies with online learning and forgetting algorithms." Thesis, University of Southampton, 2011. https://eprints.soton.ac.uk/194923/.

Full text

Abstract:

Agents that require vocabularies to complete tasks can be limited by static vocabularies which cannot evolve to meet unforeseen domain tasks, or reflect its changing needs or environment. However, agents can benefit from using evolution algorithms to evolve their vocabularies, namely the ability to support new domain tasks. While an agent can capitalise on being able support more domain tasks, using existing techniques can hinder them because they do not consider the associated costs involved with evolving an agent's ontology. With this motivation, we explore the area of ontology evolution in agent systems, and focus on the reduction of the costs associated with an evolving ontology. In more detail, we consider how an agent can reduce the costs of evolving an ontology, these include costs associated with: the acquisition of new concepts; processing new concepts; the increased memory usage from storing new concepts; and the removal of unnecessary concepts. Previous work reported in the literature has largely failed to analyse these costs in the context of evolving an agent's ontology. Against this background, we investigate and develop algorithms to enable agents to evolve their ontologies. More specifically, we present three online evolution algorithms that enable agents to: i) augment domain related concepts, ii) use prediction to select concepts to learn, and iii) prune unnecessary concepts from their ontology, with the aim to reduce the costs associated with the acquisition, processing and storage of acquired concepts. In order to evaluate our evolution algorithms, we developed an agent framework which enables agents to use these algorithms and measure an agent's performance. Finally, our empirical evaluation shows that our algorithms are successful in reducing the costs associated with evolving an agent's ontology.

APA, Harvard, Vancouver, ISO, and other styles

2

Vik, Mikael Eikrem. "Reducing catastrophic forgetting in neural networks using slow learning." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2006. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-8702.

Full text

Abstract:

This thesis describes a connectionist approach to learning and long-term memory consolidation, inspired by empirical studies on the roles of the hippocampus and neocortex in the brain. The existence of complementary learning systems is due to demands posed on our cognitive system because of the nature of our experiences. It has been shown that dual-network architectures utilizing information transfer successfully can avoid the phenomenon of catastrophic forgetting involved in multiple sequence learning. The experiments involves a Reverberated Simple Recurrent Network which is trained on multiple sequences with the memory being reinforced by means of self-generated pseudopatterns. My focus will be on the implications of how differentiated learning speed affects the level of forgetting, without explicit training on the data used to form the existing memory.

APA, Harvard, Vancouver, ISO, and other styles

3

Besedin, Andrey. "Continual forgetting-free deep learning from high-dimensional data streams." Electronic Thesis or Diss., Paris, CNAM, 2019. http://www.theses.fr/2019CNAM1263.

Full text

Abstract:

Dans cette thèse, nous proposons une nouvelle approche de l’apprentissage profond pour la classification des flux de données de grande dimension. Au cours des dernières années, les réseaux de neurones sont devenus la référence dans diverses applications d’apprentissage automatique. Cependant, la plupart des méthodes basées sur les réseaux de neurones sont conçues pour résoudre des problèmes d’apprentissage statique. Effectuer un apprentissage profond en ligne est une tâche difficile. La principale difficulté est que les classificateurs basés sur les réseaux de neurones reposent généralement sur l’hypothèse que la séquence des lots de données utilisées pendant l’entraînement est stationnaire ; ou en d’autres termes, que la distribution des classes de données est la même pour tous les lots (hypothèse i.i.d.). Lorsque cette hypothèse ne tient pas les réseaux de neurones ont tendance à oublier les concepts temporairement indisponibles dans le flux. Dans la littérature scientifique, ce phénomène est généralement appelé oubli catastrophique. Les approches que nous proposons ont comme objectif de garantir la nature i.i.d. de chaque lot qui provient du flux et de compenser l’absence de données historiques. Pour ce faire, nous entrainons des modèles génératifs et pseudo-génératifs capable de produire des échantillons synthétiques à partir des classes absentes ou mal représentées dans le flux, et complètent les lots du flux avec ces échantillons. Nous testons nos approches dans un scénario d’apprentissage incrémental et dans un type spécifique de l’apprentissage continu. Nos approches effectuent une classification sur des flux de données dynamiques avec une précision proche des résultats obtenus dans la configuration de classification statique où toutes les données sont disponibles pour la durée de l’apprentissage. En outre, nous démontrons la capacité de nos méthodes à s’adapter à des classes de données invisibles et à de nouvelles instances de catégories de données déjà connues, tout en évitant d’oublier les connaissances précédemment acquises
In this thesis, we propose a new deep-learning-based approach for online classification on streams of high-dimensional data. In recent years, Neural Networks (NN) have become the primary building block of state-of-the-art methods in various machine learning problems. Most of these methods, however, are designed to solve the static learning problem, when all data are available at once at training time. Performing Online Deep Learning is exceptionally challenging.The main difficulty is that NN-based classifiers usually rely on the assumption that the sequence of data batches used during training is stationary, or in other words, that the distribution of data classes is the same for all batches (i.i.d. assumption).When this assumption does not hold Neural Networks tend to forget the concepts that are temporarily not available in thestream. In the literature, this phenomenon is known as catastrophic forgetting. The approaches we propose in this thesis aim to guarantee the i.i.d. nature of each batch that comes from the stream and compensates for the lack of historical data. To do this, we train generative models and pseudo-generative models capable of producing synthetic samples from classes that are absent or misrepresented in the stream and complete the stream’s batches with these samples. We test our approaches in an incremental learning scenario and a specific type of continuous learning. Our approaches perform classification on dynamic data streams with the accuracy close to the results obtained in the static classification configuration where all data are available for the duration of the learning. Besides, we demonstrate the ability of our methods to adapt to invisible data classes and new instances of already known data categories, while avoiding forgetting the previously acquired knowledge

APA, Harvard, Vancouver, ISO, and other styles

4

Evilevitch, Anton, and Robert Ingram. "Avoiding Catastrophic Forgetting in Continual Learning through Elastic Weight Consolidation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302552.

Full text

Abstract:

Image classification is an area of computer science with many areas of application. One key issue with using Artificial Neural Networks (ANN) for image classification is the phenomenon of Catastrophic Forgetting when training tasks sequentially (i.e Continual Learning). This is when the network quickly looses its performance on a given task after it has been trained on a new task. Elastic Weight Consolidation (EWC) has previously been proposed as a remedy to lessen the effects of this phenomena through the use of a loss function which utilizes a Fisher Information Matrix. We want to explore and establish if this still holds true for modern network architectures, and to what extent this can be applied using today’s state- of- the- art networks. We focus on applying this approach on tasks within the same dataset. Our results indicate that the approach is feasible, and does in fact lessen the effect of Catastrophic Forgetting. These results are achieved, however, at the cost of much longer execution times and time spent tuning the hyper- parameters.
Bildklassifiering är ett område inom dataologi med många tillämpningsområden. En nyckelfråga när det gäller användingen av Artificial Neural Networks (ANN) för bildklassifiering är fenomenet Catastrophic Forgetting. Detta inträffar när ett nätverk tränas sekventiellt (m.a.o. Continual Learning). Detta innebär att nätverket snabbt tappar prestanda för en viss uppgift efter att den har tränats på en ny uppgift. Elastic Weight Consolidation (EWC) har tidigare föreslagits som ett lindring genom applicering av en förlustfunktion som använder Fisher Information Matrix. Vi vill utforska och fastställa om detta fortfarande gäller för moderna nätverksarkitekturer, och i vilken utsträckning det kan tillämpas. Vi utför metoden på uppgifter inom en och samma dataset. Våra resultat visar att metoden är genomförbar och har en minskande effekt på Catastrophic Forgetting. Dessa resultat uppnås dock på bekostnad av längre körningstider och ökad tidsåtgång för val av hyperparametrar.

APA, Harvard, Vancouver, ISO, and other styles

5

Ahmad, Neida Basheer, and Neida Basheer Ahmad. "Forgetting Can Be Helpful for Learning: How Wakeful, Offline Processing Influences Infant Language Learning." Thesis, The University of Arizona, 2017. http://hdl.handle.net/10150/624894.

Full text

Abstract:

In previous work, 11-month-old infants were unable to learn rules about the relation of the consonants in CVCV words when the stimuli were randomly ordered. By chance, the randomordering of the stimuli promoted local spurious generalizations that impeded infants' learning of the phonotactic rules. This experiment asked whether a 30-second delay after exposure to a list of 24 randomly ordered words promotes learning. The 30-second delay did promote learning, though not until the third block of testing, suggesting that a longer delay might have shown a more robust effect. The interaction between conformity and block did not approach significance. However, t-tests performed on each of the three blocks revealed that in the third block, infants displayed a novelty preference, wherein they listened longer to stimuli that did not conform to their familiarization rule than the stimuli that conformed to their familiarization rule. Additionally, there is a trend toward an interaction between the previous experiment (no delay) and the current experiment (30-sec delay), suggesting that the 30-second delay may have made a difference in infants' behavior.

APA, Harvard, Vancouver, ISO, and other styles

6

Hough, Gerald E. "Learning, forgetting, and remembering : retention of song in the adult songbird /." The Ohio State University, 2000. http://rave.ohiolink.edu/etdc/view?acc_num=osu148820355277807.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Beaulieu, Shawn L. "Developing Toward Generality: Combating Catastrophic Forgetting with Developmental Compression." ScholarWorks @ UVM, 2018. https://scholarworks.uvm.edu/graddis/874.

Full text

Abstract:

General intelligence is the exhibition of intelligent behavior across multiple problems in a variety of settings, however intelligence is defined and measured. Endemic in approaches to realize such intelligence in machines is catastrophic forgetting, in which sequential learning corrupts knowledge obtained earlier in the sequence or in which tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have either sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or enforce modularity such that minimally overlapping sub-functions contain task-specific knowledge. While successful in some domains, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data. Presented here is a method called developmental compression that addresses catastrophic forgetting in the neural networks of embodied agents. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously evolved capabilities and `compresses' specialized neural networks into a single generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation, and does so in a way that suggests better scalability than existing approaches. This method is validated on a robot control problem and may be extended to other machine learning domains in the future.

APA, Harvard, Vancouver, ISO, and other styles

8

Weeks, Clinton. "Investigation of the differential forgetting rates of item and associative information /." [St. Lucia, Qld.], 2002. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe16837.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Ariel, Robert. "The Contribution of Past Test Performance, New Learning, and Forgetting to Judgment-of-Learning Resolution." Kent State University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=kent1277315741.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Jaber, Mohamad Y. "The effects of learning and forgetting on the economic manufactured quantity (EMQ)." Thesis, University of Nottingham, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.319967.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Lesort, Timothée. "Continual Learning : Tackling Catastrophic Forgetting in Deep Neural Networks with Replay Processes." Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAE003.

Full text

Abstract:

Les humains apprennent toute leur vie. Ils accumulent des connaissances à partir d'une succession d'expériences d'apprentissage et en mémorisent les aspects essentiels sans les oublier. Les réseaux de neurones artificiels ont des difficultés à apprendre dans de telles conditions. Ils ont en général besoin d'ensembles de données rigoureusement préparés pour pouvoir apprendre à résoudre des problèmes comme de la classification ou de la régression. En particulier, lorsqu'ils apprennent sur des séquences d'ensembles de données, les nouvelles expériences leurs font oublier les anciennes. Ainsi, ils sont souvent incapables d'appréhender des scénarios réels tels ceux de robots autonomes apprenant en temps réel à s'adapter à de nouvelles situations et devant résoudre des problèmes sans oublier leurs expériences passées.L'apprentissage continu est une branche de l'apprentissage automatique s'attaquant à ce type de scénarios. Les algorithmes continus sont créés pour apprendre des connaissances, les enrichir et les améliorer au cours d'un curriculum d'expériences d'apprentissage.Dans cette thèse, nous proposons d'explorer l'apprentissage continu avec rejeu de données. Les méthodes de rejeu de données rassemblent les méthodes de répétitions et les méthodes de rejeu par génération. Le rejeu par génération consiste à utiliser un réseau de neurones auxiliaire apprenant à générer les données actuelles. Ainsi plus tard le réseau auxiliaire pourra être utilisé pour régénérer des données du passé et les remémorer au modèle principal. La répétition a le même objectif, mais cette méthode sauve simplement des images spécifiques et les rejoue plus tard au modèle principal pour éviter qu'il ne les oublie. Les méthodes de rejeu permettent de trouver un compromis entre l'optimisation de l'objectif d'apprentissage actuel et ceux du passé. Elles permettent ainsi d'apprendre sans oublier sur des séquences de tâches.Nous montrons que ces méthodes sont prometteuses pour l'apprentissage continu.En particulier, elles permettent la réévaluation des données du passé avec des nouvelles connaissances et de confronter des données issues de différentes expériences. Nous démontrons la capacité des méthodes de rejeu à apprendre continuellement à travers des tâches d'apprentissage non-supervisées, supervisées et de renforcements
Humans learn all their life long. They accumulate knowledge from a sequence of learning experiences and remember the essential concepts without forgetting what they have learned previously. Artificial neural networks struggle to learn similarly. They often rely on data rigorously preprocessed to learn solutions to specific problems such as classification or regression.In particular, they forget their past learning experiences if trained on new ones.Therefore, artificial neural networks are often inept to deal with real-lifesuch as an autonomous-robot that have to learn on-line to adapt to new situations and overcome new problems without forgetting its past learning-experiences.Continual learning (CL) is a branch of machine learning addressing this type of problems.Continual algorithms are designed to accumulate and improve knowledge in a curriculum of learning-experiences without forgetting.In this thesis, we propose to explore continual algorithms with replay processes.Replay processes gather together rehearsal methods and generative replay methods.Generative Replay consists of regenerating past learning experiences with a generative model to remember them. Rehearsal consists of saving a core-set of samples from past learning experiences to rehearse them later. The replay processes make possible a compromise between optimizing the current learning objective and the past ones enabling learning without forgetting in sequences of tasks settings.We show that they are very promising methods for continual learning. Notably, they enable the re-evaluation of past data with new knowledge and the confrontation of data from different learning-experiences. We demonstrate their ability to learn continually through unsupervised learning, supervised learning and reinforcement learning tasks

APA, Harvard, Vancouver, ISO, and other styles

12

Larsen, Caroline, and Elin Ryman. "A quantitative analysis of how the Variational Continual Learning method handles catastrophic forgetting." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280447.

Full text

Abstract:

Catastrophic forgetting is a problem that occurs when an artificial neural network in the continual learning setting replaces historic information as additional information is acquired. Several methods claiming to handle the aforementioned problem are trained and evaluated using data sets with a small number of tasks, which does not represent a real continual learning situation where the number of tasks could be large. In this report, it is examined how three versions of the method Variational Continual Learning (VCL) handles catastrophic forgetting when training an artificial neural network using a data set with 20 tasks, as well as a data set with 5 tasks. The results show that all three versions of VCL performed well, even though there were some signs of catastrophic forgetting. Notably, the two versions of VCL extended with an episodic memory achieved the highest accuracy of the three versions. In conclusion, we believe that all three versions of the VCL method handles the problem of catastrophic forgetting, when trained on data set with up to 20 tasks.
Catastrophic forgetting är ett problem som uppstår när ett artificiellt neuralt nätverk ersätter gammal kunskap då ny information lärs in. Flera metoder som påstås hantera det ovannämnda problemet har tränats och utvärderats med dataset bestående av ett litet antal uppgifter (tasks), vilket inte representerar en verklig situation där antalet uppgifter kan vara stort. Den här rapporten undersöker hur tre versioner av metoden Variational Continual Learning (VCL) hanterar catastrophic forgetting när artificiella neurala nätverk tränas med ett dataset med 20 uppgifter, samt ett dataset med 5 uppgifter. Resultaten visar att alla tre versioner av metoden presterade bra, även om det fanns viss antydan till catastrophic forgetting. I synnerhet uppnådde de två versionerna av VCL som utökats med ett episodiskt minne bäst resultat. Sammanfattningsvis kan det sägas att alla tre versioner av VCL-metoden hanterar problemet catastrophic forgetting när de tränas med ett dataset bestående av upp till 20 uppgifter.

APA, Harvard, Vancouver, ISO, and other styles

13

Masana, Castrillo Marc. "Lifelong Learning of Neural Networks: Detecting Novelty and Adapting to New Domains without Forgetting." Doctoral thesis, Universitat Autònoma de Barcelona, 2020. http://hdl.handle.net/10803/671591.

Full text

Abstract:

La visió per computador ha experimentat canvis considerables en l’última dècada, ja que les xarxes neuronals han passat a ser d’ús comú. A mesura que les capacitats computacionals disponibles han crescut, les xarxes neuronals han aconseguit avenços en moltes tasques de visió per computador i fins i tot han superat el rendiment humà en altres. Un camp de recerca que ha experimentat un notable augment de l’interès és la dels sistemes d’aprenentatge continuat. Aquests sistemes haurien de ser capaços de realitzar tasques de manera eficient, identificar-ne i aprendre’n de noves, a més de ser capaços de desplegar versions més compactes d’ells mateixos que siguin experts en tasques específiques. En aquesta tesi, contribuïm a la investigació sobre l’aprenentatge continuat i abordem la compressió i adaptació de xarxes a dominis més petits, l’aprenentatge incremental de xarxes enfrontades a diverses tasques i, finalment, la detecció d’anomalies i novetats en temps d’inferència. Explorem com es pot transferir el coneixement des de grans models pre-entrenats a xarxes amb tasques més específiques, capaces d’executar-se en dispositius més limitats extraient la informació més rellevant. L’ús d’un model pre-entrenat proporciona representacions més robustes i una inicialització més estable quan s’aprèn una tasca més específica, cosa que comporta un rendiment més alt i es coneix com a adaptació de domini. Tanmateix, aquests models són massa grans per a determinades aplicacions que cal desplegar en dispositius amb poca memòria i poca capacitat de càlcul. En aquesta tesi demostrem que, després de realitzar l’adaptació de domini, algunes activacions apreses amb prou feines contribueixen a les prediccions del model. Per tant, proposem aplicar la compressió de xarxa basada en la descomposició de matrius de baix rang mitjançant les estadístiques de les activacions. Això es tradueix en una reducció significativa de la mida del model i del cost computacional. Igual que la intel·ligència humana, el machine learning pretén tenir la capacitat d’aprendre i recordar el coneixement. Tot i això, quan una xarxa neuronal ja entrenada aprèn una nova tasca, s’acaba oblidant de les anteriors. Això es coneix com a oblit catastròfic i s’estudia la seva prevenció en l’aprenentatge continu. El treball presentat en aquesta tesi estudia àmpliament les tècniques d’aprenentatge continu com una aproximació per evitar oblits catastròfics en escenaris d’aprenentatge seqüencial. La nostra tècnica es basa en l’ús de màscares ternàries per tal d’actualitzar una xarxa a tasques noves, reutilitzant el coneixement d’altres anteriors sense oblidar res d’elles. A diferència dels treballs anteriors, les nostres màscares s’apliquen a les activacions de cada capa en lloc dels pesos. Això redueix considerablement el nombre de paràmetres que s’afegiran per a cada nova tasca. A més, analitzem l’estat de l’art en aprenentatge incremental sense accés a l’identificador de tasca. Això proporciona informació sobre les direccions de recerca actuals que se centren a evitar l’oblit catastròfic mitjançant la regularització, l’assaig de tasques anteriors des d’una petita memòria externa o compensant el biaix de la tasca més recent. Les xarxes neuronals entrenades amb una funció de cost basada en entropia creuada obliguen les sortides del model a tendir cap a un vector codificat de sortida única. Això fa que els models tinguin massa confiança quan intenten predir imatges o classes que no estaven presents a la distribució original. La capacitat d’un sistema per ser conscient dels límits de les tasques apreses i identificar anomalies o classes que encara no s’han après és clau per a l’aprenentatge continu i els sistemes autònoms. En aquesta tesi, presentem un enfocament d’aprenentatge mètric per a la detecció d’anomalies que aprèn la tasca en un espai mètric.
La visión por computador ha experimentado cambios considerables en la última década a medida que las redes neuronales se han vuelto de uso común. Debido a que las capacidades computacionales disponibles han ido aumentando, las redes neuronales han logrado avances en muchas tareas de visión por computador e incluso han superado el rendimiento humano en otras. Una dirección de investigación que ha experimentado un aumento notable en interés son los sistemas de aprendizaje continuado. Estos sistemas deben ser capaces de realizar tareas de manera eficiente, identificar y aprender otras nuevas y, además, deben poder implementar versiones más compactas de sí mismos que sean expertos en tareas específicas. En esta tesis, contribuimos a la investigación sobre el aprendizaje continuado y abordamos la compresión y adaptación de redes a pequeños dominios, el aprendizaje incremental de redes ante una variedad de tareas y, finalmente, la detección de anomalías y novedades durante la inferencia. Exploramos cómo se puede transferir el conocimiento de grandes modelos pre-entrenados a redes con tareas más específicas capaces de ejecutarse en dispositivos más pequeños. El uso de un modelo pre-entrenado proporciona representaciones más robustas y una inicialización más estable al aprender una tarea más pequeña, lo que conduce a un mayor rendimiento y se conoce como adaptación de dominio. Sin embargo, esos modelos son demasiado grandes para ciertas aplicaciones que deben implementarse en dispositivos con memoria y capacidad computacional limitadas. En esta tesis mostramos que, después de realizar la adaptación de dominio, algunas activaciones aprendidas apenas contribuyen a las predicciones del modelo. Por lo tanto, proponemos aplicar compresión de redes basada en la descomposición matricial de bajo rango utilizando las estadísticas de las activaciones. Esto da como resultado una reducción significativa del tamaño del modelo y del coste computacional. Al igual que la inteligencia humana, el machine learning tiene como objetivo tener la capacidad de aprender y recordar conocimientos. Sin embargo, cuando una red neuronal ya entrenada aprende una nueva tarea, termina olvidando las anteriores. Esto se conoce como olvido catastrófico y su prevención se estudia en el aprendizaje continuo. El trabajo presentado en esta tesis analiza ampliamente las técnicas de aprendizaje continuo y presenta un enfoque para evitar el olvido catastrófico en escenarios de aprendizaje secuencial de tareas. Nuestra técnica se basa en utilizar máscaras ternarias cuando la red tiene que aprender nuevas tareas, reutilizando los conocimientos de las anteriores sin olvidar nada de ellas. A diferencia otros trabajos, nuestras máscaras se aplican a las activaciones de cada capa en lugar de a los pesos. Esto reduce considerablemente el número de parámetros que se agregarán para cada nueva tarea. Además, el análisis de una amplia gama de trabajos sobre aprendizaje incremental sin acceso a la identificación de la tarea, proporciona información sobre los enfoques actuales del estado del arte que se centran en evitar el olvido catastrófico mediante el uso de la regularización, el ensayo de tareas anteriores con memorias externas, o compensando el sesgo hacia la tarea más reciente. Las redes neuronales entrenadas con una función de coste basada en entropía cruzada obligan a las salidas del modelo a tender hacia un vector de salida única. Esto hace que los modelos tengan demasiada confianza cuando se les presentan imágenes o clases que no estaban presentes en la distribución del entrenamiento. La capacidad de un sistema para conocer los límites de las tareas aprendidas e identificar anomalías o clases que aún no se han aprendido es clave para el aprendizaje continuado y los sistemas autónomos. En esta tesis, presentamos un enfoque de aprendizaje con métricas para la detección de anomalías que aprende la tarea en un espacio métrico.
Computer vision has gone through considerable changes in the last decade as neural networks have come into common use. As available computational capabilities have grown, neural networks have achieved breakthroughs in many computer vision tasks, and have even surpassed human performance in others. With accuracy being so high, focus has shifted to other issues and challenges. One research direction that saw a notable increase in interest is on lifelong learning systems. Such systems should be capable of efficiently performing tasks, identifying and learning new ones, and should moreover be able to deploy smaller versions of themselves which are experts on specific tasks. In this thesis, we contribute to research on lifelong learning and address the compression and adaptation of networks to small target domains, the incremental learning of networks faced with a variety of tasks, and finally the detection of out-of-distribution samples at inference time. We explore how knowledge can be transferred from large pretrained models to more task-specific networks capable of running on smaller devices by extracting the most relevant information based on activation statistics. Using a pretrained model provides more robust representations and a more stable initialization when learning a smaller task, which leads to higher performance and is known as domain adaptation. However, those models are too large for certain applications that need to be deployed on devices with limited memory and computational capacity. In this thesis we show that, after performing domain adaptation, some learned activations barely contribute to the predictions of the model. Therefore, we propose to apply network compression based on low-rank matrix decomposition using the activation statistics. This results in a significant reduction of the model size and the computational cost. Like human intelligence, machine intelligence aims to have the ability to learn and remember knowledge. However, when a trained neural network is presented with learning a new task, it ends up forgetting previous ones. This is known as catastrophic forgetting and its avoidance is studied in continual learning. The work presented in this thesis extensively surveys continual learning techniques (both when knowing the task-ID at test time or not) and presents an approach to avoid catastrophic forgetting in sequential task learning scenarios. Our technique is based on using ternary masks in order to update a network to new tasks, reusing the knowledge of previous ones while not forgetting anything about them. In contrast to earlier work, our masks are applied to the activations of each layer instead of the weights. This considerably reduces the number of mask parameters to be added for each new task; with more than three orders of magnitude for most networks. Furthermore, the analysis on a wide range of work on incremental learning without access to the task-ID, provides insight on current state-of-the-art approaches that focus on avoiding catastrophic forgetting by using regularization, rehearsal of previous tasks from a small memory, or compensating the task-recency bias. We also consider the problem of out-of-distribution detection. Neural networks trained with a cross-entropy loss force the outputs of the model to tend toward a one-hot encoded vector. This leads to models being too overly confident when presented with images or classes that were not present in the training distribution. The capacity of a system to be aware of the boundaries of the learned tasks and identify anomalies or classes which have not been learned yet is key to lifelong learning and autonomous systems. In this thesis, we present a metric learning approach to out-of-distribution detection that learns the task at hand on an embedding space.

APA, Harvard, Vancouver, ISO, and other styles

14

Tummaluri, Raghuram R. "Operator Assignment in Labor Intensive Cells Considering Operation Time Based Skill Levels, Learning and Forgetting." Ohio University / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1126900571.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Sharp, Jessica Lynn. "Retention in Male and Female Rats: Forgetting Curves for an Element that Violates Pattern Structure." Kent State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=kent1505299781953525.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Osorio, Ricardo M. Tamayo. "Sources of dissociation in the forgetting trajectories of implicit and explicit knowledge." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II, 2009. http://dx.doi.org/10.18452/15867.

Full text

Abstract:

Die vorliegende Dissertation untersucht Dissoziationen zwischen Vergessensverläufen für implizites und explizites Wissen. Aus diesem Ansatz können sich wesentliche Einschränkungen ergeben in Bezug auf die Annahme, sowohl impliziten als auch expliziten Prozessen liege ein einziges Gedächtnissystem oder ein einziger Mechanismus zugrunde. Im theoretischen Teil der Arbeit wird implizites Wissen als Information definiert, die ohne Intention gelernt und abgerufen wird, und die generelle Bedeutung einfacher Dissoziationen für Theorien impliziten Wissens erklärt. Ich gebe einen Überblick über die wesentlichen Forschungsprogramme in Hinblick auf Funktionen, Prozesse, Entwicklung, neuronale Korrelate und Vergessensverläufe impliziten Wissens und lege dar, daß der Vergleich der Vergessensverläufe impliziten und expliziten Wissens eine graduelle Perspektive ermöglicht, die die mit an einem einzelnen isolierten Zeitpunkt beobachteten einfachen Dissoziationen verbundenen Probleme überwindet, und auch dazu beitragen kann, die Lücke zwischen der Forschung zum impliziten Lernen und zum impliziten Gedächtnis zu schließen. In einer Reihe von vier Experimenten wurden studentische Versuchsteilnehmer Regelhaftigkeiten in der Umwelt ausgesetzt, die in eine künstliche Grammatikaufgabe (AG) oder Wahlreaktionsaufgabe (SRT) eingebettet waren. Für den Vergleich der Vergessensverläufe wurde das implizite (aus motorischen Reaktionszeiten erschlossene) und explizite (auf Wiedererkennung basierte) Wissen der Versuchspersonen jeweils vor und nach einem Behaltensintervall erfaßt. Die Befunde zeigen, daß sowohl in der AG als auch der SRT explizites Wissen schneller zerfällt als implizites. Darüber hinaus lieferte eine Interferenz-Aufgabe, die anstelle des Behaltensintervalls eingesetzt wurde, das gleiche Dissoziationsmuster. Schließlich wurde anhand einer Reihe von Simulationen geprüft, ob ein komputationales Ein-Speicher-Modell (Shanks, Wilkinson, & Channon, 2003) die experimentellen Befunde erklären kann. Die Simulationen zeigen, daß das Modell nur dann in Übereinstimmung mit den Daten gebracht werden kann, wenn zwischen den verschiedenen Meßzeitpunkten Veränderungen in den Parametern (a) der gemeinsamen Repräsentationsstärke für implizites und explizites Wissen, und (b) der Reliabilität des expliziten Maßes eingeführt werden. Meine Dissertation schlägt also (1) einen konzeptuellen Rahmen für explizites und implizites Wissen vor, erbringt (2) neue empirische Belege für Dissoziationen zwischen den Vergessensverläufen dieser Wissensformen, und identifiziert (3) die spezifischen Randbedingungen für ein Ein-Speicher- bzw. Ein-Prozess-Modell.
In this dissertation I investigate dissociations in the forgetting patterns of implicit and explicit knowledge. I claim that this approach may provide significant constraints for the assumption that a single system or mechanism determines both implicit and explicit processes. In the theoretical part, I construe a definition of implicit knowledge as information learned and retrieved without intention. I also explain the general role of single dissociations in theories of implicit knowledge. And I present an overview of the main lines of research concerned with the functions, operation, development, neural substrates, and forgetting patterns of implicit knowledge. In general, I argue that comparing the forgetting patterns of implicit and explicit knowledge may be best regarded from a graded perspective and may usefully bridge the gap between research on implicit learning and implicit memory. In a series of 4 Experiments university students were exposed to environmental regularities embedded in artificial grammar (AG) and serial reaction time (SRT) tasks. To compare the forgetting patterns, participants’ implicit (motor-performance based) and explicit (recognition based) knowledge was assessed before and after a retention interval. Taken together, the results indicate that explicit knowledge decays faster than implicit knowledge in both AG and SRT tasks. Furthermore, an interference task introduced instead of a retention interval produced the same pattern of dissociations. Finally, I conducted a set of simulations to asses the ability of a single-system model (Shanks, Wilkinson, & Channon, 2003) to account for my experimental results. The simulations showed that the model best fits the empirical data by introducing changes in the parameters related to (a) the common knowledge strength (for implicit and implicit knowledge), and (b) the reliability for the explicit test. In sum, my dissertation (1) suggests a conceptual framework for implicit and explicit knowledge, (2) provides new empirical evidence of dissociations in their forgetting patterns, and (3) identifies specific boundary conditions for a single-system model.

APA, Harvard, Vancouver, ISO, and other styles

17

Li, Max Hongming. "Extension on Adaptive MAC Protocol for Space Communications." Digital WPI, 2018. https://digitalcommons.wpi.edu/etd-theses/1275.

Full text

Abstract:

This work devises a novel approach for mitigating the effects of Catastrophic Forgetting in Deep Reinforcement Learning-based cognitive radio engine implementations employed in space communication applications. Previous implementations of cognitive radio space communication systems utilized a moving window- based online learning method, which discards part of its understanding of the environment each time the window is moved. This act of discarding is called Catastrophic Forgetting. This work investigated ways to control the forgetting process in a more systematic manner, both through a recursive training technique that implements forgetting in a more controlled manner and an ensemble learning technique where each member of the ensemble represents the engine's understanding over a certain period of time. Both of these techniques were integrated into a cognitive radio engine proof-of-concept, and were delivered to the SDR platform on the International Space Station. The results were then compared to the results from the original proof-of-concept. Through comparison, the ensemble learning technique showed promise when comparing performance between training techniques during different communication channel contexts.

APA, Harvard, Vancouver, ISO, and other styles

18

Sharp, Jessica L. "Learning And Forgetting Of Complex Serial Behaviors In Rats: Interference And Spacing Effects In The Serial Multiple Choice Task." Kent State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=kent1564070613748065.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Kurtz, Tanja [Verfasser]. "Individual differences in learning and forgetting in old age: the role of basic cognitive abilities and subjective organization / Tanja Kurtz." Ulm : Universität Ulm. Fakultät für Ingenieurwissenschaften und Informatik, 2014. http://d-nb.info/1047384558/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Kubik, Veit. "Effects of Testing and Enactment on Memory." Doctoral thesis, Stockholms universitet, Psykologiska institutionen, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-108094.

Full text

Abstract:

Learning occurs not only when we encode information but also when we test our memory for this information at a later time. In three empirical studies, I investigated the individual and combined effects of interleaved testing (via repeated rounds of study and test practice) and encoding (via motor enactment) during learning on later cued-recall performance for action phrases. Such materials (e.g., “water the flowers”) contain a verb and a noun and approximate everyday memory that typically revolves around past and future actions. Study I demonstrated that both interleaved testing (vs. study only) and enactment (vs. verbal encoding) individually reduced the forgetting rate over a period of 1 week, but these effects were nonadditive. That is, the direct testing effect on the forgetting rate occurred for verbal, but not for enactive encoding; enactment reduced the forgetting rate for the study-only condition, but not for the study–test condition. A possible explanation of these findings is that both study techniques sufficiently elicit verb–noun relational processing that cannot be increased further by combining them. In Studies II and III, I replicated these testing-effect results and investigated whether they varied as a function of recall type (i.e., noun-cued recall of verbs and verb-cued recall of nouns). For verbal encoding (Study II), the direct testing effect was of similar size for both noun- and verb-cued recall. For enactive encoding, the direct testing effect was lacking irrespective of recall type. In addition, interleaved tests enhanced subsequent re-encoding of action phrases, leading to an accelerated learning. This indirect testing effect was increased for the noun-cued recall of verbs—for both verbal and enactive encoding. A possible explanation is that because nouns are semantically more stable, in that the meaning of nouns changes less over time and across different contexts, they are more recognizable. Hence, associated information (e.g., about the recall status) may be more available to the learner during restudy that, in turn, can initiate more effective re-encoding. The two different testing benefits (i.e., direct and indirect) may, partly, engage different mechanisms, as they were influenced differentially by the manipulations of encoding type and recall type. The findings presented in the thesis provide new knowledge regarding the combined effects of strategies and materials that influence memory.

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 1. Epub ahead of print. Paper 2: Manuscript. Paper 3: Manuscript.

APA, Harvard, Vancouver, ISO, and other styles

21

Wilson, Haley Pace. "Generalizability of Predictive Performance Optimizer Predictions Across Learning Task Type." Wright State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1471010032.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Gatto, Lorenzo. "Apprendimento continuo per il riconoscimento di immagini." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/15607/.

Full text

Abstract:

Negli ultimi anni il deep learning ha riscontrando molto interesse da parte della comunità scientifica, in gran parte grazie ai risultati ottenuti a partire dal 2012 nell’ambito di visione artificiale, del riconoscimento del parlato e della sintesi vocale. I più importanti risultati nell’ambito di deep learning sono stati ottenuti addestrando modelli di machine learning su dataset statici, iterando più volte la procedura di addestramento su tutti i dati disponibili. Ciò si contrappone a come gli umani imparano, cioè vedendo i dati (immagini, suoni, ecc.) una sola volta ma riuscendo comunque a ricordare il passato con un elevato livello di precisione. Il modo in cui gli umani apprendono viene chiamato apprendimento continuo (o continuous learning). Approcci all’addestramento dei modelli che evitano di osservare un pattern del dataset ripetutamente soffrono di un problema chiamato catastrophic forgetting, per il quale si tende a dimenticare le caratteristiche dei pattern visti in passato, facendo sì che il modello riconosca solo pattern simili a quelli visti recentemente. Varie soluzioni al problema sono state proposte, ma nessuna ottiene ancora performance simili a quelle ottenibili con l’approccio cumulativo, che esegue la procedura di addestramento su tutti i dati disponibili iterativamente. Il contributo dell’autore è stato quello di calcolare l’accuratezza della tecnica di apprendimento continuo iCaRL su dataset CORe50 e confrontare la performance ottenuta con i risultati ottenuti da altri autori nello stesso dataset utilizzando altre tecniche. I risultati mostrano come gli approcci attuali all’apprendimento continuo siano ancora nella loro infanzia, ottenendo livelli di accuratezza non comparabili a quelli ottenibili con l’approccio cumulativo.

APA, Harvard, Vancouver, ISO, and other styles

23

Martínez, Plumed Fernando. "Incremental and developmental perspectives for general-purpose learning systems." Doctoral thesis, Universitat Politècnica de València, 2016. http://hdl.handle.net/10251/67269.

Full text

Abstract:

[EN] The stupefying success of Artificial Intelligence (AI) for specific problems, from recommender systems to self-driving cars, has not yet been matched with a similar progress in general AI systems, coping with a variety of problems. This dissertation deals with the long-standing problem of creating more general AI systems, through the analysis of their development and the evaluation of their cognitive abilities. Firstly, this thesis contributes with a general-purpose learning system that meets several desirable characteristics in terms of expressiveness, comprehensibility and versatility. The system works with approaches that are inherently general: inductive programming and reinforcement learning. The system does not rely on a fixed library of learning operators, but can be endowed with new ones, so being able to operate in a wide variety of contexts. This flexibility, jointly with its declarative character, makes it possible to use the system as an instrument for better understanding the role (and difficulty) of the constructs that each task requires. The learning process is also overhauled with a new developmental and lifelong approach for knowledge acquisition, consolidation and forgetting, which is necessary when bounded resources (memory and time) are considered. Secondly, this thesis analyses whether the use of intelligence tests for AI evaluation is a much better alternative to most task-oriented evaluation approaches in AI. Accordingly, we make a review of what has been done when AI systems have been confronted against tasks taken from intelligence tests. In this regard, we scrutinise what intelligence tests measure in machines, whether they are useful to evaluate AI systems, whether they are really challenging problems, and whether they are useful to understand (human) intelligence. Finally, the analysis of the concepts of development and incremental learning in AI systems is done at the conceptual level but also through several of these intelligence tests, providing further insight for the understanding and construction of general-purpose developing AI systems.
[ES] El éxito abrumador de la Inteligencia Artificial (IA) en la resolución de tareas específicas (desde sistemas de recomendación hasta vehículos de conducción autónoma) no ha sido aún igualado con un avance similar en sistemas de IA de carácter más general enfocados en la resolución de una mayor variedad de tareas. Esta tesis aborda la creación de sistemas de IA de propósito general así como el análisis y evaluación tanto de su desarrollo como de sus capacidades cognitivas. En primer lugar, esta tesis contribuye con un sistema de aprendizaje de propósito general que reúne distintas ventajas como expresividad, comprensibilidad y versatilidad. El sistema está basado en aproximaciones de carácter inherentemente general: programación inductiva y aprendizaje por refuerzo. Además, dicho sistema se basa en una biblioteca dinámica de operadores de aprendizaje por lo que es capaz de operar en una amplia variedad de contextos. Esta flexibilidad, junto con su carácter declarativo, hace que sea posible utilizar el sistema de forma instrumental con el objetivo de facilitar la comprensión de las distintas construcciones que cada tarea requiere para ser resuelta. Por último, el proceso de aprendizaje también se revisa por medio de un enfoque evolutivo e incremental de adquisición, consolidación y olvido de conocimiento, necesario cuando se trabaja con recursos limitados (memoria y tiempo). En segundo lugar, esta tesis analiza el uso de tests de inteligencia humana para la evaluación de sistemas de IA y plantea si su uso puede constituir una alternativa válida a los enfoques actuales de evaluación de IA (más orientados a tareas). Para ello se realiza una exhaustiva revisión bibliográfica de aquellos sistemas de IA que han sido utilizados para la resolución de este tipo de problemas. Esto ha permitido analizar qué miden realmente los tests de inteligencia en los sistemas de IA, si son significativos para su evaluación, si realmente constituyen problemas complejos y, por último, si son útiles para entender la inteligencia (humana). Finalmente se analizan los conceptos de desarrollo cognitivo y aprendizaje incremental en sistemas de IA no solo a nivel conceptual, sino también por medio de estos problemas mejorando por tanto la comprensión y construcción de sistemas de propósito general evolutivos.
[CAT] L'èxit aclaparant de la Intel·ligència Artificial (IA) en la resolució de tasques específiques (des de sistemes de recomanació fins a vehicles de conducció autònoma) no ha sigut encara igualat amb un avanç similar en sistemes de IA de caràcter més general enfocats en la resolució d'una major varietat de tasques. Aquesta tesi aborda la creació de sistemes de IA de propòsit general així com l'anàlisi i avaluació tant del seu desenvolupament com de les seues capacitats cognitives. En primer lloc, aquesta tesi contribueix amb un sistema d'aprenentatge de propòsit general que reuneix diferents avantatges com ara expressivitat, comprensibilitat i versatilitat. El sistema està basat en aproximacions de caràcter inherentment general: programació inductiva i aprenentatge per reforç. A més, el sistema utilitza una biblioteca dinàmica d'operadors d'aprenentatge pel que és capaç d'operar en una àmplia varietat de contextos. Aquesta flexibilitat, juntament amb el seu caràcter declaratiu, fa que siga possible utilitzar el sistema de forma instrumental amb l'objectiu de facilitar la comprensió de les diferents construccions que cada tasca requereix per a ser resolta. Finalment, el procés d'aprenentatge també és revisat mitjançant un enfocament evolutiu i incremental d'adquisició, consolidació i oblit de coneixement, necessari quan es treballa amb recursos limitats (memòria i temps). En segon lloc, aquesta tesi analitza l'ús de tests d'intel·ligència humana per a l'avaluació de sistemes de IA i planteja si el seu ús pot constituir una alternativa vàlida als enfocaments actuals d'avaluació de IA (més orientats a tasques). Amb aquesta finalitat, es realitza una exhaustiva revisió bibliogràfica d'aquells sistemes de IA que han sigut utilitzats per a la resolució d'aquest tipus de problemes. Açò ha permès analitzar què mesuren realment els tests d'intel·ligència en els sistemes de IA, si són significatius per a la seua avaluació, si realment constitueixen problemes complexos i, finalment, si són útils per a entendre la intel·ligència (humana). Finalment s'analitzen els conceptes de desenvolupament cognitiu i aprenentatge incremental en sistemes de IA no solament a nivell conceptual, sinó també per mitjà d'aquests problemes millorant per tant la comprensió i construcció de sistemes de propòsit general evolutius.
Martínez Plumed, F. (2016). Incremental and developmental perspectives for general-purpose learning systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/67269
TESIS

APA, Harvard, Vancouver, ISO, and other styles

24

Johansson, Philip. "Incremental Learning of Deep Convolutional Neural Networks for Tumour Classification in Pathology Images." Thesis, Linköpings universitet, Institutionen för medicinsk teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158225.

Full text

Abstract:

Medical doctors understaffing is becoming a compelling problem in many healthcare systems. This problem can be alleviated by utilising Computer-Aided Diagnosis (CAD) systems to substitute doctors in different tasks, for instance, histopa-thological image classification. The recent surge of deep learning has allowed CAD systems to perform this task at a very competitive performance. However, a major challenge with this task is the need to periodically update the models with new data and/or new classes or diseases. These periodical updates will result in catastrophic forgetting, as Convolutional Neural Networks typically requires the entire data set beforehand and tend to lose knowledge about old data when trained on new data. Incremental learning methods were proposed to alleviate this problem with deep learning. In this thesis, two incremental learning methods, Learning without Forgetting (LwF) and a generative rehearsal-based method, are investigated. They are evaluated on two criteria: The first, capability of incrementally adding new classes to a pre-trained model, and the second is the ability to update the current model with an new unbalanced data set. Experiments shows that LwF does not retain knowledge properly for the two cases. Further experiments are needed to draw any definite conclusions, for instance using another training approach for the classes and try different combinations of losses. On the other hand, the generative rehearsal-based method tends to work for one class, showing a good potential to work if better quality images were generated. Additional experiments are also required in order to investigating new architectures and approaches for a more stable training.

APA, Harvard, Vancouver, ISO, and other styles

25

Hocquet, Guillaume. "Class Incremental Continual Learning in Deep Neural Networks." Thesis, université Paris-Saclay, 2021. http://www.theses.fr/2021UPAST070.

Full text

Abstract:

Nous nous intéressons au problème de l'apprentissage continu de réseaux de neurones artificiels dans le cas où les données ne sont accessibles que pour une seule catégorie à la fois. Pour remédier au problème de l'oubli catastrophique qui limite les performances d'apprentissage dans ces conditions, nous proposons une approche basée sur la représentation des données d'une catégorie par une loi normale. Les transformations associées à ces représentations sont effectuées à l'aide de réseaux inversibles, qui peuvent alors être entraînés avec les données d'une seule catégorie. Chaque catégorie se voit attribuer un réseau pour représenter ses caractéristiques. Prédire la catégorie revient alors à identifier le réseau le plus représentatif. L'avantage d'une telle approche est qu'une fois qu'un réseau est entraîné, il n'est plus nécessaire de le mettre à jour par la suite, chaque réseau étant indépendant des autres. C'est cette propriété particulièrement avantageuse qui démarque notre méthode des précédents travaux dans ce domaine. Nous appuyons notre démonstration sur des expériences réalisées sur divers jeux de données et montrons que notre approche fonctionne favorablement comparé à l'état de l'art. Dans un second temps, nous proposons d'optimiser notre approche en réduisant son impact en mémoire en factorisant les paramètres des réseaux. Il est alors possible de réduire significativement le coût de stockage de ces réseaux avec une perte de performances limitée. Enfin, nous étudions également des stratégies pour produire des réseaux capables d'être réutilisés sur le long terme et nous montrons leur pertinence par rapport aux réseaux traditionnellement utilisés pour l'apprentissage continu
We are interested in the problem of continual learning of artificial neural networks in the case where the data are available for only one class at a time. To address the problem of catastrophic forgetting that restrain the learning performances in these conditions, we propose an approach based on the representation of the data of a class by a normal distribution. The transformations associated with these representations are performed using invertible neural networks, which can be trained with the data of a single class. Each class is assigned a network that will model its features. In this setting, predicting the class of a sample corresponds to identifying the network that best fit the sample. The advantage of such an approach is that once a network is trained, it is no longer necessary to update it later, as each network is independent of the others. It is this particularly advantageous property that sets our method apart from previous work in this area. We support our demonstration with experiments performed on various datasets and show that our approach performs favorably compared to the state of the art. Subsequently, we propose to optimize our approach by reducing its impact on memory by factoring the network parameters. It is then possible to significantly reduce the storage cost of these networks with a limited performance loss. Finally, we also study strategies to produce efficient feature extractor models for continual learning and we show their relevance compared to the networks traditionally used for continual learning

APA, Harvard, Vancouver, ISO, and other styles

26

Liang, Hongyan. "Three Essays on Performance Evaluation in Operations and Supply Chain Management." Kent State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=kent1504827189112207.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Alhawari, Omar I. "Operator Assignment Decisions in a Highly Dynamic Cellular Environment." Ohio University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1221596218.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Nguyen, Minh Ha Information Technology &amp Electrical Engineering Australian Defence Force Academy UNSW. "Cooperative coevolutionary mixture of experts : a neuro ensemble approach for automatic decomposition of classification problems." Awarded by:University of New South Wales - Australian Defence Force Academy. School of Information Technology and Electrical Engineering, 2006. http://handle.unsw.edu.au/1959.4/38752.

Full text

Abstract:

Artificial neural networks have been widely used for machine learning and optimization. A neuro ensemble is a collection of neural networks that works cooperatively on a problem. In the literature, it has been shown that by combining several neural networks, the generalization of the overall system could be enhanced over the separate generalization ability of the individuals. Evolutionary computation can be used to search for a suitable architecture and weights for neural networks. When evolutionary computation is used to evolve a neuro ensemble, it is usually known as evolutionary neuro ensemble. In most real-world problems, we either know little about these problems or the problems are too complex to have a clear vision on how to decompose them by hand. Thus, it is usually desirable to have a method to automatically decompose a complex problem into a set of overlapping or non-overlapping sub-problems and assign one or more specialists (i.e. experts, learning machines) to each of these sub-problems. An important feature of neuro ensemble is automatic problem decomposition. Some neuro ensemble methods are able to generate networks, where each individual network is specialized on a unique sub-task such as mapping a subspace of the feature space. In real world problems, this is usually an important feature for a number of reasons including: (1) it provides an understanding of the decomposition nature of a problem; (2) if a problem changes, one can replace the network associated with the sub-space where the change occurs without affecting the overall ensemble; (3) if one network fails, the rest of the ensemble can still function in their sub-spaces; (4) if one learn the structure of one problem, it can potentially be transferred to other similar problems. In this thesis, I focus on classification problems and present a systematic study of a novel evolutionary neuro ensemble approach which I call cooperative coevolutionary mixture of experts (CCME). Cooperative coevolution (CC) is a branch of evolutionary computation where individuals in different populations cooperate to solve a problem and their fitness function is calculated based on their reciprocal interaction. The mixture of expert model (ME) is a neuro ensemble approach which can generate networks that are specialized on different sub-spaces in the feature space. By combining CC and ME, I have a powerful framework whereby it is able to automatically form the experts and train each of them. I show that the CCME method produces competitive results in terms of generalization ability without increasing the computational cost when compared to traditional training approaches. I also propose two different mechanisms for visualizing the resultant decomposition in high-dimensional feature spaces. The first mechanism is a simple one where data are grouped based on the specialization of each expert and a color-map of the data records is visualized. The second mechanism relies on principal component analysis to project the feature space onto lower dimensions, whereby decision boundaries generated by each expert are visualized through convex approximations. I also investigate the regularization effect of learning by forgetting on the proposed CCME. I show that learning by forgetting helps CCME to generate neuro ensembles of low structural complexity while maintaining their generalization abilities. Overall, the thesis presents an evolutionary neuro ensemble method whereby (1) the generated ensemble generalizes well; (2) it is able to automatically decompose the classification problem; and (3) it generates networks with small architectures.

APA, Harvard, Vancouver, ISO, and other styles

29

Cook, Samantha. "The Effect of oestrogen in a series of models related to schizophrenia and Alzheimer¿s disease. A preclinical investigation into the effect of oestrogen on memory, executive function on and anxiety in response to pharmacological insult and in a model of natural forgetting." Thesis, University of Bradford, 2012. http://hdl.handle.net/10454/5508.

Full text

Abstract:

Alzheimer¿s disease is associated with aging and is characterised by a progressive cognitive decline. Its onset in women coincides with the abrupt depletion of ovarian steroids prompting the investigation of utilising oestrogen replacement therapy as restoration or a preventative measure. Gonadal steroids have also recently been implicated in other disease states, particularly schizophrenia. In addition to the cognitive decline, sufferers of Alzheimer¿s disease and schizophrenia display anxiety related behaviour which gonadal steroids have also been shown to ameliorate. In this thesis several paradigms were used to investigate the effects of oestradiol benzoate (EB) on cognition and anxiety, utilising the NMDA receptor antagonist PCP, the muscarinic receptor antagonist scopolamine and the dopamine releasing agent amphetamine to induce a cognitive deficit in rats by different pharmacological mechanisms. The thesis also investigated the effects of EB on a delay dependent cognitive deficit model of forgetfulness in natural aging. Results showed that subchronic PCP dosing failed to induce a significant deficit in the novel object recognition task. Locomotor activity tests demonstrated that the PCP treated rats were sensitised to the treatment suggesting that the PCP dosing regimen was successful. There was no significant effect of oestrogen in the reversal learning model or in the plus maze task designed to explore EB¿s effects on anxiety. However, in the latter task there was a trend towards an anxiogenic effect of EB. Results from the delay dependent model of forgetfulness in natural aging demonstrated that EB could enhance recognition memory, but not spatial memory. The results are discussed in the context of the role of gonadal steroids especially oestrogen in combating the cognitive decline seen in schizophrenia, neurodegenerative disease and natural aging.

APA, Harvard, Vancouver, ISO, and other styles

30

Velková, Romana. "Psychologické aspekty reklamy." Master's thesis, Vysoká škola ekonomická v Praze, 2012. http://www.nusl.cz/ntk/nusl-199786.

Full text

Abstract:

The thesis deals with the topic of advertisement and advertising campaigns from the perspective of psychology. It firstly tries to explain the basic psychological concepts relating to advertisement. Then the thesis tries to describe the most common elements used in advertisement. These findings are applied in the practical part to a specific campaign with the purpose of creating hypotheses about impact of advertisement on customer. The aim of the practical part is using the results of the survey to assess validity of hypotheses.

APA, Harvard, Vancouver, ISO, and other styles

31

Gerbier, Emilie. "Effet du type d’agencement temporel des répétitions d’une information sur la récupération explicite." Thesis, Lyon 2, 2011. http://www.theses.fr/2011LYO20029/document.

Full text

Abstract:

La façon dont une information se répète au cours du temps a une influence sur la façon dont nous nous souviendrons de cette information. Les recherches en psychologie ont mis en évidence l’effet de pratique distribuée, selon lequel on retient mieux les informations qui se répètent avec des intervalles inter-répétitions longs que celles qui se répètent avec des intervalles courts. Nos travaux ont porté spécifiquement sur les situations où l’information se répète sur plusieurs jours, et nous avons comparé l’efficacité relative de différents types d’agencement temporel des répétitions. Un agencement uniforme consiste en des répétitions se produisant à intervalles réguliers, un agencement expansif en des répétitions se produisant selon des intervalles de plus en plus espacés, et un agencement contractant en des répétitions se produisant selon des intervalles de plus en plus rapprochés. Les Expériences 1 et 2 consistaient en une phase d’apprentissage d’une semaine et ont révélé la supériorité des agencements expansif et uniforme après un délai de rétention de deux jours. L’Expérience 3 consistait en une phase d’apprentissage de deux semaines, et les sujets étaient ensuite testés lors de trois délais de rétention différents (2, 6 ou 13 jours). La supériorité de l’agencement expansif sur les deux autres agencements est apparue progressivement, suggérant que les différents agencements induisaient des taux d’oubli différents. Nous avons également tenté de tester différentes théories explicatives des effets de l’agencement temporel des répétitions sur la mémorisation, en particulier les théories de la variabilité de l’encodage (Expérience 4) et de la récupération en phase d’étude (Expérience 2). Les résultats observés tendent à confirmer la théorie de la récupération en phase d’étude. Nous insistons sur l’importance de la prise en compte des apports des autres disciplines des sciences cognitives dans l’étude de l’effet de pratique distribuée
How information is repeated over time determines future recollection of this information. Studies in psychology revealed a distributed practice effect, that is, one retains information better when its occurrences are separated by long lags rather than by short lags. Our studies focused specifically on cases in which items were repeated upon several days. We compared the efficiency of three different temporal schedules of repetitions: A uniform schedule that consisted in repetitions occurring with equal intervals, an expanding schedule that consisted in repetitions occurring with longer and longer intervals, and a contracting schedule that consisted in repetitions occurring with shorter and shorter intervals. In Experiments 1 and 2, the learning phase lasted one week and the retention interval lasted two days. It was shown that the expanding and uniform schedules were more efficient than the contracting schedule. In Experiment 3, the learning phase lasted two weeks and the retention interval lasted 2, 6, or 13 days. It was shown that the superiority of the expanding schedule over the other two schedules appeared gradually when the retention interval increased, suggesting that different schedules yielded different forgetting rates. We also tried to test major theories of the distributed practice effect, such as the encoding variability (Experiment 4) and the study-phase retrieval (Experiment 2) theories. Our results appeared to be consistent with the study-phase retrieval theory. We concluded our dissertation by emphasizing the importance of considering findings from other areas in cognitive science–especially neuroscience and computer science–in the study of the distributed practice effect

APA, Harvard, Vancouver, ISO, and other styles

32

Kim, Jong Wook Koubek Richard J. Ritter Frank E. "Procedural skills from learning to forgetting /." 2008. http://etda.libraries.psu.edu/theses/approved/WorldWideIndex/ETD-3130/index.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Chen, Hsin Min, and 陳新民. "Lot-Sizing Models with Learning and Forgetting Effects." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/18037690590189420474.

Full text

Abstract:

博士
國立臺灣科技大學
工業管理系
90
This dissertation studies the problems of incorporating both learning and forgetting effects into lot-sizing models in order to determine lot sizes and relevant management decisions. Three discrete time-varying demand models and one continuous stochastic demand model are proposed. This study also provides valuable suggestions for practitioners to choose appropriate lot-sizing techniques. Chapter two deals with the discrete time-varying demand lot-sizing problem in which both learning and forgetting effects on setup time and unit production time are considered under the condition of fixed learning and forgetting rates. The optimal production policy, including the number of production runs, lot sizes, and time points to start setups and production can be obtained by using a multi-dimensional forward dynamic programming algorithm. Experimental results indicate that the effects of learning on the lot-size decision are more influential than forgetting effects. The production learning effect on the total cost is more influential than either the forgetting effects or the setups learning effect. Since the multi-dimensional forward dynamic programming algorithm mentioned above becomes computationally intricate and intolerable in solving the problem in which the forgetting effect on the unit production time is a function of the break length and the level of experience gained prior to the break, a near-optimal forward dynamic programming algorithm is then proposed in Chapter three. The near-optimal solution is compared with those obtained by using the multi-dimensional forward dynamic programming algorithm and four extended heuristics (including the least unit cost heuristic, the technique for order placement and sizing, the Silver and Meal heuristic, and the economic production quantity algorithm). Several important observations obtained from a two-phase experiment verify the goodness of the proposed algorithm and the chosen heuristic method. In Chapter four, the original Wagner and Whitin and the classical Economic Order Quantity algorithms are extended to solve the problem in which the effects of production learning, production forgetting, and the time-value of money on cost are considered simultaneously. Numerical examples indicate that corresponding parameters for the three effects have significant impacts on the determination of lot sizes and relevant costs. Comparisons among models with and without the three effects are also made. In Chapter five, we consider a continuous stochastic demand lot-sizing model in which the replenishment lead time is affected by manufacturing learning and forgetting. According to three propositions that each feasible solution must satisfy, an effective search algorithm is derived to obtain the optimal solution with integer decision variables, including the number of orders, the order size, and the reorder level. Computational results indicate that the learning and forgetting effects on the expected total cost become significant as the ordering cost or the backorder cost increases.

APA, Harvard, Vancouver, ISO, and other styles

34

Van, Rensburg Madri Stephani Jansen. "Forgetting to remember : organisational memory." Thesis, 2011. http://hdl.handle.net/10500/4812.

Full text

Abstract:

Organisations need to learn from their current and past experiences to optimise their activities, decisions and future strategies. Non-governmental organisations are similar to public or governmental departments in that learning is crucial for their existence. One of the key factors influencing learning is the development and maintenance of a functional organisational memory. The organisational memory is a dynamic entity encompassing more than the storage facilities provided by an information technology system. It also resides in human form, acting as reservoirs and interpretation centres and feeding the organisational memory as a whole. Previous research in organisational memory focussed mostly on describing the structure of the storage systems, with the current focus on developing management information systems to enhance organisational memory storage and retrieval. Some work has been undertaken to describe the processes involved, which include accessing, storing and retrieving the memory. Other functions that need special attention are the development of data to information, and especially creating and using knowledge. The studies mostly involved existing organisational memory as it was represented at a specific time of the organisations’ development. This study looks at all the different developmental phases of a regional NGO, which include start-up, expansion in target territory, expansion in activities, consolidation and close-out. To investigate the temporal changes of organisational memory in a regional intermediary NGO, a retrospective case study methodology was used. The NGO was closing down, providing an opportunity to investigate all the stages of development. The data collection, analysis and interpretation involved various in-depth interviews with current and past staff members and other key stakeholders, such as beneficiary organisations and consultants. In addition, a complex set of documents were studied, including proposals, strategic documents, minutes of meetings, and audiovisual material. The main themes and factors, such as individuals, leadership, electronic and other management of the organisational memory, culture, including the importance of a vision and theory of change, policies and global developments are discussed using a temporal ecological framework. The key findings of this study illustrate the importance of directories as part of the metamemory in accessing seemingly dormant organisational memories. The conclusion is that organisational memory survives after the demise of the organisation and that it is accessible through directories.
Psychology
Ph. D. (Consulting Psychology)

APA, Harvard, Vancouver, ISO, and other styles

35

Gao, Zhi-Xian, and 高植賢. "CONSIDERING LEARNING AND FORGETTING EFFECTS IN LOT SIZING METHODS." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/33904955168299427783.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Osothsilp, Napassavong. "Worker-task assignment based in individual learning, forgetting, and task complexity." 2002. http://www.library.wisc.edu/databases/connect/dissertations.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Weng, Li Cheng, and 翁麗卿. "Lot sizing models with learning and forgetting in production and setups." Thesis, 1997. http://ndltd.ncl.edu.tw/handle/14093641266444446126.

Full text

Abstract:

碩士
國立台灣工業技術學院
管理技術研究所
85
The paper deals with the problem of incorporating both learning andforgetting effects into discrete time-varying demand lot-sizing modelsto determine lot sizing. Forgetting is retrogression in learning which causes a loss of labourproductivity due to breaks between intermittent production runs, and a lossof setup due to frequency of setup. The focus of this work is on productor''sview, we assumed that inventory be received in the ending of periods. For more concise demonstration, only four lot-sizing models, chosen from theliterature are included in this study. Three chosen heuristic models, includingthe economics order quantity (EOQ), the least cost (LUC), and the Silver & Meal (SM) models, are extended to the case where the two effects are consideredsimulaneously. The extended WWA is used to generate optimal solutions. Severalimportant conclusions are drawn from a comparison of the three heuristic solutions with the optimal solutions, and suggestions for future researchand for lot-size users to choose an appropriate lot-sizing technique are made.

APA, Harvard, Vancouver, ISO, and other styles

38

Hockenberry, Jason. "Learning, forgetting and technology substitution in the treatment of coronary artery disease." 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3316896.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

王哲宏. "A study on resource-constrained multi-project scheduling with learning and forgetting considerations." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/46820788578569134309.

Full text

Abstract:

碩士
國立屏東科技大學
工業管理系
93
In compliance with the trend of competitive environment, a business capable with cost estimation and timing of activity on project bidding is critical both for operation processing and profit making. For a project characterized as repetitive procedures, effects of learning experience and forgetting period resulting in consequential operations are main factors to determine the variation of the net present value. Systematically, this study considers the time value of money with the effects of learning experience and forgetting periods into the resource-constrained multi-project scheduling problem to develop two efficient solution procedures in terms of the optimal model and the genetic algorithm. To test the superiority of model and algorithm proposed in this work, a comparison between the developed heuristic rule and current heuristic rules is also conducted followed by an analysis on key factor affecting the project scheduling performance. The implementation results indicate that the heuristic rule proposed in this work is superior the current heuristic rules and effects of learning experience and forgetting period are significant. As a result, this study recommends a decision maker should consider such features into project planning. In addition, the proposed search heuristic rule is helpful for project managers as a means cons reduction on project scheduling.

APA, Harvard, Vancouver, ISO, and other styles

40

Glysch, Randall L. "The influence of story structure on learning and forgetting of younger and older adults." 1990. http://catalog.hathitrust.org/api/volumes/oclc/23856798.html.

Full text

Abstract:

Thesis (M.S.)--University of Wisconsin--Madison, 1990.
Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 36-40).

APA, Harvard, Vancouver, ISO, and other styles

41

Chiang, Chun-Yi, and 江俊儀. "A Study of Mixed Inventory Models with Ordering Process under Learning and Forgetting Effect." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/ng5q78.

Full text

Abstract:

碩士
國立臺北科技大學
工業工程與管理研究所
98
Traditionally, the pattern of repeated ordering process is assumed that ordering cost is always constant, hence the relationship of ordering cost and total cost can be simple linear function. In fact, each unit of ordering cost can not be fixed constant due to learning effect in the process of ordering operation. Moreover, the interruption of ordering operation will cause forgetting effect, and results in the actual total cost higher than the total cost which only takes learning effect into consideration. This paper investigates the impact of learning effect and forgetting effect on ordering cost for the continuous review inventory model involving controllable lead time with the mixture of backorder price discounts and partial lost sales. In the circumstance, order quantity, backorder price discount, safety factor and lead time are decision variables. The objective is to minimize the expected total cost with respect to related decision variables. We assume that probability distributions of the lead time demand are one for the normal distribution and another for the general distribution. We also develop an algorithm procedure, respectively, to find the optimal order quantity, optimal backorder price discount, optimal safety factor and optimal lead time. Furthermore, two numerical examples are also given to illustrate the results.

APA, Harvard, Vancouver, ISO, and other styles

42

"Incremental Learning With Sample Generation From Pretrained Networks." Master's thesis, 2020. http://hdl.handle.net/2286/R.I.57207.

Full text

Abstract:

abstract: In the last decade deep learning based models have revolutionized machine learning and computer vision applications. However, these models are data-hungry and training them is a time-consuming process. In addition, when deep neural networks are updated to augment their prediction space with new data, they run into the problem of catastrophic forgetting, where the model forgets previously learned knowledge as it overfits to the newly available data. Incremental learning algorithms enable deep neural networks to prevent catastrophic forgetting by retaining knowledge of previously observed data while also learning from newly available data. This thesis presents three models for incremental learning; (i) Design of an algorithm for generative incremental learning using a pre-trained deep neural network classifier; (ii) Development of a hashing based clustering algorithm for efficient incremental learning; (iii) Design of a student-teacher coupled neural network to distill knowledge for incremental learning. The proposed algorithms were evaluated using popular vision datasets for classification tasks. The thesis concludes with a discussion about the feasibility of using these techniques to transfer information between networks and also for incremental learning applications.
Dissertation/Thesis
Masters Thesis Computer Science 2020

APA, Harvard, Vancouver, ISO, and other styles

43

Mondesire, Sean. "Complementary Layered Learning." Doctoral diss., 2014. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/6140.

Full text

Abstract:

Layered learning is a machine learning paradigm used to develop autonomous robotic-based agents by decomposing a complex task into simpler subtasks and learns each sequentially. Although the paradigm continues to have success in multiple domains, performance can be unexpectedly unsatisfactory. Using Boolean-logic problems and autonomous agent navigation, we show poor performance is due to the learner forgetting how to perform earlier learned subtasks too quickly (favoring plasticity) or having difficulty learning new things (favoring stability). We demonstrate that this imbalance can hinder learning so that task performance is no better than that of a sub-optimal learning technique, monolithic learning, which does not use decomposition. Through the resulting analyses, we have identified factors that can lead to imbalance and their negative effects, providing a deeper understanding of stability and plasticity in decomposition-based approaches, such as layered learning. To combat the negative effects of the imbalance, a complementary learning system is applied to layered learning. The new technique augments the original learning approach with dual storage region policies to preserve useful information from being removed from an agent's policy prematurely. Through multi-agent experiments, a 28% task performance increase is obtained with the proposed augmentations over the original technique.
Ph.D.
Doctorate
Computer Science
Engineering and Computer Science
Computer Science

APA, Harvard, Vancouver, ISO, and other styles

44

"Lifelong Adaptive Neuronal Learning for Autonomous Multi-Robot Demining in Colombia, and Enhancing the Science, Technology and Innovation Capacity of the Ejército Nacional de Colombia." Doctoral diss., 2019. http://hdl.handle.net/2286/R.I.55488.

Full text

Abstract:

abstract: In order to deploy autonomous multi-robot teams for humanitarian demining in Colombia, two key problems need to be addressed. First, a robotic controller with limited power that can completely cover a dynamic search area is needed. Second, the Colombian National Army (COLAR) needs to increase its science, technology and innovation (STI) capacity to help develop, build and maintain such robots. Using Thangavelautham's (2012, 2017) Artificial Neural Tissue (ANT) control algorithm, a robotic controller for an autonomous multi-robot team was developed. Trained by a simple genetic algorithm, ANT is an artificial neural network (ANN) controller with a sparse, coarse coding network architecture and adaptive activation functions. Starting from the exterior of open, basic geometric grid areas, computer simulations of an ANT multi-robot team with limited time steps, no central controller and limited a priori information, covered some areas completely in linear time, and other areas near completely in quasi-linear time, comparable to the theoretical cover time bounds of grid-based, ant pheromone, area coverage algorithms. To mitigate catastrophic forgetting, a new learning method for ANT, Lifelong Adaptive Neuronal Learning (LANL) was developed, where neural network weight parameters for a specific coverage task were frozen, and only the activation function and output behavior parameters were re-trained for a new coverage task. The performance of the LANL controllers were comparable to training all parameters ab initio, for a new ANT controller for the new coverage task. To increase COLAR's STI capacity, a proposal for a new STI officer corps, Project ÉLITE (Equipo de Líderes en Investigación y Tecnología del Ejército) was developed, where officers enroll in a research intensive, master of science program in applied mathematics or physics in Colombia, and conduct research in the US during their final year. ÉLITE is inspired by the Israel Defense Forces Talpiot program.
Dissertation/Thesis
Doctoral Dissertation Applied Mathematics for the Life and Social Sciences 2019

APA, Harvard, Vancouver, ISO, and other styles

45

Langa, Selaelo Norah. "The Role and function of emotions in primary school children's meaningful learning." Diss., 1999. http://hdl.handle.net/10500/17169.

Full text

Abstract:

The aim of this study was to critically examine the role and function of emotions in primary school children's meaningful learning. Emotions that are commonly experienced by primary school children were identified and an indication was given of how they relate to meaningful learning. Factors that affect both emotions and meaningful learning were also discussed. In an empirical investigation that was undertaken, it was found that emotions influence meaningful learning of primary school children either positively or negatively. The following emotions pointed to both positive and negative significant correlations with regard to meaningful learning: anger, aggression, anxiety, fear, love, joy and affection. Factors like family size, gender and the environment (life world of primary school children) also influence meaningful learning.
Psychology of Education
M.Ed.(Psychology of Education)

APA, Harvard, Vancouver, ISO, and other styles

46

Anbil, Parthipan Sarath Chandar. "On challenges in training recurrent neural networks." Thèse, 2019. http://hdl.handle.net/1866/23435.

Full text

Abstract:

Dans un problème de prédiction à multiples pas discrets, la prédiction à chaque instant peut dépendre de l’entrée à n’importe quel moment dans un passé lointain. Modéliser une telle dépendance à long terme est un des problèmes fondamentaux en apprentissage automatique. En théorie, les Réseaux de Neurones Récurrents (RNN) peuvent modéliser toute dépendance à long terme. En pratique, puisque la magnitude des gradients peut croître ou décroître exponentiellement avec la durée de la séquence, les RNNs ne peuvent modéliser que les dépendances à court terme. Cette thèse explore ce problème dans les réseaux de neurones récurrents et propose de nouvelles solutions pour celui-ci. Le chapitre 3 explore l’idée d’utiliser une mémoire externe pour stocker les états cachés d’un réseau à Mémoire Long et Court Terme (LSTM). En rendant l’opération d’écriture et de lecture de la mémoire externe discrète, l’architecture proposée réduit le taux de décroissance des gradients dans un LSTM. Ces opérations discrètes permettent également au réseau de créer des connexions dynamiques sur de longs intervalles de temps. Le chapitre 4 tente de caractériser cette décroissance des gradients dans un réseau de neurones récurrent et propose une nouvelle architecture récurrente qui, grâce à sa conception, réduit ce problème. L’Unité Récurrente Non-saturante (NRUs) proposée n’a pas de fonction d’activation saturante et utilise la mise à jour additive de cellules au lieu de la mise à jour multiplicative. Le chapitre 5 discute des défis de l’utilisation de réseaux de neurones récurrents dans un contexte d’apprentissage continuel, où de nouvelles tâches apparaissent au fur et à mesure. Les dépendances dans l’apprentissage continuel ne sont pas seulement contenues dans une tâche, mais sont aussi présentes entre les tâches. Ce chapitre discute de deux problèmes fondamentaux dans l’apprentissage continuel: (i) l’oubli catastrophique d’anciennes tâches et (ii) la capacité de saturation du réseau. De plus, une solution est proposée pour régler ces deux problèmes lors de l’entraînement d’un réseau de neurones récurrent.
In a multi-step prediction problem, the prediction at each time step can depend on the input at any of the previous time steps far in the past. Modelling such long-term dependencies is one of the fundamental problems in machine learning. In theory, Recurrent Neural Networks (RNNs) can model any long-term dependency. In practice, they can only model short-term dependencies due to the problem of vanishing and exploding gradients. This thesis explores the problem of vanishing gradient in recurrent neural networks and proposes novel solutions for the same. Chapter 3 explores the idea of using external memory to store the hidden states of a Long Short Term Memory (LSTM) network. By making the read and write operations of the external memory discrete, the proposed architecture reduces the rate of gradients vanishing in an LSTM. These discrete operations also enable the network to create dynamic skip connections across time. Chapter 4 attempts to characterize all the sources of vanishing gradients in a recurrent neural network and proposes a new recurrent architecture which has significantly better gradient flow than state-of-the-art recurrent architectures. The proposed Non-saturating Recurrent Units (NRUs) have no saturating activation functions and use additive cell updates instead of multiplicative cell updates. Chapter 5 discusses the challenges of using recurrent neural networks in the context of lifelong learning. In the lifelong learning setting, the network is expected to learn a series of tasks over its lifetime. The dependencies in lifelong learning are not just within a task, but also across the tasks. This chapter discusses the two fundamental problems in lifelong learning: (i) catastrophic forgetting of old tasks, and (ii) network capacity saturation. Further, it proposes a solution to solve both these problems while training a recurrent neural network.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Learning and Forgetting'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles