To see the other types of publications on this topic, follow the link: Generative adversarial networks.

Dissertations / Theses on the topic 'Generative adversarial networks'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Generative adversarial networks.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Zesen. "Generative Adversarial Networks in Text Generation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-264575.

Full text
Abstract:
The Generative Adversarial Network (GAN) was firstly proposed in 2014, and it has been highly studied and developed in recent years. It has obtained great success in the problems that cannot be explicitly defined by a math equation such as generating real images. However, since the GAN was initially designed to solve the problem in a continuous domain (image generation, for example), the performance of GAN in text generation is developing because the sentences are naturally discrete (no interpolation exists between “hello" and “bye"). In the thesis, it firstly introduces fundamental concepts in natural language processing, generative models, and reinforcement learning. For each part, some state-of-art methods and commonly used metrics are introduced. The thesis also proposes two models for the random sentence generation and the summary generation based on context, respectively. Both models involve the technique of the GAN and are trained on the large-scale dataset. Due to the limitation of resources, the model is designed and trained as a prototype. Therefore, it cannot achieve the state-of-art performance. However, the results still show the promising performance of the application of GAN in text generation. It also proposes a novel model-based metric to evaluate the quality of summary referring both the source text and the summary. The source code of the thesis will be available soon in the GitHub repository: https://github.com/WangZesen/Text-Generation-GAN.
Det generativa motståndsnätverket (GAN) introducerades först 2014 och det har studerats samt utvecklats starkt under senare år. GAN har uppnått stor framgång för problem som inte kan definieras uttryckligen av en matematisk ekvation, som att generera riktiga bilder. Men eftersom GAN ursprungligen var utformat för att lösa problemet i en kontinuerlig domän (till exempel bildgenerering), utvecklas GAN:s prestanda i textgenerering eftersom meningarna är naturligt diskreta (ingen interpolering finns mellan “hej" och “hejdå"). I examensarbetet introduceras grundläggande begrepp i naturlig språkbearbetning, generativa modeller och förstärkningslärande. För varje del introduceras några bästa tillgängliga metoder och vanligt förekommande mätvärden. Examensarbetet föreslår också två modeller för slumpmässig meningsgenerering respektive sammanfattningsgenerering baserat på sammanhang. Båda modellerna involverar tekniken för GAN och är tränade på storskaliga datamängder. På grund av begränsningen av resurser är modellen designad och tränad som en prototyp. Därför kan den inte heller uppnå bästa möjliga prestanda. Resultaten visar ändå lovande prestanda för tillämpningen av GAN i textgenerering. Den föreslår också en ny modellbaserad metrik för att utvärdera kvaliteten på sammanfattningen som hänvisar både till källtexten och sammanfattningen. Examensarbetets källkod kommer snart att finnas tillgänglig i GitHubförvaret: https://github.com/WangZesen/Text-Generation-GAN.
APA, Harvard, Vancouver, ISO, and other styles
2

Daley, Jr John. "Generating Synthetic Schematics with Generative Adversarial Networks." Thesis, Högskolan Kristianstad, Fakulteten för naturvetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hkr:diva-20901.

Full text
Abstract:
This study investigates synthetic schematic generation using conditional generative adversarial networks, specifically the Pix2Pix algorithm was implemented for the experimental phase of the study. With the increase in deep neural network’s capabilities and availability, there is a demand for verbose datasets. This in combination with increased privacy concerns, has led to synthetic data generation utilization. Analysis of synthetic images was completed using a survey. Blueprint images were generated and were successful in passing as genuine images with an accuracy of 40%. This study confirms the ability of generative neural networks ability to produce synthetic blueprint images.
APA, Harvard, Vancouver, ISO, and other styles
3

Berman, Alan. "Generative adversarial networks for fine art generation." Master's thesis, University of Cape Town, 2020. http://hdl.handle.net/11427/32458.

Full text
Abstract:
Generative Adversarial Networks (GANs), a generative modelling technique most commonly used for image generation, have recently been applied to the task of fine art generation. Wasserstein GANs and GANHack techniques have not been applied in GANs that generate fine art, despite their showing improved GAN results in other applications. This thesis investigates whether Wasserstein GANs and GANHack extensions to DCGANs can improve the quality of DCGAN-based fine art generation. There is also no accepted method of evaluating or comparing GANs for fine art generation. DCGAN's, Wasserstein GANs' and GANHack techniques' outputs on a modest computational budget were quantitatively and qualitatively compared to see which techniques showed improvement over DCGAN. A method for evaluating computer-generated fine art, HEART, is proposed to cover both the qualities of good human-created fine art and the shortcomings of computer-created fine art, and to include the cognitive and emotional impact as well as the visual appearance. Prominent GAN quantitative evaluation techniques were used to compare sample images these GANs produced on the MNIST, CIFAR-10 and Imagenet-1K image data sets. These results were compared with sample images these GANs produced on the above data sets, as well as on art data sets. A pilot study of HEART was performed with 20 users. Wasserstein GANs achieved higher visual quality outputs than the baseline DCGAN, as did the use of GANHacks, on all the fine art data sets and are thus recommended for use in future work on GAN-based fine art generation. The study also demonstrated that HEART can be used for the evaluation and comparison of art GANs, providing comprehensive, objective quality assessments which can be substantiated in terms of emotional and cognitive impact as well as visual appearance.
APA, Harvard, Vancouver, ISO, and other styles
4

Zeid, Baker Mousa. "Generation of Synthetic Images with Generative Adversarial Networks." Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-15866.

Full text
Abstract:
Machine Learning is a fast growing area that revolutionizes computer programs by providing systems with the ability to automatically learn and improve from experience. In most cases, the training process begins with extracting patterns from data. The data is a key factor for machine learning algorithms, without data the algorithms will not work. Thus, having sufficient and relevant data is crucial for the performance. In this thesis, the researcher tackles the problem of not having a sufficient dataset, in terms of the number of training examples, for an image classification task. The idea is to use Generative Adversarial Networks to generate synthetic images similar to the ground truth, and in this way expand a dataset. Two types of experiments were conducted: the first was used to fine-tune a Deep Convolutional Generative Adversarial Network for a specific dataset, while the second experiment was used to analyze how synthetic data examples affect the accuracy of a Convolutional Neural Network in a classification task. Three well known datasets were used in the first experiment, namely MNIST, Fashion-MNIST and Flower photos, while two datasets were used in the second experiment: MNIST and Fashion-MNIST. The results of the generated images of MNIST and Fashion-MNIST had good overall quality. Some classes had clear visual errors while others were indistinguishable from ground truth examples. When it comes to the Flower photos, the generated images suffered from poor visual quality. One can easily tell the synthetic images from the real ones. One reason for the bad performance is due to the large quantity of noise in the Flower photos dataset. This made it difficult for the model to spot the important features of the flowers. The results from the second experiment show that the accuracy does not increase when the two datasets, MNIST and Fashion-MNIST, are expanded with synthetic images. This is not because the generated images had bad visual quality, but because the accuracy turned out to not be highly dependent on the number of training examples. It can be concluded that Deep Convolutional Generative Adversarial Networks are capable of generating synthetic images similar to the ground truth and thus can be used to expand a dataset. However, this approach does not completely solve the initial problem of not having adequate datasets because Deep Convolutional Generative Adversarial Networks may themselves require, depending on the dataset, a large quantity of training examples.
APA, Harvard, Vancouver, ISO, and other styles
5

Haiderbhai, Mustafa. "Generating Synthetic X-rays Using Generative Adversarial Networks." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/41092.

Full text
Abstract:
We propose a novel method for generating synthetic X-rays from atypical inputs. This method creates approximate X-rays for use in non-diagnostic visualization problems where only generic cameras and sensors are available. Traditional methods are restricted to 3-D inputs such as meshes or Computed Tomography (CT) scans. We create custom synthetic X-ray datasets using a custom generator capable of creating RGB images, point cloud images, and 2-D pose images. We create a dataset using natural hand poses and train general-purpose Conditional Generative Adversarial Networks (CGANs) as well as our own novel network pix2xray. Our results show the successful plausibility of generating X-rays from point cloud and RGB images. We also demonstrate the superiority of our pix2xray approach, especially in the troublesome cases of occlusion due to overlapping or rotated anatomy. Overall, our work establishes a baseline that synthetic X-rays can be simulated using inputs such as RGB images and point cloud.
APA, Harvard, Vancouver, ISO, and other styles
6

Garcia, Torres Douglas. "Generation of Synthetic Data with Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254366.

Full text
Abstract:
The aim of synthetic data generation is to provide data that is not real for cases where the use of real data is somehow limited. For example, when there is a need for larger volumes of data, when the data is sensitive to use, or simply when it is hard to get access to the real data. Traditional methods of synthetic data generation use techniques that do not intend to replicate important statistical properties of the original data. Properties such as the distribution, the patterns or the correlation between variables, are often omitted. Moreover, most of the existing tools and approaches require a great deal of user-defined rules and do not make use of advanced techniques like Machine Learning or Deep Learning. While Machine Learning is an innovative area of Artificial Intelligence and Computer Science that uses statistical techniques to give computers the ability to learn from data, Deep Learning is a closely related field based on learning data representations, which may serve useful for the task of synthetic data generation. This thesis focuses on one of the most interesting and promising innovations of the last years in the Machine Learning community: Generative Adversarial Networks. An approach for generating discrete, continuous or text synthetic data with Generative Adversarial Networks is proposed, tested, evaluated and compared with a baseline approach. The results prove the feasibility and show the advantages and disadvantages of using this framework. Despite its high demand for computational resources, a Generative Adversarial Networks framework is capable of generating quality synthetic data that preserves the statistical properties of a given dataset.
Syftet med syntetisk datagenerering är att tillhandahålla data som inte är verkliga i fall där användningen av reella data på något sätt är begränsad. Till exempel, när det finns behov av större datamängder, när data är känsliga för användning, eller helt enkelt när det är svårt att få tillgång till den verkliga data. Traditionella metoder för syntetiska datagenererande använder tekniker som inte avser att replikera viktiga statistiska egenskaper hos de ursprungliga data. Egenskaper som fördelningen, mönstren eller korrelationen mellan variabler utelämnas ofta. Dessutom kräver de flesta av de befintliga verktygen och metoderna en hel del användardefinierade regler och använder inte avancerade tekniker som Machine Learning eller Deep Learning. Machine Learning är ett innovativt område för artificiell intelligens och datavetenskap som använder statistiska tekniker för att ge datorer möjlighet att lära av data. Deep Learning ett närbesläktat fält baserat på inlärningsdatapresentationer, vilket kan vara användbart för att generera syntetisk data. Denna avhandling fokuserar på en av de mest intressanta och lovande innovationerna från de senaste åren i Machine Learning-samhället: Generative Adversarial Networks. Generative Adversarial Networks är ett tillvägagångssätt för att generera diskret, kontinuerlig eller textsyntetisk data som föreslås, testas, utvärderas och jämförs med en baslinjemetod. Resultaten visar genomförbarheten och visar fördelarna och nackdelarna med att använda denna metod. Trots dess stora efterfrågan på beräkningsresurser kan ett generativt adversarialnätverk skapa generell syntetisk data som bevarar de statistiska egenskaperna hos ett visst dataset.
APA, Harvard, Vancouver, ISO, and other styles
7

Graffieti, Gabriele. "Style Transfer with Generative Adversarial Networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/17015/.

Full text
Abstract:
This dissertation is focused on trying to use concepts from style transfer and image-to-image translation to address the problem of defogging. Defogging (or dehazing) is the ability to remove fog from an image, restoring it as if the photograph was taken during optimal weather conditions. The task of defogging is of particular interest in many fields, such as surveillance or self driving cars. In this thesis an unpaired approach to defogging is adopted, trying to translate a foggy image to the correspondent clear picture without having pairs of foggy and ground truth haze-free images during training. This approach is particularly significant, due to the difficult of gathering an image collection of exactly the same scenes with and without fog. Many of the models and techniques used in this dissertation already existed in literature, but they are extremely difficult to train, and often it is highly problematic to obtain the desired behavior. Our contribute was a systematic implementative and experimental activity, conducted with the aim of attaining a comprehensive understanding of how these models work, and the role of datasets and training procedures in the final results. We also analyzed metrics and evaluation strategies, in order to seek to assess the quality of the presented model in the most correct and appropriate manner. First, the feasibility of an unpaired approach to defogging was analyzed, using the cycleGAN model. Then, the base model was enhanced with a cycle perceptual loss, inspired by style transfer techniques. Next, the role of the training set was investigated, showing that improving the quality of data is at least as important as the utilization of more powerful models. Finally, our approach is compared with state-of-the art defogging methods, showing that the quality of our results is in line with preexisting approaches, even if our model was trained using unpaired data.
APA, Harvard, Vancouver, ISO, and other styles
8

Aftab, Nadeem. "Disocclusion Inpainting using Generative Adversarial Networks." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-40502.

Full text
Abstract:
The old methods used for images inpainting of the Depth Image Based Rendering (DIBR) process are inefficient in producing high-quality virtual views from captured data. From the viewpoint of the original image, the generated data’s structure seems less distorted in the virtual view obtained by translation but when then the virtual view involves rotation, gaps and missing spaces become visible in the DIBR generated data. The typical approaches for filling the disocclusion tend to be slow, inefficient, and inaccurate. In this project, a modern technique Generative Adversarial Network (GAN) is used to fill the disocclusion. GAN consists of two or more neural networks that compete against each other and get trained. This study result shows that GAN can inpaint the disocclusion with a consistency of the structure. Additionally, another method (Filling) is used to enhance the quality of GAN and DIBR images. The statistical evaluation of results shows that GAN and filling method enhance the quality of DIBR images.
APA, Harvard, Vancouver, ISO, and other styles
9

Paget, Bryan. "An Introduction to Generative Adversarial Networks." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39603.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Daniel, Filippo <1995&gt. "Transfer learning with generative adversarial networks." Master's Degree Thesis, Università Ca' Foscari Venezia, 2020. http://hdl.handle.net/10579/16989.

Full text
Abstract:
Generative Adversarial Networks (GANs) emerged in recent years as the undiscussed SotA for image synthesis. This model leverages the recent successes of convolutional networks in the field of computer vision to learn the probability distribution of image datasets. Following the first proposal of GANs, many developments and usages of the models have been proposed. This thesis aims to review the evolution of the model and use one of the most recent variations to generate realistic portrait images with a targeted set of features. The usage of this model will be applied in a transfer learning approach, discussing the advantages and disadvantages from standard approaches. Furthermore, classical and deep computer vision tools will be used to edit and confirm the results obtained from the GAN model.
APA, Harvard, Vancouver, ISO, and other styles
11

Fan, Zijian. "Applying Generative Adversarial Networks for the Generation of Adversarial Attacks Against Continuous Authentication." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-289634.

Full text
Abstract:
Cybersecurity has been a hot topic over the past decades with lots of approaches being proposed to secure our private information. One of the emerging approaches in security is continuous authentication, in which the computer system is authenticating the user by monitoring the user behavior during the login session. Although the research of continuous authentication has got a significant achievement, the security of state-of-the-art continuous authentication systems is far from perfect. In this thesis, we explore the ability of classifiers used in continuous authentication and examine whether they can be bypassed by generated samples of user behavior from generative models. In our work, we considered four machine learning classifiers as the continuous authentication system: One-Class support vector machine, support vector machine, Gaussian mixture model and an artificial neural network. Furthermore, we considered three generative models used to mimic the user behavior: generative adversarial network, kernel density estimation generator, and MMSE-based generator. The considered classifiers and generative models were tested on two continuous authentication datasets. The result shows that generative adversarial networks achieved superior results with more than 50samples passing continuous authentication.
Cybersäkerhet har varit ett hett ämne under de senaste decennierna med många tillvägagångssätt skapats för att säkra vår privata information. En av de nya tillvägagångssätten inom säkerhet är kontinuerlig autentisering där datorsystemet autentiserar användaren genom att övervaka dess beteende under inloggningssessionen. Trots att forskningen om kontinuerlig autentisering har fått betydande framsteg, är säkerheten för toppmoderna kontinuerliga autentiseringssystem långt ifrån perfekt. I denna avhandling undersöker vi förmågan hos klassificerare som används vid kontinuerlig autentisering och undersöker om de kan luras med hjälp av generativa modeller. I vårt arbete använde vi fyra maskininlärningsklassificerare som det kontinuerliga autentiseringssystemet: En-klass stödvektormaskin, stödvektormaskin, Gaussian-blandningsmodell och ett artificiellt neuronnät. Vidare övervägde vi tre generativa modeller som användes för att härma användarens beteende: generativt motsatt nätverk, kärnatäthetsuppskattningsgenerator och MMSE-baserad generator. De betraktade klassificerarna och generativa modellerna testades på två dataset för kontinuerlig autentisering. Resultatet visar att generativa motverkande nätverk uppnådde överlägsna resultat med mer än 50% av de genererade proverna som passerade kontinuerlig autentisering.
APA, Harvard, Vancouver, ISO, and other styles
12

Beyki, Mohammad Reza. "Synthetic Electronic Medical Record Generation using Generative Adversarial Networks." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/104642.

Full text
Abstract:
It has been a while that computers have replaced our record books, and medical records are no exception. Electronic Health Records (EHR) are digital version of a patient's medical records. EHRs are available to authorized users, and they contain the medical records of the patient, which should help doctors understand a patient's condition quickly. In recent years, Deep Learning models have proved their value and have become state-of-the-art in computer vision, natural language processing, speech and other areas. The private nature of EHR data has prevented public access to EHR datasets. There are many obstacles to create a deep learning model with EHR data. Because EHR data are primarily consisting of huge sparse matrices, these challenges are mostly unique to this field. Due to this, research in this area is limited, and we can improve existing research substantially. In this study, we focus on high-performance synthetic data generation in EHR datasets. Artificial data generation can help reduce privacy leakage for dataset owners as it is proven that de-identification meth- ods are prone to re-identification attacks. We propose a novel approach we call Improved Correlation Capturing Wasserstein Generative Adversarial Network (SCorGAN) to create EHR data. This work, leverages Deep Convolutional Neural Networks to extract and un- derstand spatial dependencies in EHR data. To improve our model's performance, we focus on our Deep Convolutional AutoEncoder to better map our real EHR data to our latent space where we train the Generator. To assess our model's performance, we demonstrate that our generative model can create excellent data that are statistically close to the inputdataset. Additionally, we evaluate our synthetic dataset against the original data using our previous work that focused on GAN Performance Evaluation. This work is publicly available at https://github.com/mohibeyki/SCorGAN
Master of Science
Artificial Intelligence (AI) systems have improved greatly in recent years. They are being used to understand all kinds of data. A practical use case for AI systems is to leverage their power to identify illnesses and find correlations between different conditions. To train AI and Machine Learning systems, we need to feed them huge datasets, and in the training process, we need to guide them so that they learn different features in our data. The more data an intelligent system has seen, the better it performs. However, health records are private, and we cannot share real people's health records with the public, whether they are a researcher or not. This study provides a novel approach to synthetic data generation that others can use with intelligent systems. Then these systems can work with actual health records and give us accurate feedback on people's health conditions. We then show that our synthetic dataset is a good substitute for real datasets to train intelligent systems. Lastly, we present an intelligent system that we have trained using synthetic datasets to identify illnesses in a real dataset with high accuracy and precision.
APA, Harvard, Vancouver, ISO, and other styles
13

Oskarsson, Joel. "Probabilistic Regression using Conditional Generative Adversarial Networks." Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166637.

Full text
Abstract:
Regression is a central problem in statistics and machine learning with applications everywhere in science and technology. In probabilistic regression the relationship between a set of features and a real-valued target variable is modelled as a conditional probability distribution. There are cases where this distribution is very complex and not properly captured by simple approximations, such as assuming a normal distribution. This thesis investigates how conditional Generative Adversarial Networks (GANs) can be used to properly capture more complex conditional distributions. GANs have seen great success in generating complex high-dimensional data, but less work has been done on their use for regression problems. This thesis presents experiments to better understand how conditional GANs can be used in probabilistic regression. Different versions of GANs are extended to the conditional case and evaluated on synthetic and real datasets. It is shown that conditional GANs can learn to estimate a wide range of different distributions and be competitive with existing probabilistic regression models.
APA, Harvard, Vancouver, ISO, and other styles
14

Egan, Nicholas R. (Nicholas Ryan). "Natural video synthesis with Generative Adversarial Networks." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/123076.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 71-74).
Generative Adversarial Networks (GANs) are the state of the art neural network models for image generation, but the use of GANs for video generation is still largely unexplored. This thesis introduces new GAN based video generation methods by proposing the technique of model inflation and the segmentation-to-video task. The model inflation technique converts image generative models into video generative models, and experiments show that model inflation improves training speed, training stability, and output video quality. The segmentation-to-video task is that of turning an input image segmentation mask into an output video matching that segmentation. A GAN model was created to perform this task, and its usefulness as a creative tool was demonstrated.
by Nicholas R. Egan.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
15

Nataraj, Vismitha, and Sushmitha Narayanan. "Resolving Class Imbalance using Generative Adversarial Networks." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-41405.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Brodie, Michael B. "Methods for Generative Adversarial Output Enhancement." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8763.

Full text
Abstract:
Generative Adversarial Networks (GAN) learn to synthesize novel samples for a given data distribution. While GANs can train on diverse data of various modalities, the most successful use cases to date apply GANs to computer vision tasks. Despite significant advances in training algorithms and network architectures, GANs still struggle to consistently generate high-quality outputs after training. We present a series of papers that improve GAN output inference qualitatively and quantitatively. The first chapter, Alpha Model Domination, addresses a related subfield of Multiple Choice Learning, which -- like GANs -- aims to generate diverse sets of outputs. The next chapter, CoachGAN, introduces a real-time refinement method for the latent input space that improves inference quality for pretrained GANs. The following two chapters introduce finetuning methods for arbitrary, end-to-end differentiable GANs. The first, PuzzleGAN, proposes a self-supervised puzzle-solving task to improve global coherence in generated images. The latter, Trained Truncation Trick, improves upon a common inference heuristic by better maintaining output diversity while increasing image realism. Our final work, Two Second StyleGAN Projection, reduces the time for high-quality, image-to-latent GAN projections by two orders of magnitude. We present a wide array of results and applications of our method. We conclude with implications and directions for future work.
APA, Harvard, Vancouver, ISO, and other styles
17

Yamazaki, Hiroyuki Vincent. "On Depth and Complexity of Generative Adversarial Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217293.

Full text
Abstract:
Although generative adversarial networks (GANs) have achieved state-of-the-art results in generating realistic look- ing images, they are often parameterized by neural net- works with relatively few learnable weights compared to those that are used for discriminative tasks. We argue that this is suboptimal in a generative setting where data is of- ten entangled in high dimensional space and models are ex- pected to benefit from high expressive power. Additionally, in a generative setting, a model often needs to extrapo- late missing information from low dimensional latent space when generating data samples while in a typical discrimina- tive task, the model only needs to extract lower dimensional features from high dimensional space. We evaluate different architectures for GANs with varying model capacities using shortcut connections in order to study the impacts of the capacity on training stability and sample quality. We show that while training tends to oscillate and not benefit from additional capacity of naively stacked layers, GANs are ca- pable of generating samples with higher quality, specifically for images, samples of higher visual fidelity given proper regularization and careful balancing.
Trots att Generative Adversarial Networks (GAN) har lyckats generera realistiska bilder består de än idag av neurala nätverk som är parametriserade med relativt få tränbara vikter jämfört med neurala nätverk som används för klassificering. Vi tror att en sådan modell är suboptimal vad gäller generering av högdimensionell och komplicerad data och anser att modeller med högre kapaciteter bör ge bättre estimeringar. Dessutom, i en generativ uppgift så förväntas en modell kunna extrapolera information från lägre till högre dimensioner medan i en klassificeringsuppgift så behöver modellen endast att extrahera lågdimensionell information från högdimensionell data. Vi evaluerar ett flertal GAN med varierande kapaciteter genom att använda shortcut connections för att studera hur kapaciteten påverkar träningsstabiliteten, samt kvaliteten av de genererade datapunkterna. Resultaten visar att träningen blir mindre stabil för modeller som fått högre kapaciteter genom naivt tillsatta lager men visar samtidigt att datapunkternas kvaliteter kan öka, specifikt för bilder, bilder med hög visuell fidelitet. Detta åstadkoms med hjälp utav regularisering och noggrann balansering.
APA, Harvard, Vancouver, ISO, and other styles
18

Westberg, Simon. "Investigating the Learning Behavior of Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301315.

Full text
Abstract:
Since their introduction in 2014, generative adversarial networks (GANs) have quickly become one of the most popular and successful frameworks for training deep generative models. GANs have shown exceptional results on different image generation tasks and they are known for their ability to produce realistic high- resolution images. However, despite the success and widespread use of the GAN framework, the models can be highly unstable to train and the training process has been shown to be extremely sensitive to hyperparameter settings and network architectures. In this thesis, we investigate how the latent dimension, batch size, and learning rate affect the stability and performance of both the non-saturating GAN (NS-GAN) and the Wasserstein GAN with added gradient penalty (WGAN-GP). Furthermore, we examine how the addition of noise to the inputs of the discriminator affects the stability of both GAN variants. The experiments are performed on three data sets – MNIST, CIFAR-10, and a grayscale version of CIFAR-10 – and all models are evaluated using both the Fréchet Inception Distance (FID) and precision and recall. The results presented in the thesis indicate that the learning rate has the largest impact on the stability and performance of both NS-GAN and WGANGP, and that the latent dimension and batch size have a relatively small impact when combined with an appropriate learning rate. Furthermore, we find that WGAN-GP outperforms NS-GAN on MNIST, while NS-GAN outperforms WGAN-GP on both versions of CIFAR-10. We also find that adding noise to the inputs of the discriminator during training greatly helps to stabilize the NS-GAN training process, while it has a limited effect on WGAN-GP.
Generative adversarial networks (GANs) har sedan deras introduktion 2014 blivit ett av de mest populära och framgångsrika tillvägagångssätten för att träna djupa generativa modeller. GAN-modeller har visat exceptionella resultat inom bildgenerering och de är kända för att producera skarpa och realistiska bilder. Trots att GAN-metoden har använts frekvent och framgångsrikt, kan modellerna vara väldigt instabila att träna och träningsprocessen har visat sig vara starkt påverkad av nätverkens design och hyperparametervärden. I detta examensarbete undersöks hur dimensionen av det latenta rummet, batchstorleken och lärandetakten påverkar stabiliteten och prestandan hos både den ursprungliga, non-saturating GAN-varianten (NS-GAN) och den senare Wasserstein GAN-varianten med gradientstraff (WGAN-GP). Dessutom undersöks hur stabiliteten hos båda GAN-varianterna påverkas när man adderar normalfördelat brus till de bilder som ges till den diskriminativa modellen. Experimenten utförs på tre olika dataset – MNIST, CIFAR-10 och en gråskaleversion av CIFAR-10 – och alla modeller evalueras med hjälp av Fréchet Inception Distance (FID) samt precision och recall. De resultat som presenteras i uppsatsen indikerar att lärandetakten har störst påverkan på stabiliteten och prestandan hos både NS-GAN och WGAN-GP, medan den latenta dimensionen och batchstorleken har en relativt liten påverkan när de kombineras med en lämplig lärandetakt. Dessutom finner vi att WGANGP presterar bättre än NS-GAN på MNIST, medan NS-GAN presterar bättre än WGAN-GP på båda versionerna av CIFAR-10. Vi finner också att adderandet av normalfördelat brus till de bilder som ges till den diskriminativa modellen under träningen har en betydande stabiliserande effekt på NS-GAN modellerna, medan metoden har en begränsad effekt på WGAN-GP.
APA, Harvard, Vancouver, ISO, and other styles
19

Desentz, Derek. "Partial Facial Re-imaging Using Generative Adversarial Networks." Wright State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=wright1622122813797895.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Ankaräng, Fredrik. "Generative Adversarial Networks for Cross-Lingual Voice Conversion." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299560.

Full text
Abstract:
Speech synthesis is a technology that increasingly influences our daily lives, in the form of smart assistants, advanced translation systems and similar applications. In this thesis, the phenomenon of making one’s voice sound like the voice of someone else is explored. This topic is called voice conversion and needs to be done without altering the linguistic content of speech. More specifically, a Cycle-Consistent Adversarial Network that has proven to work well in a monolingual setting, is evaluated in a multilingual environment. The model is trained to convert voices between native speakers from the Nordic countries. In the experiments no parallel, transcribed or aligned speech data is being used, forcing the model to focus on the raw audio signal. The goal of the thesis is to evaluate if performance is degraded in a multilingual environment, in comparison to monolingual voice conversion, and to measure the impact of the potential performance drop. In the study, performance is measured in terms of naturalness and speaker similarity between the generated speech and the target voice. For evaluation, listening tests are conducted, as well as objective comparisons of the synthesized speech. The results show that voice conversion between a Swedish and Norwegian speaker is possible and also that it can be performed without performance degradation in comparison to Swedish-to-Swedish conversion. Furthermore, conversion between Finnish and Swedish speakers, as well as Danish and Swedish speakers show a performance drop for the generated speech. However, despite the performance decrease, the model produces fluent and clearly articulated converted speech in all experiments. These results are noteworthy, especially since the network is trained on less than 15 minutes of nonparallel speaker data for each speaker. This thesis opens up for further areas of research, for instance investigating more languages, more recent Generative Adversarial Network architectures and devoting more resources to tweaking the hyperparameters to further optimize the model for multilingual voice conversion.
Talsyntes är ett område som allt mer influerar vår vardag, exempelvis genom smarta assistenter, avancerade översättningssystem och liknande användningsområden. I det här examensarbetet utforskas fenomenet röstkonvertering, som innebär att man får en talare att låta som någon annan, utan att det som sades förändras. Mer specifikt undersöks ett Cycle-Consistent Adversarial Network som fungerat väl för röstkonvertering inom ett enskilt språk för röstkonvertering mellan olika språk. Det neurala nätverket tränas för konvertering mellan röster från olika modersmålstalare från de nordiska länderna. I experimenten används ingen parallell eller transkriberad data, vilket tvingar modellen att endast använda sig av ljudsignalen. Målet med examensarbetet är att utvärdera om modellens prestanda försämras i en flerspråkig kontext, jämfört med en enkelspråkig sådan, samt mäta hur stor försämringen i sådant fall är. I studien mäts prestanda i termer av kvalitet och talarlikhet för det genererade talet och rösten som efterliknas. För att utvärdera detta genomförs lyssningstester, samt objektiva analyser av det genererade talet. Resultaten visar att röstkonvertering mellan en svensk och norsk talare är möjlig utan att modellens prestanda försämras, jämfört med konvertering mellan svenska talare. För konvertering mellan finska och svenska talare, samt danska och svenska talare försämrades däremot kvaliteten av det genererade talet. Trots denna försämring producerade modellen tydligt och sammanhängande tal i samtliga experiment. Det här är anmärkningsvärt eftersom modellen tränades på mindre än 15 minuter icke-parallel data för varje talare. Detta examensarbete öppnar upp för nya framtida studier, exempelvis skulle fler språk kunna inkluderas eller nyare varianter av typen Generative Adversarial Network utvärderas. Mer resurser skulle även kunna läggas på att optimera hyperparametrarna för att ytterligare optimera den undersökta modellen för flerspråkig röstkonvertering.
APA, Harvard, Vancouver, ISO, and other styles
21

Ljung, Mikael. "Synthetic Data Generation for the Financial Industry Using Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301307.

Full text
Abstract:
Following the introduction of new laws and regulations to ensure data protection in GDPR and PIPEDA, interests in technologies to protect data privacy have increased. A promising research trajectory in this area is found in Generative Adversarial Networks (GAN), an architecture trained to produce data that reflects the statistical properties of its underlying dataset without compromising the integrity of the data subjects. Despite the technology’s young age, prior research has made significant progress in the generation process of so-called synthetic data, and the current models can generate images with high-quality. Due to the architecture’s success with images, it has been adapted to new domains, and this study examines its potential to synthesize financial tabular data. The study investigates a state-of-the-art model within tabular GANs, called CTGAN, together with two proposed ideas to enhance its generative ability. The results indicate that a modified training dynamic and a novel early stopping strategy improve the architecture’s capacity to synthesize data. The generated data presents realistic features with clear influences from its underlying dataset, and the inferred conclusions on subsequent analyses are similar to those based on the original data. Thus, the conclusion is that GANs has great potential to generate tabular data that can be considered a substitute for sensitive data, which could enable organizations to have more generous data sharing policies.
Med striktare förhållningsregler till hur data ska hanteras genom GDPR och PIPEDA har intresset för anonymiseringsmetoder för att censurera känslig data aktualliserats. En lovande teknik inom området återfinns i Generativa Motstridande Nätverk, en arkitektur som syftar till att generera data som återspeglar de statiska egenskaperna i dess underliggande dataset utan att äventyra datasubjektens integritet. Trots forskningsfältet unga ålder har man gjort stora framsteg i genereringsprocessen av så kallad syntetisk data, och numera finns det modeller som kan generera bilder av hög realistisk karaktär. Som ett steg framåt i forskningen har arkitekturen adopterats till nya domäner, och den här studien syftar till att undersöka dess förmåga att syntatisera finansiell tabelldata. I studien undersöks en framträdande modell inom forskningsfältet, CTGAN, tillsammans med två föreslagna idéer i syfte att förbättra dess generativa förmåga. Resultaten indikerar att en förändrad träningsdynamik och en ny optimeringsstrategi förbättrar arkitekturens förmåga att generera syntetisk data. Den genererade datan håller i sin tur hög kvalité med tydliga influenser från dess underliggande dataset, och resultat på efterföljande analyser mellan datakällorna är av jämförbar karaktär. Slutsatsen är således att GANs har stor potential att generera tabulär data som kan betrakatas som substitut till känslig data, vilket möjliggör för en mer frikostig delningspolitik av data inom organisationer.
APA, Harvard, Vancouver, ISO, and other styles
22

Karlsson, Anton, and Torbjörn Sjöberg. "Synthesis of Tabular Financial Data using Generative Adversarial Networks." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273633.

Full text
Abstract:
Digitalization has led to tons of available customer data and possibilities for data-driven innovation. However, the data needs to be handled carefully to protect the privacy of the customers. Generative Adversarial Networks (GANs) are a promising recent development in generative modeling. They can be used to create synthetic data which facilitate analysis while ensuring that customer privacy is maintained. Prior research on GANs has shown impressive results on image data. In this thesis, we investigate the viability of using GANs within the financial industry. We investigate two state-of-the-art GAN models for synthesizing tabular data, TGAN and CTGAN, along with a simpler GAN model that we call WGAN. A comprehensive evaluation framework is developed to facilitate comparison of the synthetic datasets. The results indicate that GANs are able to generate quality synthetic datasets that preserve the statistical properties of the underlying data and enable a viable and reproducible subsequent analysis. It was however found that all of the investigated models had problems with reproducing numerical data.
Digitaliseringen har fört med sig stora mängder tillgänglig kunddata och skapat möjligheter för datadriven innovation. För att skydda kundernas integritet måste dock uppgifterna hanteras varsamt. Generativa Motstidande Nätverk (GANs) är en ny lovande utveckling inom generativ modellering. De kan användas till att syntetisera data som underlättar dataanalys samt bevarar kundernas integritet. Tidigare forskning på GANs har visat lovande resultat på bilddata. I det här examensarbetet undersöker vi gångbarheten av GANs inom finansbranchen. Vi undersöker två framstående GANs designade för att syntetisera tabelldata, TGAN och CTGAN, samt en enklare GAN modell som vi kallar för WGAN. Ett omfattande ramverk för att utvärdera syntetiska dataset utvecklas för att möjliggöra jämförelse mellan olika GANs. Resultaten indikerar att GANs klarar av att syntetisera högkvalitativa dataset som bevarar de statistiska egenskaperna hos det underliggande datat, vilket möjliggör en gångbar och reproducerbar efterföljande analys. Alla modellerna som testades uppvisade dock problem med att återskapa numerisk data.
APA, Harvard, Vancouver, ISO, and other styles
23

Castillo, Araújo Victor. "Ensembles of Single Image Super-Resolution Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-290945.

Full text
Abstract:
Generative Adversarial Networks have been used to obtain state-of-the-art results for low-level computer vision tasks like single image super-resolution, however, they are notoriously difficult to train due to the instability related to the competing minimax framework. Additionally, traditional ensembling mechanisms cannot be effectively applied with these types of networks due to the resources they require at inference time and the complexity of their architectures. In this thesis an alternative method to create ensembles of individual, more stable and easier to train, models by using interpolations in the parameter space of the models is found to produce better results than those of the initial individual models when evaluated using perceptual metrics as a proxy of human judges. This method can be used as a framework to train GANs with competitive perceptual results in comparison to state-of-the-art alternatives.
Generative Adversarial Networks (GANs) har använts för att uppnå state-of-the- art resultat för grundläggande bildanalys uppgifter, som generering av högupplösta bilder från bilder med låg upplösning, men de är notoriskt svåra att träna på grund av instabiliteten relaterad till det konkurrerande minimax-ramverket. Dessutom kan traditionella mekanismer för att generera ensembler inte tillämpas effektivt med dessa typer av nätverk på grund av de resurser de behöver vid inferenstid och deras arkitekturs komplexitet. I det här projektet har en alternativ metod för att samla enskilda, mer stabila och modeller som är lättare att träna genom interpolation i parameterrymden visat sig ge bättre perceptuella resultat än de ursprungliga enskilda modellerna och denna metod kan användas som ett ramverk för att träna GAN med konkurrenskraftig perceptuell prestanda jämfört med toppmodern teknik.
APA, Harvard, Vancouver, ISO, and other styles
24

Liu, Jiaping. "A Study on Distribution Learning of Generative Adversarial Networks." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/41250.

Full text
Abstract:
This thesis is an exploration of the properties of shallow generative adversarial networks (GANs). We focus on several aspects of GANs to investigate the learnability of a class of distributions using shallow GANs and conduct experiments to explore the influence of these aspects on the performance of the GAN models. We identify and analyze several pathological phenomena in theoretical analysis and experiments, and propose potential solutions for them.
APA, Harvard, Vancouver, ISO, and other styles
25

Radhakrishnan, Saieshwar. "Domain Adaptation of IMU sensors using Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286821.

Full text
Abstract:
Autonomous vehicles rely on sensors for a clear understanding of the environment and in a heavy duty truck, the sensors are placed at multiple locations like the cabin, chassis and the trailer in order to increase the field of view and reduce the blind spot area. Usually, these sensors perform best when they are stationary relative to the ground, hence large and fast movements, which are quite common in a truck, may lead to performance reduction, erroneous data or in the worst case, a sensor failure. This enforces a need to validate the sensors before using them for making life-critical decisions. This thesis proposes Domain Adaptation as one of the strategies to co-validate Inertial Measurement Unit (IMU) sensors. The proposed Generative Adversarial Network (GAN) based framework predicts the data of one IMU using other IMUs in the truck by implicitly learning the internal dynamics. This prediction model along with other sensor fusion strategies would be used by the supervising system to validate the IMUs in real-time. Through data collected from real-world experiments, it is shown that the proposed framework is able to accurately transform raw IMU sequences across domains. A further comparison is made between Long Short Term Memory (LSTM) and WaveNet based architectures to show the superiority of WaveNets in terms of performance and computational efficiency.
Autonoma fordon förlitar sig på sensorer för att skapa en bild av omgivningen. På en tung lastbil placeras sensorerna på multipla ställen, till exempel på hytten, chassiet och på trailern för att öka siktfältet och för att minska blinda områden. Vanligtvis presterar sensorerna som bäst när de är stationära i förhållande till marken, därför kan stora och snabba rörelser, som är vanliga på en lastbil, leda till nedsatt prestanda, felaktig data och i värsta fall fallerande sensorer. På grund av detta så finns det ett stort behov av att validera sensordata innan det används för kritiskt beslutsfattande. Den här avhandlingen föreslår domänadaption som en av de strategier för att samvalidera Tröghetsmätningssensorer (IMU-sensorer). Det föreslagna Generative Adversarial Network (GAN) baserade ramverket förutspår en Tröghetssensors data genom att implicit lära sig den interna dynamiken från andra Tröghetssensorer som är monterade på lastbilen. Den här prediktionsmodellen kombinerat med andra sensorfusionsstrategier kan användas av kontrollsystemet för att i realtid validera Tröghetssensorerna. Med hjälp av data insamlat från verkliga experiment visas det att det föreslagna ramverket klarar av att med hög noggrannhet konvertera obehandlade Tröghetssensor-sekvenser mellan domäner. Ytterligare en undersökning mellan Long Short Term Memory (LSTM) och WaveNet-baserade arkitekturer görs för att visa överlägsenheten i WaveNets när det gäller prestanda och beräkningseffektivitet.
APA, Harvard, Vancouver, ISO, and other styles
26

Sheriff, Waseem. "Learning to predict text quality using Generative Adversarial Networks." Thesis, KTH, Numerisk analys, NA, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-264109.

Full text
Abstract:
Generating summaries of long text articles is a common application in natural language processing. Automatic text summarization models often find themselves generating summaries that don’t resemble the quality of human written text, even though they preserve factual accuracy. In this thesis, a method to improve quality of summaries is created by combining loss functions from an existing baseline competitive model (Pointer Generator Networks) for abstractive text summarization with SeqGAN - a successful text generation algorithm based on Generative Adversarial Networks. The model is tested on the CNN/Daily Mail dataset of news articles. The results show that the summaries generated by the model are more human-like in quality as well as accurate in content than the baseline model.
Att generera sammanfattningar av långa artiklar är en vanlig tillämpning inom språkteknologi. Automatiska textsammanfattningsmodeller ger ofta sammanfattningar som inte håller samma kvalitet som människoskriven text, även om den faktiska noggrannheten i textens innehåll är bevarad. I denna avhandling presenteras en metod för att förbättra kvaliteten hos automatiskt genererade textsammanfattningar, vilken kombinerar förlustfunktioner från en existerande modell (Pointer Generator Networks) för automatisk generering av textsammanfattningar med SeqGAN - en framgångsrik textgenereringsalgoritm baserad på generativa kontradiktoriska nätverk (eng. Generative Adversarial Networks). Modellen testas på en uppsättning nyhetsartiklar från CNN/Daily Mail. Resultaten visar att sammanfattningarna som genereras av modellen är såväl kvalitetsmässigt som innehållsmässigt mer lika den människoskrivna texten jämfört med modellen baserad på Pointer Generator Networks.
APA, Harvard, Vancouver, ISO, and other styles
27

Birgersson, Anna, and Klara Hellgren. "Texture Enhancement in 3D Maps using Generative Adversarial Networks." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162446.

Full text
Abstract:
In this thesis we investigate the use of GANs for texture enhancement. To achievethis, we have studied if synthetic satellite images generated by GANs will improvethe texture in satellite-based 3D maps. We investigate two GANs; SRGAN and pix2pix. SRGAN increases the pixelresolution of the satellite images by generating upsampled images from low resolutionimages. As for pip2pix, the GAN performs image-to-image translation bytranslating a source image to a target image, without changing the pixel resolution. We trained the GANs in two different approaches, named SAT-to-AER andSAT-to-AER-3D, where SAT, AER and AER-3D are different datasets provided bythe company Vricon. In the first approach, aerial images were used as groundtruth and in the second approach, rendered images from an aerial-based 3D mapwere used as ground truth. The procedure of enhancing the texture in a satellite-based 3D map was dividedin two steps; the generation of synthetic satellite images and the re-texturingof the 3D map. Synthetic satellite images generated by two SRGAN models andone pix2pix model were used for the re-texturing. The best results were presentedusing SRGAN in the SAT-to-AER approach, in where the re-textured 3Dmap had enhanced structures and an increased perceived quality. SRGAN alsopresented a good result in the SAT-to-AER-3D approach, where the re-textured3D map had changed color distribution and the road markers were easier to distinguishfrom the ground. The images generated by the pix2pix model presentedthe worst result. As for the SAT-to-AER approach, even though the syntheticsatellite images generated by pix2pix were somewhat enhanced and containedless noise, they had no significant impact in the re-texturing. In the SAT-to-AER-3D approach, none of the investigated models based on the pix2pix frameworkpresented any successful results. We concluded that GANs can be used as a texture enhancer using both aerialimages and images rendered from an aerial-based 3D map as ground truth. Theuse of GANs as a texture enhancer have great potential and have several interestingareas for future works.
APA, Harvard, Vancouver, ISO, and other styles
28

Nistal, Hurlé Javier. "Exploring generative adversarial networks for controllable musical audio synthesis." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAT009.

Full text
Abstract:
Les synthétiseurs audio sont des instruments de musique électroniques qui génèrent des sons artificiels sous un certain contrôle paramétrique. Alors que les synthétiseurs ont évolué depuis leur popularisation dans les années 70, deux défis fondamentaux restent encore non résolus: 1) le développement de systèmes de synthèse répondant à des paramètres sémantiquement intuitifs; 2) la conception de techniques de synthèse «universelles», indépendantes de la source à modéliser. Cette thèse étudie l’utilisation des réseaux adversariaux génératifs (ou GAN) pour construire de tels systèmes. L’objectif principal est de rechercher et de développer de nouveaux outils pour la production musicale, qui offrent des moyens intuitifs de manipulation du son, par exemple en contrôlant des paramètres qui répondent aux propriétés perceptives du son et à d’autres caractéristiques. Notre premier travail étudie les performances des GAN lorsqu’ils sont entraînés sur diverses représentations de signaux audio. Ces expériences comparent différentes formes de données audio dans le contexte de la synthèse sonore tonale. Les résultats montrent que la représentation magnitude-fréquence instantanée et la transformée de Fourier à valeur complexe obtiennent les meilleurs résultats. En s’appuyant sur ce résultat, notre travail suivant présente DrumGAN, un synthétiseur audio de sons percussifs. En conditionnant le modèle sur des caractéristiques perceptives décrivant des propriétés timbrales de haut niveau, nous démontrons qu’un contrôle intuitif peut être obtenu sur le processus de génération. Ce travail aboutit au développement d’un plugin VST générant de l’audio haute résolution. La rareté des annotations dans les ensembles de données audio musicales remet en cause l’application de méthodes supervisées pour la génération conditionnelle. On utilise une approche de distillation des connaissances pour extraire de telles annotations à partir d’un système d’étiquetage audio préentraîné. DarkGAN est un synthétiseur de sons tonaux qui utilise les probabilités de sortie d’un tel système (appelées « étiquettes souples ») comme informations conditionnelles. Les résultats montrent que DarkGAN peut répondre modérément à de nombreux attributs intuitifs, même avec un conditionnement d’entrée hors distribution. Les applications des GAN à la synthèse audio apprennent généralement à partir de données de spectrogramme de taille fixe. Nous abordons cette limitation en exploitant une méthode auto-supervisée pour l’apprentissage de caractéristiques discrètes à partir de données séquentielles. De telles caractéristiques sont utilisées comme entrée conditionnelle pour fournir au modèle des informations dépendant du temps par étapes. La cohérence globale est assurée en fixant le bruit d’entrée z (caractéristique en GANs). Les résultats montrent que, tandis que les modèles entraînés sur un schéma de taille fixe obtiennent une meilleure qualité et diversité audio, les nôtres peuvent générer avec compétence un son de n’importe quelle durée. Une direction de recherche intéressante est la génération d’audio conditionnée par du matériel musical préexistant. Nous étudions si un générateur GAN, conditionné sur des signaux audio musicaux hautement compressés, peut générer des sorties ressemblant à l’audio non compressé d’origine. Les résultats montrent que le GAN peut améliorer la qualité des signaux audio par rapport aux versions MP3 pour des taux de compression très élevés (16 et 32 kbit/s). En conséquence directe de l’application de techniques d’intelligence artificielle dans des contextes musicaux, nous nous demandons comment la technologie basée sur l’IA peut favoriser l’innovation dans la pratique musicale. Par conséquent, nous concluons cette thèse en offrant une large perspective sur le développement d’outils d’IA pour la production musicale, éclairée par des considérations théoriques et des rapports d’utilisation d’outils d’IA dans le monde réel par des artistes professionnels
Audio synthesizers are electronic musical instruments that generate artificial sounds under some parametric control. While synthesizers have evolved since they were popularized in the 70s, two fundamental challenges are still unresolved: 1) the development of synthesis systems responding to semantically intuitive parameters; 2) the design of "universal," source-agnostic synthesis techniques. This thesis researches the use of Generative Adversarial Networks (GAN) towards building such systems. The main goal is to research and develop novel tools for music production that afford intuitive and expressive means of sound manipulation, e.g., by controlling parameters that respond to perceptual properties of the sound and other high-level features. Our first work studies the performance of GANs when trained on various common audio signal representations (e.g., waveform, time-frequency representations). These experiments compare different forms of audio data in the context of tonal sound synthesis. Results show that the Magnitude and Instantaneous Frequency of the phase and the complex-valued Short-Time Fourier Transform achieve the best results. Building on this, our following work presents DrumGAN, a controllable adversarial audio synthesizer of percussive sounds. By conditioning the model on perceptual features describing high-level timbre properties, we demonstrate that intuitive control can be gained over the generation process. This work results in the development of a VST plugin generating full-resolution audio and compatible with any Digital Audio Workstation (DAW). We show extensive musical material produced by professional artists from Sony ATV using DrumGAN. The scarcity of annotations in musical audio datasets challenges the application of supervised methods to conditional generation settings. Our third contribution employs a knowledge distillation approach to extract such annotations from a pre-trained audio tagging system. DarkGAN is an adversarial synthesizer of tonal sounds that employs the output probabilities of such a system (so-called “soft labels”) as conditional information. Results show that DarkGAN can respond moderately to many intuitive attributes, even with out-of-distribution input conditioning. Applications of GANs to audio synthesis typically learn from fixed-size two-dimensional spectrogram data analogously to the "image data" in computer vision; thus, they cannot generate sounds with variable duration. In our fourth paper, we address this limitation by exploiting a self-supervised method for learning discrete features from sequential data. Such features are used as conditional input to provide step-wise time-dependent information to the model. Global consistency is ensured by fixing the input noise z (characteristic in adversarial settings). Results show that, while models trained on a fixed-size scheme obtain better audio quality and diversity, ours can competently generate audio of any duration. One interesting direction for research is the generation of audio conditioned on preexisting musical material, e.g., the generation of some drum pattern given the recording of a bass line. Our fifth paper explores a simple pretext task tailored at learning such types of complex musical relationships. Concretely, we study whether a GAN generator, conditioned on highly compressed MP3 musical audio signals, can generate outputs resembling the original uncompressed audio. Results show that the GAN can improve the quality of the audio signals over the MP3 versions for very high compression rates (16 and 32 kbit/s). As a direct consequence of applying artificial intelligence techniques in musical contexts, we ask how AI-based technology can foster innovation in musical practice. Therefore, we conclude this thesis by providing a broad perspective on the development of AI tools for music production, informed by theoretical considerations and reports from real-world AI tool usage by professional artists
APA, Harvard, Vancouver, ISO, and other styles
29

Evholt, David, and Oscar Larsson. "Generative Adversarial Networks and Natural Language Processing for Macroeconomic Forecasting." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273422.

Full text
Abstract:
Macroeconomic forecasting is a classic problem, today most often modeled using time series analysis. Few attempts have been made using machine learning methods, and even fewer incorporating unconventional data, such as that from social media. In this thesis, a Generative Adversarial Network (GAN) is used to predict U.S. unemployment, beating the ARIMA benchmark on all horizons. Furthermore, attempts at using Twitter data and the Natural Language Processing (NLP) model DistilBERT are performed. While these attempts do not beat the benchmark, they do show promising results with predictive power. The models are also tested at predicting the U.S. stock index S&P 500. For these models, the Twitter data does improve the accuracy and shows the potential of social media data when predicting a more erratic index with less seasonality that is more responsive to current trends in public discourse. The results also show that Twitter data can be used to predict trends in both unemployment and the S&P 500 index. This sets the stage for further research into NLP-GAN models for macroeconomic predictions using social media data.
Makroekonomiska prognoser är sedan länge en svår utmaning. Idag löses de oftast med tidsserieanalys och få försök har gjorts med maskininlärning. I denna uppsats används ett generativt motstridande nätverk (GAN) för att förutspå amerikansk arbetslöshet, med resultat som slår samtliga riktmärken satta av en ARIMA. Ett försök görs också till att använda data från Twitter och den datorlingvistiska (NLP) modellen DistilBERT. Dessa modeller slår inte riktmärkena men visar lovande resultat. Modellerna testas vidare på det amerikanska börsindexet S&P 500. För dessa modeller förbättrade Twitterdata resultaten vilket visar på den potential data från sociala medier har när de appliceras på mer oregelbunda index, utan tydligt säsongsberoende och som är mer känsliga för trender i det offentliga samtalet. Resultaten visar på att Twitterdata kan användas för att hitta trender i både amerikansk arbetslöshet och S&P 500 indexet. Detta lägger grunden för fortsatt forskning inom NLP-GAN modeller för makroekonomiska prognoser baserade på data från sociala medier.
APA, Harvard, Vancouver, ISO, and other styles
30

Stenhagen, Petter. "Improving Realism in Synthetic Barcode Images using Generative Adversarial Networks." Thesis, Linköpings universitet, Datorseende, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-151959.

Full text
Abstract:
This master thesis explores the possibility of using generative Adversarial Networks (GANs) to refine labeled synthetic code images to resemble real code images while preserving label information. The GAN used in this thesis consists of a refiner and a discriminator. The discriminator tries to distinguish between real images and refined synthetic images. The refiner tries to fool the discriminator by producing refined synthetic images such that the discriminator classify them as real. By updating these two networks iteratively, the idea is that they will push each other to get better, resulting in refined synthetic images with real image characteristics. The aspiration, if the exploration of GANs turns out successful, is to be able to use refined synthetic images as training data in Semantic Segmentation (SS) tasks and thereby eliminate the laborious task of gathering and labeling real data. Starting off from a foundational GAN-model, different network architectures, hyperparameters and other design choices are explored to find the best performing GAN-model. As is widely acknowledged in the relevant literature, GANs can be difficult to train and the results in this thesis are varying and sometimes ambiguous. Based on the results from this study, the best performing models do however perform better in SS tasks than the unrefined synthetic set they are based on and benchmarked against, with regards to Intersection over Union.
APA, Harvard, Vancouver, ISO, and other styles
31

Hagvall, Hörnstedt Julia. "Synthesis of Thoracic Computer Tomography Images using Generative Adversarial Networks." Thesis, Linköpings universitet, Avdelningen för medicinsk teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158280.

Full text
Abstract:
The use of machine learning algorithms to enhance and facilitate medical diagnosis and analysis is a promising and an important area, which could improve the workload of clinicians’ substantially. In order for machine learning algorithms to learn a certain task, large amount of data needs to be available. Data sets for medical image analysis are rarely public due to restrictions concerning the sharing of patient data. The production of synthetic images could act as an anonymization tool to enable the distribution of medical images and facilitate the training of machine learning algorithms, which could be used in practice. This thesis investigates the use of Generative Adversarial Networks (GAN) for synthesis of new thoracic computer tomography (CT) images, with no connection to real patients. It also examines the usefulness of the images by comparing the quantitative performance of a segmentation network trained with the synthetic images with the quantitative performance of the same segmentation network trained with real thoracic CT images. The synthetic thoracic CT images were generated using CycleGAN for image-to-image translation between label map ground truth images and thoracic CT images. The synthetic images were evaluated using different set-ups of synthetic and real images for training the segmentation network. All set-ups were evaluated according to sensitivity, accuracy, Dice and F2-score and compared to the same parameters evaluated from a segmentation network trained with 344 real images. The thesis shows that it was possible to generate synthetic thoracic CT images using GAN. However, it was not possible to achieve an equal quantitative performance of a segmentation network trained with synthetic data compared to a segmentation network trained with the same amount of real images in the scope of this thesis. It was possible to achieve equal quantitative performance of a segmentation network, as a segmentation network trained on real images, by training it with a combination of real and synthetic images, where a majority of the images were synthetic images and a minority were real images. By using a combination of 59 real images and 590 synthetic images, equal performance as a segmentation network trained with 344 real images was achieved regarding sensitivity, Dice and F2-score. Equal quantitative performance of a segmentation network could thus be achieved by using fewer real images together with an abundance of synthetic images, created at close to no cost, indicating a usefulness of synthetically generated images.
APA, Harvard, Vancouver, ISO, and other styles
32

Zou, Xiaozhou. "Improve the Convergence Speed and Stability of Generative Adversarial Networks." Digital WPI, 2018. https://digitalcommons.wpi.edu/etd-theses/1309.

Full text
Abstract:
In this thesis, we address two major problems in Generative Adversarial Networks (GAN), an important sub-field in deep learning. The first problem that we address is the instability in the training process that happens in many real-world problems and the second problem that we address is the lack of a good evaluation metric for the performance of GAN algorithms. To understand and address the first problem, three approaches are developed. Namely, we introduce randomness to the training process; we investigate various normalization methods; most importantly we develop a better parameter initialization strategy to help stabilize training. In the randomness techniques part of the thesis, we developed two randomness approaches, namely the addition of gradient noise and the batch random flipping of the results from the discrimination section of a GAN. In the normalization part of the thesis, we compared the performances of the z-score transform, the min-max normalization, affine transformations and batch normalization. In the most novel and important part of this thesis, we developed techniques to initialize the GAN generator section with parameters that can produce a uniform distribution on the range of the training data. As far as we are aware, this seemingly simple idea has not yet appeared in the extant literature, and the empirical results we obtain on 2-dimensional synthetic data show marked improvement. As to better evaluation metrics, we demonstrate a simple yet effective way to evaluate the effectiveness of the generator using a novel "overlap loss".
APA, Harvard, Vancouver, ISO, and other styles
33

De, Biase Alessia. "Generative Adversarial Networks to enhance decision support in digital pathology." Thesis, Linköpings universitet, Statistik och maskininlärning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158486.

Full text
Abstract:
Histopathological evaluation and Gleason grading on Hematoxylin and Eosin(H&E) stained specimens is the clinical standard in grading prostate cancer. Recently, deep learning models have been trained to assist pathologists in detecting prostate cancer. However, these predictions could be improved further regarding variations in morphology, staining and differences across scanners. An approach to tackle such problems is to employ conditional GANs for style transfer. A total of 52 prostatectomies from 48 patients were scanned with two different scanners. Data was split into 40 images for training and 12 images for testing and all images were divided into overlapping 256x256 patches. A segmentation model was trained using images from scanner A, and the model was tested on images from both scanner A and B. Next, GANs were trained to perform style transfer from scanner A to scanner B. The training was performed using unpaired training images and different types of Unsupervised Image to Image Translation GANs (CycleGAN and UNIT). Beside the common CycleGAN architecture, a modified version was also tested, adding Kullback Leibler (KL) divergence in the loss function. Then, the segmentation model was tested on the augmented images from scanner B.The models were evaluated on 2,000 randomly selected patches of 256x256 pixels from 10 prostatectomies. The resulting predictions were evaluated both qualitatively and quantitatively. All proposed methods outperformed in AUC, in the best case the improvement was of 16%. However, only CycleGAN trained on a large dataset demonstrated to be capable to improve the segmentation tool performance, preserving tissue morphology and obtaining higher results in all the evaluation measurements. All the models were analyzed and, finally, the significance of the difference between the segmentation model performance on style transferred images and on untransferred images was assessed, using statistical tests.
APA, Harvard, Vancouver, ISO, and other styles
34

Lavault, Antoine. "Generative Adversarial Networks for Synthesis and Control of Drum Sounds." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS614.

Full text
Abstract:
Les synthétiseurs audio sont des systèmes électroniques capable de générer des sons artificiels sous un ensemble de paramètres dépendants de leur architecture. Quand bien même de multiples évolutions ont transformé les synthétiseurs de simples curiosités sonores dans les années 60 et précédentes à des instruments maîtres dans les productions musicales modernes, deux grands défis restent à relever: le développement d'un système de synthèse répondant à des paramètres cohérent avec leur perception par un humain et la conception d'une méthode de synthèse universelle, capable de modéliser n'importe quelle source et de la dépasser. Cette thèse étudie l'utilisation et la valorisation des réseaux antagonistes génératifs (Generative Adversarial Networks, abrégé en GAN) pour construire un système répondant aux deux problèmes exposés précédemment. L'objectif principal est ainsi de proposer un synthétiseur neuronal capable de générer des sons de batteries réalistes et contrôlable par un ensemble de paramètres de timbres prédéfinis, ainsi que de proposer un contrôle de la vélocité de la synthèse. La première étape dans le projet a été de proposer une approche basée sur les dernières avancées techniques au moment de sa conception pour générer des sons de batteries réalistes. A cette méthode de synthèse neuronale, nous avons aussi ajouter des capacités de contrôle du timbre en explorant une voie différente des solutions existantes: l'utilisation de descripteurs différentiables. Pour donner des garanties expérimentales à notre travail, nous avons réalisé des expériences d'évaluation à la fois via des métriques objectives basées sur les statistiques mais aussi des évaluations subjectives et psychoĥysiques sur la qualité perçue et la perception des erreurs de contrôle. Pour proposer un synthétiseur utilisable pour des performances musicales, nous avons aussi ajouter un contrôle de la vélocité. Toujours dans l'idée de poursuivre la réalisation d'un synthétiseur universel et à contrôle universel, nous avons créer ex-nihilo un jeu de données composé de sons de batteries dans le but avoué de créer une base exhaustive des sons accessibles dans l'immense majorité des conditions rencontrées dans le contexte de la production musicale. De ce jeu de données, nous présentons des résultats expérimentaux liés au contrôle de la dynamique, un des aspects phares de la performance musicale mais laissé de côté par la littérature. Pour justifier des capacités offertes par la méthode de synthèse par GANs, nous montrons qu'il est possible de marier les méthodes de synthèse classiques avec la synthèse neuronale en exploitant les limites et particularités des GANs pour obtenir des sons hybrides nouveaux et musicalement intéressants
Audio synthesizers are electronic systems capable of generating artificial sounds under parameters depending on their architecture. Even though multiple evolutions have transformed synthesizers from simple sonic curiosities in the 1960s and earlier to the main instruments in modern musical productions, two major challenges remain; the development of a system of sound synthesis with a parameter set coherent with its perception by a human and the design of a universal synthesis method, able to model any source and provide new original sounds. This thesis studies using and enhancing Generative Adversarial Networks (GAN) to build a system answering the previously-mentioned problems. The main objective is to propose a neural synthesizer capable of generating realistic drum sounds controllable by predefined timbre parameters and hit velocity. The first step in the project was to propose an approach based on the latest technological advances at the time of its conception to generate realistic drum sounds. We added timbre control capabilities to this method by exploring a different way from existing solutions, i.e., differentiable descriptors. To give experimental guarantees to our work, we performed evaluation experiments via objective metrics based on statistics and subjective and psychopĥysical evaluations on perceived quality and perception of control errors. These experiments continued to add velocity control to the timbral control. Still, with the idea of pursuing the realization of a versatile synthesizer with universal control, we have created a dataset ex-nihilo composed of drum sounds to create an exhaustive database of sounds accessible in the vast majority of conditions encountered in the context of music production. From this dataset, we present experimental results related to the control of dynamics, one of the critical aspects of musical performance but left aside by the literature. To justify the capabilities offered by the GANs synthesis method, we show that it is possible to marry classical synthesis methods with neural synthesis by exploiting the limits and particularities of GANs to obtain new and musically interesting hybrid sounds
APA, Harvard, Vancouver, ISO, and other styles
35

Berglöf, Olle, and Adam Jacobs. "Effects of Transfer Learning on Data Augmentation with Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259485.

Full text
Abstract:
Data augmentation is a technique that acquires more training data by augmenting available samples, where the training data is used to fit model parameters. Data augmentation is utilized due to a shortage of training data in certain domains and to reduce overfitting. Augmenting a training dataset for image classification with a Generative Adversarial Network (GAN) has been shown to increase classification accuracy. This report investigates if transfer learning within a GAN can further increase classification accuracy when utilizing the augmented training dataset. The method section describes a specific GAN architecture for the experiments that includes a label condition. When using transfer learning within the specific GAN architecture, a statistical analysis shows a statistically significant increase in classification accuracy for a classification problem with the EMNIST dataset, which consists of images of handwritten alphanumeric characters. In the discussion section, the authors analyze the results and motivates other use cases for the proposed GAN architecture.
Datautökning är en metod som skapar mer träningsdata genom att utöka befintlig träningsdata, där träningsdatan används för att anpassa modellers parametrar. Datautökning används på grund av en brist på träningsdata inom vissa områden samt för att minska overfitting. Att utöka ett träningsdataset för att genomföra bildklassificering med ett generativt adversarialt nätverk (GAN) har visats kunna öka precisionen av klassificering av bilder. Denna rapport undersöker om transferlärande inom en GAN kan vidare öka klassificeringsprecisionen när ett utökat träningsdataset används. Metoden beskriver en specific GANarkitektur som innehåller ett etikettvillkor. När transferlärande används inom den utvalda GAN-arkitekturen visar en statistisk analys en statistiskt säkerställd ökning av klassificeringsprecisionen för ett klassificeringsproblem med EMNIST datasetet, som innehåller bilder på handskrivna bokstäver och siffror. I diskussionen diskuteras orsakerna bakom resultaten och fler användningsområden nämns.
APA, Harvard, Vancouver, ISO, and other styles
36

Gawande, Saurabh. "Generative adversarial networks for single image super resolution in microscopy images." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230188.

Full text
Abstract:
Image Super resolution is a widely-studied problem in computer vision, where the objective is to convert a lowresolution image to a high resolution image. Conventional methods for achieving super-resolution such as image priors, interpolation, sparse coding require a lot of pre/post processing and optimization. Recently, deep learning methods such as convolutional neural networks and generative adversarial networks are being used to perform super-resolution with results competitive to the state of the art but none of them have been used on microscopy images. In this thesis, a generative adversarial network, mSRGAN, is proposed for super resolution with a perceptual loss function consisting of a adversarial loss, mean squared error and content loss. The objective of our implementation is to learn an end to end mapping between the low / high resolution images and optimize the upscaled image for quantitative metrics as well as perceptual quality. We then compare our results with the current state of the art methods in super resolution, conduct a proof of concept segmentation study to show that super resolved images can be used as a effective pre processing step before segmentation and validate the findings statistically.
Image Super-resolution är ett allmänt studerad problem i datasyn, där målet är att konvertera en lågupplösningsbild till en högupplöst bild. Konventionella metoder för att uppnå superupplösning som image priors, interpolation, sparse coding behöver mycket föroch efterbehandling och optimering.Nyligen djupa inlärningsmetoder som convolutional neurala nätverk och generativa adversariella nätverk är användas för att utföra superupplösning med resultat som är konkurrenskraftiga mot toppmoderna teknik, men ingen av dem har använts på mikroskopibilder. I denna avhandling, ett generativ kontradiktorisktsnätverk, mSRGAN, är föreslås för superupplösning med en perceptuell förlustfunktion bestående av en motsatt förlust, medelkvadratfel och innehållförlust.Mål med vår implementering är att lära oss ett slut på att slut kartläggning mellan bilder med låg / hög upplösning och optimera den uppskalade bilden för kvantitativa metriks såväl som perceptuell kvalitet. Vi jämför sedan våra resultat med de nuvarande toppmoderna metoderna i superupplösning, och uppträdande ett bevis på konceptsegmenteringsstudie för att visa att superlösa bilder kan användas som ett effektivt förbehandling steg före segmentering och validera fynden statistiskt.
APA, Harvard, Vancouver, ISO, and other styles
37

Delacruz, Gian P. "Using Generative Adversarial Networks to Classify Structural Damage Caused by Earthquakes." DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2158.

Full text
Abstract:
The amount of structural damage image data produced in the aftermath of an earthquake can be staggering. It is challenging for a few human volunteers to efficiently filter and tag these images with meaningful damage information. There are several solution to automate post-earthquake reconnaissance image tagging using Machine Learning (ML) solutions to classify each occurrence of damage per building material and structural member type. ML algorithms are data driven; improving with increased training data. Thanks to the vast amount of data available and advances in computer architectures, ML and in particular Deep Learning (DL) has become one of the most popular image classification algorithms producing results comparable to and in some cases superior to human experts. These kind of algorithms need the input images used for the training to be labeled, and even if there is a large amount of images most of them are not labeled and it takes structural engineers a large amount of time to do it. The current data earthquakes image data bases do not contain the label information or is incomplete slowing significantly the advance of a solution and are incredible difficult to search. To be able to train a ML algorithm to classify one of the structural damages it took the architecture school an entire year to gather 200 images of the specific damage. That number is clearly not enough to avoid overfitting so for this thesis we decided to generate synthetic images for the specific structural damage. In particular we attempt to use Generative Adversarial Neural Networks (GANs) to generate the synthetic images and enable the fast classification of rail and road damage caused by earthquakes. Fast classification of rail and road damage can allow for the safety of people and to better prepare the reconnaissance teams that manage recovery tasks. GANs combine classification neural networks with generative neural networks. For this thesis we will be combining a convolutional neural network (CNN) with a generative neural network. By taking a classifier trained in a GAN and modifying it to classify other images the classifier can take advantage of the GAN training without having to find more training data. The classifier trained in this way was able to achieve an 88\% accuracy score when classifying images of structural damage caused by earthquakes.
APA, Harvard, Vancouver, ISO, and other styles
38

Thaung, Ludwig. "Advanced Data Augmentation : With Generative Adversarial Networks and Computer-Aided Design." Thesis, Linköpings universitet, Datorseende, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-170886.

Full text
Abstract:
CNN-based (Convolutional Neural Network) visual object detectors often reach human level of accuracy but need to be trained with large amounts of manually annotated data. Collecting and annotating this data can frequently be time-consuming and financially expensive. Using generative models to augment the data can help minimize the amount of data required and increase detection per-formance. Many state-of-the-art generative models are Generative Adversarial Networks (GANs). This thesis investigates if and how one can utilize image data to generate new data through GANs to train a YOLO-based (You Only Look Once) object detector, and how CAD (Computer-Aided Design) models can aid in this process. In the experiments, different models of GANs are trained and evaluated by visual inspection or with the Fréchet Inception Distance (FID) metric. The data provided by Ericsson Research consists of images of antenna and baseband equipment along with annotations and segmentations. Ericsson Research supplied the YOLO detector, and no modifications are made to this detector. Finally, the YOLO detector is trained on data generated by the chosen model and evaluated by the Average Precision (AP). The results show that the generative models designed in this work can produce RGB images of high quality. However, the quality reduces if binary segmentation masks are to be generated as well. The experiments with CAD input data did not result in images that could be used for the training of the detector. The GAN designed in this work is able to successfully replace objects in images with the style of other objects. The results show that training the YOLO detector with GAN-modified data compared to training with real data leads to the same detection performance. The results also show that the shapes and backgrounds of the antennas contributed more to detection performance than their style and colour.
APA, Harvard, Vancouver, ISO, and other styles
39

Hinz, Tobias [Verfasser]. "Disentanglement, Compositionality, Specification: Representation Learning with Generative Adversarial Networks / Tobias Hinz." Hamburg : Staats- und Universitätsbibliothek Hamburg Carl von Ossietzky, 2021. http://d-nb.info/1234150344/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

GIANSANTI, VALENTINA. "Integration of heterogeneous single cell data with Wasserstein Generative Adversarial Networks." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2023. https://hdl.handle.net/10281/404516.

Full text
Abstract:
Tessuti, organi e organismi sono sistemi biologici complessi, oggetto di studi che mirano alla caratterizzazione dei loro processi biologici. Comprendere il loro funzionamento e la loro interazione in campioni sani e malati consente di interferire, correggere e prevenire le disfunzioni dalle quali si sviluppano possibilmente le malattie. I recenti sviluppi nelle tecnologie di sequenziamento single-cell stanno ampliano la capacità di profilare, a livello di singola cellula, diversi layer molecolari (trascrittoma, genoma, epigenoma, proteoma). Il numero, la grandezza e le diverse modalità dei dataset prodotti è in continua crescita. Ciò spinge allo sviluppo di robusti metodi per l’integrazione di dataset multiomici, che siano essi descrittivi o meno delle stesse cellule. L’integrazione di più fonti di informazione produce una descrizione più ampia e completa dell’intero sistema analizzato. La maggior parte dei sistemi di integrazione disponibili ad oggi consente l’analisi simultanea di un numero limitato di omiche (generalmente due) e richiede conoscenze pregresse riguardo le loro relazioni. Questi metodi spesso impongono la traduzione di una modalità nelle variabili espresse da un altro dato (ad esempio, i picchi di ATAC vengono convertiti in gene activity matrix). Questo step introduce un livello di approssimazione nel dato che potrebbe pregiudicare le analisi svolte in seguito. Da qui nasce MOWGAN (Multi Omic Wasserstein Generative Adversarial Network), un framework basato sul deep-learning, per la simulazione di dati multimodali appaiati in grado di supportare un alto numero di dataset (più di due) e agnostico sulle relazioni che intercorrono tra loro (non viene imposta alcuna assunzione). Ogni modalità viene proiettata in uno spazio descrittivo ridotto, le cui dimensioni sono fissate per tutti i datasets. Questo processo previene la traduzione tra modalità. Le cellule, descritte da vettori nello spazio ridotto, vengono ordinate in base alla prima componente della loro Laplacian Eigenmap. Un regressore Bayesian viene successivamente applicato per selezionare i mini-batch con i quali viene allenata una particolare architettura di deep-learning, la Wasserstein Generative Adversarial Network with gradient penalty. La componente generativa della rete restituisce in uscita un nuovo dataset, appaiato, che viene utilizzato come ponte per il passaggio di informazioni tra i dataset originali. Lo sviluppo di MOWGAN è stato condotto con l’ausilio di dati pubblici per i quali erano disponibili osservazioni di RNA e ATAC sia per le stesse cellule, che per cellule differenti. La valutazione dei risultati è stata condotta sulla base della capacità del dato prodotto di essere integrato con il dato originale. Inoltre, il dato sintetico deve avere informazione condivisa tra le diverse omiche. Questa deve rispettare la natura biologica del dato: le associazioni non devono essere presenti tra entità cellulari rappresentanti tipi cellulari differenti. L’organizzazione del dato in mini-batch consente a MOWGAN di avere una architettura di rete indipendente dal numero di modalità considerate. Infatti, il framework è stato applicato anche per l’integrazione di tre (RNA, ATAC e proteine, RNA ATAC e modificazioni istoniche) e quattro modalità (RNA, ATAC, proteine e modificazioni istoniche). Il rendimento di MOWGAN è stato dunque valutato in termini di scalabilità computazionale (integrazione di molteplici datasets) e significato biologico, essendo quest’ultimo il più importante per non giungere a conclusioni errate nello studio in essere. È stato eseguito un confronto con altri metodi già disponibili in letteratura, riscontrando la maggiore capacità di MOWGAN di creare associazioni inter-modali tra entità cellulari realmente legate. In conclusione, MOWGAN è uno strumento potente per l’integrazione di dati multi-modali in single-cell, che risponde a molte delle problematiche riscontrate nel campo.
Tissues, organs and organisms are complex biological systems. They are objects of many studies aiming at characterizing their biological processes. Understanding how they work and how they interact in healthy and unhealthy samples gives the possibility to interfere, correcting and preventing dysfunctions, possibly leading to diseases. Recent advances in single-cell technologies are expanding our capabilities to profile at single-cell resolution various molecular layers, by targeting the transcriptome, the genome, the epigenome and the proteome. The number of single-cell datasets, their size and the diverse modalities they describe is continuously increasing, prompting the need to develop robust methods to integrate multiomic datasets, whether paired from the same cells or, most challenging, from unpaired separate experiments. The integration of different source of information results in a more comprehensive description of the whole system. Most published methods allow the integration of limited number of omics (generally two) and make assumptions about their inter-relationships. They often impose the conversion of a data modality into the other one (e.g., ATAC peaks converted in a gene activity matrix). This step introduces an important level of approximation, which could affect the analysis later performed. Here we propose MOWGAN (Multi Omic Wasserstein Generative Adversarial Network), a deep-learning based framework to simulate paired multimodal data supporting high number of modalities (more than two) and agnostic about their relationships (no assumption is imposed). Each modality is embedded into feature spaces with same dimensionality across all modalities. This step prevents any conversion between data modalities. The embeddings are sorted based on the first Laplacian Eigenmap. Mini-batches are selected by a Bayesian ridge regressor to train a Wasserstein Generative Adversarial Network with gradient penalty. The output of the generative network is used to bridge real unpaired data. MOWGAN was prototyped on public data for which paired and unpaired RNA and ATAC experiments exists. Evaluation was conducted on the ability to produce data integrable with the original ones, on the amount of shared information between synthetic layers and on the ability to impose association between molecular layers that are truly connected. The organization of the embeddings in mini-batches allows MOWGAN to have a network architecture independent of the number of modalities evaluated. Indeed, the framework was also successfully applied to integrate three (e.g., RNA, ATAC and protein or histone modification data) and four modalities (e.g., RNA, ATAC, protein, histone modifications). MOWGAN’s performance was evaluated in terms of both computational scalability and biological meaning, being the latter the most important to avoid erroneous conclusion. A comparison was conducted with published methods, concluding that MOWGAN performs better when looking at the ability to retrieve the correct biological identity (e.g., cell types) and associations. In conclusion, MOWGAN is a powerful tool for multi-omics data integration in single-cell, which answer most of the critical issues observed in the field.
APA, Harvard, Vancouver, ISO, and other styles
41

Bartocci, John Timothy. "Generating a synthetic dataset for kidney transplantation using generative adversarial networks and categorical logit encoding." Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1617104572023027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Paiano, Michele. "Sperimentazione di tools per la creazione e l'addestramento di Generative Adversarial Networks." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2017.

Find full text
Abstract:
Soggetto principale di questo lavoro saranno le reti neurali artificiali, ed in particolare una classe di modelli generativi detti ”Generative adversarial Net-works”. Questi rappresentano un trampolino di lancio verso la costruzione di sistemi di Intelligenza Artificiale in grado di consumare dati grezzi provenienti dal mondo reale e automaticamente estrarne una comprensione che rappresenta la struttura intrinseca del mondo. Questo costituisce un grande passo avanti rispetto ai sistemi usati in passato, che erano in grado di apprendere da dati di addestramento accuratamente pre-etichettati da esseri umani competenti. Lo scopo finale di questa tesi non è quello di esibire dei risultati allo stato dell’arte, bensı̀ di fornire una trattazione accurata e descrivere le varie fasi di progettazione e realizzazione di una semplice generative adversarial networks utilizzando moderni strumenti di sviluppo in ambito di machine learning.
APA, Harvard, Vancouver, ISO, and other styles
43

Albertazzi, Riccardo. "A study on the application of generative adversarial networks to industrial OCR." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text
Abstract:
High performance and nearly perfect accuracy are the standards required by OCR algorithms for industrial applications. In the last years research on Deep Learning has proven that Convolutional Neural Networks (CNNs) are a very powerful and robust tool for image analysis and classification; when applied to OCR tasks, CNNs are able to perform much better than previously adopted techniques and reach easily 99% accuracy. However, Deep Learning models' effectiveness relies on the quality of the data used to train them; this can become a problem since OCR tools can run for months without interruption, and during this period unpredictable variations (printer errors, background modifications, light conditions) could affect the accuracy of the trained system. We cannot expect that the final user who trains the tool will take thousands of training pictures under different conditions until all imaginable variations have been captured; we then have to be able to generate these variations programmatically. Generative Adversarial Networks (GANs) are a recent breakthrough in machine learning; these networks are able to learn the distribution of the input data and therefore generate realistic samples belonging to that distribution. This thesis' objective is learning how GANs work in detail and perform experiments on generative models that allow to create unseen variations of OCR training characters, thus allowing the whole OCR system to be more robust to future character variations.
APA, Harvard, Vancouver, ISO, and other styles
44

Benedetti, Riccardo. "From Artificial Intelligence to Artificial Art: Deep Learning with Generative Adversarial Networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18167/.

Full text
Abstract:
Neural Network had a great impact on Artificial Intelligence and nowadays the Deep Learning algorithms are widely used to extract knowledge from huge amount of data. This thesis aims to revisit the evolution of Deep Learning from the origins till the current state-of-art by focusing on a particular prospective. The main question we try to answer is: can AI exhibit artistic abilities comparable to the human ones? Recovering the definition of the Turing Test, we propose a similar formulation of the concept, indeed, we would like to test the machine's ability to exhibit artistic behaviour equivalent to, or indistinguishable from, that of a human. The argument we will analyze as a support for this debate is an interesting and innovative idea coming from the field of Deep Learning, known as Generative Adversarial Network (GAN). GAN is basically a system composed of two neural network fighting each other in a zero-sum game. The ''bullets'' fired during this challenge are simply images generated by one of the two networks. The interesting part in this scenario is that, with a proper system design and training, after several iteration these fake generated images start to become more and more closer to the ones we see in the reality, making indistinguishable what is real from what is not. We will talk about some real anecdotes around GANs to spice up even more the discussion generated by the question previously posed and we will present some recent real world application based on GANs to emphasize their importance also in term of business. We will conclude with a practical experiment over an Amazon catalogue of clothing images and reviews with the aim of generating new never seen product starting from the most popular existing ones.
APA, Harvard, Vancouver, ISO, and other styles
45

Lenninger, Movitz. "Generative adversarial networks as integrated forward and inverse model for motor control." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-220535.

Full text
Abstract:
Internal models are believed to be crucial components in human motor control. It has been suggested that the central nervous system (CNS) uses forward and inverse models as internal representations of the motor systems. However, it is still unclear how the CNS implements the high-dimensional control of our movements. In this project, generative adversarial networks (GAN) are studied as a generative model of movement data. It is shown that, for a relatively small number of effectors, it is possible to train a GAN which produces new movement samples that are plausible given a simulator environment. It is believed that these models can be extended to generate high-dimensional movement data. Furthermore, this project investigates the possibility to use a trained GAN as an integrated forward and inverse model for motor control.
Interna modeller tros vara en viktig del av mänsklig rörelsekontroll. Det har föreslagits att det centrala nervsystemet (CNS) använder sig av framåtriktade modeller och inversa modeller för intern representation av motorsystemen. Dock är det fortfarande okänt hur det centrala nervsystemet implementerar denna högdimensionella kontroll. Detta examensarbete undersöker användningen av generativa konkurrerande nätverk som generativ modell av rörelsedata. Experiment visar att dessa nätverk kan tränas till att generera ny rörelsedata av en tvådelad arm och att den genererade datan efterliknar träningsdatan. Vi tror att nätverken även kan modellera mer högdimensionell rörelsedata. I projektet undersöks även användningen av dessa nätverk som en integrerad framåtriktad och invers modell.
APA, Harvard, Vancouver, ISO, and other styles
46

Tseng, Ching-Hsun, and 曾敬勳. "Ternary Generative Adversarial Networks." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/wgz93c.

Full text
Abstract:
碩士
國立交通大學
科技管理研究所
107
As variety learning methods introducing, using deep learning structures to fix present problems is a prevalent option, image tasks especially. Among image distinguishing, convolutional networks (CNNs) have been seen as a vital feat and overwhelmed a series of methods during competitions. Recently, semi-supervised learnings, such as GAN, have also spread a different spectrum on unsupervised image classifications. In this paper, in order to offering a more robust solution, we propose the ternary generative adversarial networks (TGAN), which we draw a lesson from DCGAN, WGAN-GP, ACGAN, and Triple GAN. Different from above novel GANs, TGAN owns three structures, the generator, discriminator, and supervisor, and thus TGAN not only can fulfill the original duty of distinguishing fake or real images and producing images but also classifies images’ label properly and sends loss to three structures to update properly toward low resolution images. Among our experiments and model comparisons, TGAN’s structure can efficiently converge and offer a decent accuracy on label classification, as TGAN has a readily trainable ability on label distinguishing, compared with ACGAN. Most importantly, this structure can help to output more reasonable generated images than rival’s samples.
APA, Harvard, Vancouver, ISO, and other styles
47

Kuo, Chun-Lin, and 郭俊麟. "Variational Bayesian Generative Adversarial Networks." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/jmkbnx.

Full text
Abstract:
碩士
國立交通大學
電機工程學系
107
In the past decade, deep neural networks have been attracting plenty of attentions in different applications especially in pattern recognition tasks like image classification, object recognition, speech recognition, speaker recognition, and synthesis or generation of different technical data including image, text, audio, speech and other types of complicated data. For the task of data generation, instead of estimating the density function, building the generative model is capable of manipulating high-dimensional probability distribution. In addition, generative models can be trained with missing data in a manner of semi-supervised learning and can be also incorporated into reinforcement learning in many ways. In particular, the generative model based on the generative adversarial network (GAN) is seen as a realization of inverse reinforcement learning. Furthermore, GAN is also capable of learning to work with multi-modal outputs for different tasks that intrinsically require generation of samples based on some distributions in the applications of super-resolution imaging and image-to-image translation. It is common to distinguish two types of generative model: implicit density model and explicit density model. Implicit density model implements a stochastic procedure that directly generates data. In practice, implicit density model transforms a latent variable using a deterministic function that map latent variable to the observed random variable. Such a mapping function is usually realized by neural networks. The transformed density is basically intractable and the high-dimensional derivative is difficult to compute. GAN provides a practical and analytical solution to this problem. In general, GAN involves a two-player game formulated as a minimax optimization problem for construction of two neural networks. One is for generator and the other is for discriminator. The discriminator is a classifier that determines whether a given sample looks like a real sample from the training data or an artificially generated sample. The generator attempts to generate plausible samples that the discriminator cannot distinguish. After a series of adversarial learning process, this model aims to estimate a converged generative distribution from the observed data. GAN has achieved remarkable performance on image generation tasks but still suffers from the mode collapse problem such that GAN could not always assure the excellent quality of synthesized samples. On the other hand, the explicit density model provides an explicit parametric specification for the distribution of observed data based on a log likelihood function. Maximum likelihood provides a straightforward approach to this category of models. Under this category, variational auto-encoder (VAE) is known as one of the most popular models with highly flexible priors and approximate posteriors. VAE maximizes the lower bound of data log-likelihood which leads to excellent performance in data reconstruction. However, new images synthesized by VAE tend to be blurry. In this thesis, we develop a variational inference solution to characterize the weight uncertainty in construction of generative adversarial network. It is worth noting, from a probabilistic perspective, the optimization for the weights of standard neural network is equivalent to a maximum likelihood estimation (MLE) problem. Basically, MLE ignores the weight uncertainty and easily produces the overfitted model. One common solution is to take the model regularization account. From Bayesian perspective, the regularization is performed by introducing the prior over the weights of network. If the prior distribution is a Gaussian, then it is equivalent to L2 regularization. Modeling the uncertainty of weights in GAN provides a meaningful solution to model regularization with improved generalization in case of different amounts of training data. Traditionally, the solutions based on the Laplace’s approximation or the sampling method using Markov Chain Monte Carlo (MCMC) involve too low complexity or take long time to converge. The Hamiltonian Monte Carlo (HMC) is introduced to efficiently calculate the gradients from samples. Also, the stochastic gradient Hamiltonian Monte Carlo (SGHMC) is used to scale up the implementation in presence of large training data. However, the computational overhead still exists when Monte Carlo methods are implemented to explore the posterior space. This study deals with the issue of computational overhead, avoids to converge to local minima in each parameter sets and proposes a new variational inference method to Bayesian GAN where the weight uncertainty in generator and discriminator is compensated. Variational Bayesian GAN (VB-GAN) is constructed by maximizing the variational lower bound and combining with an auto-encoder where the generator synthesizes the reasonable samples by preventing the issue of mode collapse due to the reconstruction of training data. A new type of hybrid VAE and GAN is developed to carry out the adversarial learning where blurry data generation is avoided. Importantly, the data reconstruction based on the Wasserstein auto-encoder is implemented as well and optimal transport is realized to measure the geometric distance between two probability distributions. The distance is minimized to regularize the continuous mixture distribution of latent variable so as to match with prior distribution instead of conditional distribution by adversarial procedure. As a result, we could get sharper results. At last, the proposed method is evaluated by the experiments on MNIST hand-written digits images generation, classification, synthetic data, mixtures of Gaussian, and CelebA, large scale celebFaces attributes. Lastly, speaker recognition with NIST i-vector based on Data Augmentation. The experimental results show the merits of subspace learning based on various realizations of adversarial learning.
APA, Harvard, Vancouver, ISO, and other styles
48

Tseng, Bo-Wei, and 曾柏偉. "Compressive Privacy Generative Adversarial Networks." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/vfdscw.

Full text
Abstract:
碩士
國立臺灣大學
電信工程學研究所
107
Machine learning as a service (MLaaS) has brought much convenience to our daily lives recently. However, the fact that the service is provided through cloud raises privacy leakage issues. In this work we propose the compressive privacy generative adversarial network (CPGAN), a data-driven adversarial learning framework for generating compressing representations that retain utility comparable to state-of-the-art, with the additional feature of defending against reconstruction attack. This is achieved by applying adversarial learning scheme to the design of compression network (privatizer), whose utility/privacy performances are evaluated by the utility classifier and the adversary reconstructor, respectively. Experimental results demonstrate that CPGAN achieves better utility/privacy trade-off in comparison with the previous work, and is applicable to real-world large datasets.
APA, Harvard, Vancouver, ISO, and other styles
49

Lee, Chia-Ruei, and 李家睿. "Using Generative Adversarial Networks for Domain Generation Algorithm." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/v7ssr7.

Full text
Abstract:
碩士
元智大學
資訊工程學系
106
Deep Learning has been widely used in the fields of image classification, video inpainting, dimensionality reduction, etc. Among different structures of deep learning networks, generative adversarial network (GAN) is the promising one to revolutionize the generative models. In particular, GAN, a hybrid structure consisting of a discriminator and generator, can be used to learn the inherent distribution of the input data. After that, the synthetic data sampled from the learned distribution exhibit similar statistics to the input data. In this thesis, we study the use of GAN as Domain Generation Algorithm (DGA) in botnet. By putting ourselves in the botmaster’s shoes, we consider the major challenges in designing a stealthy and robust botnet, such that the developed botnets over the GAN-based DGA could overcome the common weaknesses. More specifically, DGA is widely used in botnets to achieve stealthy communications between botmaster and bots. However, machine learning (ML)-based approaches have been developed to capture the difference between DGA-generated communication pattern and normal traffic pattern, so as to identify botnet communications. Thus, we study how to mimic the normal traffic pattern by taking advantage of GAN-based DGA. We used four GANs, including WGAN-GP, SeqGAN, RNN.WGAN and RNN.WGAN via Fisher GAN to conduct experiments. We found that under the DGA detection engine, Cymon, more than 20%–65% of DGA-generated traffic from our developed GAN-based DGA can escape the detection of Cymon, compared with the DGA-generated traffic from Cryptolocker and Ramnit.
APA, Harvard, Vancouver, ISO, and other styles
50

Santos, Beatriz de Jesus Pereira. "Drug Discovery with Generative Adversarial Networks." Master's thesis, 2021. http://hdl.handle.net/10316/96096.

Full text
Abstract:
Dissertação de Mestrado Integrado em Engenharia Biomédica apresentada à Faculdade de Ciências e Tecnologia
A descoberta de novos fármacos é um processo extremamente demorado, complexo, dispendioso e que apresenta taxas de sucesso muito baixas que podem ser atribuídas à elevada dimensionalidade do espaço químico. Estudar e avaliar o espaço químico de forma integral é simplesmente imprativável pelo que é importante encontrar novas formas de restringir o espaço de pesquisa. A utilização de algoritmos de Deep Learning tem surgido como uma possível solução para mitigar os problemas acima mencionados já que diminuem consideravelmente o tempo dispendido e, por conseguinte, as despesas associadas a todo o processo. As redes neuronais recorrentes (RNNs) e adversariais generativas (GANs) encontram-se entre os métodos mais promissores no que se refere à geração de novos potenciais fármacos.O trabalho desenvolvido deu origem a duas contribuições independentes. Foi efetuado um estudo extensivo das arquiteturas e parâmetros associados às redes recorrentes do qual resultou um modelo otimizado capaz de gerar até 98.7% de moléculas válidas mantendo elevados níveis de diversidade.Este estudo permitiu ainda demonstrar que a informação estereoquímica, que é de extrema importância no desenvolvimento de fármacos mas frequentemente ignorada, pode ser incluída nestes modelos computacionais com elevado sucesso.Para além disso, foi desenvolvida uma estratégia baseada em GANs que inclui uma componente de otimização. Este método é composto por duas técnicas de Deep Learning: um modelo Encoder-Decoder responsável por converter as moléculas em vetores do espaço latente, criando, desta forma, um novo tipo de representação molecular; e uma GAN com a capacidade de aprender e replicar a distribuição dos dados de treino para, posteriormente, gerar novos compostos. De modo a gerar moléculas otimizadas para uma determinada característica, a GAN treinada é conectada a um mecanismo de feedback que avalia as moléculas geradas a cada época e substitui os compostos do conjunto de treino que apresentam menor pontuação pelas novas moléculas com propriedades mais desejáveis. Desta forma, a distribuição dos compostos gerados vai-se aproximando sucessivamente do espaço químico de interesse, o que resulta na geração de um maior número de moléculas relevantes para o problema em estudo.
Drug discovery is a highly time-consuming, complex, and expensive process with low rates of success that can be mainly attributed to the high dimensionality of the chemical space. Evaluating the entire chemical space is prohibitively expensive, so it is of the utmost importance to find ways of narrowing down the search space. Deep Learning algorithms are emerging as a potential method to generate novel chemical structures since they can speed up the traditional process and decrease expenditure.Recurrent Neural networks (RNNs) and Generative Adversarial Networks (GANs) are two of the most promising methods for generating drug-like molecules from scratch.The proposed work resulted in two independent contributions. A comprehensive study on RNNs' architectures and parameters that resulted in an optimized model capable of generating up to 98.7% of valid non-specific drug-like molecules while maintaining high levels of diversity. This work also proved that stereo-chemical information, often overlooked in most works, can be successfully incorporated and learned by these models.Furthermore, a novel GAN-based framework that includes an optimization stage was developed. This approach incorporates two deep learning techniques: an Encoder-Decoder model that converts the string notations of molecules into latent space vectors, effectively creating a new type of molecular representation, and a GAN that is able to learn and replicate the training data distribution and, therefore, generate new compounds. In order to generate compounds with bespoken properties and once the GAN is replicating the chemical space, a feedback loop is incorporated that evaluates the generated molecules according to the desired property at every epoch of training and replaces the worst scoring entries in the training data by the best scoring generated molecules. This ensures a slow but steady shift of the generated distribution towards the space of the targeted property resulting in the generation of molecules that exhibit the desired characteristics.
Outro - This research has been funded by the Portuguese Research Agency FCT, through D4 - Deep Drug Discovery and Deployment (CENTRO-01-0145-FEDER029266). This work is funded by national funds through the FCT - Foundation for Science and Technology, I.P., within the scope of the project CISUC - UID/CEC/00326/2020 and by European Social Fund, through the Regional Operational Program Centro 2020.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography