To see the other types of publications on this topic, follow the link: Generative Adversarial Network.

Dissertations / Theses on the topic 'Generative Adversarial Network'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Generative Adversarial Network.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Daley, Jr John. "Generating Synthetic Schematics with Generative Adversarial Networks." Thesis, Högskolan Kristianstad, Fakulteten för naturvetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hkr:diva-20901.

Full text
Abstract:
This study investigates synthetic schematic generation using conditional generative adversarial networks, specifically the Pix2Pix algorithm was implemented for the experimental phase of the study. With the increase in deep neural network’s capabilities and availability, there is a demand for verbose datasets. This in combination with increased privacy concerns, has led to synthetic data generation utilization. Analysis of synthetic images was completed using a survey. Blueprint images were generated and were successful in passing as genuine images with an accuracy of 40%. This study confirms the ability of generative neural networks ability to produce synthetic blueprint images.
APA, Harvard, Vancouver, ISO, and other styles
2

Zeid, Baker Mousa. "Generation of Synthetic Images with Generative Adversarial Networks." Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-15866.

Full text
Abstract:
Machine Learning is a fast growing area that revolutionizes computer programs by providing systems with the ability to automatically learn and improve from experience. In most cases, the training process begins with extracting patterns from data. The data is a key factor for machine learning algorithms, without data the algorithms will not work. Thus, having sufficient and relevant data is crucial for the performance. In this thesis, the researcher tackles the problem of not having a sufficient dataset, in terms of the number of training examples, for an image classification task. The idea is to use Generative Adversarial Networks to generate synthetic images similar to the ground truth, and in this way expand a dataset. Two types of experiments were conducted: the first was used to fine-tune a Deep Convolutional Generative Adversarial Network for a specific dataset, while the second experiment was used to analyze how synthetic data examples affect the accuracy of a Convolutional Neural Network in a classification task. Three well known datasets were used in the first experiment, namely MNIST, Fashion-MNIST and Flower photos, while two datasets were used in the second experiment: MNIST and Fashion-MNIST. The results of the generated images of MNIST and Fashion-MNIST had good overall quality. Some classes had clear visual errors while others were indistinguishable from ground truth examples. When it comes to the Flower photos, the generated images suffered from poor visual quality. One can easily tell the synthetic images from the real ones. One reason for the bad performance is due to the large quantity of noise in the Flower photos dataset. This made it difficult for the model to spot the important features of the flowers. The results from the second experiment show that the accuracy does not increase when the two datasets, MNIST and Fashion-MNIST, are expanded with synthetic images. This is not because the generated images had bad visual quality, but because the accuracy turned out to not be highly dependent on the number of training examples. It can be concluded that Deep Convolutional Generative Adversarial Networks are capable of generating synthetic images similar to the ground truth and thus can be used to expand a dataset. However, this approach does not completely solve the initial problem of not having adequate datasets because Deep Convolutional Generative Adversarial Networks may themselves require, depending on the dataset, a large quantity of training examples.
APA, Harvard, Vancouver, ISO, and other styles
3

Aftab, Nadeem. "Disocclusion Inpainting using Generative Adversarial Networks." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-40502.

Full text
Abstract:
The old methods used for images inpainting of the Depth Image Based Rendering (DIBR) process are inefficient in producing high-quality virtual views from captured data. From the viewpoint of the original image, the generated data’s structure seems less distorted in the virtual view obtained by translation but when then the virtual view involves rotation, gaps and missing spaces become visible in the DIBR generated data. The typical approaches for filling the disocclusion tend to be slow, inefficient, and inaccurate. In this project, a modern technique Generative Adversarial Network (GAN) is used to fill the disocclusion. GAN consists of two or more neural networks that compete against each other and get trained. This study result shows that GAN can inpaint the disocclusion with a consistency of the structure. Additionally, another method (Filling) is used to enhance the quality of GAN and DIBR images. The statistical evaluation of results shows that GAN and filling method enhance the quality of DIBR images.
APA, Harvard, Vancouver, ISO, and other styles
4

Vanhainen, Erik, and Johan Adamsson. "Generating Realistic Neuronal Morphologies in 3D using a Generative Adversarial Network." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301788.

Full text
Abstract:
Neuronal morphology is primarily responsible for the structure of the connectivity among the neurons and is an important determinant for neuronal activity. This raises questions about the relationship between neuron shape and neuron function. To further investigate the structure-function relationship in neurons, extensive modelling with more morphological data is key. Digitally reconstructing neurons is tedious and requires a lot of manual labour and hence several generative methods have been proposed. However these generative models utilizes the current understanding of neuronal morphology, often by imposing a priori constraints, and thus may be biased or do not capture reality fully. We present an alternative technique using a Generative Adversarial Network that generates neurons without being constrained by current human understanding. The model was trained on digital reconstructions of pyramidal cells from rats and mice in a voxelized representation with dimensionality 1283. The results show that the model can generate objects that exhibit realistic neuronal features with a wide variety of shapes. Even though realistic feature are present in the generated objects they are often easily distinguishable from real neurons because of small discontinuous parts and noise in the complex arborizations. Nevertheless, this work can be seen as a proof of concept for generating realistic three dimensional morphologies in an unbiased manner.
Neuroners morfologier är primärt ansvariga för strukturen hos kopplingarna mellan neuroner och är en avgörande faktor för neuronaktivitet. Detta väcker frågor om sambandet mellan neuroners form och funktionalitet. För att undersöka detta samband är omfattande modellering med mycket morfologidata viktigt. Digital rekonstruktion av neuroner är omfattande och kräver mycket manuellt arbete. Av den anledning har flera generativa metoder föreslagits, dock bygger dessa metoder på vår nuvarande förståelse om neuroners morfologi som kan vara felaktig eller ofullständig. Vi föreslår en alternativ metod som med ett Generative Adversarial Network genererar neuroner utan att begränsas av vår nuvarande förståelse om neuroner. Modellen tränades på digitala rekonstruktioner av pyramidalceller från råttor och möss där varje neuron är representerad med 1283 voxlar. Resultaten visar att modellen kan generera objekt med realistiska neuronala särdrag och former. Även fast genererade objekt har realistiska former går de lätt att urskilja från riktiga neuroner på grund av små diskontinuerliga delar och brus i komplexa förgreningar. Detta arbete kan icke desto mindre ses som en grund till framtida arbete inom generering av tredimensionella nervceller utan mänsklig bias.
APA, Harvard, Vancouver, ISO, and other styles
5

Yamazaki, Hiroyuki Vincent. "On Depth and Complexity of Generative Adversarial Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217293.

Full text
Abstract:
Although generative adversarial networks (GANs) have achieved state-of-the-art results in generating realistic look- ing images, they are often parameterized by neural net- works with relatively few learnable weights compared to those that are used for discriminative tasks. We argue that this is suboptimal in a generative setting where data is of- ten entangled in high dimensional space and models are ex- pected to benefit from high expressive power. Additionally, in a generative setting, a model often needs to extrapo- late missing information from low dimensional latent space when generating data samples while in a typical discrimina- tive task, the model only needs to extract lower dimensional features from high dimensional space. We evaluate different architectures for GANs with varying model capacities using shortcut connections in order to study the impacts of the capacity on training stability and sample quality. We show that while training tends to oscillate and not benefit from additional capacity of naively stacked layers, GANs are ca- pable of generating samples with higher quality, specifically for images, samples of higher visual fidelity given proper regularization and careful balancing.
Trots att Generative Adversarial Networks (GAN) har lyckats generera realistiska bilder består de än idag av neurala nätverk som är parametriserade med relativt få tränbara vikter jämfört med neurala nätverk som används för klassificering. Vi tror att en sådan modell är suboptimal vad gäller generering av högdimensionell och komplicerad data och anser att modeller med högre kapaciteter bör ge bättre estimeringar. Dessutom, i en generativ uppgift så förväntas en modell kunna extrapolera information från lägre till högre dimensioner medan i en klassificeringsuppgift så behöver modellen endast att extrahera lågdimensionell information från högdimensionell data. Vi evaluerar ett flertal GAN med varierande kapaciteter genom att använda shortcut connections för att studera hur kapaciteten påverkar träningsstabiliteten, samt kvaliteten av de genererade datapunkterna. Resultaten visar att träningen blir mindre stabil för modeller som fått högre kapaciteter genom naivt tillsatta lager men visar samtidigt att datapunkternas kvaliteter kan öka, specifikt för bilder, bilder med hög visuell fidelitet. Detta åstadkoms med hjälp utav regularisering och noggrann balansering.
APA, Harvard, Vancouver, ISO, and other styles
6

Oskarsson, Joel. "Probabilistic Regression using Conditional Generative Adversarial Networks." Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166637.

Full text
Abstract:
Regression is a central problem in statistics and machine learning with applications everywhere in science and technology. In probabilistic regression the relationship between a set of features and a real-valued target variable is modelled as a conditional probability distribution. There are cases where this distribution is very complex and not properly captured by simple approximations, such as assuming a normal distribution. This thesis investigates how conditional Generative Adversarial Networks (GANs) can be used to properly capture more complex conditional distributions. GANs have seen great success in generating complex high-dimensional data, but less work has been done on their use for regression problems. This thesis presents experiments to better understand how conditional GANs can be used in probabilistic regression. Different versions of GANs are extended to the conditional case and evaluated on synthetic and real datasets. It is shown that conditional GANs can learn to estimate a wide range of different distributions and be competitive with existing probabilistic regression models.
APA, Harvard, Vancouver, ISO, and other styles
7

Li, Yuchuan. "Dual-Attention Generative Adversarial Network and Flame and Smoke Analysis." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42774.

Full text
Abstract:
Flame and smoke image processing and analysis could improve performance to detect smoke or fire and identify many complicated fire hazards, eventually to help firefighters to fight fires safely. Deep Learning applied to image processing has been prevailing in recent years among image-related research fields. Fire safety researchers also brought it into their studies due to its leading performance in image-related tasks and statistical analysis. From the perspective of input data type, traditional fire research is based on simple mathematical regressions or empirical correlations relying on sensor data, such as temperature. However, data from advanced vision devices or sensors can be analyzed by applying deep learning beyond auxiliary methods in data processing and analysis. Deep Learning has a bigger capacity in non-linear problems, especially in high-dimensional spaces, such as flame and smoke image processing. We propose a video-based real-time smoke and flame analysis system with deep learning networks and fire safety knowledge. It takes videos of fire as input and produces analysis and prediction for flashover of fire. Our system consists of four modules. The Color2IR Conversion module is made by deep neural networks to convert RGB video frames into InfraRed (IR) frames, which could provide important thermal information of fire. Thermal information is critically important for fire hazard detection. For example, 600 °C marks the start of a flashover. As RGB cameras cannot capture thermal information, we propose an image conversion module from RGB to IR images. The core of this conversion is a new network that we innovatively proposed: Dual-Attention Generative Adversarial Network (DAGAN), and it is trained using a pair of RGB and IR images. Next, Video Semantic Segmentation Module helps extract flame and smoke areas from the scene in the RGB video frames. We innovated to use synthetic RGB video data generated and captured from 3D modeling software for data augmentation. After that, a Video Prediction Module takes the RGB video frames and IR frames as input and produces predictions of the subsequent frames of their scenes. Finally, a Fire Knowledge Analysis Module predicts if flashover is coming or not, based on fire knowledge criteria such as thermal information extracted from IR images, temperature increase rate, the flashover occurrence temperature, and increase rate of lowest temperature. For our contributions and innovations, we introduce a novel network, DAGAN, by applying foreground and background attention mechanisms in the image conversion module to help reduce the hardware device requirement for flashover prediction. Besides, we also make use of combination of thermal information from IR images and segmentation information from RGB images in our system for flame and smoke analysis. We also apply a hybrid design of deep neural networks and a knowledge-based system to achieve high accuracy. Moreover, data augmentation is also applied on the Video Semantic Segmentation Module by introducing synthetic video data for training. The test results of flashover prediction show that our system has leading places quantitative and qualitative in terms of various metrics compared with other existing approaches. It can give a flashover prediction as early as 51 seconds with 94.5% accuracy before it happens.
APA, Harvard, Vancouver, ISO, and other styles
8

Rinnarv, Jonathan. "GANChat : A Generative Adversarial Network approach for chat bot learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-278143.

Full text
Abstract:
Recently a new method for training generative neural networks called Generative Adversarial Networks (GAN) has shown great results in the computer vision domain and shown potential in other generative machine learning tasks as well. GAN training is an adversarial training method where two neural networks compete and attempt to outperform each other, and in the process they both learn. In this thesis the effectiveness of GAN training is tested on conversational agents also called chat bots. To test this, current state-of-the-art training methods such as Maximum Likelihood Estimation (MLE) models are compared with GAN method trained models. Model performance was measured by closeness of the model distribution from the target distribution after training. This thesis shows that the GAN method performs worse the MLE in some scenarios but can outperform MLE in some cases.
Nyligen har en ny metod för att träna generativa neurala nätverk kallad Generative Adversarial Networks (GAN) visat bra resultat inom datorseendedomänen och visat potential inom andra maskininlärningsområden också GAN-träning är en träningsmetod där två neurala nätverk tävlar och försöker överträffa varandra, och i processen lär sig båda. I detta examensarbete har effektiviteten av GAN-träning testats på konversationsagenter, som också kallas Chat bots. För att testa det här jämfördes modeller tränade med nuvarande state-of- the-art träningsmetoder, så som Maximum likelihood-metoden (ML), med GAN-tränade modeller. Modellernas prestation mättes genom distans från modelldistribution till måldistribution efter träning. Det här examensarbetet visar att GAN-metoden presterar sämre än ML-metoden i vissa scenarier men kan överträffa ML i vissa fall.
APA, Harvard, Vancouver, ISO, and other styles
9

Cabezas, Rodríguez Juan Pablo. "Generative adversarial network based model for multi-domain fault diagnosis." Tesis, Universidad de Chile, 2019. http://repositorio.uchile.cl/handle/2250/170996.

Full text
Abstract:
Memoria para optar al título de Ingeniero Civil Mecánico
Con el uso de las redes neuronal profundas ganando terreno en el área de PHM, los sensores disminuyendo progresivamente su precio y mejores algoritmos, la falta de datos se ha vuelto un problema principal para los modelos enfocados en datos. Los datos etiquetados y aplicables a escenarios específicos son, en el mejor de los casos, escasos. El objetivo de este trabajo es desarrollar un método para diagnosticas el estado de un rodamiento en situaciones con datos limitados. Hoy en día la mayoría de las técnicas se enfocan en mejorar la precisión del diagnóstico y en estimar la vida útil remanente en componentes bien documentados. En el presente, los métodos actuales son ineficiente en escenarios con datos limitados. Se desarrolló un método en el cual las señales vibratorias son usadas para crear escalogramas y espectrogramas, los cuales a su vez se usan para entrenar redes neuronales generativas y de clasificación, en función de diagnosticar un set de datos parcial o totalmente desconocido, en base a uno conocido. Los resultados se comparan con un método más sencillo en el cual la red para clasificación es entrenada con el set de datos conocidos y usada directamente para diagnosticar el set de datos desconocido. El Case Western Reserve University Bearing Dataset y el Machine Failure Prevention Technology Bearing Dataset fueron usados como datos de entrada. Ambos sets se usaron como conocidos tanto como desconocidos. Para la clasificación una red neuronal convolucional (CNN por sus siglas en inglés) fue diseñada. Una red adversaria generativa (GAN por sus siglas en inglés) fue usada como red generativa. Esta red fue basada en una introducida en el paper StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. Los resultados fueron favorables para la red CNN mientras que fueron -en general- desfavorables para la red GAN. El análisis de resultados sugiere que la función de costo es inapropiada para el problema propuesto. Las conclusiones dictaminan que la traducción imagen-a-imagen basada en la función ciclo no funciona correctamente en señal vibratorias para diagnóstico de rodamientos. With the use of deep neural networks gaining notoriety on the prognostics & health management field, sensors getting progressively cheaper and improved algorithms, the lack of data has become a major issue for data-driven models. Data which is labelled and applicable for specific scenarios is scarce at best. The purpose of this works is to develop a method to diagnose the health state of a bearing on limited data situations. Now a days most techniques focus on improving accuracy for diagnosis and estimating remaining useful life on well documented components. As it stands, current methods are ineffective on limited data scenarios. A method was developed were in vibration signals are used to create scalograms and spectrograms, which in turn are used to train generative and classification neural networks with the goal of diagnosing a partially or totally unknown dataset based on a fully labelled one. Results were compared to a simpler method in which a classification network is trained on the labelled dataset to diagnose the unknown dataset. As inputs the Case Western Reserve University Bearing Dataset (CWR) and the Society for Machine Failure Prevention Technology Bearing Dataset. Both datasets are used as labelled and unknown. For classification a Convolutional Neural Network (CNN) is designed. A Generative Adversarial Network (GAN) is used as generative model. The generative model is based of a previous paper called StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. Results were favourable for the CNN network whilst generally negative for the GAN network. Result analysis suggests that the cost function is unsuitable for the proposed problem. Conclusions state that cycle based image-to-image translation does not work correctly on vibration signals for bearing diagnosis.
APA, Harvard, Vancouver, ISO, and other styles
10

Desentz, Derek. "Partial Facial Re-imaging Using Generative Adversarial Networks." Wright State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=wright1622122813797895.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Ankaräng, Fredrik. "Generative Adversarial Networks for Cross-Lingual Voice Conversion." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299560.

Full text
Abstract:
Speech synthesis is a technology that increasingly influences our daily lives, in the form of smart assistants, advanced translation systems and similar applications. In this thesis, the phenomenon of making one’s voice sound like the voice of someone else is explored. This topic is called voice conversion and needs to be done without altering the linguistic content of speech. More specifically, a Cycle-Consistent Adversarial Network that has proven to work well in a monolingual setting, is evaluated in a multilingual environment. The model is trained to convert voices between native speakers from the Nordic countries. In the experiments no parallel, transcribed or aligned speech data is being used, forcing the model to focus on the raw audio signal. The goal of the thesis is to evaluate if performance is degraded in a multilingual environment, in comparison to monolingual voice conversion, and to measure the impact of the potential performance drop. In the study, performance is measured in terms of naturalness and speaker similarity between the generated speech and the target voice. For evaluation, listening tests are conducted, as well as objective comparisons of the synthesized speech. The results show that voice conversion between a Swedish and Norwegian speaker is possible and also that it can be performed without performance degradation in comparison to Swedish-to-Swedish conversion. Furthermore, conversion between Finnish and Swedish speakers, as well as Danish and Swedish speakers show a performance drop for the generated speech. However, despite the performance decrease, the model produces fluent and clearly articulated converted speech in all experiments. These results are noteworthy, especially since the network is trained on less than 15 minutes of nonparallel speaker data for each speaker. This thesis opens up for further areas of research, for instance investigating more languages, more recent Generative Adversarial Network architectures and devoting more resources to tweaking the hyperparameters to further optimize the model for multilingual voice conversion.
Talsyntes är ett område som allt mer influerar vår vardag, exempelvis genom smarta assistenter, avancerade översättningssystem och liknande användningsområden. I det här examensarbetet utforskas fenomenet röstkonvertering, som innebär att man får en talare att låta som någon annan, utan att det som sades förändras. Mer specifikt undersöks ett Cycle-Consistent Adversarial Network som fungerat väl för röstkonvertering inom ett enskilt språk för röstkonvertering mellan olika språk. Det neurala nätverket tränas för konvertering mellan röster från olika modersmålstalare från de nordiska länderna. I experimenten används ingen parallell eller transkriberad data, vilket tvingar modellen att endast använda sig av ljudsignalen. Målet med examensarbetet är att utvärdera om modellens prestanda försämras i en flerspråkig kontext, jämfört med en enkelspråkig sådan, samt mäta hur stor försämringen i sådant fall är. I studien mäts prestanda i termer av kvalitet och talarlikhet för det genererade talet och rösten som efterliknas. För att utvärdera detta genomförs lyssningstester, samt objektiva analyser av det genererade talet. Resultaten visar att röstkonvertering mellan en svensk och norsk talare är möjlig utan att modellens prestanda försämras, jämfört med konvertering mellan svenska talare. För konvertering mellan finska och svenska talare, samt danska och svenska talare försämrades däremot kvaliteten av det genererade talet. Trots denna försämring producerade modellen tydligt och sammanhängande tal i samtliga experiment. Det här är anmärkningsvärt eftersom modellen tränades på mindre än 15 minuter icke-parallel data för varje talare. Detta examensarbete öppnar upp för nya framtida studier, exempelvis skulle fler språk kunna inkluderas eller nyare varianter av typen Generative Adversarial Network utvärderas. Mer resurser skulle även kunna läggas på att optimera hyperparametrarna för att ytterligare optimera den undersökta modellen för flerspråkig röstkonvertering.
APA, Harvard, Vancouver, ISO, and other styles
12

Radhakrishnan, Saieshwar. "Domain Adaptation of IMU sensors using Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286821.

Full text
Abstract:
Autonomous vehicles rely on sensors for a clear understanding of the environment and in a heavy duty truck, the sensors are placed at multiple locations like the cabin, chassis and the trailer in order to increase the field of view and reduce the blind spot area. Usually, these sensors perform best when they are stationary relative to the ground, hence large and fast movements, which are quite common in a truck, may lead to performance reduction, erroneous data or in the worst case, a sensor failure. This enforces a need to validate the sensors before using them for making life-critical decisions. This thesis proposes Domain Adaptation as one of the strategies to co-validate Inertial Measurement Unit (IMU) sensors. The proposed Generative Adversarial Network (GAN) based framework predicts the data of one IMU using other IMUs in the truck by implicitly learning the internal dynamics. This prediction model along with other sensor fusion strategies would be used by the supervising system to validate the IMUs in real-time. Through data collected from real-world experiments, it is shown that the proposed framework is able to accurately transform raw IMU sequences across domains. A further comparison is made between Long Short Term Memory (LSTM) and WaveNet based architectures to show the superiority of WaveNets in terms of performance and computational efficiency.
Autonoma fordon förlitar sig på sensorer för att skapa en bild av omgivningen. På en tung lastbil placeras sensorerna på multipla ställen, till exempel på hytten, chassiet och på trailern för att öka siktfältet och för att minska blinda områden. Vanligtvis presterar sensorerna som bäst när de är stationära i förhållande till marken, därför kan stora och snabba rörelser, som är vanliga på en lastbil, leda till nedsatt prestanda, felaktig data och i värsta fall fallerande sensorer. På grund av detta så finns det ett stort behov av att validera sensordata innan det används för kritiskt beslutsfattande. Den här avhandlingen föreslår domänadaption som en av de strategier för att samvalidera Tröghetsmätningssensorer (IMU-sensorer). Det föreslagna Generative Adversarial Network (GAN) baserade ramverket förutspår en Tröghetssensors data genom att implicit lära sig den interna dynamiken från andra Tröghetssensorer som är monterade på lastbilen. Den här prediktionsmodellen kombinerat med andra sensorfusionsstrategier kan användas av kontrollsystemet för att i realtid validera Tröghetssensorerna. Med hjälp av data insamlat från verkliga experiment visas det att det föreslagna ramverket klarar av att med hög noggrannhet konvertera obehandlade Tröghetssensor-sekvenser mellan domäner. Ytterligare en undersökning mellan Long Short Term Memory (LSTM) och WaveNet-baserade arkitekturer görs för att visa överlägsenheten i WaveNets när det gäller prestanda och beräkningseffektivitet.
APA, Harvard, Vancouver, ISO, and other styles
13

Hermoza, Aragonés Renato. "3D Reconstruction of Incomplete Archaeological Objects Using a Generative Adversarial Network." Master's thesis, Pontificia Universidad Católica del Perú, 2018. http://tesis.pucp.edu.pe/repositorio/handle/123456789/12263.

Full text
Abstract:
We introduce a data-driven approach to aid the repairing and conservation of archaeological objects: ORGAN, an object reconstruction generative adversarial network (GAN). By using an encoder-decoder 3D deep neural network on a GAN architecture, and combining two loss objectives: a completion loss and an Improved Wasserstein GAN loss, we can train a network to effectively predict the missing geometry of damaged objects. As archaeological objects can greatly differ between them, the network is conditioned on a variable, which can be a culture, a region or any metadata of the object. In our results, we show that our method can recover most of the information from damaged objects, even in cases where more than half of the voxels are missing, without producing many errors.
Tesis
APA, Harvard, Vancouver, ISO, and other styles
14

Wu, Chaoyun M. ArchMassachusetts Institute of Technology. "Machine learning in housing design : exploration of generative adversarial network in site plan / floorplan generation." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/129855.

Full text
Abstract:
Thesis: M. Arch., Massachusetts Institute of Technology, Department of Architecture, February, 2020
Cataloged from student-submitted thesis.
Includes bibliographical references (pages 99-100).
Technology has always been an important factor that shapes the way we think about Architecture. In recent years, Machine Learning technology has been gaining more and more attention. Different from traditional types of programming that rely on explicit instructions, Machine Learning allows computers to learn to execute certain tasks "by themselves". This new technology has revolutionized many industries and showed much potential. Examples like AlphaGo and OpenAI Five had shown Machine Learning's capability in solving complex problems. The Architectural design industry is not an exception. Early-stage explorations of this technology are emerging and have shown potential in solving certain design problems. However, basic problems regarding the nature of Machine Learning and its role in Architecture design remain to be answered. What does Machine Learning mean to Architecture? What will be its role in Architectural design? Will it replace human architects? Will it merely be a design tool? Or is it relevant to Architecture at all? To answer these questions, this thesis explored with a specific type of Machine Learning algorithm called Pix2Pix to investigate what can and cannot be learned by a computer through Machine Learning, and to evaluate what Machine Learning means for architects. It concluded that Machine Learning cannot be a creative design agent, but can be a powerful tool in solving conventional design problems. On this basis, this thesis proposed a prototype pipeline of integrating the technology into the design process, which is a combination of Generative Adversarial Network (Pix2Pix), Bayesian Network and Evolutionary Algorithm.
by Chaoyun Wu.
M. Arch.
M.Arch. Massachusetts Institute of Technology, Department of Architecture
APA, Harvard, Vancouver, ISO, and other styles
15

Eisenbeiser, Logan Ryan. "Latent Walking Techniques for Conditioning GAN-Generated Music." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/100052.

Full text
Abstract:
Artificial music generation is a rapidly developing field focused on the complex task of creating neural networks that can produce realistic-sounding music. Generating music is very difficult; components like long and short term structure present time complexity, which can be difficult for neural networks to capture. Additionally, the acoustics of musical features like harmonies and chords, as well as timbre and instrumentation require complex representations for a network to accurately generate them. Various techniques for both music representation and network architecture have been used in the past decade to address these challenges in music generation. The focus of this thesis extends beyond generating music to the challenge of controlling and/or conditioning that generation. Conditional generation involves an additional piece or pieces of information which are input to the generator and constrain aspects of the results. Conditioning can be used to specify a tempo for the generated song, increase the density of notes, or even change the genre. Latent walking is one of the most popular techniques in conditional image generation, but its effectiveness on music-domain generation is largely unexplored. This paper focuses on latent walking techniques for conditioning the music generation network MuseGAN and examines the impact of this conditioning on the generated music.
Master of Science
Artificial music generation is a rapidly developing field focused on the complex task of creating neural networks that can produce realistic-sounding music. Beyond simply generating music lies the challenge of controlling or conditioning that generation. Conditional generation can be used to specify a tempo for the generated song, increase the density of notes, or even change the genre. Latent walking is one of the most popular techniques in conditional image generation, but its effectiveness on music-domain generation is largely unexplored, especially for generative adversarial networks (GANs). This paper focuses on latent walking techniques for conditioning the music generation network MuseGAN and examines the impact and effectiveness of this conditioning on the generated music.
APA, Harvard, Vancouver, ISO, and other styles
16

Pineda, Ancco Ferdinand Edgardo. "A generative adversarial network approach for super resolution of sentinel-2 satellite images." Master's thesis, Pontificia Universidad Católica del Perú, 2020. http://hdl.handle.net/20.500.12404/16137.

Full text
Abstract:
Recently, satellites in operation offering very high-resolution (VHR) images has experienced an important increase, but they remain as a smaller proportion against existing lower resolution (HR) satellites. Our work proposes an alternative to improve the spatial resolution of HR images obtained by Sentinel-2 satellite by using the VHR images from PeruSat1, a Peruvian satellite, which serve as the reference for the superresolution approach implementation based on a Generative Adversarial Network (GAN) model, as an alternative for obtaining VHR images. The VHR PeruSat-1 image dataset is used for the training process of the network. The results obtained were analyzed considering the Peak Signal to Noise Ratios (PSNR), the Structural Similarity (SSIM) and the Erreur Relative Globale Adimensionnelle de Synth`ese (ERGAS). Finally, some visual outcomes, over a given testing dataset, are presented so the performance of the model could be analyzed as well.
Trabajo de investigación
APA, Harvard, Vancouver, ISO, and other styles
17

Sargent, Garrett Craig. "A Conditional Generative Adversarial Network Demosaicing Strategy for Division of Focal Plane Polarimeters." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1606050550958383.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Bartocci, John Timothy. "Generating a synthetic dataset for kidney transplantation using generative adversarial networks and categorical logit encoding." Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1617104572023027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Benedetti, Riccardo. "From Artificial Intelligence to Artificial Art: Deep Learning with Generative Adversarial Networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18167/.

Full text
Abstract:
Neural Network had a great impact on Artificial Intelligence and nowadays the Deep Learning algorithms are widely used to extract knowledge from huge amount of data. This thesis aims to revisit the evolution of Deep Learning from the origins till the current state-of-art by focusing on a particular prospective. The main question we try to answer is: can AI exhibit artistic abilities comparable to the human ones? Recovering the definition of the Turing Test, we propose a similar formulation of the concept, indeed, we would like to test the machine's ability to exhibit artistic behaviour equivalent to, or indistinguishable from, that of a human. The argument we will analyze as a support for this debate is an interesting and innovative idea coming from the field of Deep Learning, known as Generative Adversarial Network (GAN). GAN is basically a system composed of two neural network fighting each other in a zero-sum game. The ''bullets'' fired during this challenge are simply images generated by one of the two networks. The interesting part in this scenario is that, with a proper system design and training, after several iteration these fake generated images start to become more and more closer to the ones we see in the reality, making indistinguishable what is real from what is not. We will talk about some real anecdotes around GANs to spice up even more the discussion generated by the question previously posed and we will present some recent real world application based on GANs to emphasize their importance also in term of business. We will conclude with a practical experiment over an Amazon catalogue of clothing images and reviews with the aim of generating new never seen product starting from the most popular existing ones.
APA, Harvard, Vancouver, ISO, and other styles
20

Li, Jiawei. "Semantically Correct High-resolution CT Image Interpolation and its Application." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/41150.

Full text
Abstract:
Image interpolation in the medical area is of vital importance as most 3D biomedical volume images are sampled where the distance between consecutive slices is significantly greater than the in-plane pixel size due to radiation dose or scanning time. Image interpolation creates a certain number of new slices between known slices in order to obtain an isotropic volume image. The results can be used for the higher quality of 2D and 3D visualization or reconstruction of human body structure. Semantic interpolation on the manifold has been proved to be very useful for smoothing the interpolation process. Nevertheless, all previous methods focused on low-resolution image interpolation, and most of which work poorly on high-resolution images. Besides, the medical field puts a high threshold for the quality of interpolations, as they need to be semantic and realistic enough, and resemble real data with only small errors permitted. Typically, people downsample the images into 322 and 642 for semantic interpolation, which does not meet the requirement for high-resolution in the medical field. Thus, we explore a novel way to generate semantically correct interpolations and maintain the resolution at the same time. Our method has been proved to generate realistic and high-resolution interpolations on the sizes of 5262 and 5122. Our main contribution is, first, we propose a novel network, High Resolution Interpolation Network (HRINet), aiming at producing semantically correct high-resolution CT image interpolations. Second, by combining the idea of ACAI and GANs, we propose a unique alternative supervision method by applying supervised and unsupervised training alternatively to raise the accuracy and fidelity of body structure in CT when interpolated while keeping high quality. Third, we introduce an extra Markovian discriminator as a texture or fine details regularizer to make our model generate results indistinguishable from real data. In addition, we explore other possibilities or tricks to further improve the performance of our model, including low-level feature maps mixing, and removing batch normalization layers within the autoencoder. Moreover, we compare the impacts of MSE based and perceptual based loss optimizing methods for high quality interpolation, and show the trade-off between the structural correctness and sharpness. The interpolation experiments show significant improvement on both sizes of 256 2 and 5122 images quantitatively and qualitatively. We find that interpolations produced by HRINet are sharper and more realistic compared with other existing methods such as AE and ACAI in terms of various metrics. As an application of high-resolution interpolation, we have done 2D volume projection and 3D volume reconstruction from axial view CT data and their interpolations. We show the great enhancement of applying HRINet for both in sharpness and fidelity. Specifically, for 2D volume projection, we explore orthogonal projection and weighted projection respectively so as to show the improved effectiveness for visualizing internal and external human body structure.
APA, Harvard, Vancouver, ISO, and other styles
21

Brolli, Sara. "Sviluppo di un tool configurabile per il training di un Adversarial Autoencoder." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14413/.

Full text
Abstract:
Una interessante applicazione delle reti neurali è stata l’utilizzo delle stes- se come modelli generativi: algoritmi capaci di replicare la distribuzione dei dati in input allo scopo di poter poi generare nuovi valori a partire da tale distribuzione. Questa tesi analizza tre modelli particolarmente promettenti emersi negli ultimi anni: i Variational Autoencoder, le Generative Adversarial Network e una loro combinazione cioè gli Adversarial Autoencoder (AAE). Si espone l’implementazione di un tool a linea di comando che permette il training di un AAE e la generazione di immagini a partire da un modello già allenato, con anche la possibilità di personalizzare le caratteristiche delle reti neurali utilizzate.
APA, Harvard, Vancouver, ISO, and other styles
22

Gustafsson, Alexander, and Jonatan Linberg. "Investigation of generative adversarial network training : The effect of hyperparameters on training time and stability." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-19847.

Full text
Abstract:
Generative Adversarial Networks (GAN) is a technique used to learn the distribution of some dataset in order to generate similar data. GAN models are notoriously difficult to train, which has caused limited deployment in the industry. The results of this study can be used to accelerate the process of making GANs production ready. An experiment was conducted where multiple GAN models were trained, with the hyperparameters Leaky ReLU alpha, convolutional filters, learning rate and batch size as independent variables. A Mann-Whitney U-test was used to compare the training time and training stability of each model to the others’. Except for the Leaky ReLU alpha, changes to the investigated hyperparameters had a significant effect on the training time and stability. This study is limited to a few hyperparameters and values, a single dataset and few data points, further research in the area could look at the generalisability of the results or investigate more hyperparameters.
APA, Harvard, Vancouver, ISO, and other styles
23

Wang, Kang. "Image Transfer Between Magnetic Resonance Images and Speech Diagrams." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/41533.

Full text
Abstract:
Realtime Magnetic Resonance Imaging (MRI) is a method used for human anatomical study. MRIs give exceptionally detailed information about soft-tissue structures, such as tongues, that other current imaging techniques cannot achieve. However, the process requires special equipment and is expensive. Hence, it is not quite suitable for all patients. Speech diagrams show the side view positions of organs like the tongue, throat, and lip of a speaking or singing person. The process of making a speech diagram is like the semantic segmentation of an MRI, which focuses on the selected edge structure. Speech diagrams are easy to understand with a clear speech diagram of the tongue and inside mouth structure. However, it often requires manual annotation on the MRI machine by an expert in the field. By using machine learning methods, we achieved transferring images between MRI and speech diagrams in two directions. We first matched videos of speech diagram and tongue MRIs. Then we used various image processing methods and data augmentation methods to make the paired images easy to train. We built our network model inspired by different cross-domain image transfer methods and applied reference-based super-resolution methods—to generate high-resolution images. Thus, we can do the transferring work through our network instead of manually. Also, generated speech diagram can work as an intermediary part to be transferred to other medical images like computerized tomography (CT), since it is simpler in structure compared to an MRI. We conducted experiments using both the data from our database and other MRI video sources. We use multiple methods to do the evaluation and comparisons with several related methods show the superiority of our approach.
APA, Harvard, Vancouver, ISO, and other styles
24

Nord, Sofia. "Multivariate Time Series Data Generation using Generative Adversarial Networks : Generating Realistic Sensor Time Series Data of Vehicles with an Abnormal Behaviour using TimeGAN." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302644.

Full text
Abstract:
Large datasets are a crucial requirement to achieve high performance, accuracy, and generalisation for any machine learning task, such as prediction or anomaly detection, However, it is not uncommon for datasets to be small or imbalanced since gathering data can be difficult, time-consuming, and expensive. In the task of collecting vehicle sensor time series data, in particular when the vehicle has an abnormal behaviour, these struggles are present and may hinder the automotive industry in its development. Synthetic data generation has become a growing interest among researchers in several fields to handle the struggles with data gathering. Among the methods explored for generating data, generative adversarial networks (GANs) have become a popular approach due to their wide application domain and successful performance. This thesis focuses on generating multivariate time series data that are similar to vehicle sensor readings from the air pressures in the brake system of vehicles with an abnormal behaviour, meaning there is a leakage somewhere in the system. A novel GAN architecture called TimeGAN was trained to generate such data and was then evaluated using both qualitative and quantitative evaluation metrics. Two versions of this model were tested and compared. The results obtained proved that both models learnt the distribution and the underlying information within the features of the real data. The goal of the thesis was achieved and can become a foundation for future work in this field.
När man applicerar en modell för att utföra en maskininlärningsuppgift, till exempel att förutsäga utfall eller upptäcka avvikelser, är det viktigt med stora dataset för att uppnå hög prestanda, noggrannhet och generalisering. Det är dock inte ovanligt att dataset är små eller obalanserade eftersom insamling av data kan vara svårt, tidskrävande och dyrt. När man vill samla tidsserier från sensorer på fordon är dessa problem närvarande och de kan hindra bilindustrin i dess utveckling. Generering av syntetisk data har blivit ett växande intresse bland forskare inom flera områden som ett sätt att hantera problemen med datainsamling. Bland de metoder som undersökts för att generera data har generative adversarial networks (GANs) blivit ett populärt tillvägagångssätt i forskningsvärlden på grund av dess breda applikationsdomän och dess framgångsrika resultat. Denna avhandling fokuserar på att generera flerdimensionell tidsseriedata som liknar fordonssensoravläsningar av lufttryck i bromssystemet av fordon med onormalt beteende, vilket innebär att det finns ett läckage i systemet. En ny GAN modell kallad TimeGAN tränades för att genera sådan data och utvärderades sedan både kvalitativt och kvantitativt. Två versioner av denna modell testades och jämfördes. De erhållna resultaten visade att båda modellerna lärde sig distributionen och den underliggande informationen inom de olika signalerna i den verkliga datan. Målet med denna avhandling uppnåddes och kan lägga grunden för framtida arbete inom detta område.
APA, Harvard, Vancouver, ISO, and other styles
25

Waldow, Walter E. "An Adversarial Framework for Deep 3D Target Template Generation." Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1597334881614898.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Nalamothu, Abhishek. "Abusive and Hate Speech Tweets Detection with Text Generation." Wright State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=wright1567510940365305.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Zheng, Yilin. "Text-Based Speech Video Synthesis from a Single Face Image." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1572168353691788.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Käll, Viktor, and Erik Piscator. "Particle Filter Bridge Interpolation in GANs." Thesis, KTH, Matematisk statistik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301733.

Full text
Abstract:
Generative adversarial networks (GANs), a type of generative modeling framework, has received much attention in the past few years since they were discovered for their capacity to recover complex high-dimensional data distributions. These provide a compressed representation of the data where all but the essential features of a sample is extracted, subsequently inducing a similarity measure on the space of data. This similarity measure gives rise to the possibility of interpolating in the data which has been done successfully in the past. Herein we propose a new stochastic interpolation method for GANs where the interpolation is forced to adhere to the data distribution by implementing a sequential Monte Carlo algorithm for data sampling. The results show that the new method outperforms previously known interpolation methods for the data set LINES; compared to the results of other interpolation methods there was a significant improvement measured through quantitative and qualitative evaluations. The developed interpolation method has met its expectations and shown promise, however it needs to be tested on a more complex data set in order to verify that it also scales well.
Generative adversarial networks (GANs) är ett slags generativ modell som har fått mycket uppmärksamhet de senaste åren sedan de upptäcktes för sin potential att återskapa komplexa högdimensionella datafördelningar. Dessa förser en komprimerad representation av datan där enbart de karaktäriserande egenskaperna är bevarade, vilket följdaktligen inducerar ett avståndsmått på datarummet. Detta avståndsmått möjliggör interpolering inom datan vilket har åstadkommits med framgång tidigare. Häri föreslår vi en ny stokastisk interpoleringsmetod för GANs där interpolationen tvingas följa datafördelningen genom att implementera en sekventiell Monte Carlo algoritm för dragning av datapunkter. Resultaten för studien visar att metoden ger bättre interpolationer för datamängden LINES som användes; jämfört med resultaten av tidigare kända interpolationsmetoder syntes en märkbar förbättring genom kvalitativa och kvantitativa utvärderingar. Den framtagna interpolationsmetoden har alltså mött förväntningarna och är lovande, emellertid fordras att den testas på en mer komplex datamängd för att bekräfta att den fungerar väl även under mer generella förhållanden.
APA, Harvard, Vancouver, ISO, and other styles
29

Wu, Xinheng. "A Deep Unsupervised Anomaly Detection Model for Automated Tumor Segmentation." Thesis, The University of Sydney, 2020. https://hdl.handle.net/2123/22502.

Full text
Abstract:
Many researches have been investigated to provide the computer aided diagnosis (CAD) automated tumor segmentation in various medical images, e.g., magnetic resonance (MR), computed tomography (CT) and positron-emission tomography (PET). The recent advances in automated tumor segmentation have been achieved by supervised deep learning (DL) methods trained on large labelled data to cover tumor variations. However, there is a scarcity in such training data due to the cost of labeling process. Thus, with insufficient training data, supervised DL methods have difficulty in generating effective feature representations for tumor segmentation. This thesis aims to develop an unsupervised DL method to exploit large unlabeled data generated during clinical process. Our assumption is unsupervised anomaly detection (UAD) that, normal data have constrained anatomy and variations, while anomalies, i.e., tumors, usually differ from the normality with high diversity. We demonstrate our method for automated tumor segmentation on two different image modalities. Firstly, given that bilateral symmetry in normal human brains and unsymmetry in brain tumors, we propose a symmetric-driven deep UAD model using GAN model to model the normal symmetric variations thus segmenting tumors by their being unsymmetrical. We evaluated our method on two benchmarked datasets. Our results show that our method outperformed the state-of-the-art unsupervised brain tumor segmentation methods and achieved competitive performance to the supervised segmentation methods. Secondly, we propose a multi-modal deep UAD model for PET-CT tumor segmentation. We model a manifold of normal variations shared across normal CT and PET pairs; this manifold representing the normal pairing that can be used to segment the anomalies. We evaluated our method on two PET-CT datasets and the results show that we outperformed the state-of-the-art unsupervised methods, supervised methods and baseline fusion techniques.
APA, Harvard, Vancouver, ISO, and other styles
30

Fu, Yucheng. "Development of Advanced Image Processing Algorithms for Bubbly Flow Measurement." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/85390.

Full text
Abstract:
An accurate measurement of bubbly flow has a significant value for understanding the bubble behavior, heat and energy transfer pattern in different engineering systems. It also helps to advance the theoretical model development in two-phase flow study. Due to the interaction between the gas and liquid phase, the flow patterns are complicated in recorded image data. The segmentation and reconstruction of overlapping bubbles in these images is a challenging task. This dissertation provides a complete set of image processing algorithms for bubbly flow measurement. The developed algorithm can deal with bubble overlapping issues and reconstruct bubble outline in 2D high speed images under a wide void fraction range. Key bubbly flow parameters such as void fraction, interfacial area concentration, bubble number density and velocity can be computed automatically after bubble segmentation. The time-averaged bubbly flow distributions are generated based on the extracted parameters for flow characteristic study. A 3D imaging system is developed for 3D bubble reconstruction. The proposed 3D reconstruction algorithm can restore the bubble shape in a time sequence for accurate flow visualization with minimum assumptions. The 3D reconstruction algorithm shows an error of less than 2% in volume measurement compared to the syringe reading. Finally, a new image synthesis framework called Bubble Generative Adversarial Networks (BubGAN) is proposed by combining the conventional image processing algorithm and deep learning technique. This framework aims to provide a generic benchmark tool for assessing the performance of the existed image processing algorithms with significant quality improvement in synthetic bubbly flow image generation.
Ph. D.
Bubbly flow phenomenon exists in a wide variety of systems, for example, nuclear reactor, heat exchanger, chemical bubble column and biological system. The accurate measurement of the bubble distribution can be helpful to understand the behaviors of these systems. Due to the complexity of the bubbly flow images, it is not practical to manually process and label these data for analysis. This dissertation developed a complete suite of image processing algorithms to process bubbly flow images. The proposed algorithms have the capability of segmenting 2D dense bubble images and reconstructing 3D bubble shape in coordinate with multiple camera systems. The bubbly flow patterns and characteristics are analyzed in this dissertation. Finally, a generic image processing benchmark tool called Bubble Generative Adversarial Networks (BubGAN) is proposed by combining the conventional image processing and deep learning techniques together. The BubGAN framework aims to bridge the gap between real bubbly images and synthetic images used for algorithm benchmark and algorithm.
APA, Harvard, Vancouver, ISO, and other styles
31

Marriott, Richard. "Data-augmentation with synthetic identities for robust facial recognition." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSEC048.

Full text
Abstract:
En 2014, l'utilisation des réseaux neuronaux profonds (RNP) a révolutionné la reconnaissance faciale (RF). Les RNP sont capables d'apprendre à extraire des images des représentations basées sur des caractéristiques qui sont discriminantes et robustes aux détails non pertinents. On peut dire que l'un des facteurs les plus importants qui limitent aujourd'hui les performances des algorithmes de RF sont les données utilisées pour les entraîner. Les ensembles de données d'images de haute qualité qui sont représentatives des conditions de test du monde réel peuvent être difficiles à collecter. Une solution possible est d'augmenter les ensembles de données avec des images synthétiques. Cette option est récemment devenue plus viable suite au développement des « generative adversarial networks » (GAN) qui permettent de générer des échantillons de données synthétiques très réalistes. Cette thèse étudie l'utilisation des GAN pour augmenter les ensembles de données FR. Elle examine la capacité des GAN à générer de nouvelles identités, et leur capacité à démêler l'identité des autres formes de variation des images. Enfin, un GAN intégrant un modèle 3D est proposé afin de démêler complètement la pose de l'identité. Il est démontré que les images synthétisées à l'aide du GAN 3D améliorent la reconnaissance des visages aux poses larges et une précision état de l'art est démontrée pour l'ensemble de données d'évaluation ``Cross-Pose LFW''.Le dernier chapitre de la thèse évalue l'une des utilisations plus néfastes des images synthétiques : l'attaque par morphing du visage. Ces attaques exploitent l'imprécision des systèmes de RF en manipulant les images de manière à ce qu'il puisse être faussement vérifié qu'elles appartiennent à plus d'une personne. Une évaluation des attaques par morphing de visage basées sur le GAN est fournie. Une nouvelle méthode de morphing basée sur le GAN est également présentée, qui minimise la distance entre l'image transformée et les identités originales dans un espace de caractéristiques biométriques. Une contre-mesure potentielle à ces attaques par morphing consiste à entraîner les réseaux FR en utilisant des identités synthétiques supplémentaires. Dans cette veine, l'effet de l'entraînement utilisant des données synthétiques GAN 3D sur le succès des attaques simulées de morphing facial est évalué
In 2014, use of deep neural networks (DNNs) revolutionised facial recognition (FR). DNNs are capable of learning to extract feature-based representations from images that are discriminative and robust to extraneous detail. Arguably, one of the most important factors now limiting the performance of FR algorithms is the data used to train them. High-quality image datasets that are representative of real-world test conditions can be difficult to collect. One potential solution is to augment datasets with synthetic images. This option recently became increasingly viable following the development of generative adversarial networks (GANs) which allow generation of highly realistic, synthetic data samples. This thesis investigates the use of GANs for augmentation of FR datasets. It looks at the ability of GANs to generate new identities, and their ability to disentangle identity from other forms of variation in images. Ultimately, a GAN integrating a 3D model is proposed in order to fully disentangle pose from identity. Images synthesised using the 3D GAN are shown to improve large-pose FR and a state-of-the-art accuracy is demonstrated for the challenging Cross-Pose LFW evaluation dataset.The final chapter of the thesis evaluates one of the more nefarious uses of synthetic images: the face-morphing attack. Such attacks exploit imprecision in FR systems by manipulating images such that they might be falsely verified as belonging to more than one person. An evaluation of GAN-based face-morphing attacks is provided. Also introduced is a novel, GAN-based morphing method that minimises the distance of the morphed image from the original identities in a biometric feature-space. A potential counter measure to such morphing attacks is to train FR networks using additional, synthetic identities. In this vein, the effect of training using synthetic, 3D GAN data on the success of simulated face-morphing attacks is evaluated
APA, Harvard, Vancouver, ISO, and other styles
32

Chowdhury, Muhammad Iqbal Hasan. "Question-answering on image/video content." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205096/1/Muhammad%20Iqbal%20Hasan_Chowdhury_Thesis.pdf.

Full text
Abstract:
This thesis explores a computer's ability to understand multimodal data where the correspondence between image/video content and natural language text are utilised to answer open-ended natural language questions through question-answering tasks. Static image data consisting of both indoor and outdoor scenes, where complex textual questions are arbitrarily posed to a machine to generate correct answers, was examined. Dynamic videos consisting of both single-camera and multi-camera settings for the exploration of more challenging and unconstrained question-answering tasks were also considered. In exploring these challenges, new deep learning processes were developed to improve a computer's ability to understand and consider multimodal data.
APA, Harvard, Vancouver, ISO, and other styles
33

Lai, Matteo. "Conditional MR image synthesis with Auxiliary Progressive Growing GANs." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022.

Find full text
Abstract:
L'addestramento di algotritmi di deep learning (DL) richiede una grande quantità di dati, che però spesso non sono disponibili in ambito medico. In questa tesi viene proposto un modello per la generazione di dataset sintetici etichettati nell'ambito dell'imaging medico ad alta risoluzione. Dopo aver presentato vantaggi e limiti dell'uso delle tecniche di DL in radiologia, vengono proposte le Generative Adversarial Networks (GANs) come possibile soluzione per superare tali limiti. Illustrando lo stato dell'arte relativo alle GAN, viene focalizzata l'attenzione sulle Progressive Growing GAN, capaci di generare immagini ad alta risoluzione, e sulle Auxiliary Classifier GAN (ACGAN), capaci di generare immagini target. Sulla base di questi modelli, vengono proposte le innovative Progressive ACGAN (PACGAN), progettate per generare immagini target ad elevata risoluzione. L'obiettivo di questo lavoro di tesi è sfruttare la capacità delle GAN di creare una rappresentazione nello spazio latente dei dati del training set, sia per generare immagini target ad alta risoluzione (256 x 256), che per effettuare una classificazione. Il modello proposto viene testato su un dataset contenente 200 immagini di risonanza magnetica (RM) cerebrale di soggetti sani e pazienti con malattia di Alzheimer. I risultati del modello sono molto promettenti. La qualità delle immagini generate è stata valutata sia visivamente che quantitativamente, tramite FID (Fréchet Inception Distance) e MS-SSIM (Multi-Scale Structural Similarity Index), evidenziando una maggiore capacità delle PACGAN di rappresentare immagini target ad alta risoluzione rispetto alle ACGAN. Le performance di classificazione risultano ottime nel training set, con discreta capacità di generalizzare su nuovi dati. Il modello proposto consente quindi di generare immagini target ad alta risoluzione che possono essere usate per ottenere dataset sintetici.
APA, Harvard, Vancouver, ISO, and other styles
34

Schilling, Lennart. "Generating synthetic brain MR images using a hybrid combination of Noise-to-Image and Image-to-Image GANs." Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166034.

Full text
Abstract:
Generative Adversarial Networks (GANs) have attracted much attention because of their ability to learn high-dimensional, realistic data distributions. In the field of medical imaging, they can be used to augment the often small image sets available. In this way, for example, the training of image classification or segmentation models can be improved to support clinical decision making. GANs can be distinguished according to their input. While Noise-to-Image GANs synthesize new images from a random noise vector, Image-To-Image GANs translate a given image into another domain. In this study, it is investigated if the performance of a Noise-To-Image GAN, defined by its generated output quality and diversity, can be improved by using elements of a previously trained Image-To-Image GAN within its training. The data used consists of paired T1- and T2-weighted MR brain images. With the objective of generating additional T1-weighted images, a hybrid model (Hybrid GAN) is implemented that combines elements of a Deep Convolutional GAN (DCGAN) as a Noise-To-Image GAN and a Pix2Pix as an Image-To-Image GAN. Thereby, starting from the dependency of an input image, the model is gradually converted into a Noise-to-Image GAN. Performance is evaluated by the use of an independent classifier that estimates the divergence between the generative output distribution and the real data distribution. When comparing the Hybrid GAN performance with the DCGAN baseline, no improvement, neither in the quality nor in the diversity of the generated images, could be observed. Consequently, it could not be shown that the performance of a Noise-To-Image GAN is improved by using elements of a previously trained Image-To-Image GAN within its training.
APA, Harvard, Vancouver, ISO, and other styles
35

Wang, Zesen. "Generative Adversarial Networks in Text Generation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-264575.

Full text
Abstract:
The Generative Adversarial Network (GAN) was firstly proposed in 2014, and it has been highly studied and developed in recent years. It has obtained great success in the problems that cannot be explicitly defined by a math equation such as generating real images. However, since the GAN was initially designed to solve the problem in a continuous domain (image generation, for example), the performance of GAN in text generation is developing because the sentences are naturally discrete (no interpolation exists between “hello" and “bye"). In the thesis, it firstly introduces fundamental concepts in natural language processing, generative models, and reinforcement learning. For each part, some state-of-art methods and commonly used metrics are introduced. The thesis also proposes two models for the random sentence generation and the summary generation based on context, respectively. Both models involve the technique of the GAN and are trained on the large-scale dataset. Due to the limitation of resources, the model is designed and trained as a prototype. Therefore, it cannot achieve the state-of-art performance. However, the results still show the promising performance of the application of GAN in text generation. It also proposes a novel model-based metric to evaluate the quality of summary referring both the source text and the summary. The source code of the thesis will be available soon in the GitHub repository: https://github.com/WangZesen/Text-Generation-GAN.
Det generativa motståndsnätverket (GAN) introducerades först 2014 och det har studerats samt utvecklats starkt under senare år. GAN har uppnått stor framgång för problem som inte kan definieras uttryckligen av en matematisk ekvation, som att generera riktiga bilder. Men eftersom GAN ursprungligen var utformat för att lösa problemet i en kontinuerlig domän (till exempel bildgenerering), utvecklas GAN:s prestanda i textgenerering eftersom meningarna är naturligt diskreta (ingen interpolering finns mellan “hej" och “hejdå"). I examensarbetet introduceras grundläggande begrepp i naturlig språkbearbetning, generativa modeller och förstärkningslärande. För varje del introduceras några bästa tillgängliga metoder och vanligt förekommande mätvärden. Examensarbetet föreslår också två modeller för slumpmässig meningsgenerering respektive sammanfattningsgenerering baserat på sammanhang. Båda modellerna involverar tekniken för GAN och är tränade på storskaliga datamängder. På grund av begränsningen av resurser är modellen designad och tränad som en prototyp. Därför kan den inte heller uppnå bästa möjliga prestanda. Resultaten visar ändå lovande prestanda för tillämpningen av GAN i textgenerering. Den föreslår också en ny modellbaserad metrik för att utvärdera kvaliteten på sammanfattningen som hänvisar både till källtexten och sammanfattningen. Examensarbetets källkod kommer snart att finnas tillgänglig i GitHubförvaret: https://github.com/WangZesen/Text-Generation-GAN.
APA, Harvard, Vancouver, ISO, and other styles
36

Nilsson, Mårten. "Augmenting High-Dimensional Data with Deep Generative Models." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233969.

Full text
Abstract:
Data augmentation is a technique that can be performed in various ways to improve the training of discriminative models. The recent developments in deep generative models offer new ways of augmenting existing data sets. In this thesis, a framework for augmenting annotated data sets with deep generative models is proposed together with a method for quantitatively evaluating the quality of the generated data sets. Using this framework, two data sets for pupil localization was generated with different generative models, including both well-established models and a novel model proposed for this purpose. The unique model was shown both qualitatively and quantitatively to generate the best data sets. A set of smaller experiments on standard data sets also revealed cases where this generative model could improve the performance of an existing discriminative model. The results indicate that generative models can be used to augment or replace existing data sets when training discriminative models.
Dataaugmentering är en teknik som kan utföras på flera sätt för att förbättra träningen av diskriminativa modeller. De senaste framgångarna inom djupa generativa modeller har öppnat upp nya sätt att augmentera existerande dataset. I detta arbete har ett ramverk för augmentering av annoterade dataset med hjälp av djupa generativa modeller föreslagits. Utöver detta så har en metod för kvantitativ evaulering av kvaliteten hos genererade data set tagits fram. Med hjälp av detta ramverk har två dataset för pupillokalisering genererats med olika generativa modeller. Både väletablerade modeller och en ny modell utvecklad för detta syfte har testats. Den unika modellen visades både kvalitativt och kvantitativt att den genererade de bästa dataseten. Ett antal mindre experiment på standardiserade dataset visade exempel på fall där denna generativa modell kunde förbättra prestandan hos en existerande diskriminativ modell. Resultaten indikerar att generativa modeller kan användas för att augmentera eller ersätta existerande dataset vid träning av diskriminativa modeller.
APA, Harvard, Vancouver, ISO, and other styles
37

Berman, Alan. "Generative adversarial networks for fine art generation." Master's thesis, University of Cape Town, 2020. http://hdl.handle.net/11427/32458.

Full text
Abstract:
Generative Adversarial Networks (GANs), a generative modelling technique most commonly used for image generation, have recently been applied to the task of fine art generation. Wasserstein GANs and GANHack techniques have not been applied in GANs that generate fine art, despite their showing improved GAN results in other applications. This thesis investigates whether Wasserstein GANs and GANHack extensions to DCGANs can improve the quality of DCGAN-based fine art generation. There is also no accepted method of evaluating or comparing GANs for fine art generation. DCGAN's, Wasserstein GANs' and GANHack techniques' outputs on a modest computational budget were quantitatively and qualitatively compared to see which techniques showed improvement over DCGAN. A method for evaluating computer-generated fine art, HEART, is proposed to cover both the qualities of good human-created fine art and the shortcomings of computer-created fine art, and to include the cognitive and emotional impact as well as the visual appearance. Prominent GAN quantitative evaluation techniques were used to compare sample images these GANs produced on the MNIST, CIFAR-10 and Imagenet-1K image data sets. These results were compared with sample images these GANs produced on the above data sets, as well as on art data sets. A pilot study of HEART was performed with 20 users. Wasserstein GANs achieved higher visual quality outputs than the baseline DCGAN, as did the use of GANHacks, on all the fine art data sets and are thus recommended for use in future work on GAN-based fine art generation. The study also demonstrated that HEART can be used for the evaluation and comparison of art GANs, providing comprehensive, objective quality assessments which can be substantiated in terms of emotional and cognitive impact as well as visual appearance.
APA, Harvard, Vancouver, ISO, and other styles
38

Ackerman, Wesley. "Semantic-Driven Unsupervised Image-to-Image Translation for Distinct Image Domains." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8684.

Full text
Abstract:
We expand the scope of image-to-image translation to include more distinct image domains, where the image sets have analogous structures, but may not share object types between them. Semantic-Driven Unsupervised Image-to-Image Translation for Distinct Image Domains (SUNIT) is built to more successfully translate images in this setting, where content from one domain is not found in the other. Our method trains an image translation model by learning encodings for semantic segmentations of images. These segmentations are translated between image domains to learn meaningful mappings between the structures in the two domains. The translated segmentations are then used as the basis for image generation. Beginning image generation with encoded segmentation information helps maintain the original structure of the image. We qualitatively and quantitatively show that SUNIT improves image translation outcomes, especially for image translation tasks where the image domains are very distinct.
APA, Harvard, Vancouver, ISO, and other styles
39

Haiderbhai, Mustafa. "Generating Synthetic X-rays Using Generative Adversarial Networks." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/41092.

Full text
Abstract:
We propose a novel method for generating synthetic X-rays from atypical inputs. This method creates approximate X-rays for use in non-diagnostic visualization problems where only generic cameras and sensors are available. Traditional methods are restricted to 3-D inputs such as meshes or Computed Tomography (CT) scans. We create custom synthetic X-ray datasets using a custom generator capable of creating RGB images, point cloud images, and 2-D pose images. We create a dataset using natural hand poses and train general-purpose Conditional Generative Adversarial Networks (CGANs) as well as our own novel network pix2xray. Our results show the successful plausibility of generating X-rays from point cloud and RGB images. We also demonstrate the superiority of our pix2xray approach, especially in the troublesome cases of occlusion due to overlapping or rotated anatomy. Overall, our work establishes a baseline that synthetic X-rays can be simulated using inputs such as RGB images and point cloud.
APA, Harvard, Vancouver, ISO, and other styles
40

Garcia, Torres Douglas. "Generation of Synthetic Data with Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254366.

Full text
Abstract:
The aim of synthetic data generation is to provide data that is not real for cases where the use of real data is somehow limited. For example, when there is a need for larger volumes of data, when the data is sensitive to use, or simply when it is hard to get access to the real data. Traditional methods of synthetic data generation use techniques that do not intend to replicate important statistical properties of the original data. Properties such as the distribution, the patterns or the correlation between variables, are often omitted. Moreover, most of the existing tools and approaches require a great deal of user-defined rules and do not make use of advanced techniques like Machine Learning or Deep Learning. While Machine Learning is an innovative area of Artificial Intelligence and Computer Science that uses statistical techniques to give computers the ability to learn from data, Deep Learning is a closely related field based on learning data representations, which may serve useful for the task of synthetic data generation. This thesis focuses on one of the most interesting and promising innovations of the last years in the Machine Learning community: Generative Adversarial Networks. An approach for generating discrete, continuous or text synthetic data with Generative Adversarial Networks is proposed, tested, evaluated and compared with a baseline approach. The results prove the feasibility and show the advantages and disadvantages of using this framework. Despite its high demand for computational resources, a Generative Adversarial Networks framework is capable of generating quality synthetic data that preserves the statistical properties of a given dataset.
Syftet med syntetisk datagenerering är att tillhandahålla data som inte är verkliga i fall där användningen av reella data på något sätt är begränsad. Till exempel, när det finns behov av större datamängder, när data är känsliga för användning, eller helt enkelt när det är svårt att få tillgång till den verkliga data. Traditionella metoder för syntetiska datagenererande använder tekniker som inte avser att replikera viktiga statistiska egenskaper hos de ursprungliga data. Egenskaper som fördelningen, mönstren eller korrelationen mellan variabler utelämnas ofta. Dessutom kräver de flesta av de befintliga verktygen och metoderna en hel del användardefinierade regler och använder inte avancerade tekniker som Machine Learning eller Deep Learning. Machine Learning är ett innovativt område för artificiell intelligens och datavetenskap som använder statistiska tekniker för att ge datorer möjlighet att lära av data. Deep Learning ett närbesläktat fält baserat på inlärningsdatapresentationer, vilket kan vara användbart för att generera syntetisk data. Denna avhandling fokuserar på en av de mest intressanta och lovande innovationerna från de senaste åren i Machine Learning-samhället: Generative Adversarial Networks. Generative Adversarial Networks är ett tillvägagångssätt för att generera diskret, kontinuerlig eller textsyntetisk data som föreslås, testas, utvärderas och jämförs med en baslinjemetod. Resultaten visar genomförbarheten och visar fördelarna och nackdelarna med att använda denna metod. Trots dess stora efterfrågan på beräkningsresurser kan ett generativt adversarialnätverk skapa generell syntetisk data som bevarar de statistiska egenskaperna hos ett visst dataset.
APA, Harvard, Vancouver, ISO, and other styles
41

Nilsson, Alexander, and Martin Thönners. "A Framework for Generative Product Design Powered by Deep Learning and Artificial Intelligence : Applied on Everyday Products." Thesis, Linköpings universitet, Maskinkonstruktion, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-149454.

Full text
Abstract:
In this master’s thesis we explore the idea of using artificial intelligence in the product design process and seek to develop a conceptual framework for how it can be incorporated to make user customized products more accessible and affordable for everyone. We show how generative deep learning models such as Variational Auto Encoders and Generative Adversarial Networks can be implemented to generate design variations of windows and clarify the general implementation process along with insights from recent research in the field. The proposed framework consists of three parts: (1) A morphological matrix connecting several identified possibilities of implementation to specific parts of the product design process. (2) A general step-by-step process on how to incorporate generative deep learning. (3) A description of common challenges, strategies andsolutions related to the implementation process. Together with the framework we also provide a system for automatic gathering and cleaning of image data as well as a dataset containing 4564 images of windows in a front view perspective.
APA, Harvard, Vancouver, ISO, and other styles
42

Gruneau, Joar. "Investigation of deep learning approaches for overhead imagery analysis." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-232208.

Full text
Abstract:
Analysis of overhead imagery has a great potential to produce real-time data cost-effectively. This can be an important foundation for decision-making for businesses and politics. Every day a massive amount of new satellite imagery is produced. To fully take advantage of these data volumes a computationally efficient pipeline is required for the analysis. This thesis proposes a pipeline which outperforms the Segment Before you Detect network [6] and different types of fast region based convolutional neural networks [61] with a large margin in a fraction of the time. The model obtains a prediction error for counting cars of 1.67% on the Potsdam dataset and increases the vehiclewise F1 score on the VEDAI dataset from 0.305 reported by [61] to 0.542. This thesis also shows that it is possible to outperform the Segment Before you Detect network in less than 1% of the time on car counting and vehicle detection while also using less than half of the resolution. This makes the proposed model a viable solution for large-scale satellite imagery analysis.
Analys av flyg- och satellitbilder har stor potential att kostnadseffektivt producera data i realtid för beslutsfattande för företag och politik. Varje dag produceras massiva mängder nya satellitbilder. För att fullt kunna utnyttja dessa datamängder krävs ett beräkningseffektivt nätverk för analysen. Denna avhandling föreslår ett nätverk som överträffar Segment Before you Detect-nätverket [6] och olika typer av snabbt regionsbaserade faltningsnätverk [61]  med en stor marginal på en bråkdel av tiden. Den föreslagna modellen erhåller ett prediktionsfel för att räkna bilar på 1,67% på Potsdam-datasetet och ökar F1- poängen for fordons detektion på VEDAI-datasetet från 0.305 rapporterat av [61]  till 0.542. Denna avhandling visar också att det är möjligt att överträffa Segment Before you Detect-nätverket på mindre än 1% av tiden på bilräkning och fordonsdetektering samtidigt som den föreslagna modellen använder mindre än hälften av upplösningen. Detta gör den föreslagna modellen till en attraktiv lösning för storskalig satellitbildanalys.
APA, Harvard, Vancouver, ISO, and other styles
43

Chen, Chieh-Yu, and 陳傑宇. "Basketball Defensive Strategies Generation by Generative Adversarial Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/kwfpap.

Full text
Abstract:
碩士
國立交通大學
多媒體工程研究所
106
In this paper, we present a method to generate realistic defensive plays in a basketball game based on the ball and the offensive team’s movements. Our system allows players and coaches to simulate how the opposing team will react to a newly developed offensive strategy for evaluating its effectiveness. To achieve the aim, we train on the NBA dataset a conditional generative adversarial network that learns spatiotemporal interactions between players’ movements. The network consists of two components: a generator that takes a latent noise vector and the offensive team’s trajectories as input to generate defensive team’s trajectories; and a discriminator that evaluates the realistic degree of the generated results. Since a basketball game can be easily identified as fake if the ball handler, who is not defended, does not shoot the ball or cut into the restricted area, we add the wide open penalty to the objective function to assist model training. To evaluate the results, we compared the similarity of the real and the generated defensive plays, in terms of the players’ movement speed and acceleration, distance to defend ball handlers and non-ball handlers, and the frequency of wide open occurrences. In addition, we conducted a user study with 59 participants for subjective tests. Experimental results show the high fidelity of the generated defensive plays to real data and demonstrate the feasibility of our algorithm.
APA, Harvard, Vancouver, ISO, and other styles
44

Kao, Yu-Che, and 高宇哲. "Chinese Story Generation Using Conditional Generative Adversarial Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/e3afxw.

Full text
Abstract:
碩士
國立中正大學
資訊工程研究所
107
Natural language processing has grown considerably in many areas, but the field of automatic text generation has been slow. This study focuses on allowing the user to determine which content the machine is to narrate, and the user writes a shorter text to write a longer text. In the past, there have been many related researches on language models based on attention mechanisms. Based on the attention mechanism, this study proposes SG-Net that accepts Chinese word vectors and can learn in Chinese data sets, and SG-GAN that can generate sequences with more realistic sequences. In order to reasonably evaluate the quality of the text produced by the machine, this study also designed a set of experiments by manipulating the content of the input sequence semantic information. It can be seen from the experimental results that both SG-Net and SG-GAN can understand the basic semantics and grammar and write an article that can be understood, and SG-Net may only recite the statements that have been read, SG-GAN than SG- Net understands semantics and grammar better.
APA, Harvard, Vancouver, ISO, and other styles
45

Sung, Yi-Lin, and 宋易霖. "Difference-Seeking Generative Adversarial Network--Unseen Data Generation." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/r4q4t4.

Full text
Abstract:
碩士
國立臺灣大學
電信工程學研究所
107
Unseen data, which are not samples from the distribution of training data and are difficult to collect, have exhibited the importance in many applications (e.g., novelty detection, semi-supervised learning, adversarial training and so on.). In this paper, we introduce a general framework, called Difference-Seeking Generative Adversarial Network (DSGAN), to create various kinds of unseen data. The novelty is to consider the probability density of unseen data distribution to be the difference between those of two distributions p_bar_d and p_d, whose samples are relatively easy to collect. DSGAN can learn the target distribution p_t (or the unseen data distribution) via only the samples from the two distributions p_d and p_bar_d. Under our scenario, p_d is the distribution of seen data and p_bar_d can be obtained from p_d via simple operations, implying that we only need the samples of p_d during training. Three key applications, semi-supervised learning, increasing the robustness of neural network and novelty detection, are taken as case studies to illustrate that DSGAN enables to produce various unseen data. We also provide theoretical analyses about the convergence of DSGAN.
APA, Harvard, Vancouver, ISO, and other styles
46

Chen, Ming-Han, and 陳明翰. "Colorization Based on Generative Adversarial Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/25w9fb.

Full text
Abstract:
碩士
國立臺灣科技大學
電機工程系
106
During the time from the invention of cameras to the diffusion of color photography, there had been a lot of black and white photos. If we can colorize those black and white photos and transform them into colorful ones, they would certainly mark a brilliant page in the history of mankind. In this thesis, we present an auto colorization system implemented by tensorflow framework and the structure of generative adversarial network. The resolution of output can reach as high as 512 by 512. The experiment was conducted on several datasets, including shoes(Zappos 50K), human faces(CelebA), cartoon(The Simpsons), natural landscapes and modern urban cityscapes. By testing these diverse datasets, it proves the multiusability of the technique we present. In addition, we test the sequential input to verify the stability of our system. The results turned out that the multiusability of our colorization system can be used in different scenes. Therefore, our colorization system can be expanded to other applications with different datasets. The results of sequential input will not shift to different color all of a sudden, demonstrating the stability of our colorization system.
APA, Harvard, Vancouver, ISO, and other styles
47

WANG, CHEN-HAN, and 王振翰. "Generation of Music Game Beatmap via Generative Adversarial Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/rh6vs9.

Full text
Abstract:
碩士
國立中正大學
資訊工程研究所
106
Music games are very popular now, but designing beatmaps usually takes too much time. In addition, there is some limitation of existing methods to generate beatmaps. In this thesis, beatmap generation method is proposed based on Generative Adversarial Networks (GANs). Audio is firstly separated into the vocal and instrument parts to make this method close to beatmap design philosophy of designers. Our model consists of Conditional Generative Adversarial Nets (CGANs) and Improved Wasserstein GAN (WGAN-GP) for considering audio information and fast convergency of model training. Our results are compared with different methods. Besides, we conduct a subjective evaluation of our results and the real beatmaps. Our results are very competitive to the real beatmaps which means our results are close to the real beatmaps.
APA, Harvard, Vancouver, ISO, and other styles
48

YANG, MING-HAO, and 楊明豪. "Speech Synthesis based on Generative Adversarial Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/b2ztz4.

Full text
Abstract:
碩士
國立雲林科技大學
資訊工程系
107
In recent years, based on mature hardware technology and big data, the Deep Neural Network(DNN) has made breakthrough, and many successful cases can be seen in various fields. One of the most groundbreaking deep network architectures is the generative adversarial network, which provides an innovative way to train the generative model, and more specifically, it designs the model into two sub-models: generator and discriminator. The generator is used to generate samples, and the discriminator attempts to classify the samples as real or fake. This thesis, which is different from traditional speech synthesis technology, explores the speech synthesis technology based on generative adversarial network. Generative adversarial network can learn the feature distribution from the training data, thereby generating more natural speech. This thesis includes the Chinese and English speech synthesis. For English model, which corpus CSTR VCTK corpus to train three different speaker models of men and women. As for Chinese corpus, which uses the COSPRO & Toolkit, and also trains three different speakers models of men and women. From the results, it can be found that the English language average score of men and women Mean Opinion Score(MOS) reached 3.18 points (3.52 points for men and 2.83 points for women) out of 5 points, and the average score of men and women in Chinese language MOS reached 1.91 points (2.21 points for men, 1.6 points for women). In addition, in the speaker identification experiment, we found that the average pass rate of the text-related synthesized speech in Chinese and English is as follows: DNN average pass rate reaches 80.5% (72% for Chinese, 89% for English). The Support Vector Machine (SVM) has an average pass rate of 86% (100% in Chinese, 72% in English). The average pass rate of text independent synthesized speech has different pass rates according to the length of speech: the average pass rate of DNN is 36% (44% in Chinese, 28% in English) in 0.5 seconds, and 44.5% in SVM. The average pass rate of DNN in 3 seconds is 75% (78% in Chinese, 72% in English), SVM is 80.5% (72% in Chinese, 89% in English), DNN average in 5 seconds, the pass rate was 89% (78% in Chinese, 100% in English), and the SVM is 97% (94% in Chinese, 100% in English). In the average opinion score, since the English has a more complete front-end language rule to produce complete text features, so that the model can generate more natural speech. Therefore, English synthesized speech is better than Chinese. In the speaker identification experiment, the English pass rate is worse than that of Chinese in this case because the English speech time is much shorter than Chinese speech. As far as this article is unrelated, it can be found that the longer the speech time is, the higher the pass rate is. Therefore, to improve the security of the speaker recognition system can reduce the phrase time or improve the model. Since the discriminator of the system is used to identify the authenticity of the speech during the training process, we can combine the discriminator in the system into the speaker recognition system to effectively block the synthetic speech attack.
APA, Harvard, Vancouver, ISO, and other styles
49

WU, JIA-EN, and 吳嘉恩. "Generation of Automated Optical Inspection Images by Generative Adversarial Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/fxg9jr.

Full text
Abstract:
碩士
亞洲大學
資訊工程學系
107
Automated Optical Inspection (AOI) is the application of machine vision technology to detect whether the objects manufactured on the production line have serious defects and improve product yield. In recent years, there have been many successful cases in the application of deep learning models on AOI, which can enhance the correctness of AOI instruments that were originally used in non-deep learning model algorithms. Traditional deep learning relies on a large amount of test data to achieve the best results. However, there are a wide variety of objects that can be manufactured on the production line. For each product, it is not always easy to collect enough data to train a deep learning model. Therefore, if you can quickly increase the image data of defect, you can accelerate the training of AOI deep learning model and improve the correctness. In recent years, the use of the Generative Adversarial Network (GAN) to synthesize images has attracted much attention. A Generative Adversarial Network includes a generative network and a discriminative network. This thesis studies the use of Generative Adversarial Network to generate various AOI data. In our experiments, the correctness of the AOI deep learning model was improved by 2.1% by generating defective images with the proposed Generative Adversarial Network. The experimental results show that the Generative Adversarial Network can be used to generate defect images to improve the accuracy of automated optical inspection.
APA, Harvard, Vancouver, ISO, and other styles
50

Lin, Sheng-Xiang, and 林聖翔. "Automatic Web Security Testing with Generative Adversarial Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/tp3858.

Full text
Abstract:
碩士
國立宜蘭大學
資訊工程學系碩士班
107
Assessing software security contain many different types of practices. When you have to perform black box testing, fuzzing test is often used for vulnerability mining. However, there is no way to ensure that the target system has been exploited with all the vulnerabilities unless all the unacceptable inputs of the test target have been tested, but this is not possible. Therefore, it is important to improve the efficiency of testing. In the case of web security, for example, when doing testing, engineers usually prepare a large list of attack vectors. Some well-known free vulnerability scanning tools use a list of out-of-the-box attack vectors, while others generate attack vectors based on a known attack format. Although this approach can save a lot of time and labor costs, it just only test problems that have been identified, and sometimes the success rate is not high. To increase the efficiency of security testing, we're hoping to uncover more vulnerabilities by increasing the variability of attack vectors. Therefore, we proposed an automatic security testing system combining generative adversarial network (GAN). Using generating adversarial networks to generate pseudo-data features, the attack vectors can be learned and generated. We can take advantage of that to make a security engineer have second choice to test the website.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography