Log in

Relevant bibliographies by topics / Modified convolutional neural network / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Modified convolutional neural network.

Dissertations / Theses on the topic 'Modified convolutional neural network'

Author: Grafiati

Published: 2 June 2025

Last updated: 13 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Modified convolutional neural network.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Ayoub, Issa. "Multimodal Affective Computing Using Temporal Convolutional Neural Network and Deep Convolutional Neural Networks." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39337.

Full text

Abstract:

Affective computing has gained significant attention from researchers in the last decade due to the wide variety of applications that can benefit from this technology. Often, researchers describe affect using emotional dimensions such as arousal and valence. Valence refers to the spectrum of negative to positive emotions while arousal determines the level of excitement. Describing emotions through continuous dimensions (e.g. valence and arousal) allows us to encode subtle and complex affects as opposed to discrete emotions, such as the basic six emotions: happy, anger, fear, disgust, sad and neutral. Recognizing spontaneous and subtle emotions remains a challenging problem for computers. In our work, we employ two modalities of information: video and audio. Hence, we extract visual and audio features using deep neural network models. Given that emotions are time-dependent, we apply the Temporal Convolutional Neural Network (TCN) to model the variations in emotions. Additionally, we investigate an alternative model that combines a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). Given our inability to fit the latter deep model into the main memory, we divide the RNN into smaller segments and propose a scheme to back-propagate gradients across all segments. We configure the hyperparameters of all models using Gaussian processes to obtain a fair comparison between the proposed models. Our results show that TCN outperforms RNN for the recognition of the arousal and valence emotional dimensions. Therefore, we propose the adoption of TCN for emotion detection problems as a baseline method for future work. Our experimental results show that TCN outperforms all RNN based models yielding a concordance correlation coefficient of 0.7895 (vs. 0.7544) on valence and 0.8207 (vs. 0.7357) on arousal on the validation dataset of SEWA dataset for emotion prediction.

APA, Harvard, Vancouver, ISO, and other styles

2

Long, Cameron E. "Quaternion Temporal Convolutional Neural Networks." University of Dayton / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1565303216180597.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Bylund, Andreas, Anton Erikssen, and Drazen Mazalica. "Hyperparameters impact in a convolutional neural network." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18670.

Full text

Abstract:

Machine learning and image recognition is a big and growing subject in today's society. Therefore the aim of this thesis is to compare convolutional neural networks with different hyperparameter settings and see how the hyperparameters affect the networks test accuracy in identifying images of traffic signs. The reason why traffic signs are chosen as objects to evaluate hyperparameters is due to the author's previous experience in the domain. The object itself that is used for image recognition does not matter. Any dataset with images can be used to see the hyperparameters affect. Grid search is used to create a large amount of models with different width and depth, learning rate and momentum. Convolution layers, activation functions and batch size are all tested separately. These experiments make it possible to evaluate how the hyperparameters affect the networks in their performance of recognizing images of traffic signs. The models are created using Keras API and then trained and tested on the dataset Traffic Signs Preprocessed. The results show that hyperparameters affect test accuracy, some affect more than others. Configuring learning rate and momentum can in some cases result in disastrous results if they are set too high or too low. Activation function also show to be a crucial hyperparameter where it in some cases produce terrible results.

APA, Harvard, Vancouver, ISO, and other styles

4

Reiling, Anthony J. "Convolutional Neural Network Optimization Using Genetic Algorithms." University of Dayton / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1512662981172387.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

DiMascio, Michelle Augustine. "Convolutional Neural Network Optimization for Homography Estimation." University of Dayton / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1544214038882564.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Embretsén, Niklas. "Representing Voices Using Convolutional Neural Network Embeddings." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-261415.

Full text

Abstract:

In today’s society services centered around voices are gaining popularity. Being able to provide the users with voices they like, to obtain and sustain their attention, is of importance for enhancing the overall experience of the service. Finding an efficient way of representing voices such that similarity comparisons can be performed is therefore of great use. In the field of Natural Language Processing great progress has been made using embeddings from Deep Learning models to represent words in an unsupervised fashion. These representations managed to capture the semantics of the words. This thesis sets out to explore whether such embeddings can be found for audio data as well, more specifically voices from narrators of audiobooks, that captures similarities between different voices. For this two different Convolutional Neural Networks are developed and evaluated, trained on spectrogram representations of the voices. One is performing regular classification while the other one uses pairwise relationships and a Kullback–Leibler divergence based loss function, in an attempt to minimize and maximize the difference of the output between similar and dissimilar pairs of samples. From these models the embeddings used to represent each sample are extracted from the different layers of the fully connected part of the network during the evaluation. Both an objective and a subjective evaluation is performed. During the objective evaluation of the models it is first investigated whether the found embeddings are distinct for the different narrators, as well as if the embeddings do encode information about gender. The regular classification model is then further evaluated through a user test, as it achieved an order of magnitude better results during the objective evaluation. The user test sets out to evaluate whether the found embeddings capture information based on perceived similarity. It is concluded that the proposed approach has the potential to be used for representing voices in a way such that similarity is encoded, although more extensive testing, research and evaluation has to be performed to know for sure. For future work it is proposed to perform more sophisticated pre-proceessing of the data and also to collect and include data about relationships between voices during the training of the models.<br>I dagens samhälle ökar populariteten för röstbaserade tjänster. Att kunna förse användare med röster de tycker om, för att fånga och behålla deras uppmärksamhet, är därför viktigt för att förbättra användarupplevelsen. Att hitta ett effektiv sätt att representera röster, så att likheter mellan dessa kan jämföras, är därför av stor nytta. Inom fältet språkteknologi i maskininlärning har stora framstegs gjorts genom att skapa representationer av ord från de inre lagren av neurala nätverk, så kallade neurala nätverksinbäddningar. Dessa representationer har visat sig innehålla semantiken av orden. Denna uppsats avser att undersöka huruvida liknande representationer kan hittas för ljuddata i form av berättarröster från ljudböcker, där likhet mellan röster fångas upp. För att undersöka detta utvecklades och utvärderades två faltningsnätverk som använde sig av spektrogramrepresentationer av röstdata. Den ena modellen är konstruerad som en vanlig klassificeringsmodell, tränad för att skilja mellan uppläsare i datasetet. Den andra modellen använder parvisa förhållanden mellan datapunkterna och en Kullback–Leibler divergensbaserad optimeringsfunktion, med syfte att minimera och maximera skillnaden mellan lika och olika par av datapunkter. Från dessa modeller används representationer från de olika lagren av nätverket för att representera varje datapunkt under utvärderingen. Både en objektiv och subjektiv utvärderingsmetod används. Under den objektiva utvärderingen undersöks först om de funna representationerna är distinkta för olika uppläsare, sedan undersöks även om dessa fångar upp information om uppläsarens kön. Den vanliga klassificeringsmodellen utvärderas också genom ett användartest, eftersom den modellen nådde en storleksordning bättre resultat under den objektiva utvärderingen. Syftet med användartestet var att undersöka om de funna representationerna innehåller information om den upplevda likheten mellan rösterna. Slutsatsen är att det föreslagna tillvägagångssättet har potential till att användas för att representera röster så att information om likhet fångas upp, men att det krävs mer omfattande testning, undersökning och utvärdering. För framtida studier föreslås mer sofistikerad förbehandling av data samt att samla in och använda sig av data kring förhållandet mellan röster under träningen av modellerna.

APA, Harvard, Vancouver, ISO, and other styles

7

Tawfique, Ziring. "Tool-Mediated Texture Recognition Using Convolutional Neural Network." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-303774.

Full text

Abstract:

Vibration patterns can be captured by an accelerometer sensor attached to a hand-held device when it is scratched on various type of surface textures. These acceleration signals can carry relevant information for surface texture classification. Typically, methods rely on hand crafted feature engineering but with the use of Convolutional Neural Network manual feature engineering can be eliminated. A proposed method using modern machine learning techniques such as Dropout is introduced by training a Convolutional Neural Network to distinguish between 69 and 100 various surface textures. EHapNet, which is the proposed Convolutional Neural Network model, managed to achieve state of the art results with the used datasets.

APA, Harvard, Vancouver, ISO, and other styles

8

Winicki, Elliott. "ELECTRICITY PRICE FORECASTING USING A CONVOLUTIONAL NEURAL NETWORK." DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2126.

Full text

Abstract:

Many methods have been used to forecast real-time electricity prices in various regions around the world. The problem is difficult because of market volatility affected by a wide range of exogenous variables from weather to natural gas prices, and accurate price forecasting could help both suppliers and consumers plan effective business strategies. Statistical analysis with autoregressive moving average methods and computational intelligence approaches using artificial neural networks dominate the landscape. With the rise in popularity of convolutional neural networks to handle problems with large numbers of inputs, and convolutional neural networks conspicuously lacking from current literature in this field, convolutional neural networks are used for this time series forecasting problem and show some promising results. This document fulfills both MSEE Master's Thesis and BSCPE Senior Project requirements.

APA, Harvard, Vancouver, ISO, and other styles

9

Cui, Chen. "Convolutional Polynomial Neural Network for Improved Face Recognition." University of Dayton / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1497628776210369.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Li, Chao. "WELD PENETRATION IDENTIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK." UKnowledge, 2019. https://uknowledge.uky.edu/ece_etds/133.

Full text

Abstract:

Weld joint penetration determination is the key factor in welding process control area. Not only has it directly affected the weld joint mechanical properties, like fatigue for example. It also requires much of human intelligence, which either complex modeling or rich of welding experience. Therefore, weld penetration status identification has become the obstacle for intelligent welding system. In this dissertation, an innovative method has been proposed to detect the weld joint penetration status using machine-learning algorithms. A GTAW welding system is firstly built. Project a dot-structured laser pattern onto the weld pool surface during welding process, the reflected laser pattern is captured which contains all the information about the penetration status. An experienced welder is able to determine weld penetration status just based on the reflected laser pattern. However, it is difficult to characterize the images to extract key information that used to determine penetration status. To overcome the challenges in finding right features and accurately processing images to extract key features using conventional machine vision algorithms, we propose using convolutional neural network (CNN) to automatically extract key features and determine penetration status. Data-label pairs are needed to train a CNN. Therefore, an image acquiring system is designed to collect reflected laser pattern and the image of work-piece backside. Data augmentation is performed to enlarge the training data size, which resulting in 270,000 training data, 45,000 validation data and 45,000 test data. A six-layer convolutional neural network (CNN) has been designed and trained using a revised mini-batch gradient descent optimizer. Final test accuracy is 90.7% and using a voting mechanism based on three consequent images further improve the prediction accuracy.

APA, Harvard, Vancouver, ISO, and other styles

11

Wang, Zhenyu. "A Digits-Recognition Convolutional Neural Network on FPGA." Thesis, Linköpings universitet, Datorteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-161663.

Full text

Abstract:

A convolutional neural network (CNN) is a deep learning framework that is widely used in computer vision. A CNN extracts important features of input images by perform- ing convolution and reduces the parameters in the network by applying pooling operation. CNNs are usually implemented with programming languages and run on central process- ing units (CPUs) and graphics processing units (GPUs). However in recent years, research has been conducted to implement CNNs on field-programmable gate array (FPGA). The objective of this thesis is to implement a CNN on an FPGA with few hardware resources and low power consumption. The CNN we implement is for digits recognition. The input of this CNN is an image of a single digit. The CNN makes inference on what number it is on that image. The performance and power consumption of the FPGA is compared with that of a CPU and a GPU. The results show that our FPGA implementation has better performance than the CPU and the GPU, with respect to runtime, power consumption, and power efficiency.

APA, Harvard, Vancouver, ISO, and other styles

12

Jönsson, Jonatan, and Felix Stenbäck. "Fence surveillance with convolutional neural networks." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-37116.

Full text

Abstract:

Broken fences is a big security risk for any facility or area with strict security standards. In this report we suggest a machine learning approach to automate the surveillance for chain-linked fences. The main challenge is to classify broken and non-broken fences with the help of a convolution neural network. Gathering data for this task is done by hand and the dataset is about 127 videos at 26 minutes length total on 23 diﬀerent locations. The model and dataset are tested on three performances traits, scaling, augmentation improvement and false rate. In these tests we concluded that nearest neighbor increased accuracy. Classifying with fences that has been included in the training data a false rate that was low, about 1%. Classifying with fences that are unknown to the model produced a false rate of about 90%. With these results we concludes that this method and dataset is useful under the right circumstances but not in an unknown environment.

APA, Harvard, Vancouver, ISO, and other styles

13

Keisala, Simon. "Designing an Artificial Neural Network for state evaluation in Arimaa : Using a Convolutional Neural Network." Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-143188.

Full text

Abstract:

Agents being able to play board games such as Tic Tac Toe, Chess, Go and Arimaa has been, and still is, a major difficulty in Artificial Intelligence. For the mentioned board games, there is a certain amount of legal moves a player can do in a specific board state. Tic Tac Toe have in average around 4-5 legal moves, with a total amount of 255168 possible games. Both Chess, Go and Arimaa have an increased amount of possible legal moves to do, and an almost infinite amount of possible games, making it impossible to have complete knowledge of the outcome. This thesis work have created various Neural Networks, with the purpose of evaluating the likelihood of winning a game given a certain board state. An improved evaluation function would compensate for the inability of doing a deeper tree search in Arimaa, and the anticipation is to compete on equal skills against another well-performing agent (meijin) having one less search depth. The results shows great potential. From a mere one hundred games against meijin, the network manages to separate good from bad positions, and after another one hundred games able to beat meijin with equal search depth. It seems promising that by improving the training and by testing different sizes for the neural network that a neural network could win even with one less search depth. The huge branching factor of Arimaa makes such an improvement of the evaluation beneficial, even if the evaluation would be 10 000 times more slow.

APA, Harvard, Vancouver, ISO, and other styles

14

Buratti, Luca. "Visualisation of Convolutional Neural Networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text

Abstract:

Le Reti Neurali, e in particolare le Reti Neurali Convoluzionali, hanno recentemente dimostrato risultati straordinari in vari campi. Purtroppo, comunque, non vi è ancora una chiara comprensione del perchè queste architetture funzionino così bene e soprattutto è difficile spiegare il comportamento nel caso di fallimenti. Questa mancanza di chiarezza è quello che separa questi modelli dall’essere applicati in scenari concreti e critici della vita reale, come la sanità o le auto a guida autonoma. Per questa ragione, durante gli ultimi anni sono stati portati avanti diversi studi in modo tale da creare metodi che siano capaci di spiegare al meglio cosa sta succedendo dentro una rete neurale oppure dove la rete sta guardando per predire in un certo modo. Proprio queste tecniche sono il centro di questa tesi e il ponte tra i due casi di studio che sono presentati sotto. Lo scopo di questo lavoro è quindi duplice: per prima cosa, usare questi metodi per analizzare e quindi capire come migliorare applicazioni basate su reti neurali convoluzionali e in secondo luogo, per investigare la capacità di generalizzazione di queste architetture, sempre grazie a questi metodi.

APA, Harvard, Vancouver, ISO, and other styles

15

Amaducci, Fabiola. "Reduced order modelling of combustion using convolutional neural network." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/21409/.

Full text

Abstract:

It is well known that CFD simulations of a complex combustion system, such as Moderate or Intense Low-oxygen Dilution (MILD) combustion, requires consid- erable computational resources. This precludes various applications including the use of CFD in real time control systems. The idea of a reduced order model (ROM) was born from the desire to overcome this obstacle. A ROM, if properly instructed, returns the output of a requested CFD simulation in extremely short time. This one is an ideal mechanism with two basic gears: the input size reduction technique and the interpolation method. This project proposes a study on the applicability of convolutional neural network (CNN) as a dimensionality reduction technique. The code written for this purpose will be presented in detail, as well as pre and post processing. A sensibility analysis will be carry out to find out which parame- ters to adjust and how in order to achieve the optimum. Finally, the network will be compared in its peculiarity and its results with Principal Component Analysis (PCA), the technique used by the BURN group of Libre University of Bruxelles for the same purpose. Moreover with the desire to improve, we went further by trying to overcome the limits dictated by the rules of a legitimate comparation between PCA and CNN. Lastly, the author considers necessary to provide the theoretical basis in order to enrich and support what has just been described. Therefore, you will also find introductions / insights on MILD combustion, CFD of a combustion system, neural networks and the aspects related to them.

APA, Harvard, Vancouver, ISO, and other styles

16

Chen, Tairui. "Going Deeper with Convolutional Neural Network for Intelligent Transportation." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-theses/144.

Full text

Abstract:

Over last several decades, computer vision researchers have been devoted to find good feature to solve different tasks, object recognition, object detection, object segmentation, activity recognition and so forth. Ideal features transform raw pixel intensity values to a representation in which these computer vision problems are easier to solve. Recently, deep feature from covolutional neural network(CNN) have attracted many researchers to solve many problems in computer vision. In the supervised setting, these hierarchies are trained to solve specific problems by minimizing an objective function for different tasks. More recently, the feature learned from large scale image dataset have been proved to be very effective and generic for many computer vision task. The feature learned from recognition task can be used in the object detection task. This work aims to uncover the principles that lead to these generic feature representations in the transfer learning, which does not need to train the dataset again but transfer the rich feature from CNN learned from ImageNet dataset. This work aims to uncover the principles that lead to these generic feature representations in the transfer learning, which does not need to train the dataset again but transfer the rich feature from CNN learned from ImageNet dataset. We begin by summarize some related prior works, particularly the paper in object recognition, object detection and segmentation. We introduce the deep feature to computer vision task in intelligent transportation system. First, we apply deep feature in object detection task, especially in vehicle detection task. Second, to make fully use of objectness proposals, we apply proposal generator on road marking detection and recognition task. Third, to fully understand the transportation situation, we introduce the deep feature into scene understanding in road. We experiment each task for different public datasets, and prove our framework is robust.

APA, Harvard, Vancouver, ISO, and other styles

17

Franke, Cameron. "Autonomous Driving with a Simulation Trained Convolutional Neural Network." Scholarly Commons, 2017. https://scholarlycommons.pacific.edu/uop_etds/2971.

Full text

Abstract:

Autonomous vehicles will help society if they can easily support a broad range of driving environments, conditions, and vehicles. Achieving this requires reducing the complexity of the algorithmic system, easing the collection of training data, and verifying operation using real-world experiments. Our work addresses these issues by utilizing a reflexive neural network that translates images into steering and throttle commands. This network is trained using simulation data from Grand Theft Auto V~\cite{gtav}, which we augment to reduce the number of simulation hours driven. We then validate our work using a RC car system through numerous tests. Our system successfully drive 98 of 100 laps of a track with multiple road types and difficult turns; it also successfully avoids collisions with another vehicle in 90\% of the trials.

APA, Harvard, Vancouver, ISO, and other styles

18

Samuelsson, Elin. "A Confidence Measure for Deep Convolutional Neural Network Regressors." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273967.

Full text

Abstract:

Deep convolutional neural networks can be trained to estimate gaze directions from eye images. However, such networks do not provide any information about the reliability of its predictions. As uncertainty estimates could enable more accurate and reliable gaze tracking applications, a method for confidence calculation was examined in this project. This method had to be computationally efficient for the gaze tracker to function in real-time, without reducing the quality of the gaze predictions. Thus, several state-of-the-art methods were abandoned in favor of Mean-Variance Estimation, which uses an additional neural network for estimating uncertainties. This confidence network is trained based on the accuracy of the gaze rays generated by the primary network, i.e. the prediction network, for different eye images. Two datasets were used for evaluating the confidence network, including the effect of different design choices. A main conclusion was that the uncertainty associated with a predicted gaze direction depends on more factors than just the visual appearance of the eye image. Thus, a confidence network taking only this image as input can never model the regression problem perfectly. Despite this, the results show that the network learns useful information. In fact, its confidence estimates outperform those from an established Monte Carlo method, where the uncertainty is estimated from the spread of gaze directions from several prediction networks in an ensemble.<br>Djupa faltningsnätverk kan tränas till att uppskatta blickriktningar utifrån ögonbilder. Sådana nätverk ger dock ingen information om hur pålitliga dess prediktioner är. Eftersom osäkerhetsskattningar skulle möjliggöra mer exakta och robusta tillämpningar har en metod för konfidensestimering undersökts i detta projekt. Denna metod behövde vara beräkningsmässigt effektiv för att kunna följa en blickriktning i realtid utan att reducera kvaliteten på blickriktningarna. Således valdes flera etablerade tillvägagångssätt bort till fördel för medelvärdes- och variansestimering, där ytterligare ett nätverk används för att estimera osäkerheter. Detta konfidensnätverk tränas baserat på hur bra blickriktningar det första nätverket, kallat prediktionsnätverket, genererar för olika ögonbilder. Två dataset användes för att utvärdera konfidensnätverket, inklusive effekten av olika sätt att designa det. En viktig slutsats var att osäkerheten hos en predicerad blickriktning beror av fler faktorer än bara ögonbildens utseende. Därför kommer ett konfidensnätverk med endast denna bild som indata aldrig kunna modellera regressionsproblemet perfekt. Trots detta visar resultaten att nätverket lär sig användbar information. Dess konfidensskattningar överträffar till och med dem från en etablerad Monte Carlo-metod, där osäkerheten skattas utifrån spridningen av blickriktningar från en samling prediktionsnätverk.

APA, Harvard, Vancouver, ISO, and other styles

19

Matuh, Delic Senad. "A Convolutional Neural Network for predicting HIV Integration Sites." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279796.

Full text

Abstract:

Convolutional neural networks are commonly used when training deep networks with time-independent data and have demonstrated positive results in predicting DNA binding sites for DNA-binding proteins. Based upon the success of convolutional neural networks in predicting DNA binding sites of proteins, this project intends to determine if a convolutional neural network could predict possible HIV-B provirus integration sites. When exploring existing research, little information was found regarding DNA sequences targeted by HIV for integration, few, if any, have attempted to use artificial neural networks to identify these sequences and the integration sites themselves. Using data from the Retrovirus Integration Database, we train a convolutional artificial neural network to determine if it can detect potential target sites for HIV integration. The analysis and results reveal that the created convolutional neural network is able to predict HIV integration sites in human DNA with an accuracy that exceeds that of a potential random binary classifier. When analyzing the datasets separated by the neural network, the relative distribution of the different nucleotides in the immediate vicinity of HIV integration site reveals that some nucleotides are disproportionately occurring less often at these sites compared to nucleotides in randomly sampled human DNA.<br>Konvolutionella artificiella nätverk används vanligen vid tidsoberoende datamängder. Konvolutionella artificiella nätverk har varit framgångsrika med att förutse bindningssiter för DNA-bindande proteiner. Med de framsteg som gjorts med konvolutionella artificiella nätverk vill detta projekt bestämma huruvida det går att med ett konvolutionellt artificiella nätverk förutsäga möjliga siter för HIV-B integration i mänskligt DNA. Våran eftersökning visar att det finns lite kunskap om huruvida det finns nukleotidsekvenser i mänskligt DNA som främjar HIV integration. Samtidigt har få eller inga studier gjorts med konvolutionella artificiella nätverk i försök att förutsäga integrationssiter för HIV i mänskligt DNA. Genom att använda data från Retrovirus Integration Database tänker vi träna ett konvolutionellt artificiellt nätverk med syftet att försöka bestämma huruvida det tränade konvolutionella artificiella nätverket kan förutspå potentiella integrationssiter för HIV. Våra resultat visar att det skapade konvolutionella artificiella nätverket kan förutsäga HIV integration i mänskligt DNA med en träffsäkerhet som överträffar en potentiell slumpmässig binär klassificerare. Vid analys av datamängderna separerade av det neurala nätverket framträder en bild där vissa nukleotider förekommer oproportionerligt mindre frekvent i närheten av integrationssiterna i jämförelse med nukleotider i slumpmässigt genererad mänsklig DNA.

APA, Harvard, Vancouver, ISO, and other styles

20

Rochford, Matthew. "Visual Speech Recognition Using a 3D Convolutional Neural Network." DigitalCommons@CalPoly, 2019. https://digitalcommons.calpoly.edu/theses/2109.

Full text

Abstract:

Main stream automatic speech recognition (ASR) makes use of audio data to identify spoken words, however visual speech recognition (VSR) has recently been of increased interest to researchers. VSR is used when audio data is corrupted or missing entirely and also to further enhance the accuracy of audio-based ASR systems. In this research, we present both a framework for building 3D feature cubes of lip data from videos and a 3D convolutional neural network (CNN) architecture for performing classification on a dataset of 100 spoken words, recorded in an uncontrolled envi- ronment. Our 3D-CNN architecture achieves a testing accuracy of 64%, comparable with recent works, but using an input data size that is up to 75% smaller. Overall, our research shows that 3D-CNNs can be successful in finding spatial-temporal features using unsupervised feature extraction and are a suitable choice for VSR-based systems.

APA, Harvard, Vancouver, ISO, and other styles

21

Qian, Songyue. "Using convolutional neural network to generate neuro image template." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1546620227038248.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Kaster, Joshua M. "Training Convolutional Neural Network Classifiers Using Simultaneous Scaled Supercomputing." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1588973772607826.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Nyrönen, P. (Pekka). "Convolutional neural network based super-resolution for mobile devices." Master's thesis, University of Oulu, 2018. http://urn.fi/URN:NBN:fi:oulu-201812083250.

Full text

Abstract:

Super-resolution is a challenging problem of restoring details lost to diffraction in the image capturing process. Degradations from the environment and the imaging device increase its difficulty, and they are strongly present in mobile phone cameras. The latest promising approaches involve convolutional neural networks, but little testing has been done on degraded images. Also, sizes of neural networks raise a question of their applicability on mobile devices. A wide review of published super-resolution neural networks is done. Four of the network architectures are chosen, and their TensorFlow models are trained and tested for their output quality on high quality and degraded images and compared against bicubic interpolation with sharpening. For the first time, MTF and CPIQ acutance responses are measured from their outputs after processing photographs of a resolution chart. Their execution times on a mobile device are measured for small image sizes and typical phone camera photo sizes. It is shown that the networks are able to learn degradation resilience, and that quality of their results surpasses that of the bicubic interpolations with sharpening. However, even the lightest models still take a considerable time to process images on a mobile device. Moreover, it is shown that the current way of training and evaluating super-resolution neural networks with high quality images is inadequate for practical purposes, and that degradations have to be incorporated into training data in order to overcome the problem<br>Superresoluutio on haastava diffraktion kuvaa otettaessa hävittämien yksityiskotien palautusongelma. Ympäristöstä ja kuvantamislaitteistosta syntyvät rappeumat lisäävät sen vaikeutta, ja ne ovat voimakkaasti läsnä matkapuhelinten kameroissa. Viimeisimmät lupaavat lähestymistavat hyödyntävät konvoluutioneuroverkkoja, mutta niiden testausta heikkolaatuisilla kuvilla on tehty vähän. Lisäksi neuroverkkojen koot herättävät kysymyksen niiden käytettävyydestä mobiililaitteilla. Julkaistuista superresoluutioverkoista tehdään laaja katsaus. Neljän valitun verkkoarkkitehtuurin TensorFlow-mallien ulostulojen laatua testataan puhtailla ja heikompilaatuisilla kuvilla ja tuloksia verrataan terävöitettyyn kuutiolliseen interpolaatioon. Ensimmäistä kertaa MTF- ja CPIQ-terävyysvaste mitataan niiden ulostulokuvista niiden käsiteltyä valokuvia resoluutiokartasta. Mallien suoritusaikoja mitataan niin pienille kuville kuin tyypillisille kännykkäkameroiden kuvakoille. Tulokset osoittavat, että verkot pystyvät oppimaan kestäviksi laadun heikennyksille, ja että niiden tulosten laatu ylittää terävöitetyn kuutiollisen interpolaation tulokset. Kevyimmätkin mallit vaativat kuitenkin huomattavasti laskenta-aikaa mobiililaitteella. Lisäksi osoitetaan, että nykyinen tapa kouluttaa ja arvioida superresoluutioneuroverkkoja korkealaatuisilla kuvilla on riittämätön käytännön tarkoitusperiin, ja että laadun heikennys on sisällytettävä koulutusaineistoon ongelman yli pääsemiseksi

APA, Harvard, Vancouver, ISO, and other styles

24

Lopes, André Teixeira. "Facial expression recognition using deep learning - convolutional neural network." Universidade Federal do Espírito Santo, 2016. http://repositorio.ufes.br/handle/10/4301.

Full text

Abstract:

Made available in DSpace on 2016-08-29T15:33:24Z (GMT). No. of bitstreams: 1 tese_9629_dissertacao(1)20160411-102533.pdf: 9277551 bytes, checksum: c18df10308db5314d25f9eb1543445b3 (MD5) Previous issue date: 2016-03-03<br>CAPES<br>O reconhecimento de expressões faciais tem sido uma área de pesquisa ativa nos últimos dez anos, com uma área de aplicação em crescimento como animação de personagens e neuro-marketing. O reconhecimento de uma expressão facial não é um problema fácil para métodos de aprendizagem de máquina, dado que pessoas diferentes podem variar na forma com que mostram suas expressões. Até uma imagem da mesma pessoa em uma expressão pode variar em brilho, cor de fundo e posição. Portanto, reconhecer expressões faciais ainda é um problema desafiador em visão computacional. Para resolver esses problemas, nesse trabalho, nós propomos um sistema de reconhecimento de expressões faciais que usa redes neurais de convolução. Geração sintética de dados e diferentes operações de pré-processamento foram estudadas em conjunto com várias arquiteturas de redes neurais de convolução. A geração sintética de dados e as etapas de pré-processamento foram usadas para ajudar a rede na seleção de características. Experimentos foram executados em três bancos de dados largamente utilizados (CohnKanade, JAFFE, e BU3DFE) e foram feitas validações entre bancos de dados(i.e., treinar em um banco de dados e testar em outro). A abordagem proposta mostrou ser muito efetiva, melhorando os resultados do estado-da-arte na literatura.<br>Facial expression recognition has been an active research area in the past ten years, with growing application areas such avatar animation, neuromarketing and sociable robots. The recognition of facial expressions is not an easy problem for machine learning methods, since people can vary signi cantly in the way that they show their expressions. Even images of the same person in one expression can vary in brightness, background and position. Hence, facial expression recognition is still a challenging problem. To address these problems, in this work we propose a facial expression recognition system that uses Convolutional Neural Networks. Data augmentation and di erent preprocessing steps were studied together with various Convolutional Neural Networks architectures. The data augmentation and pre-processing steps were used to help the network on the feature selection. Experiments were carried out with three largely used databases (Cohn-Kanade, JAFFE, and BU3DFE) and cross-database validations (i.e. training in one database and test in another) were also performed. The proposed approach has shown to be very e ective, improving the state-of-the-art results in the literature and allowing real time facial expression recognition with standard PC computers.

APA, Harvard, Vancouver, ISO, and other styles

25

Khasgiwala, Anuj. "Word Recognition in Nutrition Labels with Convolutional Neural Network." DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7101.

Full text

Abstract:

Nowadays, everyone is very busy and running around trying to maintain a balance between their work life and family, as the working hours are increasing day by day. In such hassled life people either ignore or do not give enough attention to a healthy diet. An imperative part of a healthy eating routine is the cognizance and maintenance of nourishing data and comprehension of how extraordinary sustenance and nutritious constituents influence our bodies. Besides in the USA, in many other countries, nutritional information is fundamentally passed on to consumers through nutrition labels (NLs) which can be found in all packaged food products in the form of nutrition table. However, sometimes it turns out to be challenging to utilize this information available in these NLs notwithstanding for consumers who are health conscious as they may not be familiar with nutritional terms and discover it hard to relate nutritional information into their day by day activities because of lack of time, inspiration, or training. So it is essential to automate this information gathering and interpretation procedure by incorporating Machine Learning based algorithm to abstract nutritional information from NLs on the grounds that it enhances the consumer’s capacity to participate in nonstop nutritional information gathering and analysis.

APA, Harvard, Vancouver, ISO, and other styles

26

Vikström, Joel. "Training a Convolutional Neural Network to Evaluate Chess Positions." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263062.

Full text

Abstract:

Convolutional neural networks are typically applied to image analysis problems. We investigate whether a simple convolutional neural network can be trained to evaluate chess positions by means of predicting Stockfish (an existing chess engine) evaluations. Publicly available data from lichess.org was used, and we obtained a final MSE of 863.48 and MAE of 12.18 on our test dataset (with labels ranging from -255 to +255). To accomplish better results, we conclude that a more capable model architecture must be used.<br>Konvolutionella neuronnät används ofta för bildanalys. Vi undersöker om ett enkelt sådant nätverk kan tränas att evaluera schackpositioner genom att förutspå värderingar från Stockfish (en existerande schackdator). Vi använde offentligt tillgänglig data från lichess.org, och erhöll en slutgiltig MSE 863.48 och MAE 12.18 på vår testdata (med data i intervallet -255 till +255). För att uppnå bättre resultat drar vi slutsatsen att en mer kapabel modellarkitektur måste användas.

APA, Harvard, Vancouver, ISO, and other styles

27

SAH, BIKASH KUMAR. "A NOVEL CONVOLUTIONAL NEURAL NETWORK FOR AIR POLLUTION FORECASTING." Thesis, DELHI TECHNOLOGICAL UNIVERSITY, 2021. http://dspace.dtu.ac.in:8080/jspui/handle/repository/18792.

Full text

Abstract:

Air pollution was a global problem a few decades back. It is still a problem and will continue to be a problem if not solved appropriately.Various machine learning and deep learining approaches have been purposed for accurate prediction, estimation and analysis of the air polution. We have purposed a novel five layer one-dimensional convolution neural network architecture to forecast the PM2.5 concentration. It is a deep learning approach. We have used the five year air pollution dataset from 2010 to 2014 recorded by the US embassy in Beijing, China taken from the database from UCI machine learining repository [19]. The dataset we are considering is in the .csv format. The dataset consists of feature columns like “Number,” “year,” “month,” “day,” “PM2.5”, “PM10”, “S02”, “dew,” “temp,” “pressure,” “wind direction,” “wind direction,” “snow” and “rain.” The dataset consisted of a total of 43,324 rows and nine feature columns.The model yields the best results in predicting PM2.5 levels with an RMSE of 28.1309 and MAE of 14.9727. On statistical analysis we found that ur proposed prediction model outperformed the traditional forecasting models like DTR, SVR and ANN models for the air pollution forecasting.

APA, Harvard, Vancouver, ISO, and other styles

28

Zhang, Huizhen. "Alpha Matting via Residual Convolutional Grid Network." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39467.

Full text

Abstract:

Alpha matting is an important topic in areas of computer vision. It has various applications, such as virtual reality, digital image and video editing, and image synthesis. The conventional approaches for alpha matting perform unsatisfactorily when they encounter complicated background and foreground. It is also difficult for them to extract alpha matte accurately when the foreground objects are transparent, semi-transparent, perforated or hairy. Fortunately, the rapid development of deep learning techniques brings new possibilities for solving alpha matting problems. In this thesis, we propose a residual convolutional grid network for alpha matting, which is based on the convolutional neural networks (CNNs) and can learn the alpha matte directly from the original image and its trimap. Our grid network consists of horizontal residual convolutional computation blocks and vertical upsampling/downsampling convolutional computation blocks. By choosing different paths to pass information by itself, our network can not only retain the rich details of the image but also extract high-level abstract semantic information of the image. The experimental results demonstrate that our method can solve the matting problems that plague conventional matting methods for decades and outperform all the other state-of-the-art matting methods in quality and visual evaluation. The only matting method performs a little better than ours is the current best matting method. However, that matting method requires three times amount of trainable parameters compared with ours. Hence, our matting method is the best considering the computation complexity, memory usage, and matting performance.

APA, Harvard, Vancouver, ISO, and other styles

29

Martell, Patrick Keith. "Hierarchical Auto-Associative Polynomial Convolutional Neural Networks." University of Dayton / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1513164029518038.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Plouet, Erwan. "Convolutional and dynamical spintronic neural networks." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASP120.

Full text

Abstract:

Cette thèse aborde le développement de composants spintroniques pour le calcul neuromorphique, une approche novatrice visant à réduire la consommation énergétique significative des applications d'intelligence artificielle (IA). L'adoption généralisée de l'IA, y compris des très grands modèles de langage tels que ChatGPT, a entraîné une augmentation des besoins énergétiques, les centres de données consommant environ 1 à 2 de l'énergie mondiale, avec une projection de doublement d'ici 2030. Les architectures hardware traditionnelles, qui séparent la mémoire et les unités de traitement, ne sont pas adaptées aux tâches d'IA, car les réseaux de neurones nécessitent un accès fréquent à de nombreux paramètres stockés en mémoire, entraînant une dissipation excessive d'énergie. Le calcul neuromorphique, inspiré par le cerveau humain, fusionne les capacités de mémoire et de traitement dans un même dispositif, réduisant potentiellement la consommation d'énergie. La spintronique, qui manipule le spin des électrons plutôt que la charge, offre des composants capables de fonctionner à moindre puissance et de fournir des solutions de traitement efficaces. Cette thèse est divisée en deux parties principales. La première partie se concentre sur la réalisation expérimentale d'un réseau de neurones convolutif hybride hardware-software (CNN) utilisant des composants spintroniques. Les synapses spintroniques, qui fonctionnent avec des signaux radiofréquences, permettent un multiplexage en fréquence pour réduire le besoin de nombreuses connexions physiques dans les réseaux de neurones. Ce travail de recherche explore divers designs de synapses basées sur des spin diodes AMR, chacune avec des spécificités différentes, et démontre l'intégration de ces synapses dans un CNN matériel. Une réalisation importante a été l'implémentation d'une couche convolutive spintronique au sein d'un CNN qui, combinée à une couche entièrement connectée en software, a réussi à classifier des images du dataset FashionMNIST avec une précision de 88 %, se rapprochant des performances d'un réseau purement software. Les principaux résultats incluent le développement et le contrôle précis des synapses spintroniques, la fabrication de chaînes synaptiques pour la somme pondérée dans les réseaux de neurones, et la mise en œuvre expérimentale réussie d'un CNN hybride avec des composants spintroniques sur une tâche complexe. La deuxième partie de la thèse explore l'utilisation des nano-oscillateurs spintroniques (STNOs) pour traiter des signaux dépendants du temps à travers leurs dynamiques transitoires. Les STNOs présentent des comportements non linéaires qui peuvent être exploités pour des tâches complexes comme la classification de séries temporelles. Un réseau de STNOs simulés a été entraîné pour discriminer entre différents types de séries temporelles, démontrant des performances supérieures par rapport aux méthodes de calcul par réservoir standards. Nous avons également proposé et évalué une architecture de réseau multicouche de STNOs pour des tâches plus complexes, telles que la classification de chiffres manuscrits présentés pixel par pixel. Cette architecture a atteint une précision moyenne de 89,83%, similaire à un réseau de neurones récurrents à temps continu (CTRNN) standard équivalent, indiquant le potentiel de ces réseaux à s'adapter à diverses tâches dynamiques. De plus, des méthodes ont été établies pour faire correspondre la dynamique des dispositifs avec les échelles de temps des entrées, cruciales pour optimiser les performances des réseaux de neurones dynamiques. Nous avons démontré qu'un réseau multicouche de STNOs couplés peut être entraîné via la rétropropagation de l'erreur dans le temps, soulignant l'efficacité et le passage à l'échelle possible du calcul neuromorphique spintronique. Cette recherche a démontré que les réseaux spintroniques peuvent être utilisés pour mettre en œuvre des architectures spécifiques et résoudre des tâches complexes<br>This thesis addresses the development of spintronic components for neuromorphic computing, a novel approach aimed at reducing the significant energy consumption of AI applications. The widespread adoption of AI, including very large scale langage models like ChatGPT, has led to increased energy demands, with data centers consuming about 1-2% of global power, and projected to double by 2030. Traditional hardware architectures, which separate memory and processing units, are not well-suited for AI tasks, as neural networks require frequent access to large in-memory parameters, resulting in excessive energy dissipation. Neuromorphic computing, inspired by the human brain, merges memory and processing capabilities in the same device, potentially reducing energy use. Spintronics, which manipulates electron spin rather than charge, offers components that can operate at lower power and provide efficient processing solutions. The thesis is divided into two main parts. The first part focuses on the experimental implementation of a hybrid hardware-software convolutional neural network (CNN) using spintronic components. Spintronic synapses, which operate with radio frequency signals, enable frequency multiplexing to reduce the need for numerous physical connections in neural networks. This research work explores various designs of AMR spin diode-based synapses, each with different specificities, and demonstrates the integration of these synapses into a hardware CNN. A significant achievement was the implementation of a spintronic convolutional layer within a CNN that, when combined with a software fully-connected layer, successfully classified images from the FashionMNIST dataset with an accuracy of 88%, closely matching the performance of the pure software equivalent network. Key findings include the development and precise control of spintronic synapses, the fabrication of synaptic chains for weighted summation in neural networks, and the successful implementation of a hybrid CNN with experimental spintronic components on a complex task. The second part of the thesis explores the use of spintronic nano oscillators (STNOs) for processing time-dependent signals through their transient dynamics. STNOs exhibit nonlinear behaviors that can be utilized for complex tasks like time series classification. A network of simulated STNOs was trained to discriminate between different types of time series, demonstrating superior performance compared to standard reservoir computing methods. We also proposed and evaluated a multilayer network architecture of STNOs for more complex tasks, such as classifying handwritten digits presented pixel-by-pixel. This architecture achieved an average accuracy of 89.83% similar to an equivalent standard continuous time recurrent neural network (CTRNN), indicating the potential of these networks to adapt to various dynamic tasks. Additionally, guidelines were established for matching device dynamics with input timescales, crucial for optimizing performance in networks of dynamic neurons. We demonstrated that multilayer networks of coupled STNOs can be effectively trained via backpropagation through time, highlighting the efficiency and scalability of spintronic neuromorphic computing. This research demonstrated that spintronic networks can be used to implement specific architectures and solve complex tasks. This paves the way for the creation of compact, low-power spintronic neural networks that could be an alternative to AI hardware, offering a sustainable solution to the growing energy demands of AI technologies

APA, Harvard, Vancouver, ISO, and other styles

31

Vi, Margareta. "Object Detection Using Convolutional Neural Network Trained on Synthetic Images." Thesis, Linköpings universitet, Datorseende, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153224.

Full text

Abstract:

Training data is the bottleneck for training Convolutional Neural Networks. A larger dataset gives better accuracy though also needs longer training time. It is shown by finetuning neural networks on synthetic rendered images, that the mean average precision increases. This method was applied to two different datasets with five distinctive objects in each. The first dataset consisted of random objects with different geometric shapes. The second dataset contained objects used to assemble IKEA furniture. The neural network with the best performance, trained on 5400 images, achieved a mean average precision of 0.81 on a test which was a sample of a video sequence. Analysis of the impact of the factors dataset size, batch size, and numbers of epochs used in training and different network architectures were done. Using synthetic images to train CNN’s is a promising path to take for object detection where access to large amount of annotated image data is hard to come by.

APA, Harvard, Vancouver, ISO, and other styles

32

Fasth, Niklas, and Rasmus Hallblad. "Air Reconnaissance Analysis using Convolutional Neural Network-based Object Detection." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-48422.

Full text

Abstract:

The Swedish armed forces use the Single Source Intelligent Cell (SSIC), developed by Saab, for analysis of aerial reconnaissance video and report generation. The analysis can be time-consuming and demanding for a human operator. In the analysis workflow, identifying vehicles is an important part of the work. Artificial Intelligence is widely used for analysis in many industries to aid or replace a human worker. In this paper, the possibility to aid the human operator with air reconnaissance data analysis is investigated, specifically, object detection for finding cars in aerial images. Many state-of-the-art object detection models for vehicle detection in aerial images are based on a Convolutional Neural Network (CNN) architecture. The Faster R-CNN- and SSD-based models are both based on this architecture and are implemented. Comprehensive experiments are conducted using the models on two different datasets, the open Video Verification of Identity (VIVID) dataset and a confidential dataset provided by Saab. The datasets are similar, both consisting of aerial images with vehicles. The initial experiments are conducted to find suitable configurations for the proposed models. Finally, an experiment is conducted to compare the performance of a human operator and a machine. The results from this work prove that object detection can be used to supporting the work of air reconnaissance image analysis regarding inference time. The current performance of the object detectors makes applications, where speed is more important than accuracy, most suitable.

APA, Harvard, Vancouver, ISO, and other styles

33

Li, Yi-Hsiu, and 李易修. "Face Alignment Based on Modified Deep Convolutional Neural Networks." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/v79j68.

Full text

Abstract:

碩士<br>國立臺北科技大學<br>資訊工程系<br>106<br>Many face-related applications including face recognition, emotion detection, and medical cosmetology, rely on accurate facial features information. Manual labeling are inefficient, unstable, and subjective, and therefore an efficient automatic facial landmarking technique has been a crucial research topic. Current automatic facial feature extraction techniques based on deep learning networks are mostly applied to frontal facial landmarking. This thesis explores the method of deep convolutional neural networks for detecting 21 features on profile faces. Three major improvements on the model are proposed. First, a deeper network with bigger input sizes and more convolution and pooling layers was used to better extract the features. Second, this thesis used not only color images but also gray-scale images as inputs to emphasize the contours or edges. Third, we separated the model of both first layer and second layer in five regions and local models were used in subsequent training for better model converging under complex shapes. Fourth, some networks in second layer used non-square image as input, because a more suitable capture block can maximize the features value of the part. Experimental results substantiated the superiority of the proposed method. Compared with the original deep convolutional neural networks, the proposed model not only decreases the facial location deviation by 2.11 pixels, but also increases the accuracy of facial landmarking by 38.14% under a 3-pixel (1.5% facial height) error tolerance.

APA, Harvard, Vancouver, ISO, and other styles

34

Lai, Chen Yao, and 賴晨堯. "Using Convolution Neural Network to Identify the Modified Regions of Seam Carving." Thesis, 2019. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107CGU05392010%22.&searchmode=basic.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

"Color Labeling via Convolutional Neural Network." 2016. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1292310.

Full text

Abstract:

色彩，在計算機視覺里的很多領域扮演著非常重要的角色。然而，由於光照條件、觀察角度以及物體表面反射性質的多種多樣，要準確而真實地描述現實生活中捕捉到的色彩卻不是一件容易的事情。因此，將失真的色彩恢復過來加以描述，繼而加以利用，是一個重要的課題。在本次研究中，我們將探討如何實現顏色的恆常性描述。要達到這個目的，我們提出了一個端到端、像素到像素的卷積神經網絡，來將顏色值映射到預定義的顏色標籤上。我們將會描述應用這一網絡模型的兩個具體例子。<br>首先，我們想要解決具挑戰性的物體顏色命名問題。我們專注解決公共環境中的行人顏色描述，其非常容易受不同光照與觀察角度的影響。為了解決這一問題，我們提出要為屬於同一個表面區域的像素生成一致的色彩命名。在此研究中我們有兩點貢獻：（1）我們構建了一個大型的行人顏色命名數據集，包含了14213張人工標註的圖。（2）我們將提出的卷積神經網絡適應到基於區域的行人顏色命名任務中來。我們發現行人顏色命名卷積神經網絡要優於現有的方法，其可以為現實生活中的行人提供一致的顏色命名。此外，在行人再鑒證任務中，我們展現了我們提取的顏色描述的有效性。另外，我們提出了一個嶄新的應用，其可以用手工素描的簡易圖像來匹配購物網站中的衣服配飾，這個任務將由我們提取的顏色描述來完成。<br>此外，我們對彩色二維碼的顏色恢復問題很感興趣。彩色二維碼用多種顏色來表示不同數據組合。由於色彩失真，用移動設備照下打印出來的二維碼后，二維碼中的顏色看起來會跟原先預設的完全不一樣。我們應用提出的卷積神經網絡來作為色彩分類器。試驗結果證明在提供可靠的色彩恢復上，我們的方法要優於其他基準方法。<br>Color serves an important cue in many computer vision tasks. Nevertheless, obtaining accurate color description from images in real-world scenes is non-trivial due to varying illumination conditions, view angles, and surface reflectance. It is an important topic to restore color from distortion in real-world applications. In this study, we explore the problem of color description to achieve color constancy for images captured from uncalibrated conditions. To achieve this, we propose an end-to-end, pixel-to-pixel convolutional neural network (CNN) for color description by mapping RGB values to predefined color labels. To explore the effectiveness of this pixelwise color labeling network, we discuss two case studies via the proposed CNN model.<br>We first address the challenging problem of object color description. We focus on pedestrian description in public spaces, which is heavily affected by illumination and view angles. To explore this problem, we propose to assign consistent color names to regions of single object’s surface. We made two contributions in this study: (1) we contribute a large-scale pedestrian color naming dataset with 14,213 hand-labeled images. (2) we adapt our color labeling CNN model for region-level pedestrian color naming. We demonstrate that the adapted Pedestrian Color Naming CNN (PCN-CNN) is superior over existing approaches in providing consistent color names on real-world pedestrian images. In addition, we show the effectiveness of color descriptor extracted from PCN-CNN in complementing existing descriptors for the task of person re-identification. Moreover, we discuss a novel application to retrieve outfit matching and fashion (which could be difficult to be described by keywords) with just a user-provided color sketch.<br>Besides, we focus on an open problem of color recovery in Color QR codes. In captured QR codes images, due to chromatic distortion, colors in the QR codes can look totally different from the generated ones. We apply our color labeling CNN network, as a learning-based classifier, to recover the distortion, by mapping the printed color to the predefined labels. Experimental results show that our method is superior over baseline method<br>in providing accurate recovered color labels for color QR codes images.<br>Cheng, Zhiyi.<br>Thesis M.Phil. Chinese University of Hong Kong 2016.<br>Includes bibliographical references (leaves ).<br>Abstracts also in Chinese.<br>Title from PDF title page (viewed on …).<br>Detailed summary in vernacular field only.<br>Detailed summary in vernacular field only.<br>Detailed summary in vernacular field only.

APA, Harvard, Vancouver, ISO, and other styles

36

Xiao, Bin, and 肖彬. "Epilepsy prediction with convolutional neural network." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/emhyjx.

Full text

Abstract:

碩士<br>國立交通大學<br>資訊科學與工程研究所<br>104<br>Epilepsy is one of the most common brain diseases, which can break out at anytime, anywhere. The unpredictability of seizure is often considered the most problematic aspect of epilepsy by the patients. A good epilepsy seizure predictor can help patients reduce the burden of unpredictability and improve patients’ life quality greatly. Therefore, a central theme in epilepsy treatments is to predict epilepsy seizure, so that patients can get a warning before epilepsy seizures take place. Electroencephalograms (EEGs) are recordings of the electrical potentials produced by the brain. EEG signals, together with patient behavior, have been used in the diagnosis of epilepsy for decades. Typically, researchers treat epilepsy seizure prediction as a binary classification problem aiming at discriminate cerebral state preictal state or interictal state. In this work, we apply two of the simplest and most popular EEG signal processing methods, Discrete Fourier Transform (FFT) and Principal Component Analysis (PCA), to generate features in frequency domain and time domain separately. With these features as input, we propose a multi-view Convolutional Neural Network model to solve seizure prediction problem. Experimental results show that our approach outperforms other existing solutions. We also explore to use transfer learning to improve the performance of our solution. The experiments show that our solution can benefit from transfer learning.

APA, Harvard, Vancouver, ISO, and other styles

37

Liu, Yu-Cheng, and 劉又誠. "Action Recognition Using Convolutional Neural Network." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/30475793234292847224.

Full text

Abstract:

碩士<br>國立臺灣大學<br>電信工程學研究所<br>104<br>Multimedia plays an important role in human daily life. Hundreds of thousands videos are uploaded on the Internet. Some hot topic such as basketball and baseball games are with high click through rate so information retrieval techniques become important. Human action detection can be further applied to detect abnormal events and analyze activity. In this thesis, the dataset we use in experiments contains the human body action and interaction with objects like jumping, clapping, drinking. In the thesis, we first uses convolutional neural network (CNN) to train a model. Then extract the features of training and testing data from the model. After obtaining the features, we use the temporal information between features in same video clip to train a 3-layered long short term memory (LSTM) model. Finally, we choose the last layer feature vector of LSTM which contains all data characteristics of the testing video features as the determine scores. The results show that the accuracy of our structure is higher than some works proposed in recent years.

APA, Harvard, Vancouver, ISO, and other styles

38

Lee, Ssu-Rui, and 李思叡. "Image Denoising by Convolutional Neural Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/he56yv.

Full text

Abstract:

碩士<br>國立清華大學<br>資訊系統與應用研究所<br>107<br>Removing noise from the images to improve image quality is the main challenge in image processing. Especially as the ubiquitous spread of computers, smartphones, the Internet, and social networks, image denoising becomes more and more important. In this work, we extend upon the results of Ulyanov et al.~\cite{Ulyanov_2018_CVPR} and introduce a competitive image denoising method based on the structure characteristic of convolutional neural networks (CNNs). Different from most CNN-based methods which need a large-scale dataset for training, our method only looks at one degraded image and removes noise on itself. This method is not only an application of image denoising but also a point of view for visualizing the property and effect of each element in convolutional neural networks.

APA, Harvard, Vancouver, ISO, and other styles

39

WANG, SHENG-YUAN, and 王聖淵. "Convolutional Neural Network for Image Deblurring." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/gxr5xp.

Full text

Abstract:

碩士<br>國立中央大學<br>資訊工程學系<br>106<br>In recent years, along with the rise of deep learning in academia and industry. There will be striking deep learning achievements and works every few months. It also proves that deep learning technology application has many great effects in the image. In this paper, the convolution neural network is used as the main method to restore out of focus images or blurred images to clear images. This paper proposes three network architectures: Auto_deblur, S-Net and AGDNet. In the case that the image is slightly damaged and blurred, it is better to select S-Net, because S-Net can execute quickly. AGDNet has the best effect when the image has a relatively serious target, which integrates the conception and advantages of the first two networks. In addition, this paper also proposes to the improved loss function in training the network so that the network output is able to fit more the real and clear images. In addition to its good performance in solving deblurring, this architecture also has good effects in image super-resolution, Image denoising and Image Restoration. The results also show that this method performs better than other deep neural networks and other commonly used solutions in the industry.

APA, Harvard, Vancouver, ISO, and other styles

40

CHEN, HUNG-PEI, and 陳虹霈. "Integrating Convolutional Neural Network and Recurrent Neural Network for Automatic Text Classification." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/4jqh8z.

Full text

Abstract:

碩士<br>東吳大學<br>數學系<br>108<br>With the rapid development of huge data research area, the demand for processing textual information is increasing. Text classification is still a hot research in the field of natural language processing. In the traditional text mining process, we often use the "Bag-of-Words" model, which discards the order of the words in the sentence, mainly concerned with the frequency of occurrence of the words. TF-IDF (term frequency–inverse document frequency) is one of the techniques for feature extraction commonly used in text exploration and classification. Therefore, we combine convolutional neural network and recurrent neural network to consider the semantics and order of the words in the sentence for text classification. We apply 20Newsgroups news group as our test dataset. The performance of the result achieves an accuracy of 86.3% on the test set and improves about 3% comparing with the traditional model.

APA, Harvard, Vancouver, ISO, and other styles

41

(5931110), Durvesh Pathak. "Compressed Convolutional Neural Network for Autonomous Systems." Thesis, 2019.

Find full text

Abstract:

The word “Perception” seems to be intuitive and maybe the most straightforward problem for the human brain because as a child we have been trained to classify images, detect objects, but for computers, it can be a daunting task. Giving intuition and reasoning to a computer which has mere capabilities to accept commands and process those commands is a big challenge. However, recent leaps in hardware development, sophisticated software frameworks, and mathematical techniques have made it a little less daunting if not easy. There are various applications built around to the concept of “Perception”. These applications require substantial computational resources, expensive hardware, and some sophisticated software frameworks. Building an application for perception for the embedded system is an entirely different ballgame. Embedded system is a culmination of hardware, software and peripherals developed for specific tasks with imposed constraints on memory and power. Therefore, the applications developed should keep in mind the memory and power constraints imposed due to the nature of these systems.Before 2012, the problems related to “Perception” such as classification, object detection were solved using algorithms with manually engineered features. However, in recent years, instead of manually engineering the features, these features are learned through learning algorithms. The game-changing architecture of Convolution Neural Networks proposed in 2012 by Alex K, provided a tremendous momentum in the direction of pushing Neural networks for perception. This thesis is an attempt to develop a convolution neural network architecture for embedded systems, i.e. an architecture that has a small model size and competitive accuracy. Recreate state-of-the-art architectures using fire module’s concept to reduce the model size of the architecture. The proposed compact models are feasible for deployment on embedded devices such as the Bluebox 2.0. Furthermore, attempts are made to integrate the compact Convolution Neural Network with object detection pipelines.

APA, Harvard, Vancouver, ISO, and other styles

42

Pathak, Durvesh. "Compressed convolutional neural network for autonomous systems." Thesis, 2018. http://hdl.handle.net/1805/17921.

Full text

Abstract:

Indiana University-Purdue University Indianapolis (IUPUI)<br>The word “Perception” seems to be intuitive and maybe the most straightforward problem for the human brain because as a child we have been trained to classify images, detect objects, but for computers, it can be a daunting task. Giving intuition and reasoning to a computer which has mere capabilities to accept commands and process those commands is a big challenge. However, recent leaps in hardware development, sophisticated software frameworks, and mathematical techniques have made it a little less daunting if not easy. There are various applications built around to the concept of “Perception”. These applications require substantial computational resources, expensive hardware, and some sophisticated software frameworks. Building an application for perception for the embedded system is an entirely different ballgame. Embedded system is a culmination of hardware, software and peripherals developed for specific tasks with imposed constraints on memory and power. Therefore, the applications developed should keep in mind the memory and power constraints imposed due to the nature of these systems. Before 2012, the problems related to “Perception” such as classification, object detection were solved using algorithms with manually engineered features. However, in recent years, instead of manually engineering the features, these features are learned through learning algorithms. The game-changing architecture of Convolution Neural Networks proposed in 2012 by Alex K [1], provided a tremendous momentum in the direction of pushing Neural networks for perception. This thesis is an attempt to develop a convolution neural network architecture for embedded systems, i.e. an architecture that has a small model size and competitive accuracy. Recreate state-of-the-art architectures using fire module’s concept to reduce the model size of the architecture. The proposed compact models are feasible for deployment on embedded devices such as the Bluebox 2.0. Furthermore, attempts are made to integrate the compact Convolution Neural Network with object detection pipelines.

APA, Harvard, Vancouver, ISO, and other styles

43

Reis, Afonso de Sá. "Accelerating the training of convolutional neural network." Master's thesis, 2019. https://hdl.handle.net/10216/122196.

Full text

Abstract:

The objective of this report is to implement a Convolutional Neural Network (CNN) in an FPGA, with a main focus on accelerating the training, using Maxeler technology as a way to compile higher level code directly into hardware.Neural Networks are one of the most commonly used models used in all sorts of tasks in Machine Learning. This type of network is mostly used for image recognition/generation, since a few layers ( convolutional, pooling) can be viewed as image operations to find features, which are then combined in the fully connected layer(s) and used to produce the output.

APA, Harvard, Vancouver, ISO, and other styles

44

Chen, Jun-Hao, and 陳俊豪. "Predict FX via Convolutional Neural Network (CNNs)." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/2n2fqy.

Full text

Abstract:

碩士<br>國立臺灣大學<br>經濟學研究所<br>105<br>Deep learning is an effective approach to solve image recognition problems. People like to think intuitively from the trading chart. This study used the characteristics of deep learning to train computers how to imitate people''s thinking from the trading chart. We have three steps as follows: 1. Before training, we need to pre-process our input data from quantitative data to images. 2. We use Convolutional-Neural-Network (CNN), which is a kind of the deep learning, to train our trading model. 3. We evaluate the model performance by the accuracy of classification. With this approach, a trading model is obtained to help make trading strategies. The main application is designed to help clients automatically obtain personalized trading strategies.

APA, Harvard, Vancouver, ISO, and other styles

45

Lee, Heng, and 李亨. "Convolutional Neural Network Accelerator with Vector Quantization." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/w7kr56.

Full text

Abstract:

碩士<br>國立臺灣大學<br>電子工程學研究所<br>107<br>Deep neural networks (DNNs) have demonstrated impressive performance in many edge computer vision tasks, causing the increasing demand for DNN accelerator on mobile and internet of things (IoT) devices. However, the massive power consumption and storage requirement make the hardware design challenging. In this paper, we introduce a DNN accelerator based on a model compression technique vector quantization (VQ), which can reduce the network model size and computation cost simultaneously. Moreover, a specialized processing element (PE) is designed with various SRAM bank configurations as well as dataflows such that it can support different codebook/kernel sizes, and keep high utilization under small input or output channel numbers. Compared to the state-of-the-art, the proposed accelerator architecture achieves 3.94 times reduction in memory access and 1.2 times in latency for batch-one inference.

APA, Harvard, Vancouver, ISO, and other styles

46

Lin, Tian-Yi, and 林天翼. "Manga Character Clustering using Convolutional Neural Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/5s46b5.

Full text

Abstract:

碩士<br>國立臺灣大學<br>資訊工程學研究所<br>106<br>As reading habits change, more and more manga books are digitized and can be read on tablet or smartphone. However, most of them are just scanned from the printed version and treated as normal image files. There are only few studies focus on extracting information from manga pages automatically. In this thesis, we try to cluster faces in manga books based on their identities. We proposed a method using convolutional neural network and clustering algorithm. We designed and trained a CNN network to extract features from manga character faces, and used the extracted features and spatial relations of faces to cluster them. Our method achieved 76.92% accuracy on unseen manga books and characters.

APA, Harvard, Vancouver, ISO, and other styles

47

Yang, Shu-Sian, and 楊恕先. "Speech Recognition by Using Convolutional Neural Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/szuwyc.

Full text

Abstract:

碩士<br>國立中央大學<br>電機工程學系<br>107<br>The thesis developed a speech recognition method for automatic speech recognition. In this speech recognition method, we obtained the speech feature parameters through Mel frequency cepstral coefficients and input a Convolutional Neural Network. The main difference between this Convolutional Neural Network speech recognition method and traditional speech recognition method is that it does not need to establish an acoustic model. For example, in Chinese, it saved a lot of time without establishing a large number of consonant and vowel models. After obtaining the speech feature parameters through the MFCCs, speech recognition is finished through Convolutional Neural Network.

APA, Harvard, Vancouver, ISO, and other styles

48

Huang, Tzu-Hsuan, and 黃子軒. "A Convolutional Neural Network for Face Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/b7g3ew.

Full text

Abstract:

碩士<br>國立中央大學<br>數學系<br>107<br>Motivated by the work of Viola and Jones [10] for object detection, in this thesis we propose a new convolutional neural network model for face detection which is based on the study by Krizhevsky et al. [5]. The proposed convolutional neural network model for face detection is not limited to the black-white images and fixed face size. This approach combines Keras with OpenCV to construct a neural network framework consisting of several convolutional layers and fully connected layers. We use a training set which contains about 150,000 color images, cited from the CelebA [7] and Imagenet databases, to train the proposed neural network model. After that we employ the trained neural network as a strong classifier, combining with a sliding window, to detect the number of faces in a given color image. We also use the bilinear interpolation to design a zoom-in and zoom-out technique to deal with an image which contains a face image that is too large or too small. Finally, a series of numerical experiments is performed to demonstrate the effectiveness of the proposed convolutional neural network model for face detection.

APA, Harvard, Vancouver, ISO, and other styles

49

Wang, Shi-Hao, and 王士豪. "Applying Convolutional Neural Network for Malware Detection." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/98tyh5.

Full text

Abstract:

博士<br>國立中山大學<br>資訊管理學系研究所<br>106<br>Failure to detect malware at its very inception leaves room for it to post significant threat and cost to cyber security for not only individuals, organizations but also the society and nation. However, the rapid growth in volume and diversity of malware renders conventional detection techniques that utilize feature extraction and comparison insufficient, making it very difficult for well-trained network administrators to identify malware, not to mention regular users of internet. Challenges in malware detection is exacerbated since complexity in the type and structure also increase dramatically in these years to include source code, binary file, shell script, Perl script, instructions, settings and others. Such increased complexity offers a premium on misjudgment. In order to increase malware detection efficiency and accuracy under large volume and multiple types of malware, this dissertation adopts Convolutional Neural Networks （CNN）, one of the most successful deep learning techniques. The experiment shows an accuracy rate of over 90% in identifying malicious and benign codes. The experiment also presents that CNN is effective with detecting source code and binary code, it can further identify malware that is embedded into benign code, leaving malware no place to hide. This dissertation proposes a feasible solution for network administrators to efficiently identify malware at the very inception in the severe network environment nowadays, so that information technology personnel can take protective actions in a timely manner and make preparations for potential follow-up cyber attacks.

APA, Harvard, Vancouver, ISO, and other styles

50

Liou, Jhao-Yu, and 劉昭雨. "Using Convolutional Neural Network on Technical Analysis Indicators." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/32289485607677466547.

Full text

Abstract:

碩士<br>國立東華大學<br>資訊工程學系<br>105<br>Deep learning is a state of the art artificial intelligence technology, and the convolutional neural networks have been widely used in image recognition competitions. In the financial stock market, we often use the linear graph of various technical indexes to predict the trend. This paper uses the convolution neural network's excellent image recognition ability, combining the linear graph of various technical indicators, to predict the stock price as a classification problem, and to predict the results. The single stock's history data is too small. This paper uses the same kind of stocks as the single stock use, and collects the same kind of stock data into a linear graph of each technical index as input. Then trains the convolutional neural network, the results will be divided into two types of rising and. We also design a profit strategy base on the convolutional neural network.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!