Log in

Relevant bibliographies by topics / SLR; CNN; Artificial Neural Network / Dissertations / Theses

To see the other types of publications on this topic, follow the link: SLR; CNN; Artificial Neural Network.

Dissertations / Theses on the topic 'SLR; CNN; Artificial Neural Network'

Author: Grafiati

Published: 5 June 2025

Last updated: 11 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 28 dissertations / theses for your research on the topic 'SLR; CNN; Artificial Neural Network.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Lind, Benjamin. "Artificial Neural Networks for Image Improvement." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-137661.

Full text

Abstract:

After a digital photo has been taken by a camera, it can be manipulated to be more appealing. Two ways of doing that are to reduce noise and to increase the saturation. With time and skills in an image manipulating program, this is usually done by hand. In this thesis, automatic image improvement based on artificial neural networks is explored and evaluated qualitatively and quantitatively. A new approach, which builds on an existing method for colorizing gray scale images is presented and its performance compared both to simpler methods and the state of the art in image denoising. Saturation is lowered and noise added to original images, which the methods receive as inputs to improve upon. The new method is shown to improve in some cases but not all, depending on the image and how it was modified before given to the method.

APA, Harvard, Vancouver, ISO, and other styles

2

Hodges, Jonathan Lee. "Predicting Large Domain Multi-Physics Fire Behavior Using Artificial Neural Networks." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/86364.

Full text

Abstract:

Fire dynamics is a complex process involving multi-mode heat transfer, reacting fluid flow, and the reaction of combustible materials. High-fidelity predictions of fire behavior using computational fluid dynamics (CFD) models come at a significant computational cost where simulation times are often measured in hours, days, or even weeks. A new simulation method is to use a machine learning approach which uses artificial neural networks (ANNs) to represent underlying connections between data to make predictions of new inputs. The field of image analysis has seen significant advancements in ANN performance by using feature based layers in the network architecture. Inspired by these advancements, a generalized procedure to design ANNs to make spatially resolved predictions in multi-physics applications is presented and applied to different fire applications. A deep convolutional inverse graphics network (DCIGN) was developed to predict the two-dimensional spatially resolved spread of a wildland fire. The network uses an image stack corresponding to the spatially resolved landscape, weather, and current fire perimeter (which can be obtained from measurements) to predict the fire perimeter six hours in the future. A transpose convolutional neural network (TCNN) was developed to predict the spatially resolved thermal flow field in a compartment fire from coarse zone fire model predictions. The network uses thirty-five parameters describing the geometry of the room and the ventilation conditions to predict the full-field temperature and velocity throughout the room. The data for use in training and testing both networks was generated using high-fidelity CFD fire simulations. Overall, the ANN predictions in each network agree with simulation predictions for validation scenarios. The computational time to evaluate the ANNs is 10,000x faster than the high-fidelity fire simulations. This work represents a first step in developing super real-time full-field fire predictions for different applications.<br>Ph. D.<br>The National Fire Protection Agency estimates the total cost of fire in the United States at $300 billion annually. In 2017 alone, there were 3,400 civilian fire fatalities, 14,670 civilian fire injuries, and an estimated $23 billion direct property loss in the United States. Large scale fires in the wildland urban interface (WUI) and in large buildings still represent a significant hazard to life, property, and the environment. Researchers and fire safety engineers often use computer simulations to predict the behavior of a fire to assist in reducing the hazard of fire. Unfortunately, typical simulations of fire scenarios may take hours, days, or even weeks to run which limits their use to small areas or sections of buildings. A new method is to use a machine learning approach which uses artificial neural networks (ANNs) to represent underlying connections between data to make new predictions of fire behavior. Inspired by advancements in the field of image processing, this research developed a procedure to use machine learning to make rapid high resolution predictions of fire behavior. An ANN was developed to predict the perimeter of a wildland fire six hours in the future based on a set of images corresponding to the landscape, weather, and current fire perimeter, all of which can be obtained directly from measurements (US Geological Survey, Automated Surface Observation System, and satellites). In addition, an ANN was developed to predict high-resolution temperature and velocity fields within a floor of a building based on predictions from a coarse model. The data for use in training and testing these networks was generated using high-resolution fire simulations. Overall, the network predictions agree well with simulation predictions for new scenarios. In addition, the time to run the model is 10,000x faster than the typical simulations. The work presented herein represents a first step in developing high resolution computer simulations for different fire scenarios that run very quickly.

APA, Harvard, Vancouver, ISO, and other styles

3

Garbay, Thomas. "Zip-CNN." Electronic Thesis or Diss., Sorbonne université, 2023. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2023SORUS210.pdf.

Full text

Abstract:

Les systèmes numériques utilisés pour l'Internet des Objets (IoT) et les Systèmes Embarqués ont connu une utilisation croissante ces dernières décennies. Les systèmes embarqués basés sur des microcontrôleurs (MCU) permettent de résoudre des problématiques variées, en récoltant de nombreuses données. Aujourd'hui, environ 250 milliards de MCU sont utilisés. Les projections d'utilisation de ces systèmes pour les années à venir annoncent une croissance très forte. L'intelligence artificielle a connu un regain d'intérêt dans les années 2012. L'utilisation de réseaux de neurones convolutifs (CNN) a permis de résoudre de nombreuses problématiques de vision par ordinateur ou de traitement du langage naturel. L'utilisation de ces algorithmes d'intelligence artificielle au sein de systèmes embarqués permettrait d'améliorer grandement l'exploitation des données récoltées. Cependant le coût d'exécution des CNN rend leur implémentation complexe au sein de systèmes embarqués. Ces travaux de thèse se concentrent sur l'exploration de l'espace des solutions pour guider l'intégration des CNN au sein de systèmes embarqués basés sur des microcontrôleurs. Pour cela, la méthodologie ZIP-CNN est définie. Elle tient compte du système embarqué et du CNN à implémenter. Elle fournit à un concepteur des informations sur l'impact de l'exécution du CNN sur le système. Un modèle fourni quantitativement une estimation de la latence, de la consommation énergétique et de l'espace mémoire nécessaire à une inférence d'un CNN au sein d'une cible embarquée, quelle que soit la topologie du CNN. Ce modèle tient compte des éventuelles réductions algorithmiques telles que la distillation de connaissances, l'élagage ou la quantification. L'implémentation de CNN de l'état de l'art au sein de MCU a permis la validation expérimentale de la justesse de l'approche. L'utilisation des modèles développés durant ces travaux de thèse démocratise l'implémentation de CNN au sein de MCU, en guidant les concepteurs de systèmes embarqués. De plus, les résultats obtenus ouvrent une voie d'exploration pour appliquer les modèles développés à d'autres matériels cibles, comme les architectures multi-cœur ou les FPGA. Les résultats d'estimations sont également exploitables dans l'utilisation d'algorithmes de recherche de réseaux de neurones (NAS)<br>Digital systems used for the Internet of Things (IoT) and Embedded Systems have seen an increasing use in recent decades. Embedded systems based on Microcontroller Unit (MCU) solve various problems by collecting a lot of data. Today, about 250 billion MCU are in use. Projections in the coming years point to very strong growth. Artificial intelligence has seen a resurgence of interest in 2012. The use of Convolutional Neural Networks (CNN) has helped to solve many problems in computer vision or natural language processing. The implementation of CNN within embedded systems would greatly improve the exploitation of the collected data. However, the inference cost of a CNN makes their implementation within embedded systems challenging. This thesis focuses on exploring the solution space, in order to assist the implementation of CNN within embedded systems based on microcontrollers. For this purpose, the ZIP-CNN methodology is defined. It takes into account the embedded system and the CNN to be implemented. It provides an embedded designer with information regarding the impact of the CNN inference on the system. A designer can explore the impact of design choices, with the objective of respecting the constraints of the targeted application. A model is defined to quantitatively provide an estimation of the latency, the energy consumption and the memory space required to infer a CNN within an embedded target, whatever the topology of the CNN is. This model takes into account algorithmic reductions such as knowledge distillation, pruning or quantization. The implementation of state-of-the-art CNN within MCU verified the accuracy of the different estimations through an experimental process. This thesis democratize the implementation of CNN within MCU, assisting the designers of embedded systems. Moreover, the results open a way of exploration to apply the developed models to other target hardware, such as multi-core architectures or FPGA. The estimation results are also exploitable in the Neural Architecture Search (NAS)

APA, Harvard, Vancouver, ISO, and other styles

4

Reiling, Anthony J. "Convolutional Neural Network Optimization Using Genetic Algorithms." University of Dayton / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1512662981172387.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Knutsson, Magnus, and Linus Lindahl. "A COMPARATIVE STUDY OF FFN AND CNN WITHIN IMAGE RECOGNITION : The effects of training and accuracy of different artificial neural network designs." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-17214.

Full text

Abstract:

Image recognition and -classification is becoming more important as the need to be able to process large amounts of images is becoming more common. The aim of this thesis is to compare two types of artificial neural networks, FeedForward Network and Convolutional Neural Network, to see how these compare when performing the task of image recognition. Six models of each type of neural network was created that differed in terms of width, depth and which activation function they used in order to learn. This enabled the experiment to also see if these parameters had any effect on the rate which a network learn and how the network design affected the validation accuracy of the models. The models were implemented using the API Keras, and trained and tested using the dataset CIFAR-10. The results showed that within the scope of this experiment the CNN models were always preferable as they achieved a statistically higher validation accuracy compared to their FFN counterparts.

APA, Harvard, Vancouver, ISO, and other styles

6

Ďuriš, Denis. "Detekce ohně a kouře z obrazového signálu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-412968.

Full text

Abstract:

This diploma thesis deals with the detection of fire and smoke from the image signal. The approach of this work uses a combination of convolutional and recurrent neural network. Machine learning models created in this work contain inception modules and blocks of long short-term memory. The research part describes selected models of machine learning used in solving the problem of fire detection in static and dynamic image data. As part of the solution, a data set containing videos and still images used to train the designed neural networks was created. The results of this approach are evaluated in conclusion.

APA, Harvard, Vancouver, ISO, and other styles

7

Andersson, Viktor. "Semantic Segmentation : Using Convolutional Neural Networks and Sparse dictionaries." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139367.

Full text

Abstract:

The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This thesis investigates the effect of using such filters as kernels in the convolutional layers in the neural network. How do they affect the training time and final performance? The dataset used here is the Cityscapes-dataset which is a library of 25000 labeled road scene images.The sparse dictionary was acquired using the K-SVD method. The filters were added to two different networks whose performance was tested individually. One of the architectures is much deeper than the other. The results have been presented for both networks. The results show that filter initialization is an important aspect which should be taken into consideration while training the deep networks for semantic segmentation.

APA, Harvard, Vancouver, ISO, and other styles

8

Wilson, Brittany Michelle. "Evaluating and Improving the SEU Reliability of Artificial Neural Networks Implemented in SRAM-Based FPGAs with TMR." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8619.

Full text

Abstract:

Artificial neural networks (ANNs) are used in many types of computing applications. Traditionally, ANNs have been implemented in software, executing on CPUs and even GPUs, which capitalize on the parallelizable nature of ANNs. More recently, FPGAs have become a target platform for ANN implementations due to their relatively low cost, low power, and flexibility. Some safety-critical applications could benefit from ANNs, but these applications require a certain level of reliability. SRAM-based FPGAs are sensitive to single-event upsets (SEUs), which can lead to faults and errors in execution. However there are techniques that can mask such SEUs and thereby improve the overall design reliability. This thesis evaluates the SEU reliability of neural networks implemented in SRAM-based FPGAs and investigates mitigation techniques against upsets for two case studies. The first was based on the LeNet-5 convolutional neural network and was used to test an implementation with both fault injection and neutron radiation experiments, demonstrating that our fault injection experiments could accurately evaluate SEU reliability of the networks. SEU reliability was improved by selectively applying TMR to the most critical layers of the design, achieving a 35% improvement reliability at an increase in 6.6% resources. The second was an existing neural network called BNN-PYNQ. While the base design was more sensitive to upsets than the CNN previous tested, the TMR technique improved the reliability by approximately 7× in fault injection experiments.

APA, Harvard, Vancouver, ISO, and other styles

9

Mele, Matteo. "Convolutional Neural Networks for the Classification of Olive Oil Geographical Origin." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Find full text

Abstract:

This work proposed a deep learning approach to a multi-class classification problem. In particular, our project goal is to establish whether there is a connection between olive oil molecular composition and its geographical origin. To accomplish this, we implement a method to transform structured data into meaningful images (exploring the existing literature) and developed a fine-tuned Convolutional Neural Network able to perform the classification. We implement a series of tailored techniques to improve the model.

APA, Harvard, Vancouver, ISO, and other styles

10

Bianchi, Eric Loran. "COCO-Bridge: Common Objects in Context Dataset and Benchmark for Structural Detail Detection of Bridges." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/87588.

Full text

Abstract:

Common Objects in Context for bridge inspection (COCO-Bridge) was introduced for use by unmanned aircraft systems (UAS) to assist in GPS denied environments, flight-planning, and detail identification and contextualization, but has far-reaching applications such as augmented reality (AR) and other artificial intelligence (AI) platforms. COCO-Bridge is an annotated dataset which can be trained using a convolutional neural network (CNN) to identify specific structural details. Many annotated datasets have been developed to detect regions of interest in images for a wide variety of applications and industries. While some annotated datasets of structural defects (primarily cracks) have been developed, most efforts are individualized and focus on a small niche of the industry. This effort initiated a benchmark dataset with a focus on structural details. This research investigated the required parameters for detail identification and evaluated performance enhancements on the annotation process. The image dataset consisted of four structural details which are commonly reviewed and rated during bridge inspections: bearings, cover plate terminations, gusset plate connections, and out of plane stiffeners. This initial version of COCO-Bridge includes a total of 774 images; 10% for evaluation and 90% for training. Several models were used with the dataset to evaluate model overfitting and performance enhancements from augmentation and number of iteration steps. Methods to economize the predictive capabilities of the model without the addition of unique data were investigated to reduce the required number of training images. Results from model tests indicated the following: additional images, mirrored along the vertical-axis, provided precision and accuracy enhancements; increasing computational step iterations improved predictive precision and accuracy, and the optimal confidence threshold for operation was 25%. Annotation recommendations and improvements were also discovered and documented as a result of the research.<br>MS<br>Common Objects in Context for bridge inspection (COCO-Bridge) was introduced to improve a drone-conducted bridge inspection process. Drones are a great tool for bridge inspectors because they bring flexibility and access to the inspection. However, drones have a notoriously difficult time operating near bridges, because the signal can be lost between the operator and the drone. COCO-Bridge is an imagebased dataset that uses Artificial Intelligence (AI) as a solution to this particular problem, but has applications in other facets of the inspection as well. This effort initiated a dataset with a focus on identifying specific parts of a bridge or structural bridge elements. This would allow a drone to fly without explicit direction if the signal was lost, and also has the potential to extend its flight time. Extending flight time and operating autonomously are great advantagesfor drone operators and bridge inspectors. The output from COCO-Bridge would also help the inspectors identify areas that are prone to defects by highlighting regions that require inspection. The image dataset consisted of 774 images to detect four structural bridge elements which are commonly reviewed and rated during bridge inspections. The goal is to continue to increase the number of images and encompass more structural bridge elements in the dataset so that it may be used for all types of bridges. Methods to reduce the required number of images were investigated, because gathering images of structural bridge elements is challenging,. The results from model tests helped build a roadmap for the expansion and best-practices for developing a dataset of this type.

APA, Harvard, Vancouver, ISO, and other styles

11

Buratti, Luca. "Visualisation of Convolutional Neural Networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text

Abstract:

Le Reti Neurali, e in particolare le Reti Neurali Convoluzionali, hanno recentemente dimostrato risultati straordinari in vari campi. Purtroppo, comunque, non vi è ancora una chiara comprensione del perchè queste architetture funzionino così bene e soprattutto è difficile spiegare il comportamento nel caso di fallimenti. Questa mancanza di chiarezza è quello che separa questi modelli dall’essere applicati in scenari concreti e critici della vita reale, come la sanità o le auto a guida autonoma. Per questa ragione, durante gli ultimi anni sono stati portati avanti diversi studi in modo tale da creare metodi che siano capaci di spiegare al meglio cosa sta succedendo dentro una rete neurale oppure dove la rete sta guardando per predire in un certo modo. Proprio queste tecniche sono il centro di questa tesi e il ponte tra i due casi di studio che sono presentati sotto. Lo scopo di questo lavoro è quindi duplice: per prima cosa, usare questi metodi per analizzare e quindi capire come migliorare applicazioni basate su reti neurali convoluzionali e in secondo luogo, per investigare la capacità di generalizzazione di queste architetture, sempre grazie a questi metodi.

APA, Harvard, Vancouver, ISO, and other styles

12

Truzzi, Stefano. "Event classification in MAGIC through Convolutional Neural Networks." Doctoral thesis, Università di Siena, 2022. http://hdl.handle.net/11365/1216295.

Full text

Abstract:

The Major Atmospheric Gamma Imaging Cherenkov (MAGIC) telescopes are able to detect gamma rays from the ground with energies beyond several tens of GeV emitted by the most energetic known objects, including Pulsar Wind Nebulae, Active Galactic Nuclei, and Gamma-Ray Bursts. Gamma rays and cosmic rays are detected by imaging the Cherenkov light produced by the charged superluminal leptons in the extended air shower originated when the primary particle interacts with the atmosphere. These Cherenkov flashes brighten the night sky for short times in the nanosecond scale. From the image topology and other observables, gamma rays can be separated from the unwanted cosmic rays, and thereafter incoming direction and energy of the primary gamma rays can be reconstructed. The standard algorithm in MAGIC data analysis for the gamma/hadron separation is the so-called Random Forest, that works on a parametrization of the stereo events based on the shower image parameters. Until a few years ago, these algorithms were limited by the computational resources but modern devices, such as GPUs, make it possible to work efficiently on the pixel maps information. Most neural network applications in the field perform the training on Monte Carlo simulated data for the gamma-ray sample. This choice is prone to systematics arising from discrepancies between observational data and simulations. Instead, in this thesis I trained a known neural network scheme with observation data from a giant flare of the bright TeV blazar Mrk421 observed by MAGIC in 2013. With this method for gamma/hadron separation, the preliminary results compete with the standard MAGIC analysis based on Random Forest classification, which also shows the potential of this approach for further improvement. In this thesis first an introduction to the High-Energy Astrophysics and the Astroparticle physics is given. The cosmic messengers are briefly reviewed, with a focus on the photons, then astronomical sources of γ rays are described, followed by a description of the detection techniques. In the second chapter the MAGIC analysis pipeline starting from the low level data acquisition to the high level data is described. The MAGIC Instrument Response Functions are detailed. Finally, the most important astronomical sources used in the standard MAGIC analysis are listed. The third chapter is devoted to Deep Neural Network techniques, starting from an historical Artificial Intelligence excursus followed by a Machine Learning description. The basic principles behind an Artificial Neural Network and the Convolutional Neural Network used for this work are explained. Last chapter describes my original work, showing in detail the data selection/manipulation for training the Inception Resnet V2 Convolutional Neural Network and the preliminary results obtained from four test sources.

APA, Harvard, Vancouver, ISO, and other styles

13

Petrocelli, Danilo. "Reti neurali convoluzionali per il riconoscimento facciale sul robot NAO." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017.

Find full text

Abstract:

A partite dallo studio dell’attuale sistema di riconoscimento facciale presente sul robot NAO, questo lavoro di tesi si propone di ricercare una soluzione alternativa, basata sul riconoscimento delle caratteristiche facciali che possa essere utilizzata per estendere l’attuale sistema di riconoscimento in uso dal robot. Le soluzioni proposte per il rilevamento e successivo riconoscimento dei volti, nel robot NAO, si basano su delle librerie software pronte all’uso. In particolare il modulo di visione AlFacedetect consente di rilevare un volto ed in seguito riconoscerlo, mentre La fase di apprendimento avviene grazie al modulo Learn Face presente nel software Choregraphe, sviluppato dalla Aldebaran Robotics. Al fine di ricercare soluzioni alternative ai problemi esposti, sarà illustrato il software sviluppato e gli strumenti utilizzati in ciascuna delle fasi sopra indicate, nessuna delle quali utilizza i moduli built-in presenti nel NAO. La fase iniziale è consistita nella realizzazione dei datasets, uno per ogni caratteristica facciale considerata, dai quali si sono ricavati i diversi training set necessari per addestrare le reti neurali convoluzionali a classificare le caratteristiche facciali prese in esame. Per la fase di detection dei volti si sono utilizzati algoritmi noti in letteratura e opportunamente implementati nella libreria di Computer Vision OpenCV, adattati dinamicamente al compito da svolgere, ovvero alla caratteristica facciale da individuare. In questo modo si sono potute sfruttare appieno le potenzialità e l’efficienza delle reti neurali convoluzionali nello svolgere compiti di classificazione, riuscendo a garantire, nella fase di riconoscimento un’elevata precisione, in termini di accuratezza. La fase finale è consistita nel porting di quanto fatto sul robot NAO, in modo da effettuare la predizione di una determinata caratteristica facciale a partire da un’immagine acquisita attraverso la sua videocamera.

APA, Harvard, Vancouver, ISO, and other styles

14

Ghibellini, Alessandro. "Trend prediction in financial time series: a model and a software framework." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24708/.

Full text

Abstract:

The research has the aim to build an autonomous support for traders which in future can be translated in an Active ETF. My thesis work is characterized for a huge focus on problem formulation and an accurate analysis on the impact of the input and the length of the future horizon on the results. I will demonstrate that using financial indicators already used by professional traders every day and considering a correct length of the future horizon, it is possible to reach interesting scores in the forecast of future market states, considering both accuracy, which is around 90% in all the experiments, and confusion matrices which confirm the good accuracy scores, without an expensive Deep Learning approach. In particular, I used a 1D CNN. I also emphasize that classification appears to be the best approach to address this type of prediction in combination with proper management of unbalanced class weights. In fact, it is standard having a problem of unbalanced class weights, otherwise the model will react for inconsistent trend movements. Finally I proposed a Framework which can be used also for other fields which allows to exploit the presence of the Experts of the sector and combining this information with ML/DL approaches.

APA, Harvard, Vancouver, ISO, and other styles

15

Dahl, Jonas. "Feature Selection for Sentiment Analysis of Swedish News Article Titles." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233534.

Full text

Abstract:

The aim of this study was to elaborate the possibilities of sentiment analyzing Swedish news article titles using machine learning approaches and find how the text is best represented in such conditions. Sentiment analysis has traditionally been conducted by part-of-speech tagging and counting word polarities, which performs well for large domains and in absence of large sets of training data. For narrower domains and previously labeled data, supervised learning can be used. The work of this thesis tested the performance of a convolutional neural network and a Support Vector Machine on different sets of data. The data sets were constructed to represent various language features. This included for example a simple unigram bag-of-words model storing word counts, a bigram bag-of-words model to include the ordering of words and an integer vector summary of the title. The study concluded that each of the tested feature sets gave information about the sentiment to various extents. The neural network approach with all feature sets combined performed better than the two annotators of the study. Despite the limited data set, overfitting did not seem to be a problem when using the features together.<br>Målet med detta arbete var att undersöka möjligheten till sentimentanalys av svenska nyhetsrubriker med hjälp av maskininlärning och förstå hur dessa rubriker bäst representeras. Sentimentanalys har traditionellt använt ordklassmärkning och räknande av ordpolariteter, som fungerar bra för stora domäner där avsaknaden av större uppmärkt träningsdata är stor. För mindre domäner och tidigare uppmärkt data kan övervakat lärande användas. Inom ramen för detta arbete undersöktes ett artificiellt neuronnät med faltning och en stödvektormaskin på olika datamängder. Datamängderna formades för att representera olika språkegenskaper. Detta inkluderade bland annat en enkel ordräkningsmodell, en bigramräkningsmodell och en heltalssummering av generella egenskaper för rubriken. I studien dras slutsatsen att varje datamängd innebar att ny information kunde tillföras i olika stor utsträckning. Det artificiella neuronnätet med alla datamängder tillsammans presterade bättre än de två personer som märkte upp data till denna studie. Trots en begränsad datamängd inträffade verkade inte modellerna övertränas.

APA, Harvard, Vancouver, ISO, and other styles

16

Talevi, Luca, and Luca Talevi. "“Decodifica di intenzioni di movimento dalla corteccia parietale posteriore di macaco attraverso il paradigma Deep Learning”." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17846/.

Full text

Abstract:

Le Brain Computer Interfaces (BCI) invasive permettono di restituire la mobilità a pazienti che hanno perso il controllo degli arti: ciò avviene attraverso la decodifica di segnali bioelettrici prelevati da aree corticali di interesse al fine di guidare un arto prostetico. La decodifica dei segnali neurali è quindi un punto critico nelle BCI, richiedendo lo sviluppo di algoritmi performanti, affidabili e robusti. Tali requisiti sono soddisfatti in numerosi campi dalle Deep Neural Networks, algoritmi adattivi le cui performance scalano con la quantità di dati forniti, allineandosi con il crescente numero di elettrodi degli impianti. Impiegando segnali pre-registrati dalla corteccia di due macachi durante movimenti di reach-to-grasp verso 5 oggetti differenti, ho testato tre basilari esempi notevoli di DNN – una rete densa multistrato, una Convolutional Neural Network (CNN) ed una Recurrent NN (RNN) – nel compito di discriminare in maniera continua e real-time l’intenzione di movimento verso ciascun oggetto. In particolare, è stata testata la capacità di ciascun modello di decodificare una generica intenzione (single-class), la performance della migliore rete risultante nel discriminarle (multi-class) con o senza metodi di ensemble learning e la sua risposta ad un degrado del segnale in ingresso. Per agevolarne il confronto, ciascuna rete è stata costruita e sottoposta a ricerca iperparametrica seguendo criteri comuni. L’architettura CNN ha ottenuto risultati particolarmente interessanti, ottenendo F-Score superiori a 0.6 ed AUC superiori a 0.9 nel caso single-class con metà dei parametri delle altre reti e tuttavia maggior robustezza. Ha inoltre mostrato una relazione quasi-lineare con il degrado del segnale, priva di crolli prestazionali imprevedibili. Le DNN impiegate si sono rivelate performanti e robuste malgrado la semplicità, rendendo eventuali architetture progettate ad-hoc promettenti nello stabilire un nuovo stato dell’arte nel controllo neuroprotesico.

APA, Harvard, Vancouver, ISO, and other styles

17

Šůstek, Martin. "Word2vec modely s přidanou kontextovou informací." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2017. http://www.nusl.cz/ntk/nusl-363837.

Full text

Abstract:

This thesis is concerned with the explanation of the word2vec models. Even though word2vec was introduced recently (2013), many researchers have already tried to extend, understand or at least use the model because it provides surprisingly rich semantic information. This information is encoded in N-dim vector representation and can be recall by performing some operations over the algebra. As an addition, I suggest a model modifications in order to obtain different word representation. To achieve that, I use public picture datasets. This thesis also includes parts dedicated to word2vec extension based on convolution neural network.

APA, Harvard, Vancouver, ISO, and other styles

18

Lang, Matěj. "Detekce vad vláknitého materiálu užitím metod strojového učení." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-400649.

Full text

Abstract:

Cílem této diplomové práce je automatizace detekce vad ve vláknitých materiálech. Firma SILON se již přes padesát let zabývá výrobou jemné vaty z recyklovaných PET lahví. Tato vata se následně používá ve stavebnictví, automobilovém průmyslu, ale nejčastěji v dámských hygienických potřebách a dětských plenách. Cílem firmy je produkovat co nejkvalitnější výrobek a proto je každá dávka testována v laboratoři s několika přísnými kritérii. Jednám z testů je i množství vadných vláken, jako jsou zacuchané smotky vláken, nebo nevydloužená vlákna, která jsou tvrdá a snadno se lámou. Navrhovaný systém sestává ze snímací lavice fungující jako scanner, která nasnímá vzorek vláken, který byl vložen mezi dvě skleněné desky. Byla provedena série testů s různým osvětlením, která ověřovala vlastnosti Rhodaminu, který se používá právě na rozlišení defektů od ostatních vláken. Tyto defekty mají zpravidla jinou molekulární strukturu, na kterou se barvivo chytá lépe. Protože je Rhodamin fluorescenční barvivo, je možné ho například pod UV světlem snáze rozeznat. Tento postup je využíván při manuální detekci. Při snímání kamerou je možno si vypomoci filtrem na kameře, který odfiltruje excitační světlo a propustí pouze světlo vyzářené Rhodaminem. Součástí výroby skeneru byla i tvorba ovládacího programu. Byla vytvořena vlastní knihovna pro ovládání motoru a byla upravena knihovna pro kameru. Oba systém pak bylo možno ovládat pomocí jednotného GUI, které zajišťovalo pořizování snímku celé desky. Pomocí skeneru byla nasnímána řada snímků, které bylo třeba anotovat, aby bylo možné naučit počítač rozlišovat defekty. Anotace proběhla na pixelové úrovni; každý defekt byl označen v grafickém editoru ve speciální vrstvě. Pro rozlišování byla použita umělá neuronová síť, která funguje na principu konvolucí. Tento typ sítě je navíc plně konvoluční, takže výstupem sítě je obraz, který by měl označit na tom původním vadné pixely. Výsledky naučené sítě jsou v práci prezentovány a diskutovány. Síť byla schopna se naučit rozeznávat většinu defektů a spolehlivě je umí rozeznat a segmentovat. Potíže má v současné době s detekcí rozmazaných defektů na krajích zorného pole a s defekty, jejichž hranice není tolik zřetelná na vstupních obrazech. Nutno zmínit, že zákazník má zájem o kompletní řešení scanneru i s detekčním softwarem a vývoj tohoto zařízení bude pokračovat i po závěru této diplomové práce.

APA, Harvard, Vancouver, ISO, and other styles

19

Drottsgård, Alexander, and Jens Andreassen. "Effektivisering av automatiserad igenkänning av registreringsskyltar med hjälp av artificiella neurala nätverk för användning inom smarta hem." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20803.

Full text

Abstract:

Konceptet automatiserad igenkänning och avläsning av registreringsskyltarhar utvecklats mycket de senaste åren och användningen av Artificiellaneurala nätverk har introducerats i liten skala med lovande resultat. Viundersökte möjligheten att använda detta i ett automatiserat system förgarageportar och implementerade en prototyp för testning. Den traditionellaprocessen för att läsa av en skylt kräver flera steg, i vissa fall upp till fem.Dessa steg ger alla en felmarginal som aggregerat kan leda till över 30% riskför ett misslyckat resultat. I denna uppsats adresseras detta problem och medhjälp av att använda oss utav Artificiella neurala nätverk utvecklades enkortare process med endast två steg för att läsa en skylt, (1) lokaliseraregistreringsskylten (2) läsa karaktärerna på registreringsskylten. Dettaminskar antalet steg till hälften av den traditionella processen samt minskarrisken för fel med 13%. Vi gjorde en Litteraturstudie för att identifiera detlämpligaste neurala nätverket för uppgiften att lokalisera registreringsskyltarmed vår miljös begränsningar samt möjligheter i åtanke. Detta ledde tillanvändandet av Faster R-CNN, en algoritm som använder ett antal artificiellaneurala nätverk. Vi har använt metoden Design och Creation för att skapa enproof of concept prototyp som använder vårt föreslagna tillvägagångssätt föratt bevisa att det är möjligt att implementera detta i en verklig miljö.<br>The concept of automated recognition and reading of license plates haveevolved a lot the last years and the use of Artificial neural networks have beenintroduced in a small scale with promising results. We looked into thepossibility of using this in an automated garage port system and weimplemented a prototype for testing. The traditional process for reading alicense plate requires multiple steps, sometimes up to five. These steps all givea margin of error which aggregated sometimes leads to over 30% risk forfailure. In this paper we addressed this issue and with the help of a Artificialneural network. We developed a process with only two steps for the entireprocess of reading a license plate, (1) localize license plate (2) read thecharacters on the plate. This reduced the number of steps to half of theprevious number and also reduced the risk for errors with 13%. We performeda Literature Review to find the best suited algorithm for the task oflocalization of the license plate in our specific environment. We found FasterR-CNN, a algorithm which uses multiple artificial neural networks. We usedthe method Design and Creation to implement a proof of concept prototypeusing our approach which proved that this is possible to do in a realenvironment.

APA, Harvard, Vancouver, ISO, and other styles

20

Baronti, Mattia. "Identificazione e localizzazione del danno in strutture reticolari mediante modi di vibrare e reti neurali artificiali." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text

Abstract:

Il presente elaborato è finalizzato a determinare un metodo di indagine che consenta l’identificazione e la valutazione di danneggiamenti in strutture reticolari, mediante l’utilizzo di modi di vibrare e reti neurali artificiali. Nell’ambito dell’ingegneria civile, con il termine Structural Health Monitoring, si identificano le metodologie di monitoraggio che permettono l’individuazione di anomalie strutturali, consentendo la valutazione, in modo continuo ed automatizzato, dello stato di salute di un’opera. Metodologie Deep Learning, applicate allo SHM, sono al centro della ricerca scientifica degli ultimi anni, grazie ai progressi tecnologici e all’introduzione di strumenti di calcolo dalle notevoli capacità computazionali, in grado di elaborare grandi quantità di dati. In questo contesto, la metodologia di apprendimento supervisionato proposta si basa sull’utilizzo di una rete neurale convoluzionale (CNN), capace, sulla base di un dataset di informazioni dinamiche relative alle configurazioni di danno previste, di riconoscere e classificare la condizione strutturale di un’opera, identificando, localizzando e quantificando un eventuale danneggiamento. La classificazione della condizione strutturale avviene sulla base del training specifico della rete realizzato su grandi quantità di informazioni generate analiticamente da un modello strutturale. L’elaborazione dei dati di esempio, consente alla rete di identificare automaticamente le caratteristiche di interesse del problema, e di prevedere la condizione strutturale per dati in input non ancora visionati. In questa analisi, il parametro Modal Assurance Criterion (MAC), strumento di confronto tra modi di vibrare, viene utilizzato come indicatore dello stato di salute strutturale. Il problema di identificazione del danno viene valutato, ponendo le basi per ulteriori analisi, mediante l’applicazione del metodo su due casi studio, relativi, rispettivamente, ad un modello piano e ad uno spaziale.

APA, Harvard, Vancouver, ISO, and other styles

21

Birindwa, Fleury. "Prestandajämförelse mellan Xception, InceptionV3 och MobileNetV2 för bildklassificering på nätpaneler." Thesis, Jönköping University, JTH, Datateknik och informatik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-51351.

Full text

Abstract:

Under de senaste året har modeller för djupinlärning använts inom nästa alla områden, från industri till akademi, särskilt för bildklassifikation. Dessa modeller är dock enorma i storlek, med miljontals parametrar, vilket gör det svårt att distribuera till mindre enheter med begränsade resurser såsom mobiltelefoner. Denna studie tar upp små modeller av faltningsnätverk som är toppmoderna inom djupinlärning och vars storlek är lämplig för mobilapplikation. Syftet med denna studie är att utvärdera prestanda på faltningsnätverken Xception, InceptionV3 och MobilNetV2 för att underlätta vid valbeslut av faltningsnätverk som bas vid utveckling av mobila applikation inom bildklassificering. För att uppnå syftet har dessa faltningsnätverk implementeras med hjälp av överföringsinlärning metod samt utformas för att skilja på bilder av nätpaneler från företaget Troax. Studien tar upp metoden som möjliggör att överföra kunskap från befintliga förtränade modeller till nya modeller. Studien förklarar även hur träningsprocessen och testprocessen gick till samt analys kring resultatet. Resultat visade att Xception hade 86 % noggrannhet med en processtid på 10 minuter på 2000 träningsbilder och 1000st testbilder. Xceptions prestation var bäst bland alla dessa modeller. Skillnaden mellan Xception och Inception var på 10 % noggrannhet och 2 minuter processtid. Mellan Xception och MobilNetV2 var skillnaden på 23 % noggrannhet och 3 minuter processtid. Experimentet visade att dessa modeller presterade mindre bra vid mindre träningsbilder under 800st. Över 800st bilder började respektive modell att utföra prediktering över 70 % noggrannhet.<br>In recent years, deep learning models have been used in almost all areas, from industry to academia, specifically for image classification. However, these models are huge in size, with millions of parameters, making it difficult to distribute to smaller devices with limited resources such as mobile phones. This study addresses lightweight pre-trained models of convolutional neural networks which is state of art in deep learning and their size is suitable as a base model for mobile application development. The purpose of this study is to evaluate the performance of Xception, InceptionV3 and MobilNetV2 in order to facilitate selection decisions of a lightweight convolutional networks as base for the development of mobile applications in image classification. In order to achieve their purpose, these models have been implemented using the Transfer Learning method and are designed to distinguish images on mesh panels from the company Troax. The study takes up the method that allows transfer of knowledge from an existing model to a new model, explain how the training process and the test process went, as well as analysis of results. Results showed that Xception had 86% accuracy and had 10 minutes processing time on 2000 training images and 1000 test images. Exception’s performance was the best among all these models. The difference between Xception and InceptionV3 was 10% accuracy and 2 minutes process time. Between Xception and MobilNetV2 there was a difference of 23% in accuracy and 3 minutes in process time. Experiments showed that these models performed less well with smaller training images below 800 images. Over 800 images, each model began to perform prediction over 70% accuracy.

APA, Harvard, Vancouver, ISO, and other styles

22

Heidari, Jawid. "Classifying Material Defects with Convolutional Neural Networks and Image Processing." Thesis, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-387797.

Full text

Abstract:

Fantastic progress has been made within the field of machine learning and deep neural networks in the last decade. Deep convolutional neural networks (CNN) have been hugely successful in imageclassification and object detection. These networks can automate many processes in the industries and increase efficiency. One of these processes is image classification implementing various CNN-models. This thesis addressed two different approaches for solving the same problem. The first approach implemented two CNN-models to classify images. The large pre-trained VGG-model was retrained using so-called transfer learning and trained only the top layers of the network. The other model was a smaller one with customized layers. The trained models are an end-to-end solution. The input is an image, and the output is a class score. The second strategy implemented several classical image processing algorithms to detect the individual defects existed in the pictures. This method worked as a ruled based object detection algorithm. Canny edge detection algorithm, combined with two mathematical morphology concepts, made the backbone of this strategy. Sandvik Coromant operators gathered approximately 1000 microscopical images used in this thesis. Sandvik Coromant is a leading producer of high-quality metal cutting tools. During the manufacturing process occurs some unwanted defects in the products. These defects are analyzed by taking images with a conventional microscopic of 100 and 1000 zooming capability. The three essential defects investigated in this thesis defined as Por, Macro and Slits. Experiments conducted during this thesis show that CNN-models is a good approach to classify impurities and defects in the metal industry, the potential is high. The validation accuracy reached circa 90 percentage, and the final evaluation accuracy was around 95 percentage , which is an acceptable result. The pretrained VGG-model reached a much higher accuracy than the customized model. The Canny edge detection algorithm combined dilation and erosion and contour detection produced a good result. It detected the majority of the defects existed in the images.

APA, Harvard, Vancouver, ISO, and other styles

23

Motembe, Dodi. "Investigation of hierarchical deep neural network structure for facial expression recognition." Diss., 2020. http://hdl.handle.net/10500/27389.

Full text

Abstract:

Facial expression recognition (FER) is still a challenging concept, and machines struggle to comprehend effectively the dynamic shifts in facial expressions of human emotions. The existing systems, which have proven to be effective, consist of deeper network structures that need powerful and expensive hardware. The deeper the network is, the longer the training and the testing. Many systems use expensive GPUs to make the process faster. To remedy the above challenges while maintaining the main goal of improving the accuracy rate of the recognition, we create a generic hierarchical structure with variable settings. This generic structure has a hierarchy of three convolutional blocks, two dropout blocks and one fully connected block. From this generic structure we derived four different network structures to be investigated according to their performances. From each network structure case, we again derived six network structures in relation to the variable parameters. The variable parameters under analysis are the size of the filters of the convolutional maps and the max-pooling as well as the number of convolutional maps. In total, we have 24 network structures to investigate, and six network structures per case. After simulations, the results achieved after many repeated experiments showed in the group of case 1; case 1a emerged as the top performer of that group, and case 2a, case 3c and case 4c outperformed others in their respective groups. The comparison of the winners of the 4 groups indicates that case 2a is the optimal structure with optimal parameters; case 2a network structure outperformed other group winners. Considerations were done when choosing the best network structure, considerations were; minimum accuracy, average accuracy and maximum accuracy after 15 times of repeated training and analysis of results. All 24 proposed network structures were tested using two of the most used FER datasets, the CK+ and the JAFFE. After repeated simulations the results demonstrate that our inexpensive optimal network architecture achieved 98.11 % accuracy using the CK+ dataset. We also tested our optimal network architecture with the JAFFE dataset, the experimental results show 84.38 % by using just a standard CPU and easier procedures. We also compared the four group winners with other existing FER models performances recorded recently in two studies. These FER models used the same two datasets, the CK+ and the JAFFE. Three of our four group winners (case 1a, case 2a and case 4c) recorded only 1.22 % less than the accuracy of the top performer model when using the CK+ dataset, and two of our network structures, case 2a and case 3c came in third, beating other models when using the JAFFE dataset.<br>Electrical and Mining Engineering

APA, Harvard, Vancouver, ISO, and other styles

24

Kalgaonkar, Priyank B. "AI on the Edge with CondenseNeXt: An Efficient Deep Neural Network for Devices with Constrained Computational Resources." Thesis, 2021. http://dx.doi.org/10.7912/C2/64.

Full text

Abstract:

Indiana University-Purdue University Indianapolis (IUPUI)<br>Research work presented within this thesis propose a neoteric variant of deep convolutional neural network architecture, CondenseNeXt, designed specifically for ARM-based embedded computing platforms with constrained computational resources. CondenseNeXt is an improved version of CondenseNet, the baseline architecture whose roots can be traced back to ResNet. CondeseNeXt replaces group convolutions in CondenseNet with depthwise separable convolutions and introduces group-wise pruning, a model compression technique, to prune (remove) redundant and insignificant elements that either are irrelevant or do not affect performance of the network upon disposition. Cardinality, a new dimension to the existing spatial dimensions, and class-balanced focal loss function, a weighting factor inversely proportional to the number of samples, has been incorporated in order to relieve the harsh effects of pruning, into the design of CondenseNeXt’s algorithm. Furthermore, extensive analyses of this novel CNN architecture was performed on three benchmarking image datasets: CIFAR-10, CIFAR-100 and ImageNet by deploying the trained weight on to an ARM-based embedded computing platform: NXP BlueBox 2.0, for real-time image classification. The outputs are observed in real-time in RTMaps Remote Studio’s console to verify the correctness of classes being predicted. CondenseNeXt achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error), and up to 59.98% reduction in forward FLOPs compared to CondenseNet. CondenseNeXt can also achieve a final trained model size of 2.9 MB, however at the cost of 2.26% in accuracy loss. Thus, performing image classification on ARM-Based computing platforms without requiring a CUDA enabled GPU support, with outstanding efficiency.

APA, Harvard, Vancouver, ISO, and other styles

25

Niedermayer, Graeme. "Investigations of calorimeter clustering in ATLAS using machine learning." Thesis, 2017. https://dspace.library.uvic.ca//handle/1828/8970.

Full text

Abstract:

The Large Hadron Collider (LHC) at CERN is designed to search for new physics by colliding protons with a center-of-mass energy of 13 TeV. The ATLAS detector is a multipurpose particle detector built to record these proton-proton collisions. In order to improve sensitivity to new physics at the LHC, luminosity increases are planned for 2018 and beyond. With this greater luminosity comes an increase in the number of simultaneous proton-proton collisions per bunch crossing (pile-up). This extra pile-up has adverse effects on algorithms for clustering the ATLAS detector's calorimeter cells. These adverse effects stem from overlapping energy deposits originating from distinct particles and could lead to difficulties in accurately reconstructing events. Machine learning algorithms provide a new tool that has potential to improve clustering performance. Recent developments in computer science have given rise to new set of machine learning algorithms that, in many circumstances, out-perform more conventional algorithms. One of these algorithms, convolutional neural networks, has been shown to have impressive performance when identifying objects in 2d or 3d arrays. This thesis will develop a convolutional neural network model for calorimeter cell clustering and compare it to the standard ATLAS clustering algorithm.<br>Graduate

APA, Harvard, Vancouver, ISO, and other styles

26

(10911822), Priyank Kalgaonkar. "AI on the Edge with CondenseNeXt: An Efficient Deep Neural Network for Devices with Constrained Computational Resources." Thesis, 2021.

Find full text

Abstract:

Research work presented within this thesis propose a neoteric variant of deep convolutional neural network architecture, CondenseNeXt, designed specifically for ARM-based embedded computing platforms with constrained computational resources. CondenseNeXt is an improved version of CondenseNet, the baseline architecture whose roots can be traced back to ResNet. CondeseNeXt replaces group convolutions in CondenseNet with depthwise separable convolutions and introduces group-wise pruning, a model compression technique, to prune (remove) redundant and insignificant elements that either are irrelevant or do not affect performance of the network upon disposition. Cardinality, a new dimension to the existing spatial dimensions, and class-balanced focal loss function, a weighting factor inversely proportional to the number of samples, has been incorporated in order to relieve the harsh effects of pruning, into the design of CondenseNeXt’s algorithm. Furthermore, extensive analyses of this novel CNN architecture was performed on three benchmarking image datasets: CIFAR-10, CIFAR-100 and ImageNet by deploying the trained weight on to an ARM-based embedded computing platform: NXP BlueBox 2.0, for real-time image classification. The outputs are observed in real-time in RTMaps Remote Studio’s console to verify the correctness of classes being predicted. CondenseNeXt achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error), and up to 59.98% reduction in forward FLOPs compared to CondenseNet. CondenseNeXt can also achieve a final trained model size of 2.9 MB, however at the cost of 2.26% in accuracy loss. Thus, performing image classification on ARM-Based computing platforms without requiring a CUDA enabled GPU support, with outstanding efficiency.<br>

APA, Harvard, Vancouver, ISO, and other styles

27

(9811085), Anand Koirala. "Precision agriculture: Exploration of machine learning approaches for assessing mango crop quantity." Thesis, 2020. https://figshare.com/articles/thesis/Precision_agriculture_Exploration_of_machine_learning_approaches_for_assessing_mango_crop_quantity/13411625.

Full text

Abstract:

A machine vision based system is proposed to replace the current in-orchard manual estimates of mango fruit yield, to inform harvest resourcing and marketing. The state-of-the-art in fruit detection was reviewed, highlighting the recent move from traditional image segmentation methods to convolution neural network (CNN) based deep learning methods. An experimental comparison of several deep learning based object detection frameworks (single shot detectors versus two-staged detectors) and several standard CNN architectures was undertaken for detection of mango panicles and fruit in tree images. The machine vision system used images of individual trees captured during night time from a moving platform mounted with a Global Navigation Satellite System (GNSS) receiver and a LED panel floodlight. YOLO, a single shot object detection framework, was re-designed and named as MangoYOLO. MangoYOLO outperformed existing state-of-the-art deep learning object detection frameworks in terms of fruit detection time and accuracy and was robust in use across different cultivars and cameras. MangoYOLO achieved F1 score of 0.968 and average precision of 0.983 and required just 70 ms per image (2048 × 2048 pixel) and 4417 MB memory. The annotated image dataset was made publicly available. Approaches were trialled to relate the fruit counts from tree images to the actual harvest count at an individual tree level. Machine vision based estimates of fruit load ranged between -11% to +14% of packhouse fruit counts. However, estimation of fruit yield (t/ha) requires estimation of fruit size as well as fruit number. A fruit sizing app for smart phones was developed as an affordable in-field solution. The solution was based on segmentation of the fruit in image using colour features and estimation of the camera to fruit perimeter distance based on use of fruit allometrics. For mango fruit, RMSEs of 5.3 and 3.7 mm were achieved on length and width measurements under controlled lighting, and RMSEs of 5.5 and 4.6 mm were obtained in-field under ambient lighting. Further, estimation of harvest timing can be informed by assessment of the spread of flowering. Deep learning object detection methods were deployed for assessment of the number and development stage of mango panicles, on tree. Methods to deal with different orientations of flower panicles in tree images were implemented. An R2 >0.8 was achieved between machine vision count of panicles on images and in-field human count per tree. Similarly, mean average precision of 69.1% was achieved for classification of panicle stages. These machine vision systems form a foundation for estimation of crop load and harvest timing, and for automated harvesting.

APA, Harvard, Vancouver, ISO, and other styles

28

(8771429), Ashley S. Dale. "3D OBJECT DETECTION USING VIRTUAL ENVIRONMENT ASSISTED DEEP NETWORK TRAINING." Thesis, 2021.

Find full text

Abstract:

<div> <div> <div> <p>An RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety of configurations. When the MR-CNN architecture was initialized with MS COCO weights and the heads were trained with a mix of synthetic data and real world data, F1 scores improved in four of the five classes: The average maximum F1-score of all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91, compared to F1 = 0.89 for the networks trained exclusively with real data, and the standard deviation of the maximum mean F1-score for synthetically trained networks is σ∗ <sub>F1 </sub>= 0.015, compared to σF 1 = 0.020 for the networks trained exclusively with real data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background. </p></div></div></div>

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!