Log in

Relevant bibliographies by topics / YOLOv5s / Dissertations / Theses

To see the other types of publications on this topic, follow the link: YOLOv5s.

Dissertations / Theses on the topic 'YOLOv5s'

Author: Grafiati

Published: 5 June 2025

Last updated: 16 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 31 dissertations / theses for your research on the topic 'YOLOv5s.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Oškera, Jan. "Detekce dopravních značek a semaforů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-432850.

Full text

Abstract:

The thesis focuses on modern methods of traffic sign detection and traffic lights detection directly in traffic and with use of back analysis. The main subject is convolutional neural networks (CNN). The solution is using convolutional neural networks of YOLO type. The main goal of this thesis is to achieve the greatest possible optimization of speed and accuracy of models. Examines suitable datasets. A number of datasets are used for training and testing. These are composed of real and synthetic data sets. For training and testing, the data were preprocessed using the Yolo mark tool. The training of the model was carried out at a computer center belonging to the virtual organization MetaCentrum VO. Due to the quantifiable evaluation of the detector quality, a program was created statistically and graphically showing its success with use of ROC curve and evaluation protocol COCO. In this thesis I created a model that achieved a success average rate of up to 81 %. The thesis shows the best choice of threshold across versions, sizes and IoU. Extension for mobile phones in TensorFlow Lite and Flutter have also been created.

APA, Harvard, Vancouver, ISO, and other styles

2

Kharel, Subash. "POTHOLE DETECTION USING DEEP LEARNING AND AREA ASSESSMENT USING IMAGE MANIPULATION." OpenSIUC, 2021. https://opensiuc.lib.siu.edu/theses/2825.

Full text

Abstract:

Every year, drivers are spending over 3 billions to repair damage on vehicle caused by potholes. Along with the financial disaster, potholes cause frustration in drivers. Also, with the emerging development of automated vehicles, road safety with automation in mind is being a necessity. Deep Learning techniques offer intelligent alternatives to reduce the loss caused by spotting pothole. The world is connected in such a way that the information can be shared in no time. Using the power of connectivity, we can communicate the information of potholes to other vehicles and also the department of Transportation for necessary action. A significant number of research efforts have been done with a view to help detect potholes in the pavements. In this thesis, we have compared two object detection algorithms belonging to two major classes i.e. single shot detectors and two stage detectors using our dataset. Comparing the results in the Faster RCNN and YOLOv5, we concluded that, potholes take a small portion in image which makes potholes detection with YOLOv5 less accurate than the Faster RCNN, but keeping the speed of detection in mind, we have suggested that YOLOv5 will be a better solution for this task. Using the YOLOv5 model and image processing technique, we calculated approximate area of potholes and visualized the shape of potholes. Thus obtained information can be used by the Department of Transportation for planning necessary construction tasks. Also, we can use these information to warn the drivers about the severity of potholes depending upon the shape and area.

APA, Harvard, Vancouver, ISO, and other styles

3

Borngrund, Carl. "Machine vision for automation of earth-moving machines : Transfer learning experiments with YOLOv3." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-75169.

Full text

Abstract:

This master thesis investigates the possibility to create a machine vision solution for the automation of earth-moving machines. This research was done as without some type of vision system it will not be possible to create a fully autonomous earth moving machine that can safely be used around humans or other machines. Cameras were used as the primary sensors as they are cheap, provide high resolution and is the type of sensor that most closely mimic the human vision system. The purpose of this master thesis was to use existing real time object detectors together with transfer learning and examine if they can successfully be used to extract information in environments such as construction, forestry and mining. The amount of data needed to successfully train a real time object detector was also investigated. Furthermore, the thesis examines if there are specifically difficult situations for the defined object detector, how reliable the object detector is and finally how to use service-oriented architecture principles can be used to create deep learning systems. To investigate the questions formulated above, three data sets were created where different properties were varied. These properties were light conditions, ground material and dump truck orientation. The data sets were created using a toy dump truck together with a similarly sized wheel loader with a camera mounted on the roof of its cab. The first data set contained only indoor images where the dump truck was placed in different orientations but neither the light nor the ground material changed. The second data set contained images were the light source was kept constant, but the dump truck orientation and ground materials changed. The last data set contained images where all property were varied. The real time object detector YOLOv3 was used to examine how a real time object detector would perform depending on which one of the three data sets it was trained using. No matter the data set, it was possible to train a model to perform real time object detection. Using a Nvidia 980 TI the inference time of the model was around 22 ms, which is more than enough to be able to classify videos running at 30 fps. All three data sets converged to a training loss of around 0.10. The data set which contained more varied data, such as the data set where all properties were changed, performed considerably better reaching a validation loss of 0.164 compared to the indoor data set, containing the least varied data, only reached a validation loss of 0.257. The size of the data set was also a factor in the performance, however it was not as important as having varied data. The result also showed that all three data sets could reach a mAP score of around 0.98 using transfer learning.

APA, Harvard, Vancouver, ISO, and other styles

4

Melcherson, Tim. "Image Augmentation to Create Lower Quality Images for Training a YOLOv4 Object Detection Model." Thesis, Uppsala universitet, Signaler och system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-429146.

Full text

Abstract:

Research in the Arctic is of ever growing importance, and modern technology is used in news ways to map and understand this very complex region and how it is effected by climate change. Here, animals and vegetation are tightly coupled with their environment in a fragile ecosystem, and when the environment undergo rapid changes it risks damaging these ecosystems severely. Understanding what kind of data that has potential to be used in artificial intelligence, can be of importance as many research stations have data archives from decades of work in the Arctic. In this thesis, a YOLOv4 object detection model has been trained on two classes of images to investigate the performance impacts of disturbances in the training data set. An expanded data set was created by augmenting the initial data to contain various disturbances. A model was successfully trained on the augmented data set and a correlation between worse performance and presence of noise was detected, but changes in saturation and altered colour levels seemed to have less impact than expected. Reducing noise in gathered data is seemingly of greater importance than enhancing images with lacking colour levels. Further investigations with a larger and more thoroughly processed data set is required to gain a clearer picture of the impact of the various disturbances.

APA, Harvard, Vancouver, ISO, and other styles

5

Norling, Samuel. "Tree species classification with YOLOv3 : Classification of Silver Birch (Betula pendula) and Scots Pine (Pinus sylvestris)." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-260244.

Full text

Abstract:

Automation of tree species classification during a forest inventory could potentially provide more efficiency and better results for forest companies and stakeholding agencies. This thesis investigates how well a state of the art object detection system, YOLOv3, performs this classification task. A new image dataset with pictures of Silver Birches and Scots Pines, called LilljanNet, was created to train YOLOv3. After training YOLOv3 on half the dataset we performed validation by testing it against the other half. The trained model scored a mean average precision above 0.99. Training was also done with smaller sets of training data and the mean average precision score for these models all achieved mean average precision above 0.95. The results are promising and further research should be done testing this on smartphones and drones.<br>Automatisering av trädslagsklassifiering vid en skogstaxering skulle potentiellt sätt kunna ge mer effektivitet och bättre resultat för skogsbolag och myndigheter som ansvarar för skogen. Denna uppsats undersöker hur väl ett toppmodernt datorseendesystem, YOLOv3, utför denna klassifieringsuppgift. Ett nytt bildbibliotek med bilder av björkar och tallar, som kallas LilljanNet, skapades för att träna YOLOv3. Efter vi tränat YOLOv3 på halva datamängden utförde vi validering mot den andra halvan. Den upptränade modellen uppnådde ett mean average precision över 0.99. Träning gjordes också med mindre mängder träningsdata och mean average precision-resultaten för dessa modeller var alltid över 0.95. Resultaten är lovande och mer forskning bör göras där man testar att implementera detta på smartphones och drönare.

APA, Harvard, Vancouver, ISO, and other styles

6

Ståhl, Sebastian. "A tracking framework for a dynamic non- stationary environment." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288955.

Full text

Abstract:

As the use of unmanned aerial vehicles (UAVs) increases in popularity across the globe, their fields of application are constantly growing. This thesis researches the possibility of using a UAV to detect, track, and geolocate a target in a dynamic nonstationary environment as the seas. In this case, different projection and apparent size of the target in the captured images can lead to ambiguous assignments of coordinated. In this thesis, a framework based on a UAV, a monocular camera, a GPS receiver, and the UAV’s inertial measurement unit (IMU) is developed to perform the task of detecting, tracking and geolocating targets. An object detection model called Yolov3 was retrained to be able to detect boats in UAV footage. This model was selected due to its capabilities of detecting targets of small apparent sizes and its performance in terms of speed. A model called the kernelized correlation filter (KCF) is adopted as the visual tracking algorithm. This tracker is selected because of its performance in terms of speed and accuracy. A reinitialization of the tracker in combination with a periodic update of the tracked bounding box are implemented which resulted in improved performance of the tracker. A geolocation method is developed to continuously estimate the GPS coordinates of the target. These estimates will be used by the flight control method already developed by the stakeholder Airpelago to control the UAV. The experimental results show promising results for all models. Due to inaccurate data, the true accuracy of the geolocation method can not be determined. The average error calculated with the inaccurate data is 19.5 meters. However, an in- depth analysis of the results indicates that the true accuracy of the method is more accurate. Hence, it is assumed that the model can estimate the GPS coordinates of a target with an error significantly lower than 19.5 meters. Thus, it is concluded that it is possible to detect, track and geolocate a target in a dynamic nonstationary environment as the seas.<br>Användandet av drönare ökar i popularitet över hela världen vilket bidrar till att dess tillämpningsområden växer. I denna avhandling undersöks möjligheten att använda en drönare för att detektera, spåra och lokalisera ett mål i en dynamisk icke- stationär miljö som havet. Målets varierande position och storlek i bilderna leda till tvetydiga uppgifter. I denna avhandlingen utvecklas ett ramverk baserat på en drönare, en monokulär kamera, en GPS- mottagare och drönares IMU sensor för att utföra detektering, spårning samt lokalisering av målet. En objektdetekteringsmodell vid namn Yolov3 tränades för att kunna detektera båtar i bilder tagna från en drönare. Denna modell valdes på grund av dess förmåga att upptäcka små mål och dess prestanda vad gäller hastighet. En modell vars förkortning är KCF används som den visuella spårningsalgoritmen. Denna algoritm valdes på grund av dess prestanda när det gäller hastighet och precision. En återinitialisering av spårningsalgoritmen i kombination med en periodisk uppdatering av den spårade avgränsningsrutan implementeras vilket förbättrar trackerens prestanda. En lokaliseringsmetod utvecklas för att kontinuerligt uppskatta GPS- koordinaterna av målet. Dessa uppskattningar kommer att användas av en flygkontrollmetod som redan utvecklats av Airpelago för att styra drönaren. De experimentella resultaten visar lovande resultat för alla modeller. På grund av opålitlig data kan inte lokaliseringsmetodens precision fastställas med säkerhet. En djupgående analys av resultaten indikerar emellertid att metodens noggrannhet är mer exakt än det genomsnittliga felet beräknat med opålitliga data, som är 19.5 meter. Därför antas det att modellen kan uppskatta GPS- koordinaterna för ett mål med ett fel som är lägre än 19.5 meter. Således dras slutsatsen att det är möjligt att upptäcka, spåra och geolocera ett mål i en dynamisk icke- stationär miljö som havet.

APA, Harvard, Vancouver, ISO, and other styles

7

Ye, Fanjie. "A Method of Combining GANs to Improve the Accuracy of Object Detection on Autonomous Vehicles." Thesis, University of North Texas, 2020. https://digital.library.unt.edu/ark:/67531/metadc1752364/.

Full text

Abstract:

As the technology in the field of computer vision becomes more and more mature, the autonomous vehicles have achieved rapid developments in recent years. However, the object detection and classification tasks of autonomous vehicles which are based on cameras may face problems when the vehicle is driving at a relatively high speed. One is that the camera will collect blurred photos when driving at high speed which may affect the accuracy of deep neural networks. The other is that small objects far away from the vehicle are difficult to be recognized by networks. In this paper, we present a method to combine two kinds of GANs to solve these problems. We choose DeblurGAN as the base model to remove blur in images. SRGAN is another GAN we choose for solving small object detection problems. Due to the total time of these two are too long, we still do the model compression on it to make it lighter. Then we use the Yolov4 to do the object detection. Finally we do the evaluation of the whole model architecture and proposed a model version 2 based on DeblurGAN and ESPCN which is faster than previous one but the accuracy may be lower.

APA, Harvard, Vancouver, ISO, and other styles

8

Wang, Chen. "2D object detection and semantic segmentation in the Carla simulator." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291337.

Full text

Abstract:

The subject of self-driving car technology has drawn growing interest in recent years. Many companies, such as Baidu and Tesla, have already introduced automatic driving techniques in their newest cars when driving in a specific area. However, there are still many challenges ahead toward fully autonomous driving cars. Tesla has caused several severe accidents when using autonomous driving functions, which makes the public doubt self-driving car technology. Therefore, it is necessary to use the simulator environment to help verify and perfect algorithms for the perception, planning, and decision-making of autonomous vehicles before implementation in real-world cars. This project aims to build a benchmark for implementing the whole self-driving car system in software. There are three main components including perception, planning, and control in the entire autonomous driving system. This thesis focuses on two sub-tasks 2D object detection and semantic segmentation in the perception part. All of the experiments will be tested in a simulator environment called The CAR Learning to Act(Carla), which is an open-source platform for autonomous car research. Carla simulator is developed based on the game engine(Unreal4). It has a server-client system, which provides a flexible python API. 2D object detection uses the You only look once(Yolov4) algorithm that contains the tricks of the latest deep learning techniques from the aspect of network structure and data augmentation to strengthen the network’s ability to learn the object. Yolov4 achieves higher accuracy and short inference time when comparing with the other popular object detection algorithms. Semantic segmentation uses Efficient networks for Computer Vision(ESPnetv2). It is a light-weight and power-efficient network, which achieves the same performance as other semantic segmentation algorithms by using fewer network parameters and FLOPS. In this project, Yolov4 and ESPnetv2 are implemented into the Carla simulator. Two modules work together to help the autonomous car understand the world. The minimal distance awareness application is implemented into the Carla simulator to detect the distance to the ahead vehicles. This application can be used as a basic function to avoid the collision. Experiments are tested by using a single Nvidia GPU(RTX2060) in Ubuntu 18.0 system.<br>Ämnet självkörande bilteknik har väckt intresse de senaste åren. Många företag, som Baidu och Tesla, har redan infört automatiska körtekniker i sina nyaste bilar när de kör i ett specifikt område. Det finns dock fortfarande många utmaningar inför fullt autonoma bilar. Detta projekt syftar till att bygga ett riktmärke för att implementera hela det självkörande bilsystemet i programvara. Det finns tre huvudkomponenter inklusive uppfattning, planering och kontroll i hela det autonoma körsystemet. Denna avhandling fokuserar på två underuppgifter 2D-objekt detektering och semantisk segmentering i uppfattningsdelen. Alla experiment kommer att testas i en simulatormiljö som heter The CAR Learning to Act (Carla), som är en öppen källkodsplattform för autonom bilforskning. Du ser bara en gång (Yolov4) och effektiva nätverk för datorvision (ESPnetv2) implementeras i detta projekt för att uppnå Funktioner för objektdetektering och semantisk segmentering. Den minimala distans medvetenhets applikationen implementeras i Carla-simulatorn för att upptäcka avståndet till de främre bilarna. Denna applikation kan användas som en grundläggande funktion för att undvika kollisionen.

APA, Harvard, Vancouver, ISO, and other styles

9

Mikulský, Petr. "Detekce pohybujících se objektů ve videu s využitím neuronových sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442377.

Full text

Abstract:

This diploma thesis deals with the detection of moving objects in a video recording using neural networks. The aim of the thesis was to detect road users in video recordings. Pre-trained YOLOv5 object detection model was used for a practical part of the thesis. As part of the solution, an own dataset of traffic road video recordings was created and annotated with following classes: a car, a bus, a van, a motorcycle, a truck and a trailer truck. Final version of this dataset comprise 5404 frames and 6467 annotated objects in total. After training, the YOLOv5 model achieved 0.995 mAP, 0.995 precision and 0.986 recall on the dataset. All steps leading to the final form of the dataset are described in the conclusion chapter.

APA, Harvard, Vancouver, ISO, and other styles

10

Roohi, Masood. "end-point detection of a deformable linear object from visual data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/21133/.

Full text

Abstract:

In the context of industrial robotics, manipulating rigid objects have been studied quite deeply. However, Handling deformable objects is still a big challenge. Moreover, due to new techniques introduced in the object detection literature, employing visual data is getting more and more popular between researchers. This thesis studies how to exploit visual data for detecting the end-point of a deformable linear object. A deep learning model is trained to perform the task of object detection. First of all, basics of the neural networks is studied to get more familiar with the mechanism of the object detection. Then, a state-of-the-art object detection algorithm YOLOv3 is reviewed so it can be used as its best. Following that, it is explained how to collect the visual data and several points that can improve the data gathering procedure are delivered. After clarifying the process of annotating the data, model is trained and then it is tested. Trained model localizes the end-point. This information can be used directly by the robot to perform tasks like pick and place or it can be used to get more information on the form of the object.

APA, Harvard, Vancouver, ISO, and other styles

11

Svedberg, Malin. "Analys av inskannade arkiverade dokument med hjälp av objektdetektering uppbyggt på AI." Thesis, Högskolan i Gävle, Avdelningen för datavetenskap och samhällsbyggnad, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-32612.

Full text

Abstract:

Runt om i världen finns det en stor mängd historiska dokument som endast finns i pappersform. Genom att digitalisera dessa dokument förenklas bland annat förvaring och spridning av dokumenten. Vid digitalisering av dokument räcker det oftast inte att enbart skanna in dokumenten och förvara dem som en bild, oftast finns det önskemål att kunna hantera informationen som dokumenten innehåller på olika vis. Det kan t.ex. vara att söka efter en viss information eller att sortera dokumenten utifrån informationen dem innehåller. Det finns olika sätt att digitalisera dokument och extrahera den information som finns på dem. I denna studie används metoden objektdetektering av typen YOLOv3 för att hitta och urskilja olika områden på historiska dokument i form av gamla registerkort för gamla svenska fordon. Objektdetekteringen tränas på ett egenskapat träningsdataset och träningen av objektdetekteringen sker via ramverket Darknet. Studien redovisar resultat i form av recall, precision och IoU för flera olika objektdetekteringsmodeller tränade på olika träningsdataset och som testats på ett flertal olika testdataset. Resultatet analyseras bland annat utifrån storlek och färg på träningsdatat samt mängden träning av objektdetekteringen.

APA, Harvard, Vancouver, ISO, and other styles

12

Yesudasu, Santheep. "Cοntributiοn à la manipulatiοn de cοlis sοus cοntraintes par un tοrse humanοïde : applicatiοn à la dépaléttisatiοn autοnοme dans les entrepôts lοgistiques". Electronic Thesis or Diss., Normandie, 2024. https://theses.hal.science/tel-04874770.

Full text

Abstract:

Cette thèse de doctorat explore le développement et l'implémentation d'URNik-AI, un système de dépalettisation automatisé basé sur l'intelligence artificielle (IA), conçu pour manipuler des boîtes en carton de tailles et de poids variés à l'aide d'un torse humanoïde à double bras. L'objectif principal est d'améliorer l'efficacité, la précision et la fiabilité des tâches de dépalettisation industrielle grâce à l'intégration de la robotique avancée, de la vision par ordinateur et des techniques d'apprentissage profond.Le système URNik-AI est composé de deux bras robotiques UR10 équipés de capteurs de force/torque à six axes et d'outils de préhension. Une caméra RGB-D ASUS Xtion est montée sur des servomoteurs pan-tilt Dynamixel Pro H42 pour obtenir des images haute résolution et des données de profondeur. Le cadre logiciel comprend ROS Noetic, ROS 2 et le framework MoveIt, permettant une communication fluide et une coordination des mouvements complexes. Ce système assure une haute précision dans la détection, la saisie et la manipulation d'objets dans divers environnements industriels.Une contribution importante de cette recherche est l'implémentation de modèles d'apprentissage profond, tels que YOLOv3 et YOLOv8, pour améliorer les capacités de détection et d'estimation de pose des objets. YOLOv3, entraîné sur un ensemble de données de 807 images, a atteint des scores F1 de 0,81 et 0,90 pour les boîtes à une et plusieurs faces, respectivement. Le modèle YOLOv8 a encore amélioré les performances du système en fournissant des capacités de détection de points clés et de squelettes, essentielles pour la manipulation précise des objets. L'intégration des données de nuage de points pour l'estimation de la pose a assuré une localisation et une orientation précises des boîtes.Les résultats des tests ont démontré la robustesse du système, avec des métriques élevées de précision, rappel et précision moyenne (mAP), confirmant son efficacité. Cette thèse apporte plusieurs contributions significatives au domaine de la robotique et de l'automatisation, notamment l'intégration réussie des technologies robotiques avancées et de l'IA, le développement de techniques innovantes de détection et d'estimation de pose, ainsi que la conception d'une architecture de système polyvalente et adaptable<br>This PhD thesis explores the development and implementation of URNik-AI, an AI-powered automated depalletizing system designed to handle cardboard boxes of varying sizes and weights using a dual-arm humanoid torso. The primary objective is to enhance the efficiency, accuracy, and reliability of industrial depalletizing tasks through the integration of advanced robotics, computer vision, and deep learning techniques.The URNik-AI system consists of two UR10 robotic arms equipped with six-axis force/torque sensors and gripper tool sets. An ASUS Xtion RGB-D camera is mounted on Dynamixel Pro H42 pan-tilt servos to capture high-resolution images and depth data. The software framework includes ROS Noetic, ROS 2, and the MoveIt framework, enabling seamless communication and coordination of complex movements. This system ensures high precision in detecting, grasping, and handling objects in diverse industrial environments.A significant contribution of this research is the implementation of deep learning models, such as YOLOv3 and YOLOv8, to enhance object detection and pose estimation capabilities. YOLOv3, trained on a dataset of 807 images, achieved F1-scores of 0.81 and 0.90 for single and multi-face boxes, respectively. The YOLOv8 model further advanced the system's performance by providing keypoint and skeleton detection capabilities, which are essential for accurate grasping and manipulation. The integration of point cloud data for pose estimation ensured precise localization and orientation of boxes.Comprehensive testing demonstrated the system's robustness, with high precision, recall, and mean average precision (mAP) metrics confirming its effectiveness. This thesis makes several significant contributions to the field of robotics and automation, including the successful integration of advanced robotics and AI technologies, the development of innovative object detection and pose estimation techniques, and the design of a versatile and adaptable system architecture

APA, Harvard, Vancouver, ISO, and other styles

13

Hasanaj, Enis, Albert Aveler, and William Söder. "Cooperative edge deepfake detection." Thesis, Jönköping University, JTH, Avdelningen för datateknik och informatik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-53790.

Full text

Abstract:

Deepfakes are an emerging problem in social media and for celebrities and political profiles, it can be devastating to their reputation if the technology ends up in the wrong hands. Creating deepfakes is becoming increasingly easy. Attempts have been made at detecting whether a face in an image is real or not but training these machine learning models can be a very time-consuming process. This research proposes a solution to training deepfake detection models cooperatively on the edge. This is done in order to evaluate if the training process, among other things, can be made more efficient with this approach. The feasibility of edge training is evaluated by training machine learning models on several different types of iPhone devices. The models are trained using the YOLOv2 object detection system. To test if the YOLOv2 object detection system is able to distinguish between real and fake human faces in images, several models are trained on a computer. Each model is trained with either different number of iterations or different subsets of data, since these metrics have been identified as important to the performance of the models. The performance of the models is evaluated by measuring the accuracy in detecting deepfakes. Additionally, the deepfake detection models trained on a computer are ensembled using the bagging ensemble method. This is done in order to evaluate the feasibility of cooperatively training a deepfake detection model by combining several models. Results show that the proposed solution is not feasible due to the time the training process takes on each mobile device. Additionally, each trained model is about 200 MB, and the size of the ensemble model grows linearly by each model added to the ensemble. This can cause the ensemble model to grow to several hundred gigabytes in size.

APA, Harvard, Vancouver, ISO, and other styles

14

Noman, Md Kislu. "Deep learning-based seagrass detection and classification from underwater digital images." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2023. https://ro.ecu.edu.au/theses/2648.

Full text

Abstract:

Deep learning is the most popular branch of machine learning and has achieved great success in many real-life applications. Deep learning algorithms, in particular Convolutional Neural Networks (CNNs), have rapidly become a method of choice for analysing seagrass image data. Deep learning-based seagrass classification and detection are very challenging due to the limited labelled data, intraclass similarities between species, lighting conditions, and complex shapes and structures in the underwater environment, which make them different from large-scale dataset objects. The light propagating through water is attenuated and scattered selectively, causing severe effects on the quality of underwater images. Besides low contrast, colour distortion and bright specks affect the quality of underwater images. In this thesis, we focus on the problem of single to multi-species seagrass classification and detection from underwater digital images. We investigated the existing seagrass classification and detection models and systematically attempted to improve the performance of seagrass classification and detection by developing different models on several seagrass datasets. CNNs are a class of artificial neural networks commonly used in deep learning architectures for image recognition, object localization or mapping tasks. CNN-based models are gaining popularity in seagrass identification or mapping due to their automatic feature extraction ability and higher performance over machine learning techniques. Making a deep learning-based model for all domain users (not only computer vision experts or engineers) is also a challenging task because CNNs development requires architectural engineering and hyperparameter tuning. This thesis investigates the effective development of CNNs on multi-species seagrass datasets to minimise the requirement of architectural engineering and manual hyperparameter tuning for CNN models. This thesis develops a novel metaheuristic algorithm called Opposition-based Flow Direction Algorithm (OFDA) by leveraging the power of the Opposition-based learning technique into the Flow Direction Algorithm to tune and automate the development of CNNs. The proposed deep neuroevolutionary algorithm (OFDA-CNN) outperformed other eight popular optimisation-based neuroevolutionary algorithms on a newly developed multi-species seagrass dataset. The OFDA-CNN algorithm also outperformed the state-of-the-art multi-species seagrass classification performances on publicly available seagrass datasets. This thesis also proposes another novel metaheuristic algorithm called Boosted Atomic Orbital Search (BAOS) to optimize the architecture and tune the hyperparameter of a CNN. The proposed BAOS algorithm improved the search capability of the original version of the Atomic Orbital Search algorithm by incorporating the L´evy flight technique. The optimized deep neuroevolutionary (BAOS-CNN) algorithm achieved the highest accuracy among seven popular optimisation-based CNNs. The BAOS-CNN algorithm also outperformed the state-of-the-art multi-species seagrass classification performances. This thesis proposes also a two-stage semi-supervised framework for leveraging huge unlabelled seagrass data. We propose an EfficientNet-B5-based semi-supervised framework that leverages a large collection of unlabelled seagrass data with the guidance of a small, labelled seagrass dataset. We introduced a multi-species seagrass classifier based on EfficientNet-B5 that outperformed the state-of-the-art multi-species seagrass classification performances. This thesis also developed a two and half times larger multi-species dataset than the largest publicly available ‘DeepSeagrass’ dataset. To evaluate the performance of all the proposed models, we trained and tested them on the newly developed and some publicly available challenging seagrass datasets. Our rigorous experiments demonstrated how our models were capable of producing state-of-the-art performances of seagrass classification and detection in both single and multi-species scenarios.

APA, Harvard, Vancouver, ISO, and other styles

15

Uhrín, Peter. "Počítání unikátních aut ve snímcích." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445493.

Full text

Abstract:

Current systems for counting cars on parking lots usually use specialized equipment, such as barriers at the parking lot entrance. Usage of such equipment is not suitable for free or residential parking areas. However, even in these car parks, it can help keep track of their occupancy and other data. The system designed in this thesis uses the YOLOv4 model for visual detection of cars in photos. It then calculates an embedding vector for each vehicle, which is used to describe cars and compare whether the car has changed over time at the same parking spot. This information is stored in the database and used to calculate various statistical values like total cars count, average occupancy, or average stay time. These values can be retrieved using REST API or be viewed in the web application.

APA, Harvard, Vancouver, ISO, and other styles

16

Jacobzon, Gustaf. "Multi-site Organ Detection in CT Images using Deep Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279290.

Full text

Abstract:

When optimizing a controlled dose in radiotherapy, high resolution spatial information about healthy organs in close proximity to the malignant cells are necessary in order to mitigate dispersion into these organs-at-risk. This information can be provided by deep volumetric segmentation networks, such as 3D U-Net. However, due to limitations of memory in modern graphical processing units, it is not feasible to train a volumetric segmentation network on full image volumes and subsampling the volume gives a too coarse segmentation. An alternative is to sample a region of interest from the image volume and train an organ-specific network. This approach requires knowledge of which region in the image volume that should be sampled and can be provided by a 3D object detection network. Typically the detection network will also be region specific, although a larger region such as the thorax region, and requires human assistance in choosing the appropriate network for a certain region in the body. Instead, we propose a multi-site object detection network based onYOLOv3 trained on 43 different organs, which may operate on arbitrary chosen axial patches in the body. Our model identifies the organs present (whole or truncated) in the image volume and may automatically sample a region from the input and feed to the appropriate volumetric segmentation network. We train our model on four small (as low as 20 images) site-specific datasets in a weakly-supervised manner in order to handle the partially unlabeled nature of site-specific datasets. Our model is able to generate organ-specific regions of interests that enclose 92% of the organs present in the test set.<br>Vid optimering av en kontrollerad dos inom strålbehandling krävs det information om friska organ, så kallade riskorgan, i närheten av de maligna cellerna för att minimera strålningen i dessa organ. Denna information kan tillhandahållas av djupa volymetriskta segmenteringsnätverk, till exempel 3D U-Net. Begränsningar i minnesstorleken hos moderna grafikkort gör att det inte är möjligt att träna ett volymetriskt segmenteringsnätverk på hela bildvolymen utan att först nedsampla volymen. Detta leder dock till en lågupplöst segmentering av organen som inte är tillräckligt precis för att kunna användas vid optimeringen. Ett alternativ är att endast behandla en intresseregion som innesluter ett eller ett fåtal organ från bildvolymen och träna ett regionspecifikt nätverk på denna mindre volym. Detta tillvägagångssätt kräver dock information om vilket område i bildvolymen som ska skickas till det regionspecifika segmenteringsnätverket. Denna information kan tillhandahållas av ett 3Dobjektdetekteringsnätverk. I regel är även detta nätverk regionsspecifikt, till exempel thorax-regionen, och kräver mänsklig assistans för att välja rätt nätverk för en viss region i kroppen. Vi föreslår istället ett multiregions-detekteringsnätverk baserat påYOLOv3 som kan detektera 43 olika organ och fungerar på godtyckligt valda axiella fönster i kroppen. Vår modell identifierar närvarande organ (hela eller trunkerade) i bilden och kan automatiskt ge information om vilken region som ska behandlas av varje regionsspecifikt segmenteringsnätverk. Vi tränar vår modell på fyra små (så lågt som 20 bilder) platsspecifika datamängder med svag övervakning för att hantera den delvis icke-annoterade egenskapen hos datamängderna. Vår modell genererar en organ-specifik intresseregion för 92 % av organen som finns i testmängden.

APA, Harvard, Vancouver, ISO, and other styles

17

Ali, Hani, and Pontus Sunnergren. "Scenanalys - Övervakning och modellering." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-45036.

Full text

Abstract:

Självkörande fordon kan minska trafikstockningar och minska antalet trafikrelaterade olyckor. Då det i framtiden kommer att finnas miljontals autonoma fordon krävs en bättre förståelse av omgivningen. Syftet med detta projekt är att skapa ett externt automatiskt trafikledningssystem som kan upptäcka och spåra 3D-objekt i en komplex trafiksituation för att senare skicka beteendet från dessa objekt till ett större projekt som hanterar med att 3D-modellera trafiksituationen. Projektet använder sig av Tensorflow ramverket och YOLOv3 algoritmen. Projektet använder sig även av en kamera för att spela in trafiksituationer och en dator med Linux som operativsystem. Med hjälp av metoder som vanligen används för att skapa ett automatiserat trafikledningssystem utvärderades ett målföljningssystem. De slutliga resultaten visar att systemet är relativt instabilt och ibland inte kan känna igen vissa objekt. Om fler bilder används för träningsprocessen kan ett robustare och mycket mer tillförlitligt system utvecklas med liknande metodik.<br>Autonomous vehicles can decrease traffic congestion and reduce the amount of traffic related accidents. As there will be millions of autonomous vehicles in the future, a better understanding of the environment will be required. This project aims to create an external automated traffic system that can detect and track 3D objects within a complex traffic situation to later send these objects’ behavior for a larger-scale project that manages to 3D model the traffic situation. The project utilizes Tensorflow framework and YOLOv3 algorithm. The project also utilizes a camera to record traffic situations and a Linux operated computer. Using methods commonly used to create an automated traffic management system was evaluated. The final results show that the system is relatively unstable and can sometimes fail to recognize certain objects. If more images are used for the training process, a more robust and much more reliable system could be developed using a similar methodology.

APA, Harvard, Vancouver, ISO, and other styles

18

Charvát, Michal. "System for People Detection and Localization Using Thermal Imaging Cameras." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-432478.

Full text

Abstract:

V dnešním světě je neustále se zvyšující poptávka po spolehlivých automatizovaných mechanismech pro detekci a lokalizaci osob pro různé účely -- od analýzy pohybu návštěvníků v muzeích přes ovládání chytrých domovů až po hlídání nebezpečných oblastí, jimiž jsou například nástupiště vlakových stanic. Představujeme metodu detekce a lokalizace osob s pomocí nízkonákladových termálních kamer FLIR Lepton 3.5 a malých počítačů Raspberry Pi 3B+. Tento projekt, navazující na předchozí bakalářský projekt "Detekce lidí v místnosti za použití nízkonákladové termální kamery", nově podporuje modelování komplexních scén s polygonálními okraji a více termálními kamerami. V této práci představujeme vylepšenou knihovnu řízení a snímání pro kameru Lepton 3.5, novou techniku detekce lidí používající nejmodernější YOLO (You Only Look Once) detektor objektů v reálném čase, založený na hlubokých neuronových sítích, dále novou automaticky konfigurovatelnou termální jednotku, chráněnou schránkou z 3D tiskárny pro bezpečnou manipulaci, a v neposlední řadě také podrobný návod instalace detekčního systému do nového prostředí a další podpůrné nástroje a vylepšení. Výsledky nového systému demonstrujeme příkladem analýzy pohybu osob v Národním muzeu v Praze.

APA, Harvard, Vancouver, ISO, and other styles

19

Lavado, Diana Martins. "Sorting Surgical Tools from a Clustered Tray - Object Detection and Occlusion Reasoning." Master's thesis, 2018. http://hdl.handle.net/10316/86257.

Full text

Abstract:

Trabalho de Projeto do Mestrado Integrado em Engenharia Biomédica apresentado à Faculdade de Ciências e Tecnologia<br>O principal objetivo desta dissertação de mestrado é classificar e localizar os instrumentos cirúrgicos presentes numa bandeja desorganizada, assim como realizar o raciocínio para resolver oclusão por forma a determinar qual o instrumento que deverá ser retirado em primeiro lugar. Estas tarefas pretendem ser uma parte integrante de um sistema complexo apto a separar instrumentos cirúrgicos após a sua desinfeção, de modo a montar kits cirúrgicos e, esperançosamente, otimizar o tempo despendido pelos enfermeiros em salas de esterilização, para que se possam dedicar a tarefas mais complexas.Inicialmente, várias abordagens clássicas foram testadas para obter modelos 2D para cada tipo de instrumento cirúrgico, tal como canny edges, otsu’s threshold e watershed algorithm. A ideia era colocar códigos “2D data matrix” nos instrumentos cirúrgicos e, sempre que o código fosse detetado, o respetivo modelo seria adicionado a um mapa virtual, que seria posteriormente analisado para determinar qual o instrumento situado no topo, através da comparação com a imagem original. Todavia, devido a dificuldades na aquisição de um software específico, foi usada uma abordagem moderna, recorrendo à rede neuronal de deep learning YOLO (“you only look once”).De modo a treinar as redes neuronais foi elaborado um dataset, que foi posteriormente publicado, em conjunto com as respetivas “labels” das imagens, assim como uma divisão apropriada em grupo de teste e de treino. No total, 5 redes neuronais YOLOv2 foram treinadas: 1 para deteção e classificação de objetos e 1 para o resolver a oclusão relativa a cada tipo de instrumento (perfazendo um total de 4). Relativamente à deteção de objetos foi também realizada validação cruzada, assim como treinada a rede YOLOv3.Uma aplicação de consola que aplica o algoritmo proposto foi também desenvolvida, em que o primeiro passo é correr o detetor de objetos com redes treinadas quer de YOLOv2 ou de YOLOv3, seguido pela ordenação das deteções por ordem decrescente de percentagem de confiança. Posteriormente, as deteções correspondentes às duas percentagens de confiança mais elevadas são escolhidas, e as respetivas redes neuronais de raciocínio para resolver oclusão são implementadas. Finalmente, a melhor combinação de percentagens de confiança entre a deteção de objetos e o raciocínio de oclusão determina qual o instrumento cirúrgico que deverá ser removido em primeiro lugar do tabuleiro desorganizado.<br>The main goal of this master dissertation is to classify and localize surgical tools in a cluttered tray, as well as perform occlusion reasoning to determine which tool should be removed first. These tasks are intended to be a part of a multi-stage robotic system able to sort surgical tools after disinfection, in order to assembly surgical kits and, hopefully, optimizing the nurses time in sterilization rooms, so that they can focus on more complex tasks.Initially, several classical approaches were tested to obtain 2D templates of each type of surgical tool, such as canny edges, otsu’s threshold and watershed algorithm. The idea was to place 2D data matrixes codes onto the surgical tools and whenever the code was detected, the respective template would be added to a virtual map, which would be posteriorly be assessed and determined which tool was on top by comparison with the original image. However, due to difficulties in acquiring a specific software, a modern approach was used instead, resorting to the YOLO (“you only look once”) deep learning neural network.In order to train the neural networks, a dataset was built, which was then published, along with the respective labels of the data and appropriate division into train and test groups. In total, 5 YOLOv2 neural networks were trained: 1 for object detection and classification and 1 for occlusion reasoning of each instrument (making a total of 4). Regarding object detection, it was also performed cross-validation, as well as trained the YOLOv3 network.A console application that applies the proposed algorithm was also developed, in which the first step is to run the object detector with either the trained YOLOv2 or YOLOv3 network, followed by sorting the detections in a decrescent order of confidence score. Afterward, the detections correspondent to the two higher confidence scores are chosen and the respective occlusion reasoning neural networks are run. Finally, the best combination of confidence scores between object detection and occlusion reasoning determines the surgical tool to be removed first from the cluttered tray.

APA, Harvard, Vancouver, ISO, and other styles

20

Lu, Yan-Ting, and 盧彥廷. "YOLORIS: YOLO for Real-time Instance Segmentation." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/yyu3py.

Full text

Abstract:

碩士<br>國立交通大學<br>資訊科學與工程研究所<br>108<br>In recent years, image segmentation has become an important issue, and been used in many fields like self-driving cars, computer vision, video tracking, medical use and so on. Therefore, there are many researchers devote themselves into this challenge. Image segmentation is to classify image in pixel-wise, making machine to learn and realize locations and classes of instances in images. However, predicting instances precisely is a critical issue. There are many papers work on it, but most of them focus on accuracy instead of speed, needing lots of hardware to support. In this paper, we present a real-time and good performance method that only need one GPU to support which is based on YOLOv3.

APA, Harvard, Vancouver, ISO, and other styles

21

CHEN, WEI-LUN, and 陳威倫. "Downsized-YOLOv3 for SAR Imagery Ship Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/535hpk.

Full text

Abstract:

碩士<br>國立臺北科技大學<br>電機工程系<br>107<br>Synthetic aperture radar (SAR) is a radar with superior traversal. The radar emits energy, then get the reflects after reaches the surface. Compared with visible light, it can easily penetrate the clouds and is not affected by climate conditions. SAR has a wide range of object detection and monitoring, and produce high-resolution images, also has been widely used in aviation and spacecraft. This study obtained the ship and oil spill dataset(SOSD) and SAR ship detection dataset(SSDD). We enhanced the training samples to improve the detection accuracy. These two datasets enable further verification and comparison. The deep learning method for ship target detection we use is YOLOv3 (You Only Look Once version 3). Compared with YOLOv2, multi-size feature fusion is used so the average detection is added. Although cost time is increased a little bit, but the target for small size has a better accuracy, which increased from 94% to 97%.

APA, Harvard, Vancouver, ISO, and other styles

22

Min-ZhiJi and 紀旻志. "Optimization of YOLOv3 Inference Engine for Edge Device." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/7kj82c.

Full text

Abstract:

碩士<br>國立成功大學<br>電機工程學系<br>107<br>For neural networks used in low-end edge devices, there are several approaches to dealing with, such that compressing model, quantifying model and designing hardware accelerators. However, the number of parameters of the current NN (neural network) models is increasing, and the current NN frameworks typically initialize the entire NN model in the initial stage. So, memory requirement will be very huge. In order to reduce memory requirement, we propose layer-wise memory management based on Darknet. But NN models maybe have complex network structures with residual connections or routing connections for better training results. So, we propose a layer-dependency counter mechanism. Finally, we named the modified framework MDFI (Micro Darknet for Inference). According to our experimental result, the average memory consumption of MDFI is reduced by 76% compared to Darknet, and the average processing time of MDFI is reduced by 8%.

APA, Harvard, Vancouver, ISO, and other styles

23

LIAO, YI-CHIEN, and 廖宜健. "The Real-Time Pedestrian Detection with YOLOv3-Reduce." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/axd9mn.

Full text

Abstract:

碩士<br>國立臺北科技大學<br>電機工程系<br>107<br>In recent years, with the energetically of computer vision, there are many different ways to create architectures in the field of object detection. Among them, the deep learning neural network architecture is the mainstream. This study uses the deep learning method to perform pedestrian detection in object detection, and uses the COCO (Common Objects in Context) database provided by Microsoft for training and evaluation. The deep learning method optimizes the YOLOv3 (You Only Look Once version 3) architecture, detects the target object, and tests and evaluates the pedestrian image of different scenes for different scenes. Compared with other deep learning methods such as SSD (Single Shot Multi-Box Detector), the detection speed has increased by 5 times and the average accuracy has increased from 54% to 66%. This architecture named it YOLOv3-Reduce. By reducing the convolution kernel by half and decreasing 20 convolution layers and shortcut layers, YOLOv3-Reduce increases the detection speed by about 80% under the 720p resolution. The current deep learning network in the target detection field is optimized for the accuracy and speed of pedestrian detection.

APA, Harvard, Vancouver, ISO, and other styles

24

Liao, Szu-Yu, and 廖思羽. "Implementing 3D Semantic Maps by Visual SLAM Integrated with YOLOv3." Thesis, 2019. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5441095%22.&searchmode=basic.

Full text

Abstract:

碩士<br>國立中興大學<br>電機工程學系所<br>107<br>In recent years, Simultaneous Localization And Mapping (SLAM) becomes an important topic in the research area of unmanned vehicles. The sensors adopted by SLAM are mainly divided into lidar and camera approaches. A camera is cost-efficeint and easy to obtain. Thus it is extensively used in implementing SLAM. Particularly, ORB-SLAM2 is a real-time visual SLAM method based on feature points, supporting high-precision three-dimensional (3D) maps with monocular cameras, stereo cameras, and RGB-D cameras. However, it is unable to assign semantic labels to objects observed in the environment for the unmanned vehicle to learn the class of a detected object in an ORB-SLAM2 3D map. Although it is possible to use a method like semantic SLAM to make up the above deficiency, the bounding box of a region proposal generated by the aforementioned object detection method can vary depending on factors such as the angle or offset of the proposal itself. 　　In this thesis, we construct a 3D semantic map by integrating ORB-SLAM2 with the deep learning object detection tool YOLOv3 (You Only Look Once v3). More specifically, we obtain the class of detected objects with YOLOv3 at the same time ORB-SLAM2 inserts a new keyframe. Then we project these semantic labels to the point cloud clusters to achieve a 3D semantic map. Particularly, we remove duplicate recognition results by averaging the object size, and also calculate the coordinate of the object in the map according to the camera position. Then we output obtained object labels together with their corresponding coordinates. Finally, based on public datasets, we perform relative experiments to demonstrate the correctness of our proposed method.

APA, Harvard, Vancouver, ISO, and other styles

25

(8786558), Mehul Nanda. "You Only Gesture Once (YouGo): American Sign Language Translation using YOLOv3." Thesis, 2020.

Find full text

Abstract:

<div>The study focused on creating and proposing a model that could accurately and precisely predict the occurrence of an American Sign Language gesture for an alphabet in the English Language</div><div>using the You Only Look Once (YOLOv3) Algorithm. The training dataset used for this study was custom created and was further divided into clusters based on the uniqueness of the ASL sign.</div><div>Three diverse clusters were created. Each cluster was trained with the network known as darknet. Testing was conducted using images and videos for fully trained models of each cluster and</div><div>Average Precision for each alphabet in each cluster and Mean Average Precision for each cluster was noted. In addition, a Word Builder script was created. This script combined the trained models, of all 3 clusters, to create a comprehensive system that would create words when the trained models were supplied</div><div>with images of alphabets in the English language as depicted in ASL.</div>

APA, Harvard, Vancouver, ISO, and other styles

26

Lee, Felix, and 李宏德. "The real-time pedestrian detection by embedded GPU with YOLOv3-mobile." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/7x69h5.

Full text

Abstract:

碩士<br>國立臺北科技大學<br>電機工程系<br>106<br>Pedestrian detection technology is widely used throughout the industry, especially in the field of self-driving cars development. To ensure that the process of driving pedestrians safety and accurately determine the relative position of pedestrians is the crucial part to build a reliable and robust self-driving car. In the future, it can be also through IoT (Internet of things ) to do remote traffic monitoring. Recently the DNNs (deep neural networks) have been demonstrated to be superior to other approaches for object detection in terms of performance and accuracy, the YOLO(you only look once) is the state-of-art real-time object-detection system that is based on DNN as well. The YOLO and YOLOv3 have been demonstrated to be most fast DNN while still maintaining accuracy against other DNN models such as Fast-RCNN(Regional Convolutional neural network) and SSD(Single Shot Multi-box Detector). Although YOLOv3 can be used for pedestrian detection and it can achieve real-time performance on powerful GPU card like Pascal Titan X, but it is still challenging to use YOLOv3 on embedded GPU system like Nvidia Jetson TX1, even with its smaller version, the YOLOv3-Tiny, it can only attain 18 FPS(Frame Per Second). In this paper, we will firstly refine Deep learning framework code for embedded GPU, such as to use asynchronous data transfer to improve code efficiency. Secondary, we propose an even smaller and faster network: "YOLOv3-mobile", to make YOLOv3-tiny achieving 30 FPS with certain level of mAP (meantime Average Precision) for real-time pedestrian detection on TX1 platform, and to be adoptable for any future embedded GPU platforms.

APA, Harvard, Vancouver, ISO, and other styles

27

YIN, I.-CHENG, and 印翊誠. "UAV Images Overlapping Regions Candidates based on YOLOv3 with Traditional Feature Matching." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/63797d.

Full text

Abstract:

碩士<br>國立臺北科技大學<br>電機工程系<br>107<br>Due to Unmanned Aerial Vehicle can use the methods such as remote control, automatic flight, etc. to performs a specific mission, and it also can be equipped with sensing equipment to perform the mission of environment investigations, so it gradually become to one of the main tools of the aerial mapping in the recently years. This paper will candidates the overlapping regions on UAV images based on the deep learning method, in order to save more times on feature detector and feature matching of two full size adjacent images, it assist in pending timely image stitching works. In the Deep Learning part, we candidates the overlapping regions on UAV images by You Only Look Once version.3, then we use Structural Similarity (SSIM) to know how similar the candidates regions are, so that we can filter the mismatch region and match the same overlapping to confirm the correspond relationship of the overlapping regions on two adjacent images, then use the traditional feature extraction Algorithm to extract features on the match regions and match the features. In the experiment, we compare the execution time and precision with our method and traditional feature extract and matching, In the execution time, our method reduces 9~42 times than traditional way, and we use Mean Square Error(MSE) and Peak Signal to Noise Ratio(PSNR) and Normalized Cross Correlation(NCC) to compare the precision with two methods.

APA, Harvard, Vancouver, ISO, and other styles

28

Phong-PhuLe and 黎楓富. "Ball-Grid-Array Chip Defects Detection and Classification Using Patch-based Modified YOLOv3." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/vyywa9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Wei-ChungTseng and 曾微中. "Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/x46nq6.

Full text

Abstract:

碩士<br>國立成功大學<br>電腦與通信工程研究所<br>107<br>With the increasing popularity of mobile devices and the effectiveness of deep learning-based algorithms, people try to put deep learning models on mobile devices. However, it is limited by the complexity of computational and software overhead. We propose an efficient framework for inference to fit resource-limited devices with about 1000 times smaller than Tensorflow in code size, and a layer-wised quantization scheme that allows inference computed by fixed-point arithmetic. The fixed-point quantization scheme is more efficient than floating point arithmetic with power consumption reduced to 8% left in cost grained evaluation and reduce model size to 40%~25% left, and keep TOP5 accuracy loss under 1% in Alexnet on ImageNet.

APA, Harvard, Vancouver, ISO, and other styles

30

Троян, Дмитро Віталійович. "Комп’ютерна система керування роботом для збору тенісних м’ячів". Магістерська робота, 2021. https://dspace.znu.edu.ua/jspui/handle/12345/5702.

Full text

Abstract:

Троян Д. В. Комп’ютерна система керування роботом для збору тенісних м’ячів : кваліфікаційна робота магістра спеціальності 121 «Інженерія програмного забезпечення» / наук. керівник В. Г. Вербицький. Запоріжжя : ЗНУ, 2021. 96 с.<br>UA : Об’єкт дослідження – сукупність складових ІТ, що забезпечує роботоздатність системи - це навігація, фізика руху і нейронна мережа для розпізнавання об'єктів. Завданням науково-дослідницької роботи є дослідження можливостей використання технології для реалізації системи керування роботом для збору тенісних м’ячиків на корті. Задача також вимагає використання нейронної мережі для розпізнавання об’єктів на карті корту. Мета роботи – створення програмного застосунку, який дасть змогу налаштувати систему для керування колісним роботом у автоматичному режимі. Методи досліджень – синтез та аналіз. Результатом роботи є програмна система, що дасть змогу роботу збирати м’ячики на тенісному корті. Галузь застосування охоплює увесь тенісний спорт, де стає проблемою використання людських ресурсів на часу для збору тенісних м’ячів.<br>EN : The object of study - a set of components of IT that ensure the operability of the systema navigation, physics of motion and neural network for object recognition. The task of the research work is to study the possibilities of using the technology to implement a robot control system to collect tennis balls on the court. The task also requires the use of a neural network to detect objects on the map of the court. The purpose of the work is to create a software system that will allow to configure the system to control the robot in automatic mode. Research methods - synthesis and analysis. The result is a software system that will allow you to collect balls on the tennis court.The field of application covers the entire sport of tennis, where it becomes a problem to use human resources in time to collect tennis balls.

APA, Harvard, Vancouver, ISO, and other styles

31

Velosa, José Filipe Góis. "Classification and processing of marine Images." Master's thesis, 2019. http://hdl.handle.net/10400.13/2662.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!