Log in

Relevant bibliographies by topics / MobileNet-SSD / Journal articles

To see the other types of publications on this topic, follow the link: MobileNet-SSD.

Journal articles on the topic 'MobileNet-SSD'

Author: Grafiati

Published: 25 July 2021

Last updated: 4 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 48 journal articles for your research on the topic 'MobileNet-SSD.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Li, Yiting, Haisong Huang, Qingsheng Xie, Liguo Yao, and Qipeng Chen. "Research on a Surface Defect Detection Algorithm Based on MobileNet-SSD." Applied Sciences 8, no. 9 (September 17, 2018): 1678. http://dx.doi.org/10.3390/app8091678.

Full text

Abstract:

This paper aims to achieve real-time and accurate detection of surface defects by using a deep learning method. For this purpose, the Single Shot MultiBox Detector (SSD) network was adopted as the meta structure and combined with the base convolution neural network (CNN) MobileNet into the MobileNet-SSD. Then, a detection method for surface defects was proposed based on the MobileNet-SSD. Specifically, the structure of the SSD was optimized without sacrificing its accuracy, and the network structure and parameters were adjusted to streamline the detection model. The proposed method was applied to the detection of typical defects like breaches, dents, burrs and abrasions on the sealing surface of a container in the filling line. The results show that our method can automatically detect surface defects more accurately and rapidly than lightweight network methods and traditional machine learning methods. The research results shed new light on defect detection in actual industrial scenarios.

APA, Harvard, Vancouver, ISO, and other styles

2

Husein, Amir Mahmud, Christopher Christopher, Andy Gracia, Rio Brandlee, and Muhammad Haris Hasibuan. "Deep Neural Networks Approach for Monitoring Vehicles on the Highway." SinkrOn 4, no. 2 (April 14, 2020): 163. http://dx.doi.org/10.33395/sinkron.v4i2.10553.

Full text

Abstract:

Vehicle classification and detection aims to extract certain types of vehicle information from images or videos containing vehicles and is one of the important things in a smart transportation system. However, due to the different size of the vehicle, it became a challenge that directly and interested many researchers . In this paper, we compare YOLOv3's one-stage detection method with MobileNet-SSD for direct vehicle detection on a highway vehicle video dataset specifically recorded using two cellular devices on highway activities in Medan City, producing 42 videos, both methods evaluated based on Mean Average Precision (mAP) where YOLOv3 produces better accuracy of 81.9% compared to MobileNet-SSD at 67.9%, but the size of the resulting video file detection is greater. Mobilenet-SSD performs faster with smaller video output sizes, but it is difficult to detect small objects.

APA, Harvard, Vancouver, ISO, and other styles

3

Biswas, Debojit, Hongbo Su, Chengyi Wang, Aleksandar Stevanovic, and Weimin Wang. "An automatic traffic density estimation using Single Shot Detection (SSD) and MobileNet-SSD." Physics and Chemistry of the Earth, Parts A/B/C 110 (April 2019): 176–84. http://dx.doi.org/10.1016/j.pce.2018.12.001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Ramalingam, Balakrishnan, Vega-Heredia Manuel, Mohan Rajesh Elara, Ayyalusami Vengadesh, Anirudh Krishna Lakshmanan, Muhammad Ilyas, and Tan Jun Yuan James. "Visual Inspection of the Aircraft Surface Using a Teleoperated Reconfigurable Climbing Robot and Enhanced Deep Learning Technique." International Journal of Aerospace Engineering 2019 (September 12, 2019): 1–14. http://dx.doi.org/10.1155/2019/5137139.

Full text

Abstract:

Aircraft surface inspection includes detecting surface defects caused by corrosion and cracks and stains from the oil spill, grease, dirt sediments, etc. In the conventional aircraft surface inspection process, human visual inspection is performed which is time-consuming and inefficient whereas robots with onboard vision systems can inspect the aircraft skin safely, quickly, and accurately. This work proposes an aircraft surface defect and stain detection model using a reconfigurable climbing robot and an enhanced deep learning algorithm. A reconfigurable, teleoperated robot, named as “Kiropter,” is designed to capture the aircraft surface images with an onboard RGB camera. An enhanced SSD MobileNet framework is proposed for stain and defect detection from these images. A Self-filtering-based periodic pattern detection filter has been included in the SSD MobileNet deep learning framework to achieve the enhanced detection of the stains and defects on the aircraft skin images. The model has been tested with real aircraft surface images acquired from a Boeing 737 and a compact aircraft’s surface using the teleoperated robot. The experimental results prove that the enhanced SSD MobileNet framework achieves improved detection accuracy of aircraft surface defects and stains as compared to the conventional models.

APA, Harvard, Vancouver, ISO, and other styles

5

Nizar, Muhammad Hanif Ahmad, Chow Khuen Chan, Azira Khalil, Ahmad Khairuddin Mohamed Yusof, and Khin Wee Lai. "Real-time Detection of Aortic Valve in Echocardiography using Convolutional Neural Networks." Current Medical Imaging Formerly Current Medical Imaging Reviews 16, no. 5 (May 28, 2020): 584–91. http://dx.doi.org/10.2174/1573405615666190114151255.

Full text

Abstract:

Background: Valvular heart disease is a serious disease leading to mortality and increasing medical care cost. The aortic valve is the most common valve affected by this disease. Doctors rely on echocardiogram for diagnosing and evaluating valvular heart disease. However, the images from echocardiogram are poor in comparison to Computerized Tomography and Magnetic Resonance Imaging scan. This study proposes the development of Convolutional Neural Networks (CNN) that can function optimally during a live echocardiographic examination for detection of the aortic valve. An automated detection system in an echocardiogram will improve the accuracy of medical diagnosis and can provide further medical analysis from the resulting detection. Methods: Two detection architectures, Single Shot Multibox Detector (SSD) and Faster Regional based Convolutional Neural Network (R-CNN) with various feature extractors were trained on echocardiography images from 33 patients. Thereafter, the models were tested on 10 echocardiography videos. Results: Faster R-CNN Inception v2 had shown the highest accuracy (98.6%) followed closely by SSD Mobilenet v2. In terms of speed, SSD Mobilenet v2 resulted in a loss of 46.81% in framesper- second (fps) during real-time detection but managed to perform better than the other neural network models. Additionally, SSD Mobilenet v2 used the least amount of Graphic Processing Unit (GPU) but the Central Processing Unit (CPU) usage was relatively similar throughout all models. Conclusion: Our findings provide a foundation for implementing a convolutional detection system to echocardiography for medical purposes.

APA, Harvard, Vancouver, ISO, and other styles

6

Barba-Guaman, Luis, José Eugenio Naranjo, and Anthony Ortiz. "Deep Learning Framework for Vehicle and Pedestrian Detection in Rural Roads on an Embedded GPU." Electronics 9, no. 4 (March 31, 2020): 589. http://dx.doi.org/10.3390/electronics9040589.

Full text

Abstract:

Object detection, one of the most fundamental and challenging problems in computer vision. Nowadays some dedicated embedded systems have emerged as a powerful strategy for deliver high processing capabilities including the NVIDIA Jetson family. The aim of the present work is the recognition of objects in complex rural areas through an embedded system, as well as the verification of accuracy and processing time. For this purpose, a low power embedded Graphics Processing Unit (Jetson Nano) has been selected, which allows multiple neural networks to be run in simultaneous and a computer vision algorithm to be applied for image recognition. As well, the performance of these deep learning neural networks such as ssd-mobilenet v1 and v2, pednet, multiped and ssd-inception v2 has been tested. Moreover, it was found that the accuracy and processing time were in some cases improved when all the models suggested in the research were applied. The pednet network model provides a high performance in pedestrian recognition, however, the sdd-mobilenet v2 and ssd-inception v2 models are better at detecting other objects such as vehicles in complex scenarios.

APA, Harvard, Vancouver, ISO, and other styles

7

Evan, Meirista Wulandari, and Eko Syamsudin. "Recognition of Pedestrian Traffic Light using Tensorflow And SSD MobileNet V2." IOP Conference Series: Materials Science and Engineering 1007 (December 31, 2020): 012022. http://dx.doi.org/10.1088/1757-899x/1007/1/012022.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Finogeev, E., V. Gorbatsevich, A. Moiseenko, Y. Vizilter, and O. Vygolov. "KNOWLEDGE DISTILLATION USING GANS FOR FAST OBJECT DETECTION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2020 (August 12, 2020): 583–88. http://dx.doi.org/10.5194/isprs-archives-xliii-b2-2020-583-2020.

Full text

Abstract:

Abstract. In this paper, we propose a new method for knowledge distilling based on generative adversarial networks. Discriminator CNNs is used as an adaptive knowledge distilling loss. In experiments, single shot multibox detector SSD based on MobileNet v2 and ShuffleNet v1 are used as student networks. Our tests showed AP and mAP improvement of more than 3% on PascalVOC and 1% on MS Coco datasets compared with the baseline algorithm without any architecture or dataset changes. The proposed approach is general and can be used not only with SSD but also with any type of object detection algorithms.

APA, Harvard, Vancouver, ISO, and other styles

9

Kim, Woonki, Fatemeh Dehghan, and Seongwon Cho. "Vehicle License Plate Recognition System using SSD-Mobilenet and ResNet for Mobile Device." Korean Institute of Smart Media 9, no. 2 (June 30, 2020): 92–98. http://dx.doi.org/10.30693/smj.2020.9.2.92.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Zhang, Jindong, Jiabin Xu, Linyao Zhu, Kunpeng Zhang, Tong Liu, Donghui Wang, and Xue Wang. "An improved MobileNet-SSD algorithm for automatic defect detection on vehicle body paint." Multimedia Tools and Applications 79, no. 31-32 (June 8, 2020): 23367–85. http://dx.doi.org/10.1007/s11042-020-09152-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Li, Yange, Han Wei, Zheng Han, Jianling Huang, and Weidong Wang. "Deep Learning-Based Safety Helmet Detection in Engineering Management Based on Convolutional Neural Networks." Advances in Civil Engineering 2020 (September 19, 2020): 1–10. http://dx.doi.org/10.1155/2020/9703560.

Full text

Abstract:

Visual examination of the workplace and in-time reminder to the failure of wearing a safety helmet is of particular importance to avoid injuries of workers at the construction site. Video monitoring systems provide a large amount of unstructured image data on-site for this purpose, however, requiring a computer vision-based automatic solution for real-time detection. Although a growing body of literature has developed many deep learning-based models to detect helmet for the traffic surveillance aspect, an appropriate solution for the industry application is less discussed in view of the complex scene on the construction site. In this regard, we develop a deep learning-based method for the real-time detection of a safety helmet at the construction site. The presented method uses the SSD-MobileNet algorithm that is based on convolutional neural networks. A dataset containing 3261 images of safety helmets collected from two sources, i.e., manual capture from the video monitoring system at the workplace and open images obtained using web crawler technology, is established and released to the public. The image set is divided into a training set, validation set, and test set, with a sampling ratio of nearly 8 : 1 : 1. The experiment results demonstrate that the presented deep learning-based model using the SSD-MobileNet algorithm is capable of detecting the unsafe operation of failure of wearing a helmet at the construction site, with satisfactory accuracy and efficiency.

APA, Harvard, Vancouver, ISO, and other styles

12

Magalhães, Sandro Augusto, Luís Castro, Germano Moreira, Filipe Neves dos Santos, Mário Cunha, Jorge Dias, and António Paulo Moreira. "Evaluating the Single-Shot MultiBox Detector and YOLO Deep Learning Models for the Detection of Tomatoes in a Greenhouse." Sensors 21, no. 10 (May 20, 2021): 3569. http://dx.doi.org/10.3390/s21103569.

Full text

Abstract:

The development of robotic solutions for agriculture requires advanced perception capabilities that can work reliably in any crop stage. For example, to automatise the tomato harvesting process in greenhouses, the visual perception system needs to detect the tomato in any life cycle stage (flower to the ripe tomato). The state-of-the-art for visual tomato detection focuses mainly on ripe tomato, which has a distinctive colour from the background. This paper contributes with an annotated visual dataset of green and reddish tomatoes. This kind of dataset is uncommon and not available for research purposes. This will enable further developments in edge artificial intelligence for in situ and in real-time visual tomato detection required for the development of harvesting robots. Considering this dataset, five deep learning models were selected, trained and benchmarked to detect green and reddish tomatoes grown in greenhouses. Considering our robotic platform specifications, only the Single-Shot MultiBox Detector (SSD) and YOLO architectures were considered. The results proved that the system can detect green and reddish tomatoes, even those occluded by leaves. SSD MobileNet v2 had the best performance when compared against SSD Inception v2, SSD ResNet 50, SSD ResNet 101 and YOLOv4 Tiny, reaching an F1-score of 66.15%, an mAP of 51.46% and an inference time of 16.44ms with the NVIDIA Turing Architecture platform, an NVIDIA Tesla T4, with 12 GB. YOLOv4 Tiny also had impressive results, mainly concerning inferring times of about 5 ms.

APA, Harvard, Vancouver, ISO, and other styles

13

Guo, Wentong, Hongyuan Fang, and Niannian Wang. "New Method of Airport Pavement Health Inspection Based on MobileNet-SSD and Mask R-CNN." Journal of Physics: Conference Series 1885, no. 2 (April 1, 2021): 022048. http://dx.doi.org/10.1088/1742-6596/1885/2/022048.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Wang, Chenglong, and Zhifeng Xiao. "Lychee Surface Defect Detection Based on Deep Convolutional Neural Networks with GAN-Based Data Augmentation." Agronomy 11, no. 8 (July 28, 2021): 1500. http://dx.doi.org/10.3390/agronomy11081500.

Full text

Abstract:

The performance of fruit surface defect detection is easily affected by factors such as noisy background and foliage occlusion. In this study, we choose lychee as a fruit type to investigate its surface quality. Lychees are hard to preserve and have to be stored at low temperatures to keep fresh. Additionally, the surface of lychees is subject to scratches and cracks during harvesting/processing. To explore the feasibility of the automation of defective surface detection for lychees, we build a dataset with 3743 samples divided into three categories, namely, mature, defects, and rot. The original dataset suffers an imbalanced distribution issue. To address it, we adopt a transformer-based generative adversarial network (GAN) as a means of data augmentation that can effectively enhance the original training set with more and diverse samples to rebalance the three categories. In addition, we investigate three deep convolutional neural network (DCNN) models, including SSD-MobileNet V2, Faster RCNN-ResNet50, and Faster RCNN-Inception-ResNet V2, trained under different settings for an extensive comparison study. The results show that all three models demonstrate consistent performance gains in mean average precision (mAP), with the application of GAN-based augmentation. The rebalanced dataset also reduces the inter-category discrepancy, allowing a DCNN model to be trained equally across categories. In addition, the qualitative results show that models trained under the augmented setting can better identify the critical regions and the object boundary, leading to gains in mAP. Lastly, we conclude that the most cost-effective model, SSD-MobileNet V2, presents a comparable mAP (91.81%) and a superior inference speed (102 FPS), suitable for real-time detection in industrial-level applications.

APA, Harvard, Vancouver, ISO, and other styles

15

Supriadi, Muhammad Fadhlan, Ema Rachmawati, and Anditya Arifianto. "Pembangunan Aplikasi Mobile Pengenalan Objek Untuk Pendidikan Anak Usia Dini." Jurnal Teknologi Informasi dan Ilmu Komputer 8, no. 2 (March 25, 2021): 357. http://dx.doi.org/10.25126/jtiik.2021824363.

Full text

Abstract:

<p class="Abstrak">Penggunaan ponsel sudah sangat erat dengan kehidupaan anak usia dini sehingga menimbulkan beberapa dampak negatif bagi anak usia dini terutama berkurangnya interaksi dengan dunia sekitarnya. Salah satu teknologi yang dapat dikembangkan pada ponsel adalah <em>computer vision. </em>Salah satu penggunaan <em>computer vision </em>adalah <em>object recognition</em> yang memberikan solusi untuk membantu mengenali objek.<em> </em>Pada penelitian ini dibangun sistem pengenalan objek benda di dalam rumah yang diaplikasikan pada ponsel yang diharapkan membantu anak usia dini mengenali benda disekitarnya. <em>MobileNet </em>merupakan salah satu <em>feature extraction</em> yang memiliki kinerja yang baik dan ringan digunakan pada perangkat ponsel. Arsitektur <em>MobileNet </em>terdiri dari <em>layer depthwise convolution </em>dan <em>layer pointwise </em><em>convolution </em>dalam mengekstraksi fitur<em>. </em>Percobaan ini juga menggunakan arsitektur <em>Single Shot Multibox Detector (SSD) </em>sebagai metode dalam mendeteksi objek<em>. Pre-trained model </em>dari dataset <em>COCO </em>digunakan pada eksperimen<em>,</em> untuk mengenali 20 jenis objek benda di dalam rumah. Dari hasil eksperimen, <em>MobileNetV2 </em>menghasilkan nilai <em>mean Average Precision (mAP)</em> yang lebih baik dibandingkan dengan <em>MobileNetV1 </em>dan<em> InceptionV2, </em>yaitu<em> </em>sebesar 99,34%.</p><p class="Abstrak"> </p><p class="Judul2"><strong><em>Abstract</em></strong></p><p class="Judul2"> <em>Mobile phone usage has been very close to early childhood life, so giving rise to some negative impact on early childhood, especially reduced interaction with the surrounding world. One of the technologies that can be developed on the cellphone is computer vision. One of the uses of computer vision is object recognition that provides solutions to help to recognize objects. This research builds a system for recognition objects inside in house that is developed on a cellphone that is expected to help early childhood recognize objects in the surrounding. MobileNet is one of feature extraction that has good performance and efficient use on a cellphone. MobileNet architecture consists of a depthwise convolution layer and pointwise convolution layer in extracting features. The experiment also uses the architecture of Single Shot Multibox Detector (SSD) as a method of detecting objects. We used MobileNet architecture as a pre-trained model that had previously been trained on COCO datasets, and implement transfer learning for 20 types of objects commonly found inside the house. The experimental result indicates that the mean Average Precision (mAP) of MobileNetV2 could exceed MobileNetV1 and InceptionV2 of 99.34%.</em></p>

APA, Harvard, Vancouver, ISO, and other styles

16

Hu, Xun, Hong Li, Xinrong Li, and Chiyu Wang. "MobileNet-SSD MicroScope using adaptive error correction algorithm: real-time detection of license plates on mobile devices." IET Intelligent Transport Systems 14, no. 2 (February 1, 2020): 110–18. http://dx.doi.org/10.1049/iet-its.2019.0380.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Safadinho, David, João Ramos, Roberto Ribeiro, Vítor Filipe, João Barroso, and António Pereira. "UAV Landing Using Computer Vision Techniques for Human Detection." Sensors 20, no. 3 (January 22, 2020): 613. http://dx.doi.org/10.3390/s20030613.

Full text

Abstract:

The capability of drones to perform autonomous missions has led retail companies to use them for deliveries, saving time and human resources. In these services, the delivery depends on the Global Positioning System (GPS) to define an approximate landing point. However, the landscape can interfere with the satellite signal (e.g., tall buildings), reducing the accuracy of this approach. Changes in the environment can also invalidate the security of a previously defined landing site (e.g., irregular terrain, swimming pool). Therefore, the main goal of this work is to improve the process of goods delivery using drones, focusing on the detection of the potential receiver. We developed a solution that has been improved along its iterative assessment composed of five test scenarios. The built prototype complements the GPS through Computer Vision (CV) algorithms, based on Convolutional Neural Networks (CNN), running in a Raspberry Pi 3 with a Pi NoIR Camera (i.e., No InfraRed—without infrared filter). The experiments were performed with the models Single Shot Detector (SSD) MobileNet-V2, and SSDLite-MobileNet-V2. The best results were obtained in the afternoon, with the SSDLite architecture, for distances and heights between 2.5–10 m, with recalls from 59%–76%. The results confirm that a low computing power and cost-effective system can perform aerial human detection, estimating the landing position without an additional visual marker.

APA, Harvard, Vancouver, ISO, and other styles

18

Ramalingam, Balakrishnan, Rajesh Elara Mohan, Selvasundari Balakrishnan, Karthikeyan Elangovan, Braulio Félix Gómez, Thejus Pathmakumar, Manojkumar Devarassu, Madan Mohan Rayaguru, and Chanthini Baskar. "sTetro-Deep Learning Powered Staircase Cleaning and Maintenance Reconfigurable Robot." Sensors 21, no. 18 (September 18, 2021): 6279. http://dx.doi.org/10.3390/s21186279.

Full text

Abstract:

Staircase cleaning is a crucial and time-consuming task for maintenance of multistory apartments and commercial buildings. There are many commercially available autonomous cleaning robots in the market for building maintenance, but few of them are designed for staircase cleaning. A key challenge for automating staircase cleaning robots involves the design of Environmental Perception Systems (EPS), which assist the robot in determining and navigating staircases. This system also recognizes obstacles and debris for safe navigation and efficient cleaning while climbing the staircase. This work proposes an operational framework leveraging the vision based EPS for the modular re-configurable maintenance robot, called sTetro. The proposed system uses an SSD MobileNet real-time object detection model to recognize staircases, obstacles and debris. Furthermore, the model filters out false detection of staircases by fusion of depth information through the use of a MobileNet and SVM. The system uses a contour detection algorithm to localize the first step of the staircase and depth clustering scheme for obstacle and debris localization. The framework has been deployed on the sTetro robot using the Jetson Nano hardware from NVIDIA and tested with multistory staircases. The experimental results show that the entire framework takes an average of 310 ms to run and achieves an accuracy of 94.32% for staircase recognition tasks and 93.81% accuracy for obstacle and debris detection tasks during real operation of the robot.

APA, Harvard, Vancouver, ISO, and other styles

19

Liu, Yi, Changyun Miao, Xianguo Li, and Guowei Xu. "Research on Deviation Detection of Belt Conveyor Based on Inspection Robot and Deep Learning." Complexity 2021 (February 25, 2021): 1–15. http://dx.doi.org/10.1155/2021/3734560.

Full text

Abstract:

The deviation of the conveyor belt is a common failure that affects the safe operation of the belt conveyor. In this paper, a deviation detection method of the belt conveyor based on inspection robot and deep learning is proposed to detect the deviation at its any position. Firstly, the inspection robot captures the image and the region of interest (ROI) containing the conveyor belt edge and the exposed idler is extracted by the optimized MobileNet SSD (OM-SSD). Secondly, Hough line transform algorithm is used to detect the conveyor belt edge, and an elliptical arc detection algorithm based on template matching is proposed to detect the idler outer edge. Finally, a geometric correction algorithm based on homography transformation is proposed to correct the coordinates of the detected edge points, and the deviation degree (DD) of the conveyor belt is estimated based on the corrected coordinates. The experimental results show that the proposed method can detect the deviation of the conveyor belt continuously with an RMSE of 3.7 mm, an MAE of 4.4 mm, and an average time consumption of 135.5 ms. It improves the monitoring range, detection accuracy, reliability, robustness, and real-time performance of the deviation detection of the belt conveyor.

APA, Harvard, Vancouver, ISO, and other styles

20

Wei, Bingsheng, and Martin Barczyk. "Experimental Evaluation of Computer Vision and Machine Learning-Based UAV Detection and Ranging." Drones 5, no. 2 (May 9, 2021): 37. http://dx.doi.org/10.3390/drones5020037.

Full text

Abstract:

We consider the problem of vision-based detection and ranging of a target UAV using the video feed from a monocular camera onboard a pursuer UAV. Our previously published work in this area employed a cascade classifier algorithm to locate the target UAV, which was found to perform poorly in complex background scenes. We thus study the replacement of the cascade classifier algorithm with newer machine learning-based object detection algorithms. Five candidate algorithms are implemented and quantitatively tested in terms of their efficiency (measured as frames per second processing rate), accuracy (measured as the root mean squared error between ground truth and detected location), and consistency (measured as mean average precision) in a variety of flight patterns, backgrounds, and test conditions. Assigning relative weights of 20%, 40% and 40% to these three criteria, we find that when flying over a white background, the top three performers are YOLO v2 (76.73 out of 100), Faster RCNN v2 (63.65 out of 100), and Tiny YOLO (59.50 out of 100), while over a realistic background, the top three performers are Faster RCNN v2 (54.35 out of 100, SSD MobileNet v1 (51.68 out of 100) and SSD Inception v2 (50.72 out of 100), leading us to recommend Faster RCNN v2 as the recommended solution. We then provide a roadmap for further work in integrating the object detector into our vision-based UAV tracking system.

APA, Harvard, Vancouver, ISO, and other styles

21

Židek, Kamil, Peter Lazorík, Ján Piteľ, and Alexander Hošovský. "An Automated Training of Deep Learning Networks by 3D Virtual Models for Object Recognition." Symmetry 11, no. 4 (April 5, 2019): 496. http://dx.doi.org/10.3390/sym11040496.

Full text

Abstract:

Small series production with a high level of variability is not suitable for full automation. So, a manual assembly process must be used, which can be improved by cooperative robots and assisted by augmented reality devices. The assisted assembly process needs reliable object recognition implementation. Currently used technologies with markers do not work reliably with objects without distinctive texture, for example, screws, nuts, and washers (single colored parts). The methodology presented in the paper introduces a new approach to object detection using deep learning networks trained remotely by 3D virtual models. Remote web application generates training input datasets from virtual 3D models. This new approach was evaluated by two different neural network models (Faster RCNN Inception v2 with SSD, MobileNet V2 with SSD). The main advantage of this approach is the very fast preparation of the 2D sample training dataset from virtual 3D models. The whole process can run in Cloud. The experiments were conducted with standard parts (nuts, screws, washers) and the recognition precision achieved was comparable with training by real samples. The learned models were tested by two different embedded devices with an Android operating system: Virtual Reality (VR) glasses, Cardboard (Samsung S7), and Augmented Reality (AR) smart glasses (Epson Moverio M350). The recognition processing delays of the learned models running in embedded devices based on an ARM processor and standard x86 processing unit were also tested for performance comparison.

APA, Harvard, Vancouver, ISO, and other styles

22

Yuan, Ting, Lin Lv, Fan Zhang, Jun Fu, Jin Gao, Junxiong Zhang, Wei Li, Chunlong Zhang, and Wenqiang Zhang. "Robust Cherry Tomatoes Detection Algorithm in Greenhouse Scene Based on SSD." Agriculture 10, no. 5 (May 9, 2020): 160. http://dx.doi.org/10.3390/agriculture10050160.

Full text

Abstract:

The detection of cherry tomatoes in greenhouse scene is of great significance for robotic harvesting. This paper states a method based on deep learning for cherry tomatoes detection to reduce the influence of illumination, growth difference, and occlusion. In view of such greenhouse operating environment and accuracy of deep learning, Single Shot multi-box Detector (SSD) was selected because of its excellent anti-interference ability and self-taught from datasets. The first step is to build datasets containing various conditions in greenhouse. According to the characteristics of cherry tomatoes, the image samples with illumination change, images rotation and noise enhancement were used to expand the datasets. Then training datasets were used to train and construct network model. To study the effect of base network and the input size of networks, one contrast experiment was designed on different base networks of VGG16, MobileNet, Inception V2 networks, and the other contrast experiment was conducted on changing the network input image size of 300 pixels by 300 pixels, 512 pixels by 512 pixels. Through the analysis of the experimental results, it is found that the Inception V2 network is the best base network with the average precision of 98.85% in greenhouse environment. Compared with other detection methods, this method shows substantial improvement in cherry tomatoes detection.

APA, Harvard, Vancouver, ISO, and other styles

23

Ramalingam, Balakrishnan, Anirudh Lakshmanan, Muhammad Ilyas, Anh Le, and Mohan Elara. "Cascaded Machine-Learning Technique for Debris Classification in Floor-Cleaning Robot Application." Applied Sciences 8, no. 12 (December 17, 2018): 2649. http://dx.doi.org/10.3390/app8122649.

Full text

Abstract:

Debris detection and classification is an essential function for autonomous floor-cleaning robots. It enables floor-cleaning robots to identify and avoid hard-to-clean debris, specifically large liquid spillage debris. This paper proposes a debris-detection and classification scheme for an autonomous floor-cleaning robot using a deep Convolutional Neural Network (CNN) and Support Vector Machine (SVM) cascaded technique. The SSD (Single-Shot MultiBox Detector) MobileNet CNN architecture is used for classifying the solid and liquid spill debris on the floor through the captured image. Then, the SVM model is employed for binary classification of liquid spillage regions based on size, which helps floor-cleaning devices to identify the larger liquid spillage debris regions, considered as hard-to-clean debris in this work. The experimental results prove that the proposed technique can efficiently detect and classify the debris on the floor and achieves 95.5% percent classification accuracy. The cascaded approach takes approximately 71 milliseconds for the entire process of debris detection and classification, which implies that the proposed technique is suitable for deploying in real-time selective floor-cleaning applications.

APA, Harvard, Vancouver, ISO, and other styles

24

Ou, Soobin, Huijin Park, and Jongwoo Lee. "Implementation of an Obstacle Recognition System for the Blind." Applied Sciences 10, no. 1 (December 30, 2019): 282. http://dx.doi.org/10.3390/app10010282.

Full text

Abstract:

The blind encounter commuting risks, such as failing to recognize and avoid obstacles while walking, but protective support systems are lacking. Acoustic signals at crosswalk lights are activated by button or remote control; however, these signals are difficult to operate and not always available (i.e., broken). Bollards are posts installed for pedestrian safety, but they can create dangerous situations in that the blind cannot see them. Therefore, we proposed an obstacle recognition system to assist the blind in walking safely outdoors; this system can recognize and guide the blind through two obstacles (crosswalk lights and bollards) with image training from the Google Object Detection application program interface (API) based on TensorFlow. The recognized results notify the blind through voice guidance playback in real time. The single shot multibox detector (SSD) MobileNet and faster region-convolutional neural network (R-CNN) models were applied to evaluate the obstacle recognition system; the latter model demonstrated better performance. Crosswalk lights were evaluated and found to perform better during the day than night. They were also analyzed to determine if a client could cross at a crosswalk, while the locations of bollards were analyzed by algorithms to guide the client by voice guidance.

APA, Harvard, Vancouver, ISO, and other styles

25

Sun, Chenfan, Wei Zhan, Jinhiu She, and Yangyang Zhang. "Object Detection from the Video Taken by Drone via Convolutional Neural Networks." Mathematical Problems in Engineering 2020 (October 13, 2020): 1–10. http://dx.doi.org/10.1155/2020/4013647.

Full text

Abstract:

The aim of this research is to show the implementation of object detection on drone videos using TensorFlow object detection API. The function of the research is the recognition effect and performance of the popular target detection algorithm and feature extractor for recognizing people, trees, cars, and buildings from real-world video frames taken by drones. The study found that using different target detection algorithms on the “normal” image (an ordinary camera) has different performance effects on the number of instances, detection accuracy, and performance consumption of the target and the application of the algorithm to the image data acquired by the drone is different. Object detection is a key part of the realization of any robot’s complete autonomy, while unmanned aerial vehicles (UAVs) are a very active area of this field. In order to explore the performance of the most advanced target detection algorithm in the image data captured by UAV, we have done a lot of experiments to solve our functional problems and compared two different types of representative of the most advanced convolution target detection systems, such as SSD and Faster R-CNN, with MobileNet, GoogleNet/Inception, and ResNet50 base feature extractors.

APA, Harvard, Vancouver, ISO, and other styles

26

Li, Qing, Yingcheng Lin, and Wei He. "SSD7-FFAM: A Real-Time Object Detection Network Friendly to Embedded Devices from Scratch." Applied Sciences 11, no. 3 (January 25, 2021): 1096. http://dx.doi.org/10.3390/app11031096.

Full text

Abstract:

The high requirements for computing and memory are the biggest challenges in deploying existing object detection networks to embedded devices. Living lightweight object detectors directly use lightweight neural network architectures such as MobileNet or ShuffleNet pre-trained on large-scale classification datasets, which results in poor network structure flexibility and is not suitable for some specific scenarios. In this paper, we propose a lightweight object detection network Single-Shot MultiBox Detector (SSD)7-Feature Fusion and Attention Mechanism (FFAM), which saves storage space and reduces the amount of calculation by reducing the number of convolutional layers. We offer a novel Feature Fusion and Attention Mechanism (FFAM) method to improve detection accuracy. Firstly, the FFAM method fuses high-level semantic information-rich feature maps with low-level feature maps to improve small objects’ detection accuracy. The lightweight attention mechanism cascaded by channels and spatial attention modules is employed to enhance the target’s contextual information and guide the network to focus on its easy-to-recognize features. The SSD7-FFAM achieves 83.7% mean Average Precision (mAP), 1.66 MB parameters, and 0.033 s average running time on the NWPU VHR-10 dataset. The results indicate that the proposed SSD7-FFAM is more suitable for deployment to embedded devices for real-time object detection.

APA, Harvard, Vancouver, ISO, and other styles

27

Cob-Parro, Antonio Carlos, Cristina Losada-Gutiérrez, Marta Marrón-Romera, Alfredo Gardel-Vicente, and Ignacio Bravo-Muñoz. "Smart Video Surveillance System Based on Edge Computing." Sensors 21, no. 9 (April 23, 2021): 2958. http://dx.doi.org/10.3390/s21092958.

Full text

Abstract:

New processing methods based on artificial intelligence (AI) and deep learning are replacing traditional computer vision algorithms. The more advanced systems can process huge amounts of data in large computing facilities. In contrast, this paper presents a smart video surveillance system executing AI algorithms in low power consumption embedded devices. The computer vision algorithm, typical for surveillance applications, aims to detect, count and track people’s movements in the area. This application requires a distributed smart camera system. The proposed AI application allows detecting people in the surveillance area using a MobileNet-SSD architecture. In addition, using a robust Kalman filter bank, the algorithm can keep track of people in the video also providing people counting information. The detection results are excellent considering the constraints imposed on the process. The selected architecture for the edge node is based on a UpSquared2 device that includes a vision processor unit (VPU) capable of accelerating the AI CNN inference. The results section provides information about the image processing time when multiple video cameras are connected to the same edge node, people detection precision and recall curves, and the energy consumption of the system. The discussion of results shows the usefulness of deploying this smart camera node throughout a distributed surveillance system.

APA, Harvard, Vancouver, ISO, and other styles

28

Khan, Ali Haider, Muzammil Hussain, and Muhammad Kamran Malik. "Cardiac Disorder Classification by Electrocardiogram Sensing Using Deep Neural Network." Complexity 2021 (March 23, 2021): 1–8. http://dx.doi.org/10.1155/2021/5512243.

Full text

Abstract:

Cardiac disease is the leading cause of death worldwide. Cardiovascular diseases can be prevented if an effective diagnostic is made at the initial stages. The ECG test is referred to as the diagnostic assistant tool for screening of cardiac disorder. The research purposes of a cardiac disorder detection system from 12-lead-based ECG Images. The healthcare institutes used various ECG equipment that present results in nonuniform formats of ECG images. The research study proposes a generalized methodology to process all formats of ECG. Single Shoot Detection (SSD) MobileNet v2-based Deep Neural Network architecture was used to detect cardiovascular disease detection. The study focused on detecting the four major cardiac abnormalities (i.e., myocardial infarction, abnormal heartbeat, previous history of MI, and normal class) with 98% accuracy results were calculated. The work is relatively rare based on their dataset; a collection of 11,148 standard 12-lead-based ECG images used in this study were manually collected from health care institutes and annotated by the domain experts. The study achieved high accuracy results to differentiate and detect four major cardiac abnormalities. Several cardiologists manually verified the proposed system’s accuracy result and recommended that the proposed system can be used to screen for a cardiac disorder.

APA, Harvard, Vancouver, ISO, and other styles

29

Wang, Shuyu, Mingxin Zhao, Runjiang Dou, Shuangming Yu, Liyuan Liu, and Nanjian Wu. "A Compact High-Quality Image Demosaicking Neural Network for Edge-Computing Devices." Sensors 21, no. 9 (May 8, 2021): 3265. http://dx.doi.org/10.3390/s21093265.

Full text

Abstract:

Image demosaicking has been an essential and challenging problem among the most crucial steps of image processing behind image sensors. Due to the rapid development of intelligent processors based on deep learning, several demosaicking methods based on a convolutional neural network (CNN) have been proposed. However, it is difficult for their networks to run in real-time on edge computing devices with a large number of model parameters. This paper presents a compact demosaicking neural network based on the UNet++ structure. The network inserts densely connected layer blocks and adopts Gaussian smoothing layers instead of down-sampling operations before the backbone network. The densely connected blocks can extract mosaic image features efficiently by utilizing the correlation between feature maps. Furthermore, the block adopts depthwise separable convolutions to reduce the model parameters; the Gaussian smoothing layer can expand the receptive fields without down-sampling image size and discarding image information. The size constraints on the input and output images can also be relaxed, and the quality of demosaicked images is improved. Experiment results show that the proposed network can improve the running speed by 42% compared with the fastest CNN-based method and achieve comparable reconstruction quality as it on four mainstream datasets. Besides, when we carry out the inference processing on the demosaicked images on typical deep CNN networks, Mobilenet v1 and SSD, the accuracy can also achieve 85.83% (top 5) and 75.44% (mAP), which performs comparably to the existing methods. The proposed network has the highest computing efficiency and lowest parameter number through all methods, demonstrating that it is well suitable for applications on modern edge computing devices.

APA, Harvard, Vancouver, ISO, and other styles

30

EL-Bana, Shimaa, Ahmad Al-Kabbany, and Maha Sharkas. "A Two-Stage Framework for Automated Malignant Pulmonary Nodule Detection in CT Scans." Diagnostics 10, no. 3 (February 28, 2020): 131. http://dx.doi.org/10.3390/diagnostics10030131.

Full text

Abstract:

This research is concerned with malignant pulmonary nodule detection (PND) in low-dose CT scans. Due to its crucial role in the early diagnosis of lung cancer, PND has considerable potential in improving the survival rate of patients. We propose a two-stage framework that exploits the ever-growing advances in deep neural network models, and that is comprised of a semantic segmentation stage followed by localization and classification. We employ the recently published DeepLab model for semantic segmentation, and we show that it significantly improves the accuracy of nodule detection compared to the classical U-Net model and its most recent variants. Using the widely adopted Lung Nodule Analysis dataset (LUNA16), we evaluate the performance of the semantic segmentation stage by adopting two network backbones, namely, MobileNet-V2 and Xception. We present the impact of various model training parameters and the computational time on the detection accuracy, featuring a 79.1% mean intersection-over-union (mIoU) and an 88.34% dice coefficient. This represents a mIoU increase of 60% and a dice coefficient increase of 30% compared to U-Net. The second stage involves feeding the output of the DeepLab-based semantic segmentation to a localization-then-classification stage. The second stage is realized using Faster RCNN and SSD, with an Inception-V2 as a backbone. On LUNA16, the two-stage framework attained a sensitivity of 96.4%, outperforming other recent models in the literature, including deep models. Finally, we show that adopting a transfer learning approach, particularly, the DeepLab model weights of the first stage of the framework, to infer binary (malignant-benign) labels on the Kaggle dataset for pulmonary nodules achieves a classification accuracy of 95.66%, which represents approximately 4% improvement over the recent literature.

APA, Harvard, Vancouver, ISO, and other styles

31

Yu, Jingrui, Roman Seidel, and Gangolf Hirtz. "OmniPD: One-Step Person Detection in Top-View Omnidirectional Indoor Scenes." Current Directions in Biomedical Engineering 5, no. 1 (September 1, 2019): 239–44. http://dx.doi.org/10.1515/cdbme-2019-0061.

Full text

Abstract:

AbstractWe propose a one-step person detector for topview omnidirectional indoor scenes based on convolutional neural networks (CNNs). While state of the art person detectors reach competitive results on perspective images, missing CNN architectures as well as training data that follows the distortion of omnidirectional images makes current approaches not applicable to our data. The method predicts bounding boxes of multiple persons directly in omnidirectional images without perspective transformation, which reduces overhead of pre- and post-processing and enables realtime performance. The basic idea is to utilize transfer learning to fine-tune CNNs trained on perspective images with data augmentation techniques for detection in omnidirectional images. We fine-tune two variants of Single Shot MultiBox detectors (SSDs). The first one uses Mobilenet v1 FPN as feature extractor (moSSD). The second one uses ResNet50 v1 FPN (resSSD). Both models are pre-trained on Microsoft Common Objects in Context (COCO) dataset. We fine-tune both models on PASCAL VOC07 and VOC12 datasets, specifically on class person. Random 90-degree rotation and random vertical flipping are used for data augmentation in addition to the methods proposed by original SSD. We reach an average precision (AP) of 67.3%with moSSD and 74.9%with resSSD on the evaluation dataset. To enhance the fine-tuning process, we add a subset of HDA Person dataset and a subset of PIROPO database and reduce the number of perspective images to PASCAL VOC07. The AP rises to 83.2% for moSSD and 86.3% for resSSD, respectively. The average inference speed is 28 ms per image for moSSD and 38 ms per image for resSSD using Nvidia Quadro P6000. Our method is applicable to other CNN-based object detectors and can potentially generalize for detecting other objects in omnidirectional images.

APA, Harvard, Vancouver, ISO, and other styles

32

Shourov, Chowdhury Erfan, Mahasweta Sarkar, Arash Jahangiri, and Christopher Paolini. "Deep Learning Architectures for Skateboarder–Pedestrian Surrogate Safety Measures." Future Transportation 1, no. 2 (September 12, 2021): 387–413. http://dx.doi.org/10.3390/futuretransp1020022.

Full text

Abstract:

Skateboarding as a method of transportation has become prevalent, which has increased the occurrence and likelihood of pedestrian–skateboarder collisions and near-collision scenarios in shared-use roadway areas. Collisions between pedestrians and skateboarders can result in significant injury. New approaches are needed to evaluate shared-use areas prone to hazardous pedestrian–skateboarder interactions, and perform real-time, in situ (e.g., on-device) predictions of pedestrian–skateboarder collisions as road conditions vary due to changes in land usage and construction. A mechanism called the Surrogate Safety Measures for skateboarder–pedestrian interaction can be computed to evaluate high-risk conditions on roads and sidewalks using deep learning object detection models. In this paper, we present the first ever skateboarder–pedestrian safety study leveraging deep learning architectures. We view and analyze state of the art deep learning architectures, namely the Faster R-CNN and two variants of the Single Shot Multi-box Detector (SSD) model to select the correct model that best suits two different tasks: automated calculation of Post Encroachment Time (PET) and finding hazardous conflict zones in real-time. We also contribute a new annotated data set that contains skateboarder–pedestrian interactions that has been collected for this study. Both our selected models can detect and classify pedestrians and skateboarders correctly and efficiently. However, due to differences in their architectures and based on the advantages and disadvantages of each model, both models were individually used to perform two different set of tasks. Due to improved accuracy, the Faster R-CNN model was used to automate the calculation of post encroachment time, whereas to determine hazardous regions in real-time, due to its extremely fast inference rate, the Single Shot Multibox MobileNet V1 model was used. An outcome of this work is a model that can be deployed on low-cost, small-footprint mobile and IoT devices at traffic intersections with existing cameras to perform on-device inferencing for in situ Surrogate Safety Measurement (SSM), such as Time-To-Collision (TTC) and Post Encroachment Time (PET). SSM values that exceed a hazard threshold can be published to an Message Queuing Telemetry Transport (MQTT) broker, where messages are received by an intersection traffic signal controller for real-time signal adjustment, thus contributing to state-of-the-art vehicle and pedestrian safety at hazard-prone intersections.

APA, Harvard, Vancouver, ISO, and other styles

33

Et al., Wei Zhan,. "Electric Equipment Inspection on High Voltage Transmission Line Via Mobile Net-SSD." CONVERTER, July 9, 2021, 527–40. http://dx.doi.org/10.17762/converter.225.

Full text

Abstract:

Daily check and inspection of electrical utilities on the transmission line to find out faults or malfunction data and analyze, it’s to ensure normal state of electrical equipment really difficult in any situation. Machine-controlled inspections by like robots or drones for power transmission infrastructures is an indispensable way to assure the safety of power transmission. Targeted object detection and classification of the power transmission infrastructure is the prerequisite for automatic inspection. In our experiment we have create the dedicated datasets of the electric equipment on power transmission line for multi-object detection, including our data collection, prepossessing and annotation. This work has been done multiple experiments to solve our functional problem and compare novel state of art deep learning methods such as Faster R-CNN, Mask R-CNN, YOLO, and SSD with MobileNet is a base feature extractor, to realize the electric equipment on power transmission line detection. For Condition monitoringand diagnosis identification of the importance of electric equipment on the electric transfer line, in the proposed deep detection approach, the Single-Shot Multi-box Detector (SSD) is a powerful deepmeta-architecture. The results show that our method can automatically detect electric equipment on high voltage transfer defects more accurately and rapidly than lightweight network methods and traditional deep learning methods. Results shed new light on defect detection in actual in progressive scenarios. In our research the main goal to show the implementation of the object detection on electric equipment's inspections on high voltage electric transfer lines on drone video using MobileNet-SSD object detection and recognition.

APA, Harvard, Vancouver, ISO, and other styles

34

"Insect Identification Among Deep Learning’s Meta-Architectures using Tensor Flow." International Journal of Engineering and Advanced Technology 9, no. 1 (October 30, 2019): 1910–14. http://dx.doi.org/10.35940/ijeat.a1031.109119.

Full text

Abstract:

Agriculture provides food for human existence, where insects damage the crops. The identification of the insect is a difficult process and subjected to expert opinion. In recent years, researches using deep learning in fields of object detection have been widespread and show accuracy as a result. This study show the comparison of three widely used deep learning meta-architectures (Faster R-CNN, SSD Inception and SSD Mobilenet) as object detection for selected flying insects namely Phyllophaga spp., Helicoverpa armigera and Spodoptera litura. The proposed study is focused on accuracy performance of selected meta-architectures using small dataset of insects. The meta-architecture was tested with same environment for all three architectures and Faster RCNN meta-architecture performs outstanding with an accuracy of 95.33%

APA, Harvard, Vancouver, ISO, and other styles

35

Danilov, Viacheslav, Olga Gerget, Kirill Klyshnikov, Evgeny Ovcharenko, and Alejandro Frangi. "Comparative Study of Deep Learning Models for Automatic Coronary Stenosis Detection in X-ray Angiography." Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2, December 17, 2020, paper75–1—paper75–11. http://dx.doi.org/10.51130/graphicon-2020-2-3-75.

Full text

Abstract:

The article explores the application of machine learning approach to detect both single-vessel and multivessel coronary artery disease from X-ray angiography. Since the interpretation of coronary angiography images requires interventional cardiologists to have considerable training, our study is aimed at analysing, training, and assessing the potential of the existing object detectors for classifying and detecting coronary artery stenosis using angiographic imaging series. 100 patients who underwent coronary angiography at the Research Institute for Complex Issues of Cardiovascular Diseases were retrospectively enrolled in the study. To automate the medical data analysis, we examined and compared three models (SSD MobileNet V1, Faster-RCNN ResNet-50 V1, FasterRCNN NASNet) with various architecture, network complexity, and a number of weights. To compare developed deep learning models, we used the mean Average Precision (mAP) metric, training time, and inference time. Testing results show that the training/inference time is directly proportional to the model complexity. Thus, Faster-RCNN NASNet demonstrates the slowest inference time. Its mean inference time per one image made up 880 ms. In terms of accuracy, FasterRCNN ResNet-50 V1 demonstrates the highest prediction accuracy. This model has reached the mAP metric of 0.92 on the validation dataset. SSD MobileNet V1 has demonstrated the best inference time with the inference rate of 23 frames per second.

APA, Harvard, Vancouver, ISO, and other styles

36

Murthy, Chintakindi Balaram, Mohammad Farukh Hashmi, and Avinash G. Keskar. "Optimized MobileNet + SSD: a real-time pedestrian detection on a low-end edge device." International Journal of Multimedia Information Retrieval, July 16, 2021. http://dx.doi.org/10.1007/s13735-021-00212-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

"Object Detection and Classification for Autonomous Drones." International Journal of Recent Technology and Engineering 8, no. 6 (March 30, 2020): 3162–65. http://dx.doi.org/10.35940/ijrte.f8862.038620.

Full text

Abstract:

Detecting and classifying objects in a single frame which consists of several objects in a cumbersome task. With the advancement of deep learning techniques, the rate of accuracy has increased significantly. This paper aims to implement the state of the art custom algorithm for detection and classification of objects in a single frame with the goal of attaining high accuracy with a real time performance. The proposed system utilizes SSD architecture coupled with MobileNet to achieve maximum accuracy. The system will be fast enough to detect and recognize multiple objects even at 30 FPS.

APA, Harvard, Vancouver, ISO, and other styles

38

Firmansyah, Muhammad Hafidh, Seok-Joo Koh, Wahyu Kurnia Dewanto, and Trismayanti Dwi Puspitasari. "Light-weight MobileNet for Fast Detection of COVID-19." Jurnal Teknologi Informasi dan Terapan 8, no. 1 (July 1, 2021). http://dx.doi.org/10.25047/jtit.v8i1.214.

Full text

Abstract:

The machine learning models based on Convolutional Neural Networks (CNNs) can be effectively used for detection and recognition of objects, such as Corona Virus Disease 19 (COVID-19). In particular, the MobileNet and Single Shot multi-box Detector (SSD) have recently been proposed as the machine learning model for object detection. However, there are still some challenges for deployment of such architectures on the embedded devices, due to the limited computational power. Another problem is that the accuracy of the associated machine learning model may be decreased, depending on the number of concerned parameters and layers. This paper proposes a light-weight MobileNet (LMN) architecture that can be used to improve the accuracy of the machine learning model, with a small number of layers and lower computation time, compared to the existing models. By experimentation, we show that the proposed LMN model can be effectively used for detection of COVID-19 virus. The proposed LMN can achieve the accuracy of 98% with the file size of 27.8 Mbits by replacing the standard CNN layers with separable convolutional layers.

APA, Harvard, Vancouver, ISO, and other styles

39

"Pembuatan Aplikasi Deteksi Objek Menggunakan TensorFlow Object Detection API dengan Memanfaatkan SSD MobileNet V2 Sebagai Model Pra - Terlatih." Jurnal Ilmiah Komputasi 19, no. 3 (March 30, 2020). http://dx.doi.org/10.32409/jikstik.19.3.68.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Fauzi, Willy Achmat, Supeno M. Susiki Nugroho, Eko Mulyanto Yuniarno, Wiwik Anggraeni, and Mauridhi Hery Purnomo. "Multiple Face Tracking using Kalman and Hungarian Algorithm to Reduce Face Recognition Computational Cost." JAREE (Journal on Advanced Research in Electrical Engineering) 5, no. 1 (April 1, 2021). http://dx.doi.org/10.12962/jaree.v5i1.191.

Full text

Abstract:

Currently, research in face recognition systemsmainly utilized deep learning to achieve high accuracy. Usingdeep learning as the base platform, per frame image processingto detect and recognize faces is computationally expensive,especially for video surveillance systems using large numbers ofmounted cameras simultaneously streaming video data to thesystem. The idea behind this research is that the system does notneed to recognize every occurrence of faces in every frame. Weused MobileNet SSD to detect the face, Kalman filter to predictface location in the next frame when detection fails, andHungarian algorithm to maintain the identity of each face.Based on the result, using our algorithm 87.832 face that mustbe recognized is reduced to only 204 faces, and run at the realtime scenario. This method is proven to be used in surveillancesystems by reducing computational cost.

APA, Harvard, Vancouver, ISO, and other styles

41

"Effective Deep Learning Based Architecture for Pedestrian Detection from Digital Images." International Journal of Engineering and Advanced Technology 9, no. 3 (February 29, 2020): 1498–508. http://dx.doi.org/10.35940/ijeat.b4225.029320.

Full text

Abstract:

This paper is to present an efficient and fast deep learning algorithm based on neural networks for object detection and pedestrian detection. The technique, called MobileNet Single Shot Detector, is an extension to Convolution Neural Networks. This technique is based on depth-wise distinguishable convolutions in order to build a lightweighted deep convolution network. A single filter is applied to each input and outputs are combined by using pointwise convolution. Single Shot Multibox Detector is a feed forward convolution network that is combined with MobileNets to give efficient and accurate results. MobileNets combined with SSD and Multibox Technique makes it much faster than SSD alone can work. The accuracy for this technique is calculated over colored (RGB images) and also on infrared images and its results are compared with the results of shallow machine learning based feature extraction plus classification technique viz. HOG plus SVM technique. The comparison of performance between proposed deep learning and shallow learning techniques has been conducted over benchmark dataset and validation testing over own dataset in order measure efficiency of both algorithms and find an effective algorithm that can work with speed and accurately to be applied for object detection in real world pedestrian detection application.

APA, Harvard, Vancouver, ISO, and other styles

42

Thohari, Afandi Nur Aziz, and Rifki Adhitama. "Real-Time Object Detection For Wayang Punakawan Identification Using Deep Learning." JURNAL INFOTEL 11, no. 4 (December 30, 2019). http://dx.doi.org/10.20895/infotel.v11i4.455.

Full text

Abstract:

Indonesia is a country that has a variety of cultures, one of which is wayang kulit. This typical javanese performance art must continue to be preserved so that to be known by future generations. There are many wayang figures in Indonesia, and the most famous is punakawan. Wayang punakawan consists of four character namely semar, gareng petruk, and bagong. To preserve wayang punakawan to be known by the next generation, then in this study created a system that is able to identify real-time punakawan object using deep learning technology. The method that used is Single Shot Multiple Detector (SSD) as one of the models of deep learning that has a good ability in classifying data with three-dimensional structures such as real-time video. SSD model with MobileNet layer can work in slight computation, so that it can be run in real-time system. To classify object there are two steps that must be done such as training process and testing process. Training process takes 28 hours with 100.000 steps of iteration.The result of training process is a model which used to identify object. Based on the test result obtained an accuracy to detect object was 98,86%. This prove that the system has been able to optimize object in real-time accurately.

APA, Harvard, Vancouver, ISO, and other styles

43

Danilov, Viacheslav V., Kirill Yu Klyshnikov, Olga M. Gerget, Anton G. Kutikhin, Vladimir I. Ganyukov, Alejandro F. Frangi, and Evgeny A. Ovcharenko. "Real-time coronary artery stenosis detection based on modern neural networks." Scientific Reports 11, no. 1 (April 7, 2021). http://dx.doi.org/10.1038/s41598-021-87174-2.

Full text

Abstract:

AbstractInvasive coronary angiography remains the gold standard for diagnosing coronary artery disease, which may be complicated by both, patient-specific anatomy and image quality. Deep learning techniques aimed at detecting coronary artery stenoses may facilitate the diagnosis. However, previous studies have failed to achieve superior accuracy and performance for real-time labeling. Our study is aimed at confirming the feasibility of real-time coronary artery stenosis detection using deep learning methods. To reach this goal we trained and tested eight promising detectors based on different neural network architectures (MobileNet, ResNet-50, ResNet-101, Inception ResNet, NASNet) using clinical angiography data of 100 patients. Three neural networks have demonstrated superior results. The network based on Faster-RCNN Inception ResNet V2 is the most accurate and it achieved the mean Average Precision of 0.95, F1-score 0.96 and the slowest prediction rate of 3 fps on the validation subset. The relatively lightweight SSD MobileNet V2 network proved itself as the fastest one with a low mAP of 0.83, F1-score of 0.80 and a mean prediction rate of 38 fps. The model based on RFCN ResNet-101 V2 has demonstrated an optimal accuracy-to-speed ratio. Its mAP makes up 0.94, F1-score 0.96 while the prediction speed is 10 fps. The resultant performance-accuracy balance of the modern neural networks has confirmed the feasibility of real-time coronary artery stenosis detection supporting the decision-making process of the Heart Team interpreting coronary angiography findings.

APA, Harvard, Vancouver, ISO, and other styles

44

"Deep Learning-Based Embedded System for Carabao Mango (Mangifera Indica L.) Sorting." International Journal of Recent Technology and Engineering 8, no. 2 (July 30, 2019): 5456–62. http://dx.doi.org/10.35940/ijrte.b3754.078219.

Full text

Abstract:

This paper presents the design and development of an embedded system for ‘Carabao’ or Philippine mango sorting utilizing deep learning techniques. In particular, the proposed system initially takes as input a top view image of the mango, which is consequently rolled over to evaluate every sides. The input images were processed by Single Shot MultiBox Detector (SSD) MobileNet for mango detection and Multi-Task Learning Convolutional Neural Network (MTL-CNN) for classification/sorting ripeness and basic quality, running on an embedded computer, i.e. Raspberry Pi 3. Our dataset consisting of 2800 mango images derived from about 270 distinct mango fruits were annotated for multiple classification tasks, namely, basic quality (defective or good) and ripeness (green, semi-ripe, and ripe). The mango detection results achieved a total precision score of 0.92 and a mean average precision (mAP) of over 0.8 in the final checkpoint. The basic quality classification accuracy results were 0.98 and 0.92, respectively, for defective and good quality, while the ripeness classification for green, ripe, and semi-ripe were 1.0, 1.0, and 0.91, respectively. Overall, the results demonstrated the feasibility of our proposed embedded system for image-based Carabao mango sorting using deep learning techniques.

APA, Harvard, Vancouver, ISO, and other styles

45

Parikh, Vishal, Jay Mehta, Saumyaa Shah, and Priyanka Sharma. "Comparative Analysis of Key-frame Extraction Techniques For Video Summarization." Recent Advances in Computer Science and Communications 13 (July 10, 2020). http://dx.doi.org/10.2174/2666255813999200710131444.

Full text

Abstract:

Background: With the technological advancement, the quality of life of a human were improved. Also with the technological advancement large amount of data were produced by human. The data is in the forms of text, images and videos. Hence there is a need for significant efforts and means of devising methodologies for analyzing and summarizing them to manage with the space constraints. Video summaries can be generated either by keyframes or by skim/shot. The keyframe extraction is done based on deep learning based object detection techniques. Various object detection algorithms have been reviewed for generating and selecting the best possible frames as keyframes. A set of frames were extracted out of the original video sequence and based on the technique used, one or more frames of the set are decided as a keyframe, which then becomes the part of the summarized video. The following paper discusses the selection of various keyframe extraction techniques in detail. Methods : The research paper is focused at summary generation for office surveillance videos. The major focus for the summary generation is based on various keyframe extraction techniques. For the same various training models like Mobilenet, SSD, and YOLO were used. A comparative analysis of the efficiency for the same showed YOLO giving better performance as compared to the others. Keyframe selection techniques like sufficient content change, maximum frame coverage, minimum correlation, curve simplification, and clustering based on human presence in the frame have been implemented. Results: Variable and fixed length video summaries were generated and analyzed for each keyframe selection techniques for office surveillance videos. The analysis shows that he output video obtained after using the Clustering and the Curve Simplification approaches is compressed to half the size of the actual video but requires considerably less storage space. The technique depending on the change of frame content between consecutive frames for keyframe selection produces the best output for office room scenarios. The technique depending on frame content between consecutive frames for keyframe selection produces the best output for office surveillance videos. Conclusion: In this paper, we discussed the process of generating a synopsis of a video to highlight the important portions and discard the trivial and redundant parts. First, we have described various object detection algorithms like YOLO and SSD, used in conjunction with neural networks like MobileNet to obtain the probabilistic score of an object that is present in the video. These algorithms generate the probability of a person being a part of the image, for every frame in the input video. The results of object detection are passed to keyframe extraction algorithms to obtain the summarized video. From our comparative analysis for keyframe selection techniques for office videos will help in determining which keyframe selection technique is preferable.

APA, Harvard, Vancouver, ISO, and other styles

46

Xu, Jie. "A deep learning approach to building an intelligent video surveillance system." Multimedia Tools and Applications, October 7, 2020. http://dx.doi.org/10.1007/s11042-020-09964-6.

Full text

Abstract:

Abstract Recent advances in the field of object detection and face recognition have made it possible to develop practical video surveillance systems with embedded object detection and face recognition functionalities that are accurate and fast enough for commercial uses. In this paper, we compare some of the latest approaches to object detection and face recognition and provide reasons why they may or may not be amongst the best to be used in video surveillance applications in terms of both accuracy and speed. It is discovered that Faster R-CNN with Inception ResNet V2 is able to achieve some of the best accuracies while maintaining real-time rates. Single Shot Detector (SSD) with MobileNet, on the other hand, is incredibly fast and still accurate enough for most applications. As for face recognition, FaceNet with Multi-task Cascaded Convolutional Networks (MTCNN) achieves higher accuracy than advances such as DeepFace and DeepID2+ while being faster. An end-to-end video surveillance system is also proposed which could be used as a starting point for more complex systems. Various experiments have also been attempted on trained models with observations explained in detail. We finish by discussing video object detection and video salient object detection approaches which could potentially be used as future improvements to the proposed system.

APA, Harvard, Vancouver, ISO, and other styles

47

"Effect of Various Activation Function on Steering Angle Prediction in CNN based Autonomous Vehicle System." International Journal of Engineering and Advanced Technology 9, no. 2 (December 30, 2019): 3806–11. http://dx.doi.org/10.35940/ijeat.b4017.129219.

Full text

Abstract:

Autonomous or Self-driving vehicles are set to become the main mode of transportation for future generations. They are highly reliable, very safe and always improving as they never stop learning. There are numerous systems being developed currently based on various techniques like behavioural cloning and reinforcement learning. Almost all these systems work in a similar way, that is, the agent (vehicle) is completely aware of its immediate surroundings and takes future decisions based on its own historical experiences. The proposed work involves the design and implementation of Convolutional Neural Network (CNN) enhanced with new activation function. The proposed CNN is trained to take a picture of the road in front of it as input and give the required angle of tilt of the steering wheel . The model is trained using the behavioural cloning method and thus learns to navigate from the experiences of a human agent. This method is very accurate and efficient. In this paper, for the detection of object and vehicle in autonomous vehicle, the existing Tensorflow object Detection API is collaborated with pretrained SSD MobileNet model. This paper presents in detail literature survey on various techniques that have been used in predicting steering angle and object detection in self driving car. Apart from that, the effect of activation functions like ReLU, Sigmoid and ELU over the CNN model is analysed.

APA, Harvard, Vancouver, ISO, and other styles

48

Diamanta, Deon, and Hapnes Toba. "Pendeteksian Citra Pengunjung Menggunakan Single Shot Detector untuk Analisis dan Prediksi Seasonality." Jurnal Teknik Informatika dan Sistem Informasi 7, no. 1 (April 24, 2021). http://dx.doi.org/10.28932/jutisi.v7i1.3329.

Full text

Abstract:

This study discusses the analysis of retail store with time series method to obtain information about sales trend and seasonality by looking at visitor data and total transaction data at a time period. Data in the form of the number of customers who visit are obtained through CCTV video camera recordings placed at retail store X and the total transaction occurred at retail store X. The visitor counting uses the deep learning method with SSD (Single Shot Detector) object detection framework and MobileNet architecture. The library used to count the number of customers visiting the store is OpenCV, Pandas, Numpy, Dlib, and Imutils. The number of customers visiting the store will then be compared to the number of transactions that occur at the same time so that a conversion rate can be obtained. From here, we can see sales trend that occur at any time. Time series analysis is also carried out to determine and analyze the pattern of data obtained based on certain time to predict the things that need to be done in the future. Through this research, information has been successfully obtained related to seasonality patterns, value and interpretation of retail conversion rates, models for predicting the number of visitors and transactions, and answering the hypothesis with the Wilcoxon test method obtained a p-value of 0,014 which states that the data pattern of the number of consumers is not the same as the transaction data pattern.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!