Log in

Relevant bibliographies by topics / Robust Event-based Object Classification / Journal articles

To see the other types of publications on this topic, follow the link: Robust Event-based Object Classification.

Journal articles on the topic 'Robust Event-based Object Classification'

Author: Grafiati

Published: 4 June 2025

Last updated: 23 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Robust Event-based Object Classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Liu, Changyu, Bin Lu, and Huiling Li. "Secure Access Control and Large Scale Robust Representation for Online Multimedia Event Detection." Scientific World Journal 2014 (2014): 1–12. http://dx.doi.org/10.1155/2014/219732.

Full text

Abstract:

We developed an online multimedia event detection (MED) system. However, there are a secure access control issue and a large scale robust representation issue when we want to integrate traditional event detection algorithms into the online environment. For the first issue, we proposed a tree proxy-based and service-oriented access control (TPSAC) model based on the traditional role based access control model. Verification experiments were conducted on the CloudSim simulation platform, and the results showed that the TPSAC model is suitable for the access control of dynamic online environments. For the second issue, inspired by the object-bank scene descriptor, we proposed a 1000-object-bank (1000OBK) event descriptor. Feature vectors of the 1000OBK were extracted from response pyramids of 1000 generic object detectors which were trained on standard annotated image datasets, such as the ImageNet dataset. A spatial bag of words tiling approach was then adopted to encode these feature vectors for bridging the gap between the objects and events. Furthermore, we performed experiments in the context of event classification on the challenging TRECVID MED 2012 dataset, and the results showed that the robust 1000OBK event descriptor outperforms the state-of-the-art approaches.

APA, Harvard, Vancouver, ISO, and other styles

2

Venkateswara Rao, N., G. Anil Kumar, and B. Harish. "HOG based object detection and classification." International Journal of Engineering & Technology 7, no. 3.3 (2018): 151. http://dx.doi.org/10.14419/ijet.v7i3.3.15585.

Full text

Abstract:

The intension of the project is to classify objects in real world and to tracks them throughout their life spans. Object detection algorithms use feature extraction and learning algorithms to classification of an object category. Our algorithm uses a combination of “histogram of oriented gradient” (HOG) and “support vector machine” (SVM) classifier to classify of objects. Results have shown this to be a robust method in both classifying the objects along with tracking them in real time world.

APA, Harvard, Vancouver, ISO, and other styles

3

Saikrishnan, Venkatesan, and Mani Karthikeyan. "Mayfly Optimization with Deep Learning-based Robust Object Detection and Classification on Surveillance Videos." Engineering, Technology & Applied Science Research 13, no. 5 (2023): 11747–52. http://dx.doi.org/10.48084/etasr.6231.

Full text

Abstract:

Surveillance videos are recordings captured by video recording devices for monitoring and securing an area or property. These videos are frequently used in applications, involving law enforcement, security systems, retail analytics, and traffic monitoring. Surveillance videos can provide valuable visual information for analyzing patterns, identifying individuals or objects of interest, and detecting and investigating incidents. Object detection and classification on video surveillance involves the usage of computer vision techniques to identify and categorize objects within the video footage. Object detection algorithms are employed to locate and identify objects within each frame. These algorithms use various techniques, namely bounding box regression, Convolutional Neural Networks (CNNs), and feature extraction to detect objects of interest. This study presents the Mayfly Optimization with Deep Learning-based Robust Object Detection and Classification (MFODL-RODC) method on surveillance videos. The main aim of the MFODL-RODC technique lies in the accurate classification and recognition of objects in surveillance videos. To accomplish this, the MFODL-RODC method follows a two-step process, consisting of object detection and object classification. The MFODL-RODC method uses the EfficientDet object detector for the object detection process. Besides, the classification of detected objects takes place using the Variational Autoencoder (VAE) model. The MFO algorithm is employed to enrich the performance of the VAE model. The simulation examination of the MFODL-RODC technique is performed on benchmark datasets. The extensive results accentuated the improved performance of the MFODL-RODC method over other existing algorithms with an output of 98.89%.

APA, Harvard, Vancouver, ISO, and other styles

4

El Shair, Zaid, and Samir A. Rawashdeh. "High-Temporal-Resolution Object Detection and Tracking Using Images and Events." Journal of Imaging 8, no. 8 (2022): 210. http://dx.doi.org/10.3390/jimaging8080210.

Full text

Abstract:

Event-based vision is an emerging field of computer vision that offers unique properties, such as asynchronous visual output, high temporal resolutions, and dependence on brightness changes, to generate data. These properties can enable robust high-temporal-resolution object detection and tracking when combined with frame-based vision. In this paper, we present a hybrid, high-temporal-resolution object detection and tracking approach that combines learned and classical methods using synchronized images and event data. Off-the-shelf frame-based object detectors are used for initial object detection and classification. Then, event masks, generated per detection, are used to enable inter-frame tracking at varying temporal resolutions using the event data. Detections are associated across time using a simple, low-cost association metric. Moreover, we collect and label a traffic dataset using the hybrid sensor DAVIS 240c. This dataset is utilized for quantitative evaluation using state-of-the-art detection and tracking metrics. We provide ground truth bounding boxes and object IDs for each vehicle annotation. Further, we generate high-temporal-resolution ground truth data to analyze tracking performance at different temporal rates. Our approach shows promising results, with minimal performance deterioration at higher temporal resolutions (48–384 Hz) when compared with the baseline frame-based performance at 24 Hz.

APA, Harvard, Vancouver, ISO, and other styles

5

Song, Hanbin, Sanghyeop Yeo, Youngwan Jin, et al. "Short-Wave Infrared (SWIR) Imaging for Robust Material Classification: Overcoming Limitations of Visible Spectrum Data." Applied Sciences 14, no. 23 (2024): 11049. http://dx.doi.org/10.3390/app142311049.

Full text

Abstract:

This paper presents a novel approach to material classification using short-wave infrared (SWIR) imaging, aimed at applications where differentiating visually similar objects based on material properties is essential, such as in autonomous driving. Traditional vision systems, relying on visible spectrum imaging, struggle to distinguish between objects with similar appearances but different material compositions. Our method leverages SWIR’s distinct reflectance characteristics, particularly for materials containing moisture, and demonstrates a significant improvement in accuracy. Specifically, SWIR data achieved near-perfect classification results with an accuracy of 99% for distinguishing real from artificial objects, compared to 77% with visible spectrum data. In object detection tasks, our SWIR-based model achieved a mean average precision (mAP) of 0.98 for human detection and up to 1.00 for other objects, demonstrating its robustness in reducing false detections. This study underscores SWIR’s potential to enhance object recognition and reduce ambiguity in complex environments, offering a valuable contribution to material-based object recognition in autonomous driving, manufacturing, and beyond.

APA, Harvard, Vancouver, ISO, and other styles

6

Guan, Yurong, Muhammad Aamir, Zhihua Hu, et al. "A Region-Based Efficient Network for Accurate Object Detection." Traitement du Signal 38, no. 2 (2021): 481–94. http://dx.doi.org/10.18280/ts.380228.

Full text

Abstract:

Object detection in images is an important task in image processing and computer vision. Many approaches are available for object detection. For example, there are numerous algorithms for object positioning and classification in images. However, the current methods perform poorly and lack experimental verification. Thus, it is a fascinating and challenging issue to position and classify image objects. Drawing on the recent advances in image object detection, this paper develops a region-baed efficient network for accurate object detection in images. To improve the overall detection performance, image object detection was treated as a twofold problem, involving object proposal generation and object classification. First, a framework was designed to generate high-quality, class-independent, accurate proposals. Then, these proposals, together with their input images, were imported to our network to learn convolutional features. To boost detection efficiency, the number of proposals was reduced by a network refinement module, leaving only a few eligible candidate proposals. After that, the refined candidate proposals were loaded into the detection module to classify the objects. The proposed model was tested on the test set of the famous PASCAL Visual Object Classes Challenge 2007 (VOC2007). The results clearly demonstrate that our model achieved robust overall detection efficiency over existing approaches using fewer or more proposals, in terms of recall, mean average best overlap (MABO), and mean average precision (mAP).

APA, Harvard, Vancouver, ISO, and other styles

7

Guan, Yurong, Muhammad Aamir, Zhihua Hu, et al. "An Object Detection Framework Based on Deep Features and High-Quality Object Locations." Traitement du Signal 38, no. 3 (2021): 719–30. http://dx.doi.org/10.18280/ts.380319.

Full text

Abstract:

Objection detection has long been a fundamental issue in computer vision. Despite being widely studied, it remains a challenging task in the current body of knowledge. Many researchers are eager to develop a more robust and efficient mechanism for object detection. In the extant literature, promising results are achieved by many novel approaches of object detection and classification. However, there is ample room to further enhance the detection efficiency. Therefore, this paper proposes an image object detection and classification, using a deep neural network (DNN) for based on high-quality object locations. The proposed method firstly derives high-quality class-independent object proposals (locations) through computing multiple hierarchical segments with super pixels. Next, the proposals were ranked by region score, i.e., several contours wholly enclosed in the proposed region. After that, the top-ranking object proposal was adopted for post-classification by the DNN. During the post-classification, the network extracts the eigenvectors from the proposals, and then maps the features with the softmax classifier, thereby determining the class of each object. The proposed method was found superior to traditional approaches through an evaluation on Pascal VOC 2007 Dataset.

APA, Harvard, Vancouver, ISO, and other styles

8

Ghadi, Yazeed Yasin, Adnan Ahmed Rafique, Tamara al Shloul, Suliman A. Alsuhibany, Ahmad Jalal, and Jeongmin Park. "Robust Object Categorization and Scene Classification over Remote Sensing Images via Features Fusion and Fully Convolutional Network." Remote Sensing 14, no. 7 (2022): 1550. http://dx.doi.org/10.3390/rs14071550.

Full text

Abstract:

The latest visionary technologies have made an evident impact on remote sensing scene classification. Scene classification is one of the most challenging yet important tasks in understanding high-resolution aerial and remote sensing scenes. In this discipline, deep learning models, particularly convolutional neural networks (CNNs), have made outstanding accomplishments. Deep feature extraction from a CNN model is a frequently utilized technique in these approaches. Although CNN-based techniques have achieved considerable success, there is indeed ample space for improvement in terms of their classification accuracies. Certainly, fusion with other features has the potential to extensively improve the performance of distant imaging scene classification. This paper, thus, offers an effective hybrid model that is based on the concept of feature-level fusion. We use the fuzzy C-means segmentation technique to appropriately classify various objects in the remote sensing images. The segmented regions of the image are then labeled using a Markov random field (MRF). After the segmentation and labeling of the objects, classical and CNN features are extracted and combined to classify the objects. After categorizing the objects, object-to-object relations are studied. Finally, these objects are transmitted to a fully convolutional network (FCN) for scene classification along with their relationship triplets. The experimental evaluation of three publicly available standard datasets reveals the phenomenal performance of the proposed system.

APA, Harvard, Vancouver, ISO, and other styles

9

Lang, Graham K., and Peter Seitz. "Robust classification of arbitrary object classes based on hierarchical spatial feature-matching." Machine Vision and Applications 10, no. 3 (1997): 123–35. http://dx.doi.org/10.1007/s001380050065.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Konstantinov, A. V., S. R. Kirpichenko, and L. V. Utkin. "Generating Survival Interpretable Trajectories and Data." Doklady Mathematics 110, S1 (2024): S75—S86. https://doi.org/10.1134/s1064562424601999.

Full text

Abstract:

Abstract A new model for generating survival trajectories and data based on applying an autoencoder of a specific structure is proposed. It solves three tasks. First, it provides predictions in the form of the expected event time and the survival function for a new feature vector based on the Beran estimator. Second, the model generates additional data based on a given training set that would supplement the original dataset. Third, the most important, it generates a prototype time-dependent trajectory for an object, which characterizes how features of the object could be changed to achieve a different time to an event. The trajectory can be viewed as a type of the counterfactual explanation. The proposed model is robust during training and inference due to a specific weighting scheme incorporated into the variational autoencoder. The model also determines the censored indicators of new generated data by solving a classification task. The paper demonstrates the efficiency and properties of the proposed model using numerical experiments on synthetic and real datasets. The code of the algorithm implementing the proposed model is publicly available.

APA, Harvard, Vancouver, ISO, and other styles

11

Kabir, Raihan, Yutaka Watanobe, Md Rashedul Islam, Keitaro Naruse, and Md Mostafizer Rahman. "Unknown Object Detection Using a One-Class Support Vector Machine for a Cloud–Robot System." Sensors 22, no. 4 (2022): 1352. http://dx.doi.org/10.3390/s22041352.

Full text

Abstract:

Inter-robot communication and high computational power are challenging issues for deploying indoor mobile robot applications with sensor data processing. Thus, this paper presents an efficient cloud-based multirobot framework with inter-robot communication and high computational power to deploy autonomous mobile robots for indoor applications. Deployment of usable indoor service robots requires uninterrupted movement and enhanced robot vision with a robust classification of objects and obstacles using vision sensor data in the indoor environment. However, state-of-the-art methods face degraded indoor object and obstacle recognition for multiobject vision frames and unknown objects in complex and dynamic environments. From these points of view, this paper proposes a new object segmentation model to separate objects from a multiobject robotic view-frame. In addition, we present a support vector data description (SVDD)-based one-class support vector machine for detecting unknown objects in an outlier detection fashion for the classification model. A cloud-based convolutional neural network (CNN) model with a SoftMax classifier is used for training and identification of objects in the environment, and an incremental learning method is introduced for adding unknown objects to the robot knowledge. A cloud–robot architecture is implemented using a Node-RED environment to validate the proposed model. A benchmarked object image dataset from an open resource repository and images captured from the lab environment were used to train the models. The proposed model showed good object detection and identification results. The performance of the model was compared with three state-of-the-art models and was found to outperform them. Moreover, the usability of the proposed system was enhanced by the unknown object detection, incremental learning, and cloud-based framework.

APA, Harvard, Vancouver, ISO, and other styles

12

Baegizova, Aigulim, Ainur Jumagaliyeva, Venera Rystygulova, Galia Mukhamedrakhimova, and Zhanar Lamasheva. "THE USE OF ARTIFICIAL INTELLIGENCE FOR OBJECT RECOGNITION BASED ON NEURAL NETWORKS." Вестник КазАТК 136, no. 1 (2024): 246–59. https://doi.org/10.52167/1609-1817-2025-136-1-246-259.

Full text

Abstract:

This article is dedicated to the development and investigation of methods for automatic object recognition using artificial intelligence and convolutional neural networks. A standard dataset for object recognition, including images of various types, was used as the experimental base, allowing the creation of a model capable of effectively classifying objects under various conditions, such as changes in lighting, viewing angles, and image quality. The methodology includes data normalization and augmentation, stratified dataset splitting, and model architecture optimization to ensure a balance between computational efficiency and classification accuracy. The developed convolutional neural network features a layered architecture with activation functions and regularization methods, enabling the model to remain robust and adaptive. The results of the study demonstrated the high accuracy and reliability of the proposed model. The analysis of classification errors identified challenges related to the recognition of visually similar objects and proposed solutions to minimize them. This work emphasized the significance of using artificial intelligence and convolutional neural networks for automating object recognition, opening new perspectives for their application in intelligent systems and high-precision technologies.

APA, Harvard, Vancouver, ISO, and other styles

13

Alsmadi, Mutasem K., Mohammed Tayfour, Raed A. Alkhasawneh, Usama Badawi, Ibrahim Almarashdeh, and Firas Haddad. "Robust features extraction for general fish classification." International Journal of Electrical and Computer Engineering (IJECE) 9, no. 6 (2019): 5192. http://dx.doi.org/10.11591/ijece.v9i6.pp5192-5204.

Full text

Abstract:

Image recognition process could be plagued by many problems including noise, overlap, distortion, errors in the outcomes of segmentation, and impediment of objects within the image. Based on feature selection and combination theory between major extracted features, this study attempts to establish a system that could recognize fish object within the image utilizing texture, anchor points, and statistical measurements. Then, a generic fish classification is executed with the application of an innovative classification evaluation through a meta-heuristic algorithm known as Memetic Algorithm (Genetic Algorithm with Simulated Annealing) with back-propagation algorithm (MA-B Classifier). Here, images of dangerous and non-dangerous fish are recognized. Images of dangerous fish are further recognized as Predatory or Poison fish family, whereas families of non-dangerous fish are classified into garden and food family. A total of 24 fish families were used in testing the proposed prototype, whereby each family encompasses different number of species. The process of classification was successfully undertaken by the proposed prototype, whereby 400 distinct fish images were used in the experimental tests. Of these fish images, 250 were used for training phase while 150 were used for testing phase. The back-propagation algorithm and the proposed MA-B Classifier produced a general accuracy recognition rate of 82.25 and 90% respectively.

APA, Harvard, Vancouver, ISO, and other styles

14

Guo, Yapeng, Yang Xu, Hongtao Cui, and Shunlong Li. "Vision-Based Multiscale Construction Object Detection under Limited Supervision." Structural Control and Health Monitoring 2024 (February 16, 2024): 1–13. http://dx.doi.org/10.1155/2024/1032674.

Full text

Abstract:

Contemporary multiscale construction object detection algorithms rely predominantly on fully-supervised deep learning, requiring arduous and time-consuming labeling process. This paper presents a novel semisupervised multiscale construction objects detection (SS-MCOD) by harnessing nearly infinite unlabeled images along with limited labels, achieving more accurate and robust detection results. SS-MCOD uses a deformable convolutional network (DCN)-based teacher-student joint learning framework. DCN uses deformable advantages to extract and fuse multiscale construction object features. The teacher module generates pseudolabels for construction objects in unlabeled images, while the student module learns the location and classification of construction objects in both labeled images and unlabeled images with pseudolabels. Experimental validation using commonly used construction datasets demonstrates the accuracy and generalization performance of SS-MCOD. This research can provide insights for other detection tasks with limited labels in the construction domain.

APA, Harvard, Vancouver, ISO, and other styles

15

Miao, Wei, Jiangrong Shen, Qi Xu, Timo Hamalainen, Yi Xu, and Fengyu Cong. "SpikingYOLOX: Improved YOLOX Object Detection with Fast Fourier Convolution and Spiking Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 2 (2025): 1465–73. https://doi.org/10.1609/aaai.v39i2.32137.

Full text

Abstract:

In recent years, with the advancements in brain science, spiking neural networks (SNNs) have garnered significant attention. SNNs can generate spikes that mimic the function of neurons transmission in humans brain, thereby significantly reducing computational costs by the event-driven nature during training. While deep SNNs have shown impressive performance on classification tasks, they still face challenges in more complex tasks such as object detection. In this paper, we propose SpikingYOLOX, extending the structure of the original YOLOX by introducing signed spiking neurons and fast Fourier convolution (FFC). The designed ternary signed spiking neurons could generate three kinds of spikes to obtain more robust features in the deep layer of the backbone. Meanwhile, we integrate FFC with SNN modules to enhance object detection performance, because its global receptive field is beneficial to the object detection task. Extensive experiments demonstrate that the proposed SpikingYOLOX achieves state-of-the-art performance among other SNN-based object detection methods.

APA, Harvard, Vancouver, ISO, and other styles

16

Liu, Yan, Zhu Zhuxngjie, Qiuhui Zhang, et al. "A New Method Based on Deep Convolutional Neural Networks for Object Detection and Classification." AATCC Journal of Research 8, no. 1_suppl (2021): 37–45. http://dx.doi.org/10.14504/ajr.8.s1.5.

Full text

Abstract:

Accurate object detection and classification has a broad application in industrial tasks, such as fabric defect and invoice detection. Previous state-of-the-art methods such as SSD and Faster-RCNN usually need to carefully adjust anchor box related hyper parameters and have poor performance in special fields with large object size/ratio variations and complex background texture. In this study, we proposed a new accurate, robust, and anchor-free method to handle automatic object detection and classification problems. First, we used the feature pyramid network (FPN), to merge the feature maps of different scales of features extracted from a convolutional neural network (CNN), which allowed easy and robust multi-scale feature fusion. Second, we built two subnets to generate candidate region proposals from the FPN outputs. followed by another CNN that determined the categories of the proposed regions from the two subnets.

APA, Harvard, Vancouver, ISO, and other styles

17

Lv, Xuan, Zezhong Ma, and Qing Liu. "A Subblock Partition Of Multi-Layer Pattern Based Image Classification Approach." MATEC Web of Conferences 246 (2018): 03043. http://dx.doi.org/10.1051/matecconf/201824603043.

Full text

Abstract:

Since traditional partition approach may construct very different image representation because of the changed locations of objects in the same image, a subblock partition of multi-layer pattern method for image representation is proposed. The saliency windows straddled by superpixels are utilized to partition the image into multi-layer pattern subblocks. Then all the subblocks are combined to a three order tensor. Comparing to the results of image classification item of Pascal Voc 2007 Challenge，it indicates that the proposed representation method is robust to the varied object locations and achieves better performance than other approaches. CCS Concepts Computing methodologies➝Computer vision • Computing methodologies➝ Machine learning

APA, Harvard, Vancouver, ISO, and other styles

18

RAJARAM, DHIVYA, and KOGILAVANI SIVAKUMAR. "MOVING OBJECTS DETECTION, CLASSIFICATION AND TRACKING OF VIDEO STREAMING BY IMPROVED FEATURE EXTRACTION APPROACH USING K-SVM." DYNA 97, no. 3 (2022): 274–80. http://dx.doi.org/10.6036/10304.

Full text

Abstract:

The computer vision plays a vital role in variety of applications such as traffic surveillance, robotics, human interaction devices, etc. The video surveillance system has designed to detect, track and classify the moving objects. The moving object detection, classification and tracking of video streaming has various challenges, which utilizes various novel approaches. The existing work uses spatiotemporal feature analysis using sample consistency algorithm for moving object detection and classification. It is not performed well with the complex scene on the video. The binary masking representation of moving object is the challenging task for the researchers. These video streams are partitioned based on the frames, shots, and scenes; here the proposed research work utilizes kernel-Support Vector Machine learning technique for moving object detection and tracking. In this approach, the MIO-TCD DATASET is used for moving object detection. Here the feature extraction is the major part of foreground and background analysis in the video streaming, which utilizes the vehicle features based video data. The SURF (Speeded-Up Robust Feature) feature is used to recognize/register the object and it also used for classification of moving objects. Here the optical flow method is to quantify the relative motion of object in the video streams. Based on the differences on the partitioned frames, the optical flow features hold the object for measuring the pixel of the moving objects. The feature extraction process is improved by combining feature class with intensity level of optical flow result, which makes the gradient analysis of first order derivative function. The proposed method achieves the result of recall, precision, and f-measures than the existing work. The proposed method is done with the help of MATLAB 2018a. Keywords: Computer Vision and Pattern Recognition; Kernel-SVM; SURF features; Optical Flow; Texture feature; Moving object detection, tracking and classification;

APA, Harvard, Vancouver, ISO, and other styles

19

Cheng, Yun, Hai Tao Lang, Peng Yao, Rui Guo, Jian Ying Hu, and Jie Wang. "A Real Implementation of Dangerous Objects Capturing in Surveillance Web." Applied Mechanics and Materials 278-280 (January 2013): 1032–35. http://dx.doi.org/10.4028/www.scientific.net/amm.278-280.1032.

Full text

Abstract:

The main focus of our research is capturing dangerous objects when they appear under the surveillance camera again, while have performed a dangerous activities in other places. Our solution is a two-phase method, including object learning and capturing under the classification framework. The samples of objects and non-objects are collected to train a classifier with libSVM in object learning phase. In object capturing phase, all moving objects are detected by background subtraction, then are classified into dangerous or non-dangerous. To obtain a robust objects representation to illumination, scale, rotation etc. we fuse HSV space based color feature and multiple scale texture feature. The experimental results with real surveillance data validated the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

20

Zeng, Shaoning, Bob Zhang, Jianping Gou, Yong Xu, and Wei Huang. "Fast and Robust Dictionary-based Classification for Image Data." ACM Transactions on Knowledge Discovery from Data 15, no. 6 (2021): 1–22. http://dx.doi.org/10.1145/3449360.

Full text

Abstract:

Dictionary-based classification has been promising in knowledge discovery from image data, due to its good performance and interpretable theoretical system. Dictionary learning effectively supports both small- and large-scale datasets, while its robustness and performance depends on the atoms of the dictionary most of the time. Empirically, using a large number of atoms is helpful to obtain a robust classification, while robustness cannot be ensured when setting a small number of atoms. However, learning a huge dictionary dramatically slows down the speed of classification, which is especially worse on the large-scale datasets. To address the problem, we propose a Fast and Robust Dictionary-based Classification (FRDC) framework, which fully utilizes the learned dictionary for classification by staging - and -norms to obtain a robust sparse representation. The new objective function, on the one hand, introduces an additional -norm term upon the conventional -norm optimization, which generates a more robust classification. On the other hand, the optimization based on both - and -norms is solved in two stages, which is much easier and faster than current solutions. In this way, even when using a limited size of dictionary, which makes sure the classification runs very fast, it still can gain higher robustness for multiple types of image data. The optimization is then theoretically analyzed in a new formulation, close but distinct to elastic-net, to prove it is crucial to improve the performance under the premise of robustness. According to our extensive experiments conducted on four image datasets for face and object classification, FRDC keeps generating a robust classification no matter whether using a small or large number of atoms. This guarantees a fast and robust dictionary-based image classification. Furthermore, when simply using deep features extracted via some popular pre-trained neural networks, it outperforms many state-of-the-art methods on the specific datasets.

APA, Harvard, Vancouver, ISO, and other styles

21

Mutasem, K. Alsmadi, Tayfour Mohammed, A. Alkhasawneh Raed, Badawi Usama, Almarashdeh Ibrahim, and Haddad Firas. "Robust feature extraction methods for general fish classification." International Journal of Electrical and Computer Engineering (IJECE) 9, no. 6 (2019): 5192–204. https://doi.org/10.11591/ijece.v9i6.pp5192-5204.

Full text

Abstract:

Image recognition process could be plagued by many problems including noise, overlap, distortion, errors in the outcomes of segmentation, and impediment of objects within the image. Based on feature selection and combination theory between major extracted features, this study attempts to establish a system that could recognize fish object within the image utilizing texture, anchor points, and statistical measurements. Then, a generic fish classification is executed with the application of an innovative classification evaluation through a meta-heuristic algorithm known as Memetic Algorithm (Genetic Algorithm with Simulated Annealing) with back-propagation algorithm (MA-B Classifier). Here, images of dangerous and non-dangerous fish are recognized. Images of dangerous fish are further recognized as Predatory or Poison fish family, whereas families of non-dangerous fish are classified into garden and food family. A total of 24 fish families were used in testing the proposed prototype, whereby each family encompasses different number of species. The process of classification was successfully undertaken by the proposed prototype, whereby 400 distinct fish images were used in the experimental tests. Of these fish images, 250 were used for training phase while 150 were used for testing phase. The back-propagation algorithm and the proposed MA-B Classifier produced a general accuracy recognition rate of 82.25 and 90% respectively.

APA, Harvard, Vancouver, ISO, and other styles

22

Xiao, Xingyuan, Linlong Jiang, Yaqun Liu, and Guozhen Ren. "Limited-Samples-Based Crop Classification Using a Time-Weighted Dynamic Time Warping Method, Sentinel-1 Imagery, and Google Earth Engine." Remote Sensing 15, no. 4 (2023): 1112. http://dx.doi.org/10.3390/rs15041112.

Full text

Abstract:

Reliable crop type classification supports the scientific basis for food security and sustainable agricultural development. However, it still lacks a limited-samples-based crop classification method which is labor- and time-efficient. To this end, we used the Google Earth Engine (GEE) and Sentinel-1A/B SAR time series to develop eight types of crop classification strategies based on different sampling methods of central and scattered, different perspectives of object-based and pixel-based, and different classifiers of the Time-Weighted Dynamic Time Warping (TWDTW) and Random Forest (RF). We carried out 30-times classifications with different samples for each strategy to classify the crop types at the North Dakota–Minnesota border in the U.S. We then compared their classification accuracies and assessed the accuracy sensitivity to sample size. The results found that the TWDTW generally performed better than RF, especially for small-sample classification. Object-based classifications had higher accuracies than pixel-based classifications, and the object-based TWDTW had the highest accuracy. RF performed better in scattered sampling than the central sampling strategy. TWDTW performed better than RF in distinguishing soybean and dry bean with similar curves. The accuracies improved for all eight classification strategies with increasing sample size, and TWDTW was more robust, while RF was more sensitive to sample size change. RF required many more samples than TWDTW to achieve satisfactory accuracy, and it performed better than TWDTW when the sample size exceeded 50. The accuracy comparisons indicated that the TWDTW has stronger temporal and spatial generalization capabilities and has high potential applications for early, historical, and limited-samples-based crop type classification. The findings of our research are worthwhile contributions to the methodology and practices of crop type classification as well as sustainable agricultural development.

APA, Harvard, Vancouver, ISO, and other styles

23

Sachin Sonawane. "Robust Classification of Black-Eyed Peas Based on Segment Anything Model and Transfer Learning." Journal of Information Systems Engineering and Management 10, no. 21s (2025): 66–80. https://doi.org/10.52783/jisem.v10i21s.3290.

Full text

Abstract:

Evaluating the physical quality of harvested black-eyed peas is essential to ensure their products meet high standards. Carefully designed and optimized machine learning models can provide better quality evaluation. A hybrid neural network integrating EfficientNetV2B1 and Vision Transformer (ViT) to classify black-eyed peas is introduced in this work. One of the main challenges in accurate classification was segmenting objects in a clustered view. Inconsistent lighting, variations in sample size, random placement of the objects, and neighboring objects touching each other make the task difficult. We utilized the Segment Anything Model (SAM) to address the issue. SAM detected individual objects for our samples of weights up to 30 grams with 100% accuracy. We incorporated SAM with our custom object retrieval block to separate the segmented objects into images of size 224 × 224 × 3 pixels for classification purposes. We also used image augmentation with a stable diffusion method to balance the dataset. Stable diffusion generates high-quality and diverse images while preserving the original distribution. Subsequently, we experimented with five hybrid architectures, EfficientNetV2B1+ViT, MobileNetV2+ShuffleNetV2, ResNet50+DenseNet121, VGG16+ResNet18, and InceptionV3+MobileNetV2 with feature fusion addressed by the convolutional block attention module (CBAM). Our experimentation showed that the EfficientNetV2B1+ViT model outperformed other models. EfficientNetV2B1+ViT exploited depth-wise separable convolution and transformer-based models utilizing multi-head self-attention mechanisms. With hyperparameter optimization, EfficientNetV2B1+ViT achieved an impressive accuracy of 95.80% and a loss of 0.1256 across eight classes of sound-quality seeds, defects, and foreign contamination, highlighting its efficiency and robustness.

APA, Harvard, Vancouver, ISO, and other styles

24

Qiu, Yuanyuan, Zhijie Xu, and Jianqin Zhang. "Patch-Based Auxiliary Node Classification for Domain Adaptive Object Detection." Electronics 13, no. 7 (2024): 1239. http://dx.doi.org/10.3390/electronics13071239.

Full text

Abstract:

Domain adaptive object detection (DAOD) aims to leverage labeled source domain data to train object detection models that can generalize well to unlabeled target domains. Recently, many researchers have considered implementing fine-grained pixel-level domain adaptation using graph representations. Existing methods construct semantically complete graphs and align them across domains via graph matching. This work introduced an auxiliary node classification task before domain alignment through graph matching, which utilizes the inherent information of graph nodes to classify them, in order to avoid suboptimal graph matching results caused by node class confusion. However, previous methods neglected the contextual information of graph nodes, leading to biased node classification and suboptimal graph matching. To solve this issue, we propose a novel patch-based auxiliary node classification method for DAOD. Unlike existing methods that use only the inherent information of nodes for node classification, our method exploits the local region information of nodes and employs multi-layer convolutional neural networks to learn the local region feature representation of nodes, enriching the node context information. Thus, accurate and robust node classification results are produced and the risk of class confusion is reduced. Moreover, we propose a progressive strategy to fuse the inherent features and the learned local region features of nodes, which ensures that the network can stably and reliably utilize local region features for accurate node classification. In this paper, we conduct abundant experiments on various DAOD scenarios and demonstrate that our proposed model outperforms existing works.

APA, Harvard, Vancouver, ISO, and other styles

25

Thanki, Dhval, Dippal Israni, and Ashwin Makwana. "Glacier Mapping with Object based Image Analysis Method, Case Study of Mount Everest Region." Jurnal Kejuruteraan 31, no. 2 (2019): 215–20. http://dx.doi.org/10.17576/jkukm-2019-31(2)-05.

Full text

Abstract:

Substantial progress in Geoinformatics System in recent years leads to the research in monitoring and mapping of glaciers. Monitoring glacier in the mountain region with traditional manual method is very crucial and time-consuming. Glaciers are melting because of global warming. Melting of glaciers can causes calamities like rising in sea level, glacial lake outburst, avalanches etc. Glacier monitoring using multi-temporal data for objects on the surface of the glacier is hard to classify. This paper gives an insight into the importance of Geo-spatial data and object-based image analysis method for satellite image processing. The object-based image analysis benefits more compared to traditional pixel-based image analysis as it is more robust and noise removing more image features etc. Spectral data with multiple bands is the backbone of surveying and monitoring of glacier. In this paper case study of Mount Everest region (27 48° 32N, 86 54° 47E) is represented. The remotely sensed image needs to be taken in a cloud-free environment. Object-based image classification is done in recognition tool. Also, the step by step methodology of object-based classification, segmentation and post-classification possibilities are discussed. Finally, the paper presents several representations of indexes. The integration of indexes is useful for accurately classifying the different part of terrain, lake, vegetation and glacier.

APA, Harvard, Vancouver, ISO, and other styles

26

Ye, Hanchen, Yuyue Zhang, and Xiaoli Zhao. "Robust and Refined Salient Object Detection Based on Diffusion Model." Electronics 12, no. 24 (2023): 4962. http://dx.doi.org/10.3390/electronics12244962.

Full text

Abstract:

Salient object detection (SOD) networks are vulnerable to adversarial attacks. As adversarial training is computationally expensive for SOD, existing defense methods instead adopt a noise-against-noise strategy that disrupts adversarial perturbation and restores the image either in input or feature space. However, their limited learning capacity and the need for network modifications limit their applicability. In recent years, the popular diffusion model coincides with the existing defense idea and exhibits excellent purification performance, but there still remains an accuracy gap between the saliency results generated from the purified images and the benign images. In this paper, we propose a Robust and Refined (RoRe) SOD defense framework based on the diffusion model to simultaneously achieve adversarial robustness as well as improved accuracy for benign and purified images. Our proposed RoRe defense consists of three modules: purification, adversarial detection, and refinement. The purification module leverages the powerful generation capability of the diffusion model to purify perturbed input images to achieve robustness. The adversarial detection module utilizes the guidance classifier in the diffusion model for multi-step voting classification. By combining this classifier with a similarity condition, precise adversarial detection can be achieved, providing the possibility of regaining the original accuracy for benign images. The refinement module uses a simple and effective UNet to enhance the accuracy of purified images. The experiments demonstrate that RoRe achieves superior robustness over state-of-the-art methods while maintaining high accuracy for benign images. Moreover, RoRe shows good results against backward pass differentiable approximation (BPDA) attacks.

APA, Harvard, Vancouver, ISO, and other styles

27

Kochari, Vijaylaxmi, Sanjeev S. Sannakki, Vijay S. Rajpurohit, and Mahesh G. Huddar. "Enhancing hyperspectral image object classification through robust feature extraction and spatial-spectral fusion using deep learning." Indonesian Journal of Electrical Engineering and Computer Science 37, no. 1 (2025): 279. http://dx.doi.org/10.11591/ijeecs.v37.i1.pp279-287.

Full text

Abstract:

Hyperspectral imaging (HSI) has gained significant attention in recent years due to its broad applications across agriculture, environmental monitoring, urban planning, infrastructure management, and defense and security for object detection and classification. Despite its potential, current methodologies face challenges such as insufficient feature extraction, noise interference, and inadequate spatial-spectral fusion, limiting classification accuracy and robustness. This study reviews advancements in HSI object detection and classification methodologies, emphasizing the role of machine-learning (ML) and deep-learning (DL) techniques. Hence, this work proposes a novel framework to address these challenges, prioritizing robust feature extraction, effective spatial-spectral fusion, and comprehensive noise removal mechanisms. By integrating DL techniques and training with HSI noisy data, this framework aims to enhance classification accuracy and robustness. The findings suggest that the proposed approach significantly improves the reliability and performance of HSI-based object classification systems. This research provides a pathway for future development in the domain, promising to elevate the effectiveness of HSI applications in real-world scenarios.

APA, Harvard, Vancouver, ISO, and other styles

28

Vijaylaxmi, Kochari Sanjeev S. Sannakki Vijay S. Rajpurohit Mahesh G. Huddar. "Enhancing hyperspectral image object classification through robust feature extraction and spatial-spectral fusion using deep learning." Indonesian Journal of Electrical Engineering and Computer Science 37, no. 1 (2025): 279–87. https://doi.org/10.11591/ijeecs.v37.i1.pp279-287.

Full text

Abstract:

Hyperspectral imaging (HSI) has gained significant attention in recent years due to its broad applications across agriculture, environmental monitoring, urban planning, infrastructure management, and defense and security for object detection and classification. Despite its potential, current methodologies face challenges such as insufficient feature extraction, noise interference, and inadequate spatial-spectral fusion, limiting classification accuracy and robustness. This study reviews advancements in HSI object detection and classification methodologies, emphasizing the role of machine-learning (ML) and deep-learning (DL) techniques. Hence, this work proposes a novel framework to address these challenges, prioritizing robust feature extraction, effective spatial-spectral fusion, and comprehensive noise removal mechanisms. By integrating DL techniques and training with HSI noisy data, this framework aims to enhance classification accuracy and robustness. The findings suggest that the proposed approach significantly improves the reliability and performance of HSI-based object classification systems. This research provides a pathway for future development in the domain, promising to elevate the effectiveness of HSI applications in real-world scenarios.

APA, Harvard, Vancouver, ISO, and other styles

29

FERRAZ, CAROLINA TOLEDO, OSMANDO PEREIRA, MARCOS VERDINI ROSA, and ADILSON GONZAGA. "OBJECT RECOGNITION BASED ON BAG OF FEATURES AND A NEW LOCAL PATTERN DESCRIPTOR." International Journal of Pattern Recognition and Artificial Intelligence 28, no. 08 (2014): 1455010. http://dx.doi.org/10.1142/s0218001414550106.

Full text

Abstract:

Bag of Features (BoF) has gained a lot of interest in computer vision. Visual codebook based on robust appearance descriptors extracted from local image patches is an effective means of texture analysis and scene classification. This paper presents a new method for local feature description based on gray-level difference mapping called Mean Local Mapped Pattern (M-LMP). The proposed descriptor is robust to image scaling, rotation, illumination and partial viewpoint changes. The training set is composed of rotated and scaled images, with changes in illumination and view points. The test set is composed of rotated and scaled images. The proposed descriptor more effectively captures smaller differences of the image pixels than similar ones. In our experiments, we implemented an object recognition system based on the M-LMP and compared our results to the Center-Symmetric Local Binary Pattern (CS-LBP) and the Scale-Invariant Feature Transform (SIFT). The results for object classification were analyzed in a BoF methodology and show that our descriptor performs better compared to these two previously published methods.

APA, Harvard, Vancouver, ISO, and other styles

30

Bui, Quang-Thanh, Tien-Yin Chou, Thanh-Van Hoang, et al. "Gradient Boosting Machine and Object-Based CNN for Land Cover Classification." Remote Sensing 13, no. 14 (2021): 2709. http://dx.doi.org/10.3390/rs13142709.

Full text

Abstract:

In regular convolutional neural networks (CNN), fully-connected layers act as classifiers to estimate the probabilities for each instance in classification tasks. The accuracy of CNNs can be improved by replacing fully connected layers with gradient boosting algorithms. In this regard, this study investigates three robust classifiers, namely XGBoost, LightGBM, and Catboost, in combination with a CNN for a land cover study in Hanoi, Vietnam. The experiments were implemented using SPOT7 imagery through (1) image segmentation and extraction of features, including spectral information and spatial metrics, (2) normalization of attribute values and generation of graphs, and (3) using graphs as the input dataset to the investigated models for classifying six land cover classes, namely House, Bare land, Vegetation, Water, Impervious Surface, and Shadow. The results show that CNN-based XGBoost (Overall accuracy = 0.8905), LightGBM (0.8956), and CatBoost (0.8956) outperform the other methods used for comparison. It can be seen that the combination of object-based image analysis and CNN-based gradient boosting algorithms significantly improves classification accuracies and can be considered as alternative methods for land cover analysis.

APA, Harvard, Vancouver, ISO, and other styles

31

Taher, Josef, Teemu Hakala, Anttoni Jaakkola, et al. "Feasibility of Hyperspectral Single Photon Lidar for Robust Autonomous Vehicle Perception." Sensors 22, no. 15 (2022): 5759. http://dx.doi.org/10.3390/s22155759.

Full text

Abstract:

Autonomous vehicle perception systems typically rely on single-wavelength lidar sensors to obtain three-dimensional information about the road environment. In contrast to cameras, lidars are unaffected by challenging illumination conditions, such as low light during night-time and various bidirectional effects changing the return reflectance. However, as many commercial lidars operate on a monochromatic basis, the ability to distinguish objects based on material spectral properties is limited. In this work, we describe the prototype hardware for a hyperspectral single photon lidar and demonstrate the feasibility of its use in an autonomous-driving-related object classification task. We also introduce a simple statistical model for estimating the reflectance measurement accuracy of single photon sensitive lidar devices. The single photon receiver frame was used to receive 30 12.3 nm spectral channels in the spectral band 1200–1570 nm, with a maximum channel-wise intensity of 32 photons. A varying number of frames were used to accumulate the signal photon count. Multiple objects covering 10 different categories of road environment, such as car, dry asphalt, gravel road, snowy asphalt, wet asphalt, wall, granite, grass, moss, and spruce tree, were included in the experiments. We test the influence of the number of spectral channels and the number of frames on the classification accuracy with random forest classifier and find that the spectral information increases the classification accuracy in the high-photon flux regime from 50% to 94% with 2 channels and 30 channels, respectively. In the low-photon flux regime, the classification accuracy increases from 30% to 38% with 2 channels and 6 channels, respectively. Additionally, we visualize the data with the t-SNE algorithm and show that the photon shot noise in the single photon sensitive hyperspectral data contributes the most to the separability of material specific spectral signatures. The results of this study provide support for the use of hyperspectral single photon lidar data on more advanced object detection and classification methods, and motivates the development of advanced single photon sensitive hyperspectral lidar devices for use in autonomous vehicles and in robotics.

APA, Harvard, Vancouver, ISO, and other styles

32

Anraeni, Siska, Muhid Mustari, Ramdaniah Ramdaniah, Nia Kurniati, and Syahrul Mubarak. "Innovative CNN approach for reliable chicken meat classification in the poultry industry." Bulletin of Social Informatics Theory and Application 8, no. 2 (2024): 226–35. https://doi.org/10.31763/businta.v8i2.686.

Full text

Abstract:

In response to the burgeoning need for advanced object recognition and classification, this research embarks on a journey harnessing the formidable capabilities of Convolutional Neural Networks (CNNs). The central aim of this study revolves around the precise identification and categorization of objects, with a specific focus on the critical task of distinguishing between fresh and spoiled chicken meat. This study's overarching objective is to craft a robust CNN-based classification model that excels in discriminating between objects. In the context of our research, we set out to create a model adept at distinguishing between fresh and rotten chicken meat. This endeavor holds immense potential in augmenting food safety and elevating quality control standards within the poultry industry. Our research methodology entails meticulous data collection, which includes acquiring high-resolution images of chicken meat. This meticulously curated dataset serves as the bedrock for both training and testing our CNN model. To optimize the model, we employ the 'adam' optimizer, while critical performance metrics, such as accuracy, precision, recall, and the F1-score, are methodically computed to evaluate the model's effectiveness. Our experimental findings unveil the remarkable success of our CNN model, with consistent accuracy, precision, and recall metrics all reaching an impressive pinnacle of 94%. These metrics underscore the model's excellence in the realm of object classification, with a particular emphasis on its proficiency in distinguishing between fresh and rotten chicken meat. In summation, our research concludes that the CNN model has exhibited exceptional prowess in the domains of object recognition and classification. The model's high accuracy signifies its precision in furnishing accurate predictions, while its elevated precision and recall values accentuate its effectiveness in differentiating between object classes. Consequently, the CNN model stands as a robust foundation for future strides in object classification technology. As we peer into the horizon of future research, myriad opportunities beckon. Our CNN model's applicability extends beyond chicken meat classification, inviting exploration across diverse domains. Furthermore, the model's refinement and adaptation for specific challenges represent an exciting avenue for future work, promising heightened performance across a broader spectrum of object recognition tasks.

APA, Harvard, Vancouver, ISO, and other styles

33

Li, Erzhu, Alim Samat, Wei Liu, Cong Lin, and Xuyu Bai. "High-Resolution Imagery Classification Based on Different Levels of Information." Remote Sensing 11, no. 24 (2019): 2916. http://dx.doi.org/10.3390/rs11242916.

Full text

Abstract:

Detailed land use and land cover (LULC) information is one of the important information for land use surveys and applications related to the earth sciences. Therefore, LULC classification using very-high resolution remotely sensed imagery has been a hot issue in the remote sensing community. However, it remains a challenge to successfully extract LULC information from very-high resolution remotely sensed imagery, due to the difficulties in describing the individual characteristics of various LULC categories using single level features. The traditional pixel-wise or spectral-spatial based methods pay more attention to low-level feature representations of target LULC categories. In addition, deep convolutional neural networks offer great potential to extract high-level features to describe objects and have been successfully applied to scene understanding or classification. However, existing studies has paid little attention to constructing multi-level feature representations to better understand each category. In this paper, a multi-level feature representation framework is first designed to extract more robust feature representations for the complex LULC classification task using very-high resolution remotely sensed imagery. To this end, spectral reflection and morphological and morphological attribute profiles are used to describe the pixel-level and neighborhood-level information. Furthermore, a novel object-based convolutional neural networks (CNN) is proposed to extract scene-level information. The object-based CNN method combines advantages of object-based method and CNN method and can perform multi-scale analysis at the scene level. Then, the random forest method is employed to carry out the final classification using the multi-level features. The proposed method was validated on three challenging remotely sensed imageries including a hyperspectral image and two multispectral images with very-high spatial resolution, and achieved excellent classification performances.

APA, Harvard, Vancouver, ISO, and other styles

34

Li, Zhiyi, Songtao Zhang, Zihan Fu, Fanlei Meng, and Lijuan Zhang. "Confidence-Feature Fusion: A Novel Method for Fog Density Estimation in Object Detection Systems." Electronics 14, no. 2 (2025): 219. https://doi.org/10.3390/electronics14020219.

Full text

Abstract:

Foggy weather poses significant challenges to outdoor computer vision tasks, such as object detection, by degrading image quality and reducing algorithm reliability. In this paper, we present a novel model for estimating fog density in outdoor scenes, aiming to enhance object detection performance under varying foggy conditions. Using a support vector machine (SVM) classification framework, the proposed model categorizes unknown images into distinct fog density levels based on both global and local fog-relevant features. Key features such as entropy, contrast, and dark channel information are extracted to quantify the effects of fog on image clarity and object visibility. Moreover, we introduce an innovative region selection method tailored to images without detectable objects, ensuring robust feature extraction. Evaluation on synthetic datasets with varying fog densities demonstrates a classification accuracy of 85.8%, surpassing existing methods in terms of correlation coefficients and robustness. Beyond accurate fog density estimation, this approach provides valuable insights into the impact of fog on object detection, contributing to safer navigation in foggy environments.

APA, Harvard, Vancouver, ISO, and other styles

35

Liu, Hengsong, and Tongle Duan. "Cross-Modal Collaboration and Robust Feature Classifier for Open-Vocabulary 3D Object Detection." Sensors 25, no. 2 (2025): 553. https://doi.org/10.3390/s25020553.

Full text

Abstract:

The multi-sensor fusion, such as LiDAR and camera-based 3D object detection, is a key technology in autonomous driving and robotics. However, traditional 3D detection models are limited to recognizing predefined categories and struggle with unknown or novel objects. Given the complexity of real-world environments, research into open-vocabulary 3D object detection is essential. Therefore, this paper aims to address two key issues in this area: how to localize and classify novel objects. We propose Cross-modal Collaboration and Robust Feature Classifier to improve localization accuracy and classification robustness for novel objects. The Cross-modal Collaboration involves the collaborative localization between LiDAR and camera. In this approach, 2D images provide preliminary regions of interest for novel objects in the 3D point cloud, while the 3D point cloud offers more precise positional information to the 2D images. Through iterative updates between two modalities, the preliminary region and positional information are refined, achieving the accurate localization of novel objects. The Robust Feature Classifier aims to accurately classify novel objects. To prevent them from being misidentified as background or other incorrect categories, this method maps the semantic vectors of new categories into multiple sets of visual features distinguished from the background. And it clusters these visual features based on each individual semantic vector to maintain inter-class separability. Our method achieves state-of-the-art performance on various scenarios and datasets.

APA, Harvard, Vancouver, ISO, and other styles

36

Yang, Feng, Wentong Li, Haiwei Hu, Wanyi Li, and Peng Wang. "Multi-Scale Feature Integrated Attention-Based Rotation Network for Object Detection in VHR Aerial Images." Sensors 20, no. 6 (2020): 1686. http://dx.doi.org/10.3390/s20061686.

Full text

Abstract:

Accurate and robust detection of multi-class objects in very high resolution (VHR) aerial images has been playing a significant role in many real-world applications. The traditional detection methods have made remarkable progresses with horizontal bounding boxes (HBBs) due to CNNs. However, HBB detection methods still exhibit limitations including the missed detection and the redundant detection regions, especially for densely-distributed and strip-like objects. Besides, large scale variations and diverse background also bring in many challenges. Aiming to address these problems, an effective region-based object detection framework named Multi-scale Feature Integration Attention Rotation Network (MFIAR-Net) is proposed for aerial images with oriented bounding boxes (OBBs), which promotes the integration of the inherent multi-scale pyramid features to generate a discriminative feature map. Meanwhile, the double-path feature attention network supervised by the mask information of ground truth is introduced to guide the network to focus on object regions and suppress the irrelevant noise. To boost the rotation regression and classification performance, we present a robust Rotation Detection Network, which can generate efficient OBB representation. Extensive experiments and comprehensive evaluations on two publicly available datasets demonstrate the effectiveness of the proposed framework.

APA, Harvard, Vancouver, ISO, and other styles

37

Wu, Liang, Pengyu Hao, Kaixuan Zhang, Qian Zhang, Ru Han, and Dekun Cao. "A Robust Star Identification Algorithm for Resident Space Object Surveillance." Photogrammetric Engineering & Remote Sensing 90, no. 9 (2024): 565–74. http://dx.doi.org/10.14358/pers.23-00086r2.

Full text

Abstract:

Star identification algorithms can be applied to resident space object (RSO) surveillance, which includes a large number of stars and false stars. This paper proposes an efficient, robust star identification algorithm for RSO surveillance based on a neural network. First, a feature called equal-frequency binning radial feature (EFB-RF) is proposed for guide stars, and a superficial neural network is constructed for feature classification. Then the training set is generated based on EFB-RF. Finally, the remaining stars are identified using a residual star matching method. The simulation experiment and results show that the identification rate of our algorithm can reach 99.82% under 1 pixel position noise, and it can reach 99.54% under 5% false stars. When the percentage of missing stars is 15%, it can reach 99.40%. The algorithm is verified by RSO surveillance.

APA, Harvard, Vancouver, ISO, and other styles

38

Liu, Guoqi, Yifei Dong, Ming Deng, and Yihang Liu. "Magnetostatic Active Contour Model with Classification Method of Sparse Representation." Journal of Electrical and Computer Engineering 2020 (July 1, 2020): 1–10. http://dx.doi.org/10.1155/2020/5438763.

Full text

Abstract:

The active contour model is widely used to segment images. For the classical magnetostatic active contour (MAC) model, the magnetic field is computed based on the detected points by using an edge detector. However, noise and nontarget points are always detected. Thus, MAC is nonrobust to noise and the extracted objects may be deviant from the real objects. In this paper, a magnetostatic active contour model with a classification method of sparse representation is proposed. First, rough edge information is obtained with some edge detectors. Second, the extracted edge contours are divided into two parts by sparse classification, that is, the target object part and the redundant part. Based on the classified target points, a new magnetic field is generated, and contours evolve with MAC to extract the target objects. Experimental results show that the proposed model could decrease the influence of noise and robust segmentation results could be obtained.

APA, Harvard, Vancouver, ISO, and other styles

39

Guo, Wenhua, Jiabao Gao, Yanbin Tian, Fan Yu, and Zuren Feng. "SAFS: Object Tracking Algorithm Based on Self-Adaptive Feature Selection." Sensors 21, no. 12 (2021): 4030. http://dx.doi.org/10.3390/s21124030.

Full text

Abstract:

Object tracking is one of the most challenging problems in the field of computer vision. In challenging object tracking scenarios such as illumination variation, occlusion, motion blur and fast motion, existing algorithms can present decreased performances. To make better use of the various features of the image, we propose an object tracking method based on the self-adaptive feature selection (SAFS) algorithm, which can select the most distinguishable feature sub-template to guide the tracking task. The similarity of each feature sub-template can be calculated by the histogram of the features. Then, the distinguishability of the feature sub-template can be measured by their similarity matrix based on the maximum a posteriori (MAP). The selection task of the feature sub-template is transformed into the classification task between feature vectors by the above process and adopt modified Jeffreys’ entropy as the discriminant metric for classification, which can complete the update of the sub-template. Experiments with the eight video sequences in the Visual Tracker Benchmark dataset evaluate the comprehensive performance of SAFS and compare them with five baselines. Experimental results demonstrate that SAFS can overcome the difficulties caused by scene changes and achieve robust object tracking.

APA, Harvard, Vancouver, ISO, and other styles

40

Wu, Yan, Jiqian Li, and Jing Bai. "Multiple Classifiers-Based Feature Fusion for RGB-D Object Recognition." International Journal of Pattern Recognition and Artificial Intelligence 31, no. 05 (2017): 1750014. http://dx.doi.org/10.1142/s0218001417500148.

Full text

Abstract:

RGB-D-based object recognition has been enthusiastically investigated in the past few years. RGB and depth images provide useful and complementary information. Fusing RGB and depth features can significantly increase the accuracy of object recognition. However, previous works just simply take the depth image as the fourth channel of the RGB image and concatenate the RGB and depth features, ignoring the different power of RGB and depth information for different objects. In this paper, a new method which contains three different classifiers is proposed to fuse features extracted from RGB image and depth image for RGB-D-based object recognition. Firstly, a RGB classifier and a depth classifier are trained by cross-validation to get the accuracy difference between RGB and depth features for each object. Then a variant RGB-D classifier is trained with different initialization parameters for each class according to the accuracy difference. The variant RGB-D-classifier can result in a more robust classification performance. The proposed method is evaluated on two benchmark RGB-D datasets. Compared with previous methods, ours achieves comparable performance with the state-of-the-art method.

APA, Harvard, Vancouver, ISO, and other styles

41

Hu, Zhaohua, and Xiaoyi Shi. "Deep Directional Network for Object Tracking." Algorithms 11, no. 11 (2018): 178. http://dx.doi.org/10.3390/a11110178.

Full text

Abstract:

Existing object trackers are mostly based on correlation filtering and neural network frameworks. Correlation filtering is fast but has poor accuracy. Although a neural network can achieve high precision, a large amount of computation increases the tracking time. To address this problem, we utilize a convolutional neural network (CNN) to learn object direction. We propose a target direction classification network based on CNNs that has a directional shortcut to the tracking target, unlike the particle filter that randomly finds the target. Our network uses an end-to-end approach to determine scale variation that has good robustness to scale variation sequences. In the pretraining stage, the Visual Object Tracking Challenges (VOT) dataset is used to train the network for learning positive and negative sample classification and direction classification. In the online tracking stage, the sliding window operation is performed by using the obtained directional information to determine the exact position of the object. The network only calculates a single sample, which guarantees a low computational burden. The positive and negative sample redetection strategies can successfully ensure that the samples are not lost. The one-pass evaluation (OPE) evaluation results of the object tracking benchmark (OTB) demonstrate that the algorithm is very robust and is also faster than several deep trackers.

APA, Harvard, Vancouver, ISO, and other styles

42

Kar, Subhajit, Rajorshi Bhattacharya, Ramkrishna Das, Ylva Pihlström, and Megan O. Lewis. "Classification of Wolf–Rayet Stars Using Ensemble-based Machine Learning Algorithms." Astrophysical Journal 977, no. 2 (2024): 170. https://doi.org/10.3847/1538-4357/ad8dda.

Full text

Abstract:

Abstract We develop a robust machine learning classifier model utilizing the eXtreme-Gradient Boosting (XGB) algorithm for improved classification of Galactic Wolf–Rayet (WR) stars based on IR colors and positional attributes. For our study, we choose an extensive data set of 6555 stellar objects (from 2MASS and AllWISE data releases) lying in the Milky Way (MW) with available photometric magnitudes of different types, including WR stars. Our XGB classifier model can accurately (with an 86% detection rate) identify a sufficient number of WR stars against a large sample of non-WR sources. The XGB model outperforms other ensemble classifier models, such as Random Forest. Also, using the XGB algorithm, we develop a WR subtype classifier model that can differentiate the WR subtypes from the non-WR sources with a high model accuracy (>60%). Further, we apply both XGB-based models to a selection of 6457 stellar objects with unknown object types, detecting 58 new WR star candidates and predicting subtypes for 10 of them. The identified WR sources are mainly located in the local spiral arm of the MW and mostly lie in the solar neighborhood.

APA, Harvard, Vancouver, ISO, and other styles

43

Wang, Fang, Huitao Li, Kai Wang, Lichen Su, Jing Li, and Lili Zhang. "An Improved Object Detection Method for Underwater Sonar Image Based on PP-YOLOv2." Journal of Sensors 2022 (November 21, 2022): 1–12. http://dx.doi.org/10.1155/2022/5827499.

Full text

Abstract:

Forward-looking sonar is widely used in underwater obstacles and objects detection for navigational safety. Automatic sonar images recognition plays an important role to reduce the workload of staff and subjective errors caused by visual fatigue. However, the application of automatic object classification in forward-looking sonar is still lacking, which is due to small effective samples and low signal-to-noise ratios (SNR). This paper proposed an improved PP-YOLOv2 algorithm for real-time detection, called as PPYOLO-T. Specifically, the proposed method first resegments the sonar image according to different aspect ratio and filters the acoustic noise in various ways. Then, attention mechanism is introduced to improve the ability of network feature extraction. Finally, the decoupled head is used to optimize the multiobjective classification. Experimental results show that the proposed method can effectively improve the accuracy of multitarget detection task, which can meet the requirement of robust real-time detection for both raw and noised sonar targets.

APA, Harvard, Vancouver, ISO, and other styles

44

Han, Jiaming, Yuqiang Ren, Jian Ding, Ke Yan, and Gui-Song Xia. "Few-Shot Object Detection via Variational Feature Aggregation." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 1 (2023): 755–63. http://dx.doi.org/10.1609/aaai.v37i1.25153.

Full text

Abstract:

As few-shot object detectors are often trained with abundant base samples and fine-tuned on few-shot novel examples, the learned models are usually biased to base classes and sensitive to the variance of novel examples. To address this issue, we propose a meta-learning framework with two novel feature aggregation schemes. More precisely, we first present a Class-Agnostic Aggregation (CAA) method, where the query and support features can be aggregated regardless of their categories. The interactions between different classes encourage class-agnostic representations and reduce confusion between base and novel classes. Based on the CAA, we then propose a Variational Feature Aggregation (VFA) method, which encodes support examples into class-level support features for robust feature aggregation. We use a variational autoencoder to estimate class distributions and sample variational features from distributions that are more robust to the variance of support examples. Besides, we decouple classification and regression tasks so that VFA is performed on the classification branch without affecting object localization. Extensive experiments on PASCAL VOC and COCO demonstrate that our method significantly outperforms a strong baseline (up to 16%) and previous state-of-the-art methods (4% in average).

APA, Harvard, Vancouver, ISO, and other styles

45

Ali, Ali Khudhair Abbas Ali, and Yıldız Aydın. "Vision Transformer-Based Approach: A Novel Method for Object Recognition." Karadeniz Fen Bilimleri Dergisi 15, no. 1 (2025): 560–76. https://doi.org/10.31466/kfbd.1620640.

Full text

Abstract:

This paper proposes a hybrid method to improve object recognition applications on inefficient and imbalanced datasets. The proposed method aims to enhance object recognition performance using the Vision Transformer (ViT) deep learning model and various classical machine learning classifiers (LightGBM, AdaBoost, ExtraTrees, and Logistic Regression). The Caltech-101 dataset used in the study is a low-resolution and noisy image dataset with class imbalance problems. Our method achieves better results by combining the feature extraction capabilities of the Vision Transformer model and the robust classification performance of classical machine learning classifiers. Experiments conducted on the Caltech-101 dataset demonstrate that the proposed method achieves a precision of 92.3%, a recall of 89.7%, and an accuracy of 95.5%, highlighting its effectiveness in addressing the challenges of object recognition in imbalanced datasets.

APA, Harvard, Vancouver, ISO, and other styles

46

Shi, Run, Chaoqun Wang, Gang Zhao, and Chunyan Xu. "SCA-MMA: Spatial and Channel-Aware Multi-Modal Adaptation for Robust RGB-T Object Tracking." Electronics 11, no. 12 (2022): 1820. http://dx.doi.org/10.3390/electronics11121820.

Full text

Abstract:

The RGB and thermal (RGB-T) object tracking task is challenging, especially with various target changes caused by deformation, abrupt motion, background clutter and occlusion. It is critical to employ the complementary nature between visual RGB and thermal infrared data. In this work, we address the RGB-T object tracking task with a novel spatial- and channel-aware multi-modal adaptation (SCA-MMA) framework, which builds an adaptive feature learning process for better mining this object-aware information in a unified network. For each type of modality information, the spatial-aware adaptation mechanism is introduced to dynamically learn the location-based characteristics of specific tracking objects at multiple convolution layers. Further, the channel-aware multi-modal adaptation mechanism is proposed to adaptively learn the feature fusion/aggregation of different modalities. In order to perform object tracking, we employ a binary classification module with two fully connected layers to predict the bounding boxes of specific targets. Comprehensive evaluations on GTOT and RGBT234 datasets demonstrate the significant superiority of our proposed SCA-MMA for robust RGB-T object tracking tasks. In particular, the precision rate (PR) and success rate (SR) on GTOT and RGBT234 datasets can reach 90.5%/73.2% and 80.2%/56.9%, significantly higher than the state-of-the-art algorithms.

APA, Harvard, Vancouver, ISO, and other styles

47

Thinh, Bui Van, Tran Anh Tuan, Ngo Quoc Viet, and Pham The Bao. "Content based video retrieval system using principal object analysis." Tạp chí Khoa học 14, no. 9 (2019): 24. http://dx.doi.org/10.54607/hcmue.js.14.9.291(2017).

Full text

Abstract:

Video retrieval is a searching problem on videos or clips based on the content of video clips which relates to the input image or video. Some recent approaches have been in challenging problem due to the diversity of video types, frame transitions and camera positions. Besides, that an appropriate measures is selected for the problem is a question. We propose a content based video retrieval system in some main steps resulting in a good performance. From a main video, we process extracting keyframes and principal objects using Segmentation of Aggregating Superpixels (SAS) algorithm. After that, Speeded Up Robust Features (SURF) are selected from those principal objects. Then, the model “Bag-of-words” in accompanied by SVM classification are applied to obtain the retrieval result. Our system is evaluated on over 300 videos in diversity from music, history, movie, sports, and natural scene to TV program show.

APA, Harvard, Vancouver, ISO, and other styles

48

WANG, DONG, GANG YANG, and HUCHUAN LU. "TRI-TRACKING: COMBINING THREE INDEPENDENT VIEWS FOR ROBUST VISUAL TRACKING." International Journal of Image and Graphics 12, no. 03 (2012): 1250021. http://dx.doi.org/10.1142/s0219467812500210.

Full text

Abstract:

Robust tracking is a challenging problem, due to intrinsic appearance variability of objects caused by in-plane or out-plane rotation and extrinsic factors change such as illumination, occlusion, background clutter and local blur. In this paper, we present a novel tri-tracking framework combining different views (different models using independent features) for robust object tracking. This new tracking framework exploits a hybrid discriminative generative model based on online semi-supervised learning. We only need the first frame for parameters initialization, and then the tracking process is automatic in the remaining frames, with updating the model online to capture the changes of both object appearance and background. There are three main contributions in our tri-tracking approach. First, we propose a tracking framework for combining generative model and discriminative model, together with different cues that complement each other. Second, by introducing a third tracker, we provide a solution to the problem that it is difficult to combine two classification results in co-training framework when they are opposite. Third, we propose a principle way for combing different views, which based on their Discriminative power. We conduct experiments on some challenging videos, the results from which demonstrate that the proposed tri-tracking framework is robust.

APA, Harvard, Vancouver, ISO, and other styles

49

Uçar, Ayşegül, Yakup Demir, and Cüneyt Güzeliş. "Object recognition and detection with deep learning for autonomous driving applications." SIMULATION 93, no. 9 (2017): 759–69. http://dx.doi.org/10.1177/0037549717709932.

Full text

Abstract:

Autonomous driving requires reliable and accurate detection and recognition of surrounding objects in real drivable environments. Although different object detection algorithms have been proposed, not all are robust enough to detect and recognize occluded or truncated objects. In this paper, we propose a novel hybrid Local Multiple system (LM-CNN-SVM) based on Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) due to their powerful feature extraction capability and robust classification property, respectively. In the proposed system, we divide first the whole image into local regions and employ multiple CNNs to learn local object features. Secondly, we select discriminative features by using Principal Component Analysis. We then import into multiple SVMs applying both empirical and structural risk minimization instead of using a direct CNN to increase the generalization ability of the classifier system. Finally, we fuse SVM outputs. In addition, we use the pre-trained AlexNet and a new CNN architecture. We carry out object recognition and pedestrian detection experiments on the Caltech-101 and Caltech Pedestrian datasets. Comparisons to the best state-of-the-art methods show that the proposed system achieved better results.

APA, Harvard, Vancouver, ISO, and other styles

50

BÁLYA, DÁVID. "CNN UNIVERSAL MACHINE AS CLASSIFICATON PLATFORM: AN ART-LIKE CLUSTERING ALGORITHM." International Journal of Neural Systems 13, no. 06 (2003): 415–25. http://dx.doi.org/10.1142/s0129065703001807.

Full text

Abstract:

Fast and robust classification of feature vectors is a crucial task in a number of real-time systems. A cellular neural/nonlinear network universal machine (CNN-UM) can be very efficient as a feature detector. The next step is to post-process the results for object recognition. This paper shows how a robust classification scheme based on adaptive resonance theory (ART) can be mapped to the CNN-UM. Moreover, this mapping is general enough to include different types of feed-forward neural networks. The designed analogic CNN algorithm is capable of classifying the extracted feature vectors keeping the advantages of the ART networks, such as robust, plastic and fault-tolerant behaviors. An analogic algorithm is presented for unsupervised classification with tunable sensitivity and automatic new class creation. The algorithm is extended for supervised classification. The presented binary feature vector classification is implemented on the existing standard CNN-UM chips for fast classification. The experimental evaluation shows promising performance after 100% accuracy on the training set.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!