Log in

Relevant bibliographies by topics / 2D Convolution Neural Network (CNN) / Journal articles

To see the other types of publications on this topic, follow the link: 2D Convolution Neural Network (CNN).

Journal articles on the topic '2D Convolution Neural Network (CNN)'

Author: Grafiati

Published: 5 June 2025

Last updated: 9 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic '2D Convolution Neural Network (CNN).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Liao, Shengbin, Xiaofeng Wang, and ZongKai Yang. "A heterogeneous two-stream network for human action recognition." AI Communications 36, no. 3 (2023): 219–33. http://dx.doi.org/10.3233/aic-220188.

Full text

Abstract:

The most widely used two-stream architectures and building blocks for human action recognition in videos generally consist of 2D or 3D convolution neural networks. 3D convolution can abstract motion messages between video frames, which is essential for video classification. 3D convolution neural networks usually obtain good performance compared with 2D cases, however it also increases computational cost. In this paper, we propose a heterogeneous two-stream architecture which incorporates two convolutional networks. One uses a mixed convolution network (MCN), which combines some 3D convolutions in the middle of 2D convolutions to train RGB frames, another one adopts BN-Inception network to train Optical Flow frames. Considering the redundancy of neighborhood video frames, we adopt a sparse sampling strategy to decrease the computational cost. Our architecture is trained and evaluated on the standard video actions benchmarks of HMDB51 and UCF101. Experimental results show our approach obtains the state-of-the-art performance on the datasets of HMDB51 (73.04%) and UCF101 (95.27%).

APA, Harvard, Vancouver, ISO, and other styles

2

Wang, Baiyang, Dongyue Huo, Yuyun Kang, and Jie Sun. "AGV Status Monitoring and Fault Diagnosis based on CNN." Journal of Physics: Conference Series 2281, no. 1 (2022): 012019. http://dx.doi.org/10.1088/1742-6596/2281/1/012019.

Full text

Abstract:

Abstract In order to solve the problem of AGV fault detection system’s complexion and low accuracy, a convolutional neural network (CNN) based on the status monitoring and fault diagnosis method for automatic guided vehicle (AGV) is proposed. Firstly, the vibration signals of the core components of AGV are converted into two-dimensional (2D) images. Secondly, 2D images are input into convolution neural network for training. Finally, the trained model is used to monitor the running status of AGV and identify faults. The results show that the proposed method can effectively monitor the status of AGV in operation.

APA, Harvard, Vancouver, ISO, and other styles

3

Chidester, Benjamin, Tianming Zhou, Minh N. Do, and Jian Ma. "Rotation equivariant and invariant neural networks for microscopy image analysis." Bioinformatics 35, no. 14 (2019): i530—i537. http://dx.doi.org/10.1093/bioinformatics/btz353.

Full text

Abstract:

Abstract Motivation Neural networks have been widely used to analyze high-throughput microscopy images. However, the performance of neural networks can be significantly improved by encoding known invariance for particular tasks. Highly relevant to the goal of automated cell phenotyping from microscopy image data is rotation invariance. Here we consider the application of two schemes for encoding rotation equivariance and invariance in a convolutional neural network, namely, the group-equivariant CNN (G-CNN), and a new architecture with simple, efficient conic convolution, for classifying microscopy images. We additionally integrate the 2D-discrete-Fourier transform (2D-DFT) as an effective means for encoding global rotational invariance. We call our new method the Conic Convolution and DFT Network (CFNet). Results We evaluated the efficacy of CFNet and G-CNN as compared to a standard CNN for several different image classification tasks, including simulated and real microscopy images of subcellular protein localization, and demonstrated improved performance. We believe CFNet has the potential to improve many high-throughput microscopy image analysis applications. Availability and implementation Source code of CFNet is available at: https://github.com/bchidest/CFNet. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

4

Feng, Yuchao, Jianwei Zheng, Mengjie Qin, Cong Bai, and Jinglin Zhang. "3D Octave and 2D Vanilla Mixed Convolutional Neural Network for Hyperspectral Image Classification with Limited Samples." Remote Sensing 13, no. 21 (2021): 4407. http://dx.doi.org/10.3390/rs13214407.

Full text

Abstract:

Owing to the outstanding feature extraction capability, convolutional neural networks (CNNs) have been widely applied in hyperspectral image (HSI) classification problems and have achieved an impressive performance. However, it is well known that 2D convolution suffers from the absent consideration of spectral information, while 3D convolution requires a huge amount of computational cost. In addition, the cost of labeling and the limitation of computing resources make it urgent to improve the generalization performance of the model with scarcely labeled samples. To relieve these issues, we design an end-to-end 3D octave and 2D vanilla mixed CNN, namely Oct-MCNN-HS, based on the typical 3D-2D mixed CNN (MCNN). It is worth mentioning that two feature fusion operations are deliberately constructed to climb the top of the discriminative features and practical performance. That is, 2D vanilla convolution merges the feature maps generated by 3D octave convolutions along the channel direction, and homology shifting aggregates the information of the pixels locating at the same spatial position. Extensive experiments are conducted on four publicly available HSI datasets to evaluate the effectiveness and robustness of our model, and the results verify the superiority of Oct-MCNN-HS both in efficacy and efficiency.

APA, Harvard, Vancouver, ISO, and other styles

5

M, Sowmya, Balasubramanian M, and Vaidehi K. "Human Behavior Classification using 2D – Convolutional Neural Network, VGG16 and ResNet50." Indian Journal of Science and Technology 16, no. 16 (2023): 1221–29. https://doi.org/10.17485/IJST/v16i16.199.

Full text

Abstract:

Abstract <strong>Objective:</strong> To develop a real-time application for human behavior classification using 2- Dimensional Convolution Neural Network, VGG16 and ResNet50. <strong>Methods:</strong> This study provides a novel system which considers sitting, standing and walking as normal human behaviors. It consists of three major steps: dataset collection, training, and testing. In this work real time images are used. In human behavior classification dataset there are 2271 trained images and 539 testing images. <strong>Findings:</strong> The Convolution Neural Network (CNN), VGG16 and ResNet50 are trained using human normal behavior images. <strong>Novelty:</strong> The dataset namely human behavior classification dataset is used in this work and the experimental results has shown that on human behavior classification ResNet50 has outperformed with accuracy of 99.72% compared to VGG16 and 2D-CNN. This work can detect the three normal behaviors of humans in an unconstrained laboratory environment. <strong>Keywords:</strong> Deep Learning; 2D Convolution Neural Network (CNN); Human Behavior Classification; ADAM Optimizer; VGG16; ResNet50

APA, Harvard, Vancouver, ISO, and other styles

6

Yuan, Q., Y. Ang, and H. Z. M. Shafri. "HYPERSPECTRAL IMAGE CLASSIFICATION USING RESIDUAL 2D AND 3D CONVOLUTIONAL NEURAL NETWORK JOINT ATTENTION MODEL." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIV-M-3-2021 (August 10, 2021): 187–93. http://dx.doi.org/10.5194/isprs-archives-xliv-m-3-2021-187-2021.

Full text

Abstract:

Abstract. Hyperspectral image classification (HSIC) is a challenging task in remote sensing data analysis, which has been applied in many domains for better identification and inspection of the earth surface by extracting spectral and spatial information. The combination of abundant spectral features and accurate spatial information can improve classification accuracy. However, many traditional methods are based on handcrafted features, which brings difficulties for multi-classification tasks due to spectral intra-class heterogeneity and similarity of inter-class. The deep learning algorithm, especially the convolutional neural network (CNN), has been perceived promising feature extractor and classification for processing hyperspectral remote sensing images. Although 2D CNN can extract spatial features, the specific spectral properties are not used effectively. While 3D CNN has the capability for them, but the computational burden increases as stacking layers. To address these issues, we propose a novel HSIC framework based on the residual CNN network by integrating the advantage of 2D and 3D CNN. First, 3D convolutions focus on extracting spectral features with feature recalibration and refinement by channel attention mechanism. The 2D depth-wise separable convolution approach with different size kernels concentrates on obtaining multi-scale spatial features and reducing model parameters. Furthermore, the residual structure optimizes the back-propagation for network training. The results and analysis of extensive HSIC experiments show that the proposed residual 2D-3D CNN network can effectively extract spectral and spatial features and improve classification accuracy.

APA, Harvard, Vancouver, ISO, and other styles

7

Gao, Wenqiang, Zhiyun Xiao, and Tengfei Bao. "Detection and Identification of Potato-Typical Diseases Based on Multidimensional Fusion Atrous-CNN and Hyperspectral Data." Applied Sciences 13, no. 8 (2023): 5023. http://dx.doi.org/10.3390/app13085023.

Full text

Abstract:

As one of the world’s most crucial crops, the potato is an essential source of nutrition for human activities. However, several diseases pose a severe threat to the yield and quality of potatoes. Timely and accurate detection and identification of potato diseases are of great importance. Hyperspectral imaging has emerged as an essential tool that provides rich spectral and spatial distribution information and has been widely used in potato disease detection and identification. Nevertheless, the accuracy of prediction is often low when processing hyperspectral data using a one-dimensional convolutional neural network (1D-CNN). Additionally, conventional three-dimensional convolutional neural networks (3D-CNN) often require high hardware consumption while processing hyperspectral data. In this paper, we propose an Atrous-CNN network structure that fuses multiple dimensions to address these problems. The proposed structure combines the spectral information extracted by 1D-CNN, the spatial information extracted by 2D-CNN, and the spatial spectrum information extracted by 3D-CNN. To enhance the perceptual field of the convolution kernel and reduce the loss of hyperspectral data, null convolution is utilized in 1D-CNN and 2D-CNN to extract data features. We tested the proposed structure on three real-world potato diseases and achieved recognition accuracy of up to 0.9987. The algorithm presented in this paper effectively extracts hyperspectral data feature information using three different dimensional CNNs, leading to higher recognition accuracy and reduced hardware consumption. Therefore, it is feasible to use the 1D-CNN network and hyperspectral image technology for potato plant disease identification.

APA, Harvard, Vancouver, ISO, and other styles

8

Leong, Mei Chee, Dilip K. Prasad, Yong Tsui Lee, and Feng Lin. "Semi-CNN Architecture for Effective Spatio-Temporal Learning in Action Recognition." Applied Sciences 10, no. 2 (2020): 557. http://dx.doi.org/10.3390/app10020557.

Full text

Abstract:

This paper introduces a fusion convolutional architecture for efficient learning of spatio-temporal features in video action recognition. Unlike 2D convolutional neural networks (CNNs), 3D CNNs can be applied directly on consecutive frames to extract spatio-temporal features. The aim of this work is to fuse the convolution layers from 2D and 3D CNNs to allow temporal encoding with fewer parameters than 3D CNNs. We adopt transfer learning from pre-trained 2D CNNs for spatial extraction, followed by temporal encoding, before connecting to 3D convolution layers at the top of the architecture. We construct our fusion architecture, semi-CNN, based on three popular models: VGG-16, ResNets and DenseNets, and compare the performance with their corresponding 3D models. Our empirical results evaluated on the action recognition dataset UCF-101 demonstrate that our fusion of 1D, 2D and 3D convolutions outperforms its 3D model of the same depth, with fewer parameters and reduces overfitting. Our semi-CNN architecture achieved an average of 16–30% boost in the top-1 accuracy when evaluated on an input video of 16 frames.

APA, Harvard, Vancouver, ISO, and other styles

9

Liang, Lianhui, Shaoquan Zhang, Jun Li, Antonio Plaza, and Zhi Cui. "Multi-Scale Spectral-Spatial Attention Network for Hyperspectral Image Classification Combining 2D Octave and 3D Convolutional Neural Networks." Remote Sensing 15, no. 7 (2023): 1758. http://dx.doi.org/10.3390/rs15071758.

Full text

Abstract:

Traditional convolutional neural networks (CNNs) can be applied to obtain the spectral-spatial feature information from hyperspectral images (HSIs). However, they often introduce significant redundant spatial feature information. The octave convolution network is frequently utilized instead of traditional CNN to decrease spatial redundant information of the network and extend its receptive field. However, the 3D octave convolution-based approaches may introduce extensive parameters and complicate the network. To solve these issues, we propose a new HSI classification approach with a multi-scale spectral-spatial network-based framework that combines 2D octave and 3D CNNs. Our method, called MOCNN, first utilizes 2D octave convolution and 3D DenseNet branch networks with various convolutional kernel sizes to obtain complex spatial contextual feature information and spectral characteristics, separately. Moreover, the channel and the spectral attention mechanisms are, respectively, applied to these two branch networks to emphasize significant feature regions and certain important spectral bands that comprise discriminative information for the categorization. Furthermore, a sample balancing strategy is applied to address the sample imbalance problem. Expansive experiments are undertaken on four HSI datasets, demonstrating that our MOCNN approach outperforms several other methods for HSI classification, especially in scenarios dominated by limited and imbalanced sample data.

APA, Harvard, Vancouver, ISO, and other styles

10

Archana, D. "Brain Tumor Detection Using Convolution Neural Networks." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem46974.

Full text

Abstract:

ABSTRACT Early diagnosis of brain tumors is important for improving patient prognoses; however, traditional diagnostic methods like biopsies require invasive surgical procedures. In this paper, we introduce two deep learning-based methods—a new two-dimensional Convolutional Neural Network (CNN) and a convolutional auto-encoder network—that enable the accurate classification of brain tumors from magnetic resonance imaging (MRI). A data set of 7,000 T1-weighted contrast-enhanced MRI images was utilized, including glioma, meningioma, pituitary gland tumor, and normal brain samples. Preprocessing and augmentation procedures were carried out on the data set in order to enhance the generalization ability of the models. The suggested architecture of the 2D CNN includes eight convolutional layers, four pooling layers, and utilizes batch normalization with a uniform 2×2 kernel across the network. The auto-encoder architecture combines feature extraction and classification by utilizing the last output of the encoder. Experimental results show that the 2D CNN achieved a training accuracy of 96.47% with an average recall of 95%, showing good performance and efficiency in computation. The simplicity and effectiveness of the proposed CNN make it a promising tool for real-time clinical applications, offering a non-surgical and highly reliable tool for the assistance of radiologists in the diagnosis of brain tumors.

APA, Harvard, Vancouver, ISO, and other styles

11

Li, Jiaojiao, Ruxing Cui, Bo Li, Rui Song, Yunsong Li, and Qian Du. "Hyperspectral Image Super-Resolution with 1D–2D Attentional Convolutional Neural Network." Remote Sensing 11, no. 23 (2019): 2859. http://dx.doi.org/10.3390/rs11232859.

Full text

Abstract:

Hyperspectral image (HSI) super-resolution (SR) is of great application value and has attracted broad attention. The hyperspectral single image super-resolution (HSISR) task is correspondingly difficult in SR due to the unavailability of auxiliary high resolution images. To tackle this challenging task, different from the existing learning-based HSISR algorithms, in this paper we propose a novel framework, i.e., a 1D–2D attentional convolutional neural network, which employs a separation strategy to extract the spatial–spectral information and then fuse them gradually. More specifically, our network consists of two streams: a spatial one and a spectral one. The spectral one is mainly composed of the 1D convolution to encode a small change in the spectrum, while the 2D convolution, cooperating with the attention mechanism, is used in the spatial pathway to encode spatial information. Furthermore, a novel hierarchical side connection strategy is proposed for effectively fusing spectral and spatial information. Compared with the typical 3D convolutional neural network (CNN), the 1D–2D CNN is easier to train with less parameters. More importantly, our proposed framework can not only present a perfect solution for the HSISR problem, but also explore the potential in hyperspectral pansharpening. The experiments over widely used benchmarks on SISR and hyperspectral pansharpening demonstrate that the proposed method could outperform other state-of-the-art methods, both in visual quality and quantity measurements.

APA, Harvard, Vancouver, ISO, and other styles

12

Dai, Hualin, Yingli Yue, and Qi Liu. "Hyperspectral Image Classification Based on Hybrid Depth-Wise Separable Convolution and Dual-Branch Feature Fusion Network." Applied Sciences 15, no. 3 (2025): 1394. https://doi.org/10.3390/app15031394.

Full text

Abstract:

Recently, advancements in convolutional neural networks (CNNs) have significantly contributed to the advancement of hyperspectral image (HSI) classification. However, the problem of limited training samples is the primary obstacle to obtaining further improvements in HSI classification. The traditional methods relying solely on 2D-CNN for feature extraction underutilize the inter-band correlations of HSI, while the methods based on 3D-CNN alone for feature extraction lead to an increase in training parameters. To solve the above problems, we propose an HSI classification network based on hybrid depth-wise separable convolution and dual-branch feature fusion (HDCDF). The dual-branch structure is designed in HDCDF to extract simultaneously integrated spectral–spatial features and obtain complementary features via feature fusion. The proposed modules of 2D depth-wise separable convolution attention (2D-DCAttention) block and hybrid residual blocks are applied to the dual branch, respectively, further extracting more representative and comprehensive features. Instead of full 3D convolutions, HDCDF uses hybrid 2D–3D depth-wise separable convolutions, offering computational efficiency. Experiments are conducted on three benchmark HSI datasets: Indian Pines, University of Pavia, and Salinas Valley. The experimental results show that the proposed method showcases superior performance when the training samples are extremely limited, outpacing the state-of-the-art method by an average of 2.03% in the overall accuracy of three datasets, which shows that HDCDF has a certain potential in HSI classification.

APA, Harvard, Vancouver, ISO, and other styles

13

Ren, Yun, Changren Zhu, and Shunping Xiao. "Deformable Faster R-CNN with Aggregating Multi-Layer Features for Partially Occluded Object Detection in Optical Remote Sensing Images." Remote Sensing 10, no. 9 (2018): 1470. http://dx.doi.org/10.3390/rs10091470.

Full text

Abstract:

The region-based convolutional networks have shown their remarkable ability for object detection in optical remote sensing images. However, the standard CNNs are inherently limited to model geometric transformations due to the fixed geometric structures in its building modules. To address this, we introduce a new module named deformable convolution that is integrated into the prevailing Faster R-CNN. By adding 2D offsets to the regular sampling grid in the standard convolution, it learns the augmenting spatial sampling locations in the modules from target tasks without additional supervision. In our work, a deformable Faster R-CNN is constructed by substituting the standard convolution layer with a deformable convolution layer in the last network stage. Besides, top-down and skip connections are adopted to produce a single high-level feature map of a fine resolution, on which the predictions are to be made. To make the model robust to occlusion, a simple yet effective data augmentation technique is proposed for training the convolutional neural network. Experimental results show that our deformable Faster R-CNN improves the mean average precision by a large margin on the SORSI and HRRS dataset.

APA, Harvard, Vancouver, ISO, and other styles

14

Kondal, Easala Ravi, and Soubhagya Sankar Barpanda. "Hyperspectral image classification using Hyb-3D convolution neural network spectral partitioning." Indonesian Journal of Electrical Engineering and Computer Science 29, no. 1 (2023): 295–303. https://doi.org/10.11591/ijeecs.v29.i1.pp295-303.

Full text

Abstract:

Hyperspectral image classification (HSIC) on remote sensing imaging has brought immersive achievement using artificial intelligence technology. In deep learning convolution neural networks (CNN), 2D-CNN, and 3D-CNN methods are widely used to classify the spectral-spatial bands of hyperspectral images (HSI). The proposed Hybrid 3D-CNN (H3D-CNN) model framework for deeper features extraction predicts classification accuracy in supervised learning. The model reduces the narrow gap between supervised and unsupervised learning and the complexity and cost of the previous models. The HSI classification analysis is carried out on real-world data sets of Indian pines Salinas datasets captured by Airborne visible, infrared imaging spectrometer (AVIRIS) sensors that performed superior classification accuracy results.

APA, Harvard, Vancouver, ISO, and other styles

15

He, Chu, Zishan Shi, Tao Qu, Dingwen Wang, and Mingsheng Liao. "Lifting Scheme-Based Deep Neural Network for Remote Sensing Scene Classification." Remote Sensing 11, no. 22 (2019): 2648. http://dx.doi.org/10.3390/rs11222648.

Full text

Abstract:

Recently, convolutional neural networks (CNNs) achieve impressive results on remote sensing scene classification, which is a fundamental problem for scene semantic understanding. However, convolution, the most essential operation in CNNs, restricts the development of CNN-based methods for scene classification. Convolution is not efficient enough for high-resolution remote sensing images and limited in extracting discriminative features due to its linearity. Thus, there has been growing interest in improving the convolutional layer. The hardware implementation of the JPEG2000 standard relies on the lifting scheme to perform wavelet transform (WT). Compared with the convolution-based two-channel filter bank method of WT, the lifting scheme is faster, taking up less storage and having the ability of nonlinear transformation. Therefore, the lifting scheme can be regarded as a better alternative implementation for convolution in vanilla CNNs. This paper introduces the lifting scheme into deep learning and addresses the problems that only fixed and finite wavelet bases can be replaced by the lifting scheme, and the parameters cannot be updated through backpropagation. This paper proves that any convolutional layer in vanilla CNNs can be substituted by an equivalent lifting scheme. A lifting scheme-based deep neural network (LSNet) is presented to promote network applications on computational-limited platforms and utilize the nonlinearity of the lifting scheme to enhance performance. LSNet is validated on the CIFAR-100 dataset and the overall accuracies increase by 2.48% and 1.38% in the 1D and 2D experiments respectively. Experimental results on the AID which is one of the newest remote sensing scene dataset demonstrate that 1D LSNet and 2D LSNet achieve 2.05% and 0.45% accuracy improvement compared with the vanilla CNNs respectively.

APA, Harvard, Vancouver, ISO, and other styles

16

Pushkar, Piyush, Rohan Khandare, Yasharth Prasad, Vishal Kumar, and Dr Megha Kadam. "Real Time Drowsiness Detection System Using CNN." International Journal for Research in Applied Science and Engineering Technology 11, no. 5 (2023): 3884–87. http://dx.doi.org/10.22214/ijraset.2023.49112.

Full text

Abstract:

Abstract: Driver fatigue and rash driving are the leading causes of road accidents, which result in the loss of valued life and decrease road traffic safety. Driver drowsiness solutions that are reliable and precise are essential to prevent accidents and increase road traffic safety. Various driver drowsiness detection systems have been developed using various technologies that are geared at the specific parameter of detecting the driver's tiredness. This research offers a unique multi-level distribution model for detecting driver drowsiness utilising Convolution Neural Networks (CNN) and. To detect the driver's behaviour and emotion, the driver's face pattern is handled with a 2D Convolution Neural Network (CNN). The suggested model is built with OpenCV, and the experimental findings show that it recognises the driver's emotion and tiredness more efficiently than existing technologies

APA, Harvard, Vancouver, ISO, and other styles

17

Pushkar, Piyush, Rohan Khandare, Yasharth Prasad, and Vishal Kumar. "Real Time Drowsiness Detection System Using CNN." International Journal for Research in Applied Science and Engineering Technology 11, no. 1 (2023): 1487–90. http://dx.doi.org/10.22214/ijraset.2023.48847.

Full text

Abstract:

Abstract: Driver fatigue and rash driving are the leading causes of road accidents, which result in the loss of valued life and decrease road traffic safety. Driver drowsiness solutions that are reliable and precise are essential to prevent accidents and increase road traffic safety. Various driver drowsiness detection systems have been developed using various technologies that are geared at the specific parameter of detecting the driver's tiredness. This research offers a unique multi-level distribution model for detecting driver drowsiness utilising Convolution Neural Networks (CNN) and. To detect the driver's behaviour and emotion, the driver's face pattern is handled with a 2D Convolution Neural Network (CNN). The suggested model is built with OpenCV, and the experimental findings show that it recognises the driver's emotion and tiredness more efficiently than existing technologies.

APA, Harvard, Vancouver, ISO, and other styles

18

El, Abady Naglaa F., Zayed Hala H., and Mohamed Taha. "Source printer identification using convolutional neural network and transfer learning approach." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 948–60. https://doi.org/10.11591/ijai.v13.i1.pp948-960.

Full text

Abstract:

In recent years, Source printer identification has become increasingly important for detecting forged documents. A printer's distinguishing feature is its fingerprints. Each printer has a unique collection of fingerprints on every printed page. A model for identifying the source printer and classifying the questioned document into one of the printer classes is provided by source printer identification. A paper proposes a new approach that trains three different approaches on the dataset to choose the more accurate model for determining the printer's source. In the first, some pre-trained models are used as feature extractors, and support vector machine (SVM) is used to classify the generated features. In the second, we construct a two-dimensional convolutional neural network (2D-CNN) to address the source printer identification (SPI) problem. Instead of SoftMax, 2D-CNN is employed for feature extractors and SVM as a classifier. This approach obtains 93.75% 98.5% accuracy for 2D-CNN-SVM in the experiments. The SVM classifier enhanced the 2D-CNN accuracy by roughly 5% over the initial configuration. Finally, we adjusted 13 already-pre-trained CNN architectures using the dataset. Among the 13 pre-trained CNN models, DarkNet-19 has the greatest accuracy of 99.2 %. On the same dataset, the suggested approaches achieve well in terms of classification accuracy than the other recently released algorithms.

APA, Harvard, Vancouver, ISO, and other styles

19

Ibrahim, Alaa, Mohamed Waleed Fakhr, and Mohamed Farouk. "Secure CNN Computation Using Random Projection and Support Vector Regression." Journal of Advanced Research in Applied Sciences and Engineering Technology 65, no. 1 (2025): 209–25. https://doi.org/10.37934/araset.65.1.209225.

Full text

Abstract:

Convolutional Neural Networks (CNNs) are foundational in numerous machine learning applications, particularly in image processing, where they excel in identifying patterns within visual data. At the core of CNNs lies the 2D convolution operation, which is essential for extracting spatial features from images. However, when applied to sensitive data, such as in medical imaging or surveillance, preserving the privacy of both the input data and the convolutional filters is crucial. This paper introduces a novel approach to secure the 2D convolution operation in CNNs, leveraging random projection and machine learning techniques. By encrypting the input images and convolutional filters using random projection, the method ensures that the convolution feature maps are computed securely without exposing the underlying data. The proposed technique maintains the accuracy and efficiency of CNN while offering a privacy-preserving solution that is more computationally efficient than traditional methods such as Homomorphic Encryption (HE). Experimental results using synthetic Gaussian data demonstrate the feasibility and effectiveness of this approach in securely computing convolutions, making it a promising solution for protecting sensitive information in CNN-based applications. Additionally, the paper compares the proposed method with homomorphic encryption, showing that while both methods ensure data confidentiality, the random projection approach offers a more efficient solution with lower computational overhead.

APA, Harvard, Vancouver, ISO, and other styles

20

Zhang, Yi, Fuzhou Liu, Jie Guan, and Yongli Zhu. "Time-frequency Fusion Method via Convolutional Neural Network for Partial Discharge Classification." Journal of Physics: Conference Series 2452, no. 1 (2023): 012014. http://dx.doi.org/10.1088/1742-6596/2452/1/012014.

Full text

Abstract:

Abstract To improve the accuracy of partial discharge (PD) pattern recognition by jointing time-domain (TD) and frequency-domain (FD) information, a time-frequency (TF) fusion method via convolution neural network (CNN) is proposed in this paper. Firstly, PD signals are represented by PD waveform images and transformed into the envelope of variational mode decomposition-based Hilbert marginal spectrum (VHMS). Secondly, a fusion network, FuNet involving a 2-dimensional CNN (2D-CNN), a 1D-CNN, and a multilayer perceptron (MLP), is established to join TF information. In FuNet, the 2D-CNN inputted by PD waveform images and 1D-CNN adopted the envelope of VHMS of PD signal as its input are all improved by drawing on the complementary strengths of different convolution layers’ features. Then the MLP will fuse the extracted TD and FD features and classify the PD defects.

APA, Harvard, Vancouver, ISO, and other styles

21

Lv, Shidong, Tao Long, Zhixian Hou, Liang Yan, and Zhenzhong Li. "3D CNN Hardware Circuit for Motion Recognition Based on FPGA." Journal of Physics: Conference Series 2363, no. 1 (2022): 012030. http://dx.doi.org/10.1088/1742-6596/2363/1/012030.

Full text

Abstract:

In recent years, three-dimensional convolutional neural network (3D CNN) has been widely used in the fields of action recognition and video analysis. The general purpose processors are difficult to achieve efficient and intensive computing, and the deployment of 3D CNN based on FPGA has the advantages of low power consumption, high energy efficiency, and customizability, and has gradually become a hot choice for deploying convolutional neural networks in many embedded scenarios. This paper designs a small 3D convolutional neural network based on the classic 3D convolutional neural network C3D, and uses the general matrix multiplication (GEMM) to map the 3D convolution calculation to the 2D matrix multiplication calculation. The matrix is divided into blocks and transmitted to the FPGA through the AXI bus, and the multiplication operation of the block matrix is realized through a two-dimensional multiply-accumulate array. A System on Chip (SoC) architecture is built on the PYNQ platform, using ARM Cortex-A9 as the process control core, and the calculation of the entire matrix is completed under reasonable block and scheduling on the ARM. The IP core of matrix calculation is designed using High Level Synthesis (HLS), and the corresponding parallel optimization scheme is given. Experiments have verified that the prototype design of the hardware acceleration circuit achieves low power consumption, high energy efficiency, and high precision motion recognition while using less hardware resources.

APA, Harvard, Vancouver, ISO, and other styles

22

Hu, Jinlong, Yuezhen Kuang, Bin Liao, Lijie Cao, Shoubin Dong, and Ping Li. "A Multichannel 2D Convolutional Neural Network Model for Task-Evoked fMRI Data Classification." Computational Intelligence and Neuroscience 2019 (December 31, 2019): 1–9. http://dx.doi.org/10.1155/2019/5065214.

Full text

Abstract:

Deep learning models have been successfully applied to the analysis of various functional MRI data. Convolutional neural networks (CNN), a class of deep neural networks, have been found to excel at extracting local meaningful features based on their shared-weights architecture and space invariance characteristics. In this study, we propose M2D CNN, a novel multichannel 2D CNN model, to classify 3D fMRI data. The model uses sliced 2D fMRI data as input and integrates multichannel information learned from 2D CNN networks. We experimentally compared the proposed M2D CNN against several widely used models including SVM, 1D CNN, 2D CNN, 3D CNN, and 3D separable CNN with respect to their performance in classifying task-based fMRI data. We tested M2D CNN against six models as benchmarks to classify a large number of time-series whole-brain imaging data based on a motor task in the Human Connectome Project (HCP). The results of our experiments demonstrate the following: (i) convolution operations in the CNN models are advantageous for high-dimensional whole-brain imaging data classification, as all CNN models outperform SVM; (ii) 3D CNN models achieve higher accuracy than 2D CNN and 1D CNN model, but 3D CNN models are computationally costly as any extra dimension is added in the input; (iii) the M2D CNN model proposed in this study achieves the highest accuracy and alleviates data overfitting given its smaller number of parameters as compared with 3D CNN.

APA, Harvard, Vancouver, ISO, and other styles

23

Wang, Hui, Yu Guo, and Zhengyou Wang. "Face-Based CNN on Triangular Mesh with Arbitrary Connectivity." Electronics 11, no. 15 (2022): 2466. http://dx.doi.org/10.3390/electronics11152466.

Full text

Abstract:

Applying convolutional neural networks (CNNs) to triangular meshes has always been a challenging task. Because of the complex structure of the meshes, most of the existing methods apply CNNs indirectly to them, and require complex preprocessing or transformation of the meshes. In this paper, we propose a novel face-based CNN, which can be directly applied to triangular meshes with arbitrary connectivity by defining face convolution and pooling. The proposed approach takes each face of the meshes as the basic element, similar to CNNs with pixels of 2D images. First, the intrinsic features of the faces are used as the input features of the network. Second, a sort convolution operation with adjustable convolution kernel sizes is constructed to extract the face features. Third, we design an approximately uniform pooling operation by learnable face collapse, which can be applied to the meshes with arbitrary connectivity, and we directly use its inverse operation as unpooling. Extensive experiments show that the proposed approach is comparable to, or can even outperform, state-of-the-art methods in mesh classification and mesh segmentation.

APA, Harvard, Vancouver, ISO, and other styles

24

Fauzivy, Reggiswarashari, and Widya Sihwi Sari. "Speech emotion recognition using 2D-convolutional neural network." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 6 (2022): 6594–601. https://doi.org/10.11591/ijece.v12i6.pp6594-6601.

Full text

Abstract:

This research proposes a speech emotion recognition model to predict human emotions using the convolutional neural network (CNN) by learning segmented audio of specific emotions. Speech emotion recognition utilizes the extracted features of audio waves to learn speech emotion characteristics; one of them is mel frequency cepstral coefficient (MFCC). Dataset takes a vital role to obtain valuable results in model learning. Hence this research provides the leverage of dataset combination implementation. The model learns a combined dataset with audio segmentation and zero padding using 2D-CNN. Audio segmentation and zero padding equalize the extracted audio features to learn the characteristics. The model results in 83.69% accuracy to predict seven emotions: neutral, happy, sad, angry, fear, disgust, and surprise from the combined dataset with the segmentation of the audio files.

APA, Harvard, Vancouver, ISO, and other styles

25

Ma, Jie, and Lei Jiao. "Fault Diagnosis of Planetary Gear Based on FRWT and 2D-CNN." Mathematical Problems in Engineering 2022 (February 10, 2022): 1–14. http://dx.doi.org/10.1155/2022/4648653.

Full text

Abstract:

The fault signals of planetary gears are nonstationary and nonlinear signals. It is difficult to extract weak fault features under strong background noise. This paper adopts a new filtering method, fractional Wavelet transform (FRWT). Compared with the traditional fractional Fourier transform (FRFT), it can improve the effect of noise reduction. This paper adopts a planetary gear fault diagnosis method combining fractional wavelet transform (FRWT) and two-dimensional convolutional neural network (2D-CNN). Firstly, several intrinsic mode component functions (IMFs) are obtained from the original vibration signal by AFSA-VMD decomposition, and the two components with the largest correlation coefficient are selected for signal reconstruction. Then, the reconstructed signal is filtered in fractional wavelet domain. By analyzing the wavelet energy entropy of the filtered signal, a two-dimensional normalized energy characteristic matrix is constructed and the two-dimensional features are input into the two-dimensional convolution neural network model for training. The simulation results show that the training effect of this method is better than that of FRFT-2D-CNN. Through the verification of the test set, we can know that the fault diagnosis of planetary gears can be realized accurately based on FRWT and 2D-CNN.

APA, Harvard, Vancouver, ISO, and other styles

26

Gao, Hongmin, Shuo Lin, Yao Yang, Chenming Li, and Mingxiang Yang. "Convolution Neural Network Based on Two-Dimensional Spectrum for Hyperspectral Image Classification." Journal of Sensors 2018 (August 28, 2018): 1–13. http://dx.doi.org/10.1155/2018/8602103.

Full text

Abstract:

Inherent spectral characteristics of hyperspectral image (HSI) data are determined and need to be deeply mined. A convolution neural network (CNN) model of two-dimensional spectrum (2D spectrum) is proposed based on the advantages of deep learning to extract feature and classify HSI. First of all, the traditional data processing methods which use small area pixel block or one-dimensional spectral vector as input unit bring many heterogeneous noises. The 2D-spectrum image method is proposed to solve the problem and make full use of spectral value and spatial information. Furthermore, a batch normalization algorithm (BN) is introduced to address internal covariate shifts caused by changes in the distribution of input data and expedite the training of the network. Finally, Softmax loss models are proposed to induce competition among the outputs and improve the performance of the CNN model. The HSI datasets of experiments include Indian Pines, Salinas, Kennedy Space Center (KSC), and Botswana. Experimental results show that the overall accuracies of the 2D-spectrum CNN model can reach 98.26%, 97.28%, 96.22%, and 93.64%. These results are higher than the accuracies of other traditional methods described in this paper. The proposed model can achieve high target classification accuracy and efficiency.

APA, Harvard, Vancouver, ISO, and other styles

27

Zhang, Erhu, Botao Xue, Fangzhou Cao, Jinghong Duan, Guangfeng Lin, and Yifei Lei. "Fusion of 2D CNN and 3D DenseNet for Dynamic Gesture Recognition." Electronics 8, no. 12 (2019): 1511. http://dx.doi.org/10.3390/electronics8121511.

Full text

Abstract:

Gesture recognition has been applied in many fields as it is a natural human–computer communication method. However, recognition of dynamic gesture is still a challenging topic because of complex disturbance information and motion information. In this paper, we propose an effective dynamic gesture recognition method by fusing the prediction results of a two-dimensional (2D) motion representation convolution neural network (CNN) model and three-dimensional (3D) dense convolutional network (DenseNet) model. Firstly, to obtain a compact and discriminative gesture motion representation, the motion history image (MHI) and pseudo-coloring technique were employed to integrate the spatiotemporal motion sequences into a frame image, before being fed into a 2D CNN model for gesture classification. Next, the proposed 3D DenseNet model was used to extract spatiotemporal features directly from Red, Green, Blue (RGB) gesture videos. Finally, the prediction results of the proposed 2D and 3D deep models were blended together to boost recognition performance. The experimental results on two public datasets demonstrate the effectiveness of our proposed method.

APA, Harvard, Vancouver, ISO, and other styles

28

Liang, Ziyi. "The Happy and Sad Music Recognition Using Simple 2D CNN." Highlights in Science, Engineering and Technology 85 (March 13, 2024): 585–90. http://dx.doi.org/10.54097/nm34y669.

Full text

Abstract:

At the present time is the rapid development of artificial intelligence, deep learning is an important part of it, the use of neural networks for deep learning in which has its own unique advantages. Music, as an important component of the human emotional world, has its related applications in various fields. This paper focuses on the classification of Happy and Sad in music emotion. The article constructs a simple VGG-like neural network model by using 2D convolutional neural network through Mel's spectrum processing. The data files in wav format are fed into the network for training and evaluated by observing the accuracy and loss function values. Eventually the model was able to perform some recognition but failed to achieve good recognition due to some of its limitations. By constructing such a neural network, the simple VGG-like convolutional neural network model is explored for simple music emotion recognition to try if a robust model can be built.

APA, Harvard, Vancouver, ISO, and other styles

29

Dong, Shidu, Zhi Liu, Huaqiu Wang, Yihao Zhang, and Shaoguo Cui. "A Separate 3D Convolutional Neural Network Architecture for 3D Medical Image Semantic Segmentation." Journal of Medical Imaging and Health Informatics 9, no. 8 (2019): 1705–16. http://dx.doi.org/10.1166/jmihi.2019.2797.

Full text

Abstract:

To exploit three-dimensional (3D) context information and improve 3D medical image semantic segmentation, we propose a separate 3D (S3D) convolution neural network (CNN) architecture. First, a two-dimensional (2D) CNN is used to extract the 2D features of each slice in the xy-plane of 3D medical images. Second, one-dimensional (1D) features reassembled from the 2D features in the z-axis are input into a 1D-CNN and are then classified feature-wise. Analysis shows that S3D-CNN has lower time complexity, fewer parameters and less memory space requirements than other 3D-CNNs with a similar structure. As an example, we extend the deep convolutional encoder–decoder architecture (SegNet) to S3D-SegNet for brain tumor image segmentation. We also propose a method based on priority queues and the dice loss function to address the class imbalance for medical image segmentation. The experimental results show the following: (1) S3D-SegNet extended from SegNet can improve brain tumor image segmentation. (2) The proposed imbalance accommodation method can increase the speed of training convergence and reduce the negative impact of the imbalance. (3) S3D-SegNet with the proposed imbalance accommodation method offers performance comparable to that of some state-of-the-art 3D-CNNs and experts in brain tumor image segmentation.

APA, Harvard, Vancouver, ISO, and other styles

30

Mustafa Ahmed Othman Abo Mhara. "Complexity Neural Networks for Estimating Flood Process in Internet-of-Things Empowered Smart City." International Journal of Engineering and Management Research 10, no. 6 (2020): 118–29. http://dx.doi.org/10.31033/ijemr.10.6.16.

Full text

Abstract:

With the advancement of the Internet of Things (IoT)-based water conservation computerization, hydrological data is increasingly enriched. Considering the ability of deep learning on complex features extraction, we proposed a flood process forecasting model based on Convolution Neural Network(CNN) with two-dimension(2D) convolutional operation. At first, we imported the spatial-temporal rainfall features of the Xixian basin. Subsequently, extensive experiments were carried out to determine the optimal hyper parameters of the proposed CNN flood forecasting model.

APA, Harvard, Vancouver, ISO, and other styles

31

Wang, Xinyu, Le Sun, Chuhan Lu, and Baozhu Li. "A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification." Remote Sensing 16, no. 7 (2024): 1180. http://dx.doi.org/10.3390/rs16071180.

Full text

Abstract:

Recently, with the remarkable advancements of deep learning in the field of image processing, convolutional neural networks (CNNs) have garnered widespread attention from researchers in the domain of hyperspectral image (HSI) classification. Moreover, due to the high performance demonstrated by the transformer architecture in classification tasks, there has been a proliferation of neural networks combining CNNs and transformers for HSI classification. However, the majority of the current methods focus on extracting spatial–spectral features from the HSI data of a single size for a pixel, overlooking the rich multi-scale feature information inherent to the data. To address this problem, we designed a novel transformer network with a CNN-enhanced cross-attention (TNCCA) mechanism for HSI classification. It is a dual-branch network that utilizes different scales of HSI input data to extract shallow spatial–spectral features using a multi-scale 3D and 2D hybrid convolutional neural network. After converting the feature maps into tokens, a series of 2D convolutions and dilated convolutions are employed to generate two sets of Q (queries), K (keys), and V (values) at different scales in a cross-attention module. This transformer with CNN-enhanced cross-attention explores multi-scale CNN-enhanced features and fuses them from both branches. Experimental evaluations conducted on three widely used hyperspectral image (HSI) datasets, under the constraint of limited sample size, demonstrate excellent classification performance of the proposed network.

APA, Harvard, Vancouver, ISO, and other styles

32

Liu, Jing, Meiyi Wu, KangXin Li, and Yi Liu. "A Lightweight Network Based on Dynamic Split Pointwise Convolution Strategy for Hyperspectral Remote Sensing Images Classification." Remote Sensing 17, no. 5 (2025): 888. https://doi.org/10.3390/rs17050888.

Full text

Abstract:

For reducing the parameters and computational complexity of networks while improving the classification accuracy of hyperspectral remote sensing images (HRSIs), a dynamic split pointwise convolution (DSPC) strategy is presented, and a lightweight convolutional neural network (CNN), i.e., CSM-DSPCss-Ghost, is proposed based on DSPC. A channel switching module (CSM) and a dynamic split pointwise convolution Ghost (DSPC-Ghost) module are presented by combining the presented DSPC with channel shuffling and the Ghost strategy, respectively. CSM replaces the first expansion pointwise convolution in the MobileNetV2 bottleneck module to reduce the parameter number and relieve the increasing channel correlation caused by the original channel expansion pointwise convolution. DSPC-Ghost replaces the second pointwise convolution in the MobileNetV2 bottleneck module, which can further reduce the number of parameters based on DSPC and extract the depth spectral and spatial features of HRSIs successively. Finally, the CSM-DSPCss-Ghost bottleneck module is presented by introducing a squeeze excitation module and a spatial attention module after the CSM and the depthwise convolution, respectively. The presented CSM-DSPCss-Ghost network consists of seven successive CSM-DSPCss-Ghost bottleneck modules. Experiments on four measured HRSIs show that, compared with 2D CNN, 3D CNN, MobileNetV2, ShuffleNet, GhostNet, and Xception, CSM-DSPCss-Ghost can significantly improve classification accuracy and running speed while reducing the number of parameters.

APA, Harvard, Vancouver, ISO, and other styles

33

Kang, Byungjin, Inho Park, Changmin Ok, and Sungho Kim. "ODPA-CNN: One Dimensional Parallel Atrous Convolution Neural Network for Band-Selective Hyperspectral Image Classification." Applied Sciences 12, no. 1 (2021): 174. http://dx.doi.org/10.3390/app12010174.

Full text

Abstract:

Recently, hyperspectral image (HSI) classification using deep learning has been actively studied using 2D and 3D convolution neural networks (CNN). However, they learn spatial information as well as spectral information. These methods can increase the accuracy of classification, but do not only focus on the spectral information, which is a big advantage of HSI. In addition, the 1D-CNN, which learns only pure spectral information, has limitations because it uses adjacent spectral information. In this paper, we propose a One Dimensional Parellel Atrous Convolution Neural Network (ODPA-CNN) that learns not only adjacent spectral information for HSI classification, but also spectral information from a certain distance. It extracts features in parallel to account for bands of varying distances. The proposed method excludes spatial information such as the shape of an object and performs HSI classification only with spectral information about the material of the object. Atrous convolution is not a convolution of adjacent spectral information, but a convolution between spectral information separated by a certain distance. We compare the proposed model with various datasets to the other models. We also test with the data we have taken ourselves. Experimental results show a higher performance than some 3D-CNN models and other 1D-CNN methods. In addition, using datasets to which random space is applied, the vulnerabilities of 3D-CNN are identified, and the proposed model is shown to be robust to datasets with little spatial information.

APA, Harvard, Vancouver, ISO, and other styles

34

Ravi Kondal, Easala, and Soubhagya Sankar Barpanda. "Hyperspectral image classification using Hyb-3D convolution neural network spectral partitioning." Indonesian Journal of Electrical Engineering and Computer Science 29, no. 1 (2022): 295. http://dx.doi.org/10.11591/ijeecs.v29.i1.pp295-303.

Full text

Abstract:

Hyperspectral image classification (HSIC) on remote sensing imaging has brought immersive achievement using artificial intelligence technology. In deep learning convolution neural networks (CNN), 2D-CNN, and 3D-CNN methods are widely used to classify the spectral-spatial bands of hyperspectral images (HSI). The proposed Hybrid 3D-CNN (H3D-CNN) model framework for deeper features extraction predicts classification accuracy in supervised learning. The model reduces the narrow gap between supervised and unsupervised learning and the complexity and cost of the previous models. The HSI classification analysis is carried out on real-world data sets of Indian pines Salinas datasets captured by Airborne visible, infrared imaging spectrometer (AVIRIS) sensors that performed superior classification accuracy results.

APA, Harvard, Vancouver, ISO, and other styles

35

Han, Seungmin, Seokju Oh, and Jongpil Jeong. "Bearing Fault Diagnosis Based on Multiscale Convolutional Neural Network Using Data Augmentation." Journal of Sensors 2021 (February 20, 2021): 1–14. http://dx.doi.org/10.1155/2021/6699637.

Full text

Abstract:

Bearings are one of the most important parts of a rotating machine. Bearing failure can lead to mechanical failure, financial loss, and even personal injury. In recent years, various deep learning techniques have been used to diagnose bearing faults in rotating machines. However, deep learning technology has a data imbalance problem because it requires huge amounts of data. To solve this problem, we used data augmentation techniques. In addition, Convolutional Neural Network, one of the deep learning models, is a method capable of performing feature learning without prior knowledge. However, since conventional fault diagnosis based on CNN can only extract single-scale features, not only useful information may be lost but also domain shift problems may occur. In this paper, we proposed a Multiscale Convolutional Neural Network (MSCNN) to extract more powerful and differentiated features from raw signals. MSCNN can learn more powerful feature expression than conventional CNN through multiscale convolution operation and reduce the number of parameters and training time. The proposed model proved better results and validated the effectiveness of the model compared to 2D-CNN and 1D-CNN.

APA, Harvard, Vancouver, ISO, and other styles

36

Nourmohammadi, Farzaneh, Chetan Parmar, Elmar Wings, and Jaume Comellas. "Using Convolutional Neural Networks for Blocking Prediction in Elastic Optical Networks." Applied Sciences 14, no. 5 (2024): 2003. http://dx.doi.org/10.3390/app14052003.

Full text

Abstract:

This paper presents a study on connection-blocking prediction in Elastic Optical Networks (EONs) using Convolutional Neural Networks (CNNs). In EONs, connections are established and torn down dynamically to fulfill the instantaneous requirements of the users. The dynamic allocation of the connections may cause spectrum fragmentation and lead to network performance degradation as connection blocking increases. Predicting potential blocking situations can be helpful during EON operations. For example, this prediction could be used in real networks to trigger proper spectrum defragmentation mechanisms at suitable moments, thereby enhancing network performance. Extensive simulations over the well-known NSFNET (National Science Foundation Network) backbone network topology were run by generating realistic traffic patterns. The obtained results are later used to train the developed machine learning models, which allow the prediction of connection-blocking events. Resource use was continuously monitored and recorded during the process. Two different Convolutional Neural Network models, a 1D CNN (One-Dimensional Convolutional Neural Network) and 2D CNN (Two-Dimensional Convolutional Neural Network), are proposed as the predicting methods, and their behavior is compared to other conventional models based on an SVM (Support Vector Machine) and KNN (K Nearest Neighbors). The results obtained show that the proposed 2D CNN predicts blocking with the best accuracy (92.17%), followed by the SVM, the proposed 1D CNN, and KNN. Results suggest that 2D CNN can be helpful in blocking prediction and might contribute to increasing the efficiency of future EON networks.

APA, Harvard, Vancouver, ISO, and other styles

37

Qin, Yufeng, and Xianjun Shi. "Fault Diagnosis Method for Rolling Bearings Based on Two-Channel CNN under Unbalanced Datasets." Applied Sciences 12, no. 17 (2022): 8474. http://dx.doi.org/10.3390/app12178474.

Full text

Abstract:

As a critical component in industrial systems, timely and accurate fault diagnosis of rolling bearings is closely related to reliability and safety. Since the equipment usually operates in normal conditions with few fault samples, unbalanced data distribution problems lead to poor fault diagnosis ability. To address the above problems, a two-channel convolutional neural network (TC-CNN) model is proposed. Firstly, the frequency spectrum of the vibration signal is extracted using the Fast Fourier Transform (FFT), and the frequency spectrum is used as the input to the one-dimensional convolutional neural network (1D-CNN). Secondly, the time-frequency image of the vibration signal is extracted using generalized S-transform (GST), and the time-frequency image is used as the input to the two-dimensional convolutional neural network (2D-CNN). Then, feature extraction in the convolution and pooling layers is performed in the above two CNN channels, respectively. The feature vectors obtained from the two CNN models are stitched together in the fusion layer, and the fault classes are identified using an SVM classifier. Finally, using the rolling bearing experimental dataset of Case Western Reserve University (CWRU), the fault diagnosis effect of the proposed TC-CNN model under various data imbalance conditions is verified. In comparison with other related works, the experimental results demonstrate the better fault diagnosis results and robustness of the method.

APA, Harvard, Vancouver, ISO, and other styles

38

Chen, Linlin, Zhihui Wei, and Yang Xu. "A Lightweight Spectral–Spatial Feature Extraction and Fusion Network for Hyperspectral Image Classification." Remote Sensing 12, no. 9 (2020): 1395. http://dx.doi.org/10.3390/rs12091395.

Full text

Abstract:

Hyperspectral image (HSI) classification accuracy has been greatly improved by employing deep learning. The current research mainly focuses on how to build a deep network to improve the accuracy. However, these networks tend to be more complex and have more parameters, which makes the model difficult to train and easy to overfit. Therefore, we present a lightweight deep convolutional neural network (CNN) model called S2FEF-CNN. In this model, three S2FEF blocks are used for the joint spectral–spatial features extraction. Each S2FEF block uses 1D spectral convolution to extract spectral features and 2D spatial convolution to extract spatial features, respectively, and then fuses spectral and spatial features by multiplication. Instead of using the full connected layer, two pooling layers follow three blocks for dimension reduction, which further reduces the training parameters. We compared our method with some state-of-the-art HSI classification methods based on deep network on three commonly used hyperspectral datasets. The results show that our network can achieve a comparable classification accuracy with significantly reduced parameters compared to the above deep networks, which reflects its potential advantages in HSI classification.

APA, Harvard, Vancouver, ISO, and other styles

39

Nguyen, Thien An, and Jaejin Lee. "A Nonlinear Convolutional Neural Network-Based Equalizer for Holographic Data Storage Systems." Applied Sciences 13, no. 24 (2023): 13029. http://dx.doi.org/10.3390/app132413029.

Full text

Abstract:

Central data systems require mass storage systems for big data from many fields and devices. Several technologies have been proposed to meet this demand. Holographic data storage (HDS) is at the forefront of data storage innovation and exploits the extraordinary characteristics of light to encode and retrieve two-dimensional (2D) data from holographic volume media. Nevertheless, a formidable challenge exists in the form of 2D interference that is a by-product of hologram dispersion during data retrieval and is a substantial barrier to the reliability and efficiency of HDS systems. To solve these problems, an equalizer and target are applied to HDS systems. However, in previous studies, the equalizer acted only as a linear convolution filter for the received signal. In this study, we propose a nonlinear equalizer using a convolutional neural network (CNN) for HDS systems. Using a CNN-based equalizer, the received signal can be nonlinearly converted into the desired signal with higher accuracy. In the experiments, our proposed model achieved a gain of approximately 2.5 dB in contrast to conventional models.

APA, Harvard, Vancouver, ISO, and other styles

40

Tabian, I., H. Fu, and Zahra Sharif Khodaei. "Impact Detection on Composite Plates Based on Convolution Neural Network." Key Engineering Materials 827 (December 2019): 476–81. http://dx.doi.org/10.4028/www.scientific.net/kem.827.476.

Full text

Abstract:

This paper presents a novel Convolutional Neural Network (CNN) based metamodel for impact detection and characterization for a Structural Health Monitoring (SHM) application. The signals recorded by PZT sensors during various impact events on a composite plate is used as inputs to CNN to detect and locate impact events. The input of the metamodel consists of 2D images, constructed from the signals recorded from a network of sensors. The developed meta-model was then developed and tested on a composite plate. The results show that the CNN-based metamodel is capable of detecting impacts with more than 98% accuracy. In addition, the network was capable of detecting impacts in the other regions of the panel, which was not trained with but had similar geometric configuration. The accuracy in this case was also above 98%, showing the scalability of this method for large complex structures of repeating zones such as composite stiffened panel.

APA, Harvard, Vancouver, ISO, and other styles

41

Sumida, Iori, Taiki Magome, Hideki Kitamori, et al. "Deep convolutional neural network for reduction of contrast-enhanced region on CT images." Journal of Radiation Research 60, no. 5 (2019): 586–94. http://dx.doi.org/10.1093/jrr/rrz030.

Full text

Abstract:

Abstract This study aims to produce non-contrast computed tomography (CT) images using a deep convolutional neural network (CNN) for imaging. Twenty-nine patients were selected. CT images were acquired without and with a contrast enhancement medium. The transverse images were divided into 64 × 64 pixels. This resulted in 14 723 patches in total for both non-contrast and contrast-enhanced CT image pairs. The proposed CNN model comprises five two-dimensional (2D) convolution layers with one shortcut path. For comparison, the U-net model, which comprises five 2D convolution layers interleaved with pooling and unpooling layers, was used. Training was performed in 24 patients and, for testing of trained models, another 5 patients were used. For quantitative evaluation, 50 regions of interest (ROIs) were selected on the reference contrast-enhanced image of the test data, and the mean pixel value of the ROIs was calculated. The mean pixel values of the ROIs at the same location on the reference non-contrast image and the predicted non-contrast image were calculated and those values were compared. Regarding the quantitative analysis, the difference in mean pixel value between the reference contrast-enhanced image and the predicted non-contrast image was significant (P < 0.0001) for both models. Significant differences in pixels (P < 0.0001) were found using the U-net model; in contrast, there was no significant difference using the proposed CNN model when comparing the reference non-contrast images and the predicted non-contrast images. Using the proposed CNN model, the contrast-enhanced region was satisfactorily reduced.

APA, Harvard, Vancouver, ISO, and other styles

42

Lan, Qiang, Zelong Wang, Mei Wen, Chunyuan Zhang, and Yijie Wang. "High Performance Implementation of 3D Convolutional Neural Networks on a GPU." Computational Intelligence and Neuroscience 2017 (2017): 1–8. http://dx.doi.org/10.1155/2017/8348671.

Full text

Abstract:

Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural networks to video classification, which constitutes a 3D input and requires far larger amounts of memory and much more computation. FFT based methods can reduce the amount of computation, but this generally comes at the cost of an increased memory requirement. On the other hand, the Winograd Minimal Filtering Algorithm (WMFA) can reduce the number of operations required and thus can speed up the computation, without increasing the required memory. This strategy was shown to be successful for 2D neural networks. We implement the algorithm for 3D convolutional neural networks and apply it to a popular 3D convolutional neural network which is used to classify videos and compare it to cuDNN. For our highly optimized implementation of the algorithm, we observe a twofold speedup for most of the 3D convolution layers of our test network compared to the cuDNN version.

APA, Harvard, Vancouver, ISO, and other styles

43

Bashi, Omar I. Dallal, Husamuldeen K. Hameed, Yasir Mahmood Al Kubaiaisi, and Ahmad H. Sabry. "Development of object detection from point clouds of a 3D dataset by Point-Pillars neural network." Eastern-European Journal of Enterprise Technologies 2, no. 9 (122) (2023): 26–33. http://dx.doi.org/10.15587/1729-4061.2023.275155.

Full text

Abstract:

Deep learning algorithms are able to automatically handle point clouds over a broad range of 3D imaging implementations. They have applications in advanced driver assistance systems, perception and robot navigation, scene classification, surveillance, stereo vision, and depth estimation. According to prior studies, the detection of objects from point clouds of a 3D dataset with acceptable accuracy is still a challenging task. The Point-Pillars technique is used in this work to detect a 3D object employing 2D convolutional neural network (CNN) layers. Point-Pillars architecture includes a learnable encoder to use Point-Nets for learning a demonstration of point clouds structured with vertical columns (pillars). The Point-Pillars architecture operates a 2D CNN to decode the predictions, create network estimations, and create 3D envelop boxes for various object labels like pedestrians, trucks, and cars. This study aims to detect objects from point clouds of a 3D dataset by Point-Pillars neural network architecture that makes it possible to detect a 3D object by means of 2D convolutional neural network (CNN) layers. The method includes producing a sparse pseudo-image from a point cloud using a feature encoder, using a 2D convolution backbone to process the pseudo-image into high-level, and using detection heads to regress and detect 3D bounding boxes. This work utilizes an augmentation for ground truth data as well as additional augmentations of global data methods to include further diversity in the data training and associating packs. The obtained results demonstrated that the average orientation similarity (AOS) and average precision (AP) were 0.60989, 0.61157 for trucks, and 0.74377, 0.75569 for cars.

APA, Harvard, Vancouver, ISO, and other styles

44

Omar, I. Dallal Bashi, K. Hameed Husamuldeen, Mahmood Al Kubaiaisi Yasir, and H. Sabry Ahmad. "Development of object detection from point clouds of a 3D dataset by Point-Pillars neural network." Eastern-European Journal of Enterprise Technologies 2, no. 9(122) (2023): 26–33. https://doi.org/10.15587/1729-4061.2023.275155.

Full text

Abstract:

Deep learning algorithms are able to automatically handle point clouds over a broad range of 3D imaging implementations. They have applications in advanced driver assistance systems, perception and robot navigation, scene classification, surveillance, stereo vision, and depth estimation. According to prior studies, the detection of objects from point clouds of a 3D dataset with acceptable accuracy is still a challenging task. The Point-Pillars technique is used in this work to detect a 3D object employing 2D convolutional neural network (CNN) layers. Point-Pillars architecture includes a learnable encoder to use Point-Nets for learning a demonstration of point clouds structured with vertical columns (pillars). The Point-Pillars architecture operates a 2D CNN to decode the predictions, create network estimations, and create 3D envelop boxes for various object labels like pedestrians, trucks, and cars. This study aims to detect objects from point clouds of a 3D dataset by Point-Pillars neural network architecture that makes it possible to detect a 3D object by means of 2D convolutional neural network (CNN) layers. The method includes producing a sparse pseudo-image from a point cloud using a feature encoder, using a 2D convolution backbone to process the pseudo-image into high-level, and using detection heads to regress and detect 3D bounding boxes. This work utilizes an augmentation for ground truth data as well as additional augmentations of global data methods to include further diversity in the data training and associating packs. The obtained results demonstrated that the average orientation similarity (AOS) and average precision (AP) were 0.60989, 0.61157 for trucks, and 0.74377, 0.75569 for cars.

APA, Harvard, Vancouver, ISO, and other styles

45

Pan, Xinyu, Chen Zang, Wanxuan Lu, Guiyuan Jiang, and Qian Sun. "FSFF-Net: A Frequency-Domain Feature and Spatial-Domain Feature Fusion Network for Hyperspectral Image Classification." Electronics 14, no. 11 (2025): 2234. https://doi.org/10.3390/electronics14112234.

Full text

Abstract:

In hyperspectral image (HSI) classification, each pixel is assigned to a specific land cover type, which is critical for applications in environmental monitoring, agriculture, and urban planning. Convolutional neural network (CNN) and Transformers have become widely adopted due to their exceptional feature extraction capabilities. However, the local receptive field of CNN limits their ability to capture global context, while Transformers, though effective in modeling long-range dependencies, introduce computational overhead. To address these challenges, we propose a frequency-domain and spatial-domain feature fusion network (FSFF-Net) for HSI classification, which reduces computational complexity while capturing global features. The FSFF-Net consists of a frequency-domain transformer (FDformer) and a deepwise convolution-based parallel encoder structure. The FDformer replaces the self-attention mechanism in traditional Visual Transformers with a three-step process: two-dimensional discrete Fourier transform (2D-DFT), adaptive filter, and two-dimensional inverse Fourier transform (2D-IDFT). 2D DFT and 2D-IDFT convert images between the spatial and frequency domains. Adaptive filter can adaptively retain important frequency components, remove redundant components, and assign weights to different frequency components. This module not only can reduce computational overhead by decreasing the number of parameters, but also mitigates the limitations of CNN by capturing complementary frequency-domain features, which enhance the spatial-domain features for improved classification. In parallel, deepwise convolution is employed to capture spatial-domain features. The network then integrates the frequency-domain features from FDformer and the spatial-domain features from deepwise convolution through a feature fusion module. The experimental results demonstrate that our method is efficient and robust for HSIs classification, achieving overall accuracies of 98.03%, 99.57%, 97.05%, and 98.40% on Indian Pines, Pavia University, Salinas, and Houston 2013 datasets, respectively.

APA, Harvard, Vancouver, ISO, and other styles

46

Chang, Yang-Lang, Tan-Hsu Tan, Wei-Hong Lee, et al. "Consolidated Convolutional Neural Network for Hyperspectral Image Classification." Remote Sensing 14, no. 7 (2022): 1571. http://dx.doi.org/10.3390/rs14071571.

Full text

Abstract:

The performance of hyperspectral image (HSI) classification is highly dependent on spatial and spectral information, and is heavily affected by factors such as data redundancy and insufficient spatial resolution. To overcome these challenges, many convolutional neural networks (CNN) especially 2D-CNN-based methods have been proposed for HSI classification. However, these methods produced insufficient results compared to 3D-CNN-based methods. On the other hand, the high computational complexity of the 3D-CNN-based methods is still a major concern that needs to be addressed. Therefore, this study introduces a consolidated convolutional neural network (C-CNN) to overcome the aforementioned issues. The proposed C-CNN is comprised of a three-dimension CNN (3D-CNN) joined with a two-dimension CNN (2D-CNN). The 3D-CNN is used to represent spatial–spectral features from the spectral bands, and the 2D-CNN is used to learn abstract spatial features. Principal component analysis (PCA) was firstly applied to the original HSIs before they are fed to the network to reduce the spectral bands redundancy. Moreover, image augmentation techniques including rotation and flipping have been used to increase the number of training samples and reduce the impact of overfitting. The proposed C-CNN that was trained using the augmented images is named C-CNN-Aug. Additionally, both Dropout and L2 regularization techniques have been used to further reduce the model complexity and prevent overfitting. The experimental results proved that the proposed model can provide the optimal trade-off between accuracy and computational time compared to other related methods using the Indian Pines, Pavia University, and Salinas Scene hyperspectral benchmark datasets.

APA, Harvard, Vancouver, ISO, and other styles

47

Oh, Seokjin, Jiyong An, and Kyeong-Sik Min. "Area-Efficient Mapping of Convolutional Neural Networks to Memristor Crossbars Using Sub-Image Partitioning." Micromachines 14, no. 2 (2023): 309. http://dx.doi.org/10.3390/mi14020309.

Full text

Abstract:

Memristor crossbars can be very useful for realizing edge-intelligence hardware, because the neural networks implemented by memristor crossbars can save significantly more computing energy and layout area than the conventional CMOS (complementary metal–oxide–semiconductor) digital circuits. One of the important operations used in neural networks is convolution. For performing the convolution by memristor crossbars, the full image should be partitioned into several sub-images. By doing so, each sub-image convolution can be mapped to small-size unit crossbars, of which the size should be defined as 128 × 128 or 256 × 256 to avoid the line resistance problem caused from large-size crossbars. In this paper, various convolution schemes with 3D, 2D, and 1D kernels are analyzed and compared in terms of neural network’s performance and overlapping overhead. The neural network’s simulation indicates that the 2D + 1D kernels can perform the sub-image convolution using a much smaller number of unit crossbars with less rate loss than the 3D kernels. When the CIFAR-10 dataset is tested, the mapping of sub-image convolution of 2D + 1D kernels to crossbars shows that the number of unit crossbars can be reduced almost by 90% and 95%, respectively, for 128 × 128 and 256 × 256 crossbars, compared with the 3D kernels. On the contrary, the rate loss of 2D + 1D kernels can be less than 2%. To improve the neural network’s performance more, the 2D + 1D kernels can be combined with 3D kernels in one neural network. When the normalized ratio of 2D + 1D layers is around 0.5, the neural network’s performance indicates very little rate loss compared to when the normalized ratio of 2D + 1D layers is zero. However, the number of unit crossbars for the normalized ratio = 0.5 can be reduced by half compared with that for the normalized ratio = 0.

APA, Harvard, Vancouver, ISO, and other styles

48

Ramalingappa, Likhitha, and Aswathnarayan Manjunatha. "Power quality event classification using complex wavelets phasor models and customized convolution neural network." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 1 (2022): 22. http://dx.doi.org/10.11591/ijece.v12i1.pp22-31.

Full text

Abstract:

Origin and triggers of power quality (PQ) events must be identified in prior, in order to take preventive steps to enhance power quality. However it is important to identify, localize and classify the PQ events to determine the causes and origins of PQ disturbances. In this paper a novel algorithm is presented to classify voltage variations into six different PQ events considering the space phasor model (SPM) diagrams, dual tree complex wavelet transforms (DTCWT) sub bands and the convolution neural network (CNN) model. The input voltage data is converted into SPM data, the SPM data is transformed using 2D DTCWT into low pass and high pass sub bands which are simultaneously processed by the 2D CNN model to perform classification of PQ events. In the proposed method CNN model based on Google Net is trained to perform classification of PQ events with default configuration as in deep neural network designer in MATLAB environment. The proposed algorithm achieve higher accuracy with reduced training time in classification of events than compared with reported PQ event classification methods.

APA, Harvard, Vancouver, ISO, and other styles

49

Likhitha, Ramalingappa, and Manjunatha Aswathnarayan. "Power quality event classification using complex wavelets phasor models and customized convolution neural network." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 1 (2022): 22–31. https://doi.org/10.11591/ijece.v12i1.pp22-31.

Full text

Abstract:

Origin and triggers of power quality (PQ) events must be identified in prior, in order to take preventive steps to enhance power quality. However it is important to identify, localize and classify the PQ events to determine the causes and origins of PQ disturbances. In this paper a novel algorithm is presented to classify voltage variations into six different PQ events considering the space phasor model (SPM) diagrams, dual tree complex wavelet transforms (DTCWT) sub bands and the convolution neural network (CNN) model. The input voltage data is converted into SPM data, the SPM data is transformed using 2D DTCWT into low pass and high pass sub bands which are simultaneously processed by the 2D CNN model to perform classification of PQ events. In the proposed method CNN model based on Google Net is trained to perform classification of PQ events with default configuration as in deep neural network designer in MATLAB environment. The proposed algorithm achieve higher accuracy with reduced training time in classification of events than compared with reported PQ event classification methods.

APA, Harvard, Vancouver, ISO, and other styles

50

Ran, Tonghuan, Guangfeng Shi, Zhuo Zhang, Yuhao Pan, and Haiyang Zhu. "Hyperspectral Image Classification Method Based on Morphological Features and Hybrid Convolutional Neural Networks." Applied Sciences 14, no. 22 (2024): 10577. http://dx.doi.org/10.3390/app142210577.

Full text

Abstract:

The exploitation of the spatial and spectral characteristics of hyperspectral remote sensing images (HRSIs) for the high-precision classification of earth observation targets is crucial. Convolutional neural networks (CNNs) have good classification performance and are widely used neural networks. Herein, a morphological processing (MP)-based HRSI classification method and a 3D–2D CNN are proposed to improve HRSI classification accuracy. Principal component analysis is performed to reduce the dimensionality of the HRSI cube, and MP is implemented to extract the spectral–spatial features of the low-dimensional HRSI cube. The extracted features are concatenated with the low-dimensional HRSI cube, and the designed 3D–2D CNN framework completes the classification task. Residual connections and an attention mechanism are added to the CNN structure to prevent gradient vanishing, and the scale of the control parameters of the model structure is optimized to guarantee the model’s feature extraction ability. The CNN structure uses multiscale convolution, involving depthwise separable convolution, which can effectively reduce the amount of parameter calculation. Two classic datasets (Indian Pines and Pavia University) and a self-made dataset (My Dataset) are used to compare the performance of this method with existing classification techniques. The proposed method effectively improved classification accuracy despite its short classification time.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!