To see the other types of publications on this topic, follow the link: 3D-Convolutional Neural Network (3D-CNN).

Journal articles on the topic '3D-Convolutional Neural Network (3D-CNN)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic '3D-Convolutional Neural Network (3D-CNN).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Chang, Yang-Lang, Tan-Hsu Tan, Wei-Hong Lee, et al. "Consolidated Convolutional Neural Network for Hyperspectral Image Classification." Remote Sensing 14, no. 7 (2022): 1571. http://dx.doi.org/10.3390/rs14071571.

Full text
Abstract:
The performance of hyperspectral image (HSI) classification is highly dependent on spatial and spectral information, and is heavily affected by factors such as data redundancy and insufficient spatial resolution. To overcome these challenges, many convolutional neural networks (CNN) especially 2D-CNN-based methods have been proposed for HSI classification. However, these methods produced insufficient results compared to 3D-CNN-based methods. On the other hand, the high computational complexity of the 3D-CNN-based methods is still a major concern that needs to be addressed. Therefore, this study introduces a consolidated convolutional neural network (C-CNN) to overcome the aforementioned issues. The proposed C-CNN is comprised of a three-dimension CNN (3D-CNN) joined with a two-dimension CNN (2D-CNN). The 3D-CNN is used to represent spatial–spectral features from the spectral bands, and the 2D-CNN is used to learn abstract spatial features. Principal component analysis (PCA) was firstly applied to the original HSIs before they are fed to the network to reduce the spectral bands redundancy. Moreover, image augmentation techniques including rotation and flipping have been used to increase the number of training samples and reduce the impact of overfitting. The proposed C-CNN that was trained using the augmented images is named C-CNN-Aug. Additionally, both Dropout and L2 regularization techniques have been used to further reduce the model complexity and prevent overfitting. The experimental results proved that the proposed model can provide the optimal trade-off between accuracy and computational time compared to other related methods using the Indian Pines, Pavia University, and Salinas Scene hyperspectral benchmark datasets.
APA, Harvard, Vancouver, ISO, and other styles
2

Lv, Shidong, Tao Long, Zhixian Hou, Liang Yan, and Zhenzhong Li. "3D CNN Hardware Circuit for Motion Recognition Based on FPGA." Journal of Physics: Conference Series 2363, no. 1 (2022): 012030. http://dx.doi.org/10.1088/1742-6596/2363/1/012030.

Full text
Abstract:
In recent years, three-dimensional convolutional neural network (3D CNN) has been widely used in the fields of action recognition and video analysis. The general purpose processors are difficult to achieve efficient and intensive computing, and the deployment of 3D CNN based on FPGA has the advantages of low power consumption, high energy efficiency, and customizability, and has gradually become a hot choice for deploying convolutional neural networks in many embedded scenarios. This paper designs a small 3D convolutional neural network based on the classic 3D convolutional neural network C3D, and uses the general matrix multiplication (GEMM) to map the 3D convolution calculation to the 2D matrix multiplication calculation. The matrix is divided into blocks and transmitted to the FPGA through the AXI bus, and the multiplication operation of the block matrix is realized through a two-dimensional multiply-accumulate array. A System on Chip (SoC) architecture is built on the PYNQ platform, using ARM Cortex-A9 as the process control core, and the calculation of the entire matrix is completed under reasonable block and scheduling on the ARM. The IP core of matrix calculation is designed using High Level Synthesis (HLS), and the corresponding parallel optimization scheme is given. Experiments have verified that the prototype design of the hardware acceleration circuit achieves low power consumption, high energy efficiency, and high precision motion recognition while using less hardware resources.
APA, Harvard, Vancouver, ISO, and other styles
3

SENJAWATI, RINDU TEGAR, ESMERALDA CONTESSA DJAMAL, and FATAN KASYIDI. "Identifikasi Emosi Melalui Sinyal EEG menggunakan 3D-Convolutional Neural Network." ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika 12, no. 2 (2024): 417. http://dx.doi.org/10.26760/elkomika.v12i2.417.

Full text
Abstract:
ABSTRAKEmosi memberikan peran penting dalam interaksi manusia yang didapat melalui respon yang tepat. Respon yang tak tepat menunjukan adanya gangguan mental sehingga diperlukan identifikasi emosi. Identifikasi dapat dilakukan menggunakan aktivitas sinyal listrik di otak menggunakan Elektroensephalogram (EEG). Karena sinyal EEG pada setiap kanal merupakan urutan data maka dijadikan multi-kanal yang direpresentasikan pada matriks agar urutan-urutan data tetap terjaga. Penggunaan matriks memadukan informasi dari ketiga dimensi (kanal x frekuensi x waktu) dapat menggambarkan kompleksitas dari sinyal EEG. Sehingga dapat mengenali pola aktivitas otak pada rentang frekuensi tertentu berkembang sepanjang waktu. Untuk menangkap informasi tersebut perlu dilakukan ekstraksi fitur agar mewakili variabel-variabel emosi. Ekstraksi dilakukan pada domain frekuensi (4-45 Hz) dan waktu menggunakan Short Time Fourier Transform (STFT) kemudian idenitifikasi menggunakan 3D Convolutional Neural Network (CNN). Eksperimen menggunakan 3D CNN menghasilkan akurasi 65.45 dengan teknik koreksi bobot Adamax.Kata kunci: emosi, sinyal EEG, multi-kanal, STFT, 3D-CNN ABSTRACTEmotions play an important role in human interaction through appropriate responses. Inappropriate responses indicate a mental disorder, so identification of emotions is required. Identification can be done using electrical signal activity in the brain with Electroencephalogram (EEG). Because the EEG signal in each channel is a data sequence, it is made into a multi-channel represented in a matrix so that the data sequence is maintained. Using a matrix combining information from all three dimensions (channel x frequency x time) can describe the complexity of the EEG signal. Allowing recognition of evolving brain activity patterns within specific frequency ranges over time. Extraction is done in the frequency domain (4-45 Hz) and time using Short Time Fourier Transform (STFT), then identification using a 3D Convolutional Neural Network (CNN). Experiments using 3D CNN resulted in an accuracy of 65.45 with the Adamax weight correction technique.Keywords: emotion, EEG signal, multi-channel, STFT, 3D-CNN
APA, Harvard, Vancouver, ISO, and other styles
4

Yin, Junjie, Ningning Huang, Jing Tang, and Meie Fang. "Recognition of 3D Shapes Based on 3V-DepthPano CNN." Mathematical Problems in Engineering 2020 (January 30, 2020): 1–11. http://dx.doi.org/10.1155/2020/7584576.

Full text
Abstract:
This paper proposes a convolutional neural network (CNN) with three branches based on the three-view drawing principle and depth panorama for 3D shape recognition. The three-view drawing principle provides three key views of a 3D shape. A depth panorama contains the complete 2.5D information of each view. 3V-DepthPano CNN is a CNN system with three branches designed for depth panoramas generated from the three key views. This recognition system, i.e., 3V-DepthPano CNN, applies a three-branch convolutional neural network to aggregate the 3D shape depth panorama information into a more compact 3D shape descriptor to implement the classification of 3D shapes. Furthermore, we adopt a fine-tuning technique on 3V-DepthPano CNN and extract shape features to facilitate the retrieval of 3D shapes. The proposed method implements a good tradeoff state between higher accuracy and training time. Experiments show that the proposed 3V-DepthPano CNN with 3 views obtains approximate accuracy to MVCNN with 12/80 views. But the 3V-DepthPano CNN frame takes much shorter time to obtain depth panoramas and train the network than MVCNN. It is superior to all other existing advanced methods for both classification and shape retrieval.
APA, Harvard, Vancouver, ISO, and other styles
5

Jiang, Haiyang, Yaozong Pan, Jian Zhang, and Haitao Yang. "Battlefield Target Aggregation Behavior Recognition Model Based on Multi-Scale Feature Fusion." Symmetry 11, no. 6 (2019): 761. http://dx.doi.org/10.3390/sym11060761.

Full text
Abstract:
In this paper, our goal is to improve the recognition accuracy of battlefield target aggregation behavior while maintaining the low computational cost of spatio-temporal depth neural networks. To this end, we propose a novel 3D-CNN (3D Convolutional Neural Networks) model, which extends the idea of multi-scale feature fusion to the spatio-temporal domain, and enhances the feature extraction ability of the network by combining feature maps of different convolutional layers. In order to reduce the computational complexity of the network, we further improved the multi-fiber network, and finally established an architecture—3D convolution Two-Stream model based on multi-scale feature fusion. Extensive experimental results on the simulation data show that our network significantly boosts the efficiency of existing convolutional neural networks in the aggregation behavior recognition, achieving the most advanced performance on the dataset constructed in this paper.
APA, Harvard, Vancouver, ISO, and other styles
6

Dong, Shidu, Zhi Liu, Huaqiu Wang, Yihao Zhang, and Shaoguo Cui. "A Separate 3D Convolutional Neural Network Architecture for 3D Medical Image Semantic Segmentation." Journal of Medical Imaging and Health Informatics 9, no. 8 (2019): 1705–16. http://dx.doi.org/10.1166/jmihi.2019.2797.

Full text
Abstract:
To exploit three-dimensional (3D) context information and improve 3D medical image semantic segmentation, we propose a separate 3D (S3D) convolution neural network (CNN) architecture. First, a two-dimensional (2D) CNN is used to extract the 2D features of each slice in the xy-plane of 3D medical images. Second, one-dimensional (1D) features reassembled from the 2D features in the z-axis are input into a 1D-CNN and are then classified feature-wise. Analysis shows that S3D-CNN has lower time complexity, fewer parameters and less memory space requirements than other 3D-CNNs with a similar structure. As an example, we extend the deep convolutional encoder–decoder architecture (SegNet) to S3D-SegNet for brain tumor image segmentation. We also propose a method based on priority queues and the dice loss function to address the class imbalance for medical image segmentation. The experimental results show the following: (1) S3D-SegNet extended from SegNet can improve brain tumor image segmentation. (2) The proposed imbalance accommodation method can increase the speed of training convergence and reduce the negative impact of the imbalance. (3) S3D-SegNet with the proposed imbalance accommodation method offers performance comparable to that of some state-of-the-art 3D-CNNs and experts in brain tumor image segmentation.
APA, Harvard, Vancouver, ISO, and other styles
7

Avula, Sri Lasya. "Efficient 3D Medical Image Segmentation using CoTr: Bridging CNN and Transformer." International Journal for Research in Applied Science and Engineering Technology 11, no. 5 (2023): 4748–54. http://dx.doi.org/10.22214/ijraset.2023.52686.

Full text
Abstract:
Abstract: Neural networks are a subset of machine learning, and they are at the heart of deep learning algorithms. Before CNNs, identifying objects in images was done manually using time-consuming, manual feature extraction methods. The superior performance of convolutional neural networks, when dealing with images, speech, or audio signals sets them apart from other neural networks. Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. Due to the inductive bias of locality and weight sharing inherent in convolutional operations, these networks lose the ability to model long-range dependency. In this study, a novel framework is presented for accurately segmenting 3D medical images based on the combination of a convolutional neural network and a transformer (CoTr). This framework allows us to construct CNNs for extracting feature representations, and Vision Transformers for modelling longrange dependency on the extracted feature maps. As a self-attention device, the transformer performs a global operation where it draws information from all the information on the system in order to make a decision.
APA, Harvard, Vancouver, ISO, and other styles
8

Hu, Jinlong, Yuezhen Kuang, Bin Liao, Lijie Cao, Shoubin Dong, and Ping Li. "A Multichannel 2D Convolutional Neural Network Model for Task-Evoked fMRI Data Classification." Computational Intelligence and Neuroscience 2019 (December 31, 2019): 1–9. http://dx.doi.org/10.1155/2019/5065214.

Full text
Abstract:
Deep learning models have been successfully applied to the analysis of various functional MRI data. Convolutional neural networks (CNN), a class of deep neural networks, have been found to excel at extracting local meaningful features based on their shared-weights architecture and space invariance characteristics. In this study, we propose M2D CNN, a novel multichannel 2D CNN model, to classify 3D fMRI data. The model uses sliced 2D fMRI data as input and integrates multichannel information learned from 2D CNN networks. We experimentally compared the proposed M2D CNN against several widely used models including SVM, 1D CNN, 2D CNN, 3D CNN, and 3D separable CNN with respect to their performance in classifying task-based fMRI data. We tested M2D CNN against six models as benchmarks to classify a large number of time-series whole-brain imaging data based on a motor task in the Human Connectome Project (HCP). The results of our experiments demonstrate the following: (i) convolution operations in the CNN models are advantageous for high-dimensional whole-brain imaging data classification, as all CNN models outperform SVM; (ii) 3D CNN models achieve higher accuracy than 2D CNN and 1D CNN model, but 3D CNN models are computationally costly as any extra dimension is added in the input; (iii) the M2D CNN model proposed in this study achieves the highest accuracy and alleviates data overfitting given its smaller number of parameters as compared with 3D CNN.
APA, Harvard, Vancouver, ISO, and other styles
9

Chen, Jiangcheng, Sheng Bi, George Zhang, and Guangzhong Cao. "High-Density Surface EMG-Based Gesture Recognition Using a 3D Convolutional Neural Network." Sensors 20, no. 4 (2020): 1201. http://dx.doi.org/10.3390/s20041201.

Full text
Abstract:
High-density surface electromyography (HD-sEMG) and deep learning technology are becoming increasingly used in gesture recognition. Based on electrode grid data, information can be extracted in the form of images that are generated with instant values of multi-channel sEMG signals. In previous studies, image-based, two-dimensional convolutional neural networks (2D CNNs) have been applied in order to recognize patterns in the electrical activity of muscles from an instantaneous image. However, 2D CNNs with 2D kernels are unable to handle a sequence of images that carry information concerning how the instantaneous image evolves with time. This paper presents a 3D CNN with 3D kernels to capture both spatial and temporal structures from sequential sEMG images and investigates its performance on HD-sEMG-based gesture recognition in comparison to the 2D CNN. Extensive experiments were carried out on two benchmark datasets (i.e., CapgMyo DB-a and CSL-HDEMG). The results show that, where the same network architecture is used, 3D CNN can achieve a better performance than 2D CNN, especially for CSL-HDEMG, which contains the dynamic part of finger movement. For CapgMyo DB-a, the accuracy of 3D CNN was 1% higher than 2D CNN when the recognition window length was equal to 40 ms, and was 1.5% higher when equal to 150 ms. For CSL-HDEMG, the accuracies of 3D CNN were 15.3% and 18.6% higher than 2D CNN when the window length was equal to 40 ms and 150 ms, respectively. Furthermore, 3D CNN achieves a competitive performance in comparison to the baseline methods.
APA, Harvard, Vancouver, ISO, and other styles
10

Polat, Huseyin, and Homay Danaei Mehr. "Classification of Pulmonary CT Images by Using Hybrid 3D-Deep Convolutional Neural Network Architecture." Applied Sciences 9, no. 5 (2019): 940. http://dx.doi.org/10.3390/app9050940.

Full text
Abstract:
Lung cancer is the most common cause of cancer-related deaths worldwide. Hence, the survival rate of patients can be increased by early diagnosis. Recently, machine learning methods on Computed Tomography (CT) images have been used in the diagnosis of lung cancer to accelerate the diagnosis process and assist physicians. However, in conventional machine learning techniques, using handcrafted feature extraction methods on CT images are complicated processes. Hence, deep learning as an effective area of machine learning methods by using automatic feature extraction methods could minimize the process of feature extraction. In this study, two Convolutional Neural Network (CNN)-based models were proposed as deep learning methods to diagnose lung cancer on lung CT images. To investigate the performance of the two proposed models (Straight 3D-CNN with conventional softmax and hybrid 3D-CNN with Radial Basis Function (RBF)-based SVM), the altered models of two-well known CNN architectures (3D-AlexNet and 3D-GoogleNet) were considered. Experimental results showed that the performance of the two proposed models surpassed 3D-AlexNet and 3D-GoogleNet. Furthermore, the proposed hybrid 3D-CNN with SVM achieved more satisfying results (91.81%, 88.53% and 91.91% for accuracy rate, sensitivity and precision respectively) compared to straight 3D-CNN with softmax in the diagnosis of lung cancer.
APA, Harvard, Vancouver, ISO, and other styles
11

Xu, Weiting, Xingcheng Han, Yingliang Zhao, et al. "Research on Underwater Acoustic Target Recognition Based on a 3D Fusion Feature Joint Neural Network." Journal of Marine Science and Engineering 12, no. 11 (2024): 2063. http://dx.doi.org/10.3390/jmse12112063.

Full text
Abstract:
In the context of a complex marine environment, extracting and recognizing underwater acoustic target features using ship-radiated noise present significant challenges. This paper proposes a novel deep neural network model for underwater target recognition, which integrates 3D Mel frequency cepstral coefficients (3D-MFCC) and 3D Mel features derived from ship audio signals as inputs. The model employs a serial architecture that combines a convolutional neural network (CNN) with a long short-term memory (LSTM) network. It replaces the traditional CNN with a multi-scale depthwise separable convolutional network (MSDC) and incorporates a multi-scale channel attention mechanism (MSCA). The experimental results demonstrate that the average recognition rate of this method reaches 87.52% on the DeepShip dataset and 97.32% on the ShipsEar dataset, indicating a strong classification performance.
APA, Harvard, Vancouver, ISO, and other styles
12

Mu, Guo, and Liu. "A Multi-Scale and Multi-Level Spectral-Spatial Feature Fusion Network for Hyperspectral Image Classification." Remote Sensing 12, no. 1 (2020): 125. http://dx.doi.org/10.3390/rs12010125.

Full text
Abstract:
Extracting spatial and spectral features through deep neural networks has become an effective means of classification of hyperspectral images. However, most networks rarely consider the extraction of multi-scale spatial features and cannot fully integrate spatial and spectral features. In order to solve these problems, this paper proposes a multi-scale and multi-level spectral-spatial feature fusion network (MSSN) for hyperspectral image classification. The network uses the original 3D cube as input data and does not need to use feature engineering. In the MSSN, using different scale neighborhood blocks as the input of the network, the spectral-spatial features of different scales can be effectively extracted. The proposed 3D–2D alternating residual block combines the spectral features extracted by the three-dimensional convolutional neural network (3D-CNN) with the spatial features extracted by the two-dimensional convolutional neural network (2D-CNN). It not only achieves the fusion of spectral features and spatial features but also achieves the fusion of high-level features and low-level features. Experimental results on four hyperspectral datasets show that this method is superior to several state-of-the-art classification methods for hyperspectral images.
APA, Harvard, Vancouver, ISO, and other styles
13

Lu, Xiaofei, and Shouwang Li. "Design of 3D Environment Combining Digital Image Processing Technology and Convolutional Neural Network." Advances in Multimedia 2024 (January 12, 2024): 1–12. http://dx.doi.org/10.1155/2024/5528497.

Full text
Abstract:
As virtual reality technology advances, 3D environment design and modeling have garnered increasing attention. Applications in networked virtual environments span urban planning, industrial design, and manufacturing, among other fields. However, existing 3D modeling methods exhibit high reconstruction error precision, limiting their practicality in many domains, particularly environmental design. To enhance 3D reconstruction accuracy, this study proposes a digital image processing technology that combines binocular camera calibration, stereo correction, and a convolutional neural network (CNN) algorithm for optimization and improvement. By employing the refined stereo-matching algorithm, a 3D reconstruction model was developed to augment 3D environment design and reconstruction accuracy while optimizing the 3D reconstruction effect. An experiment using the ShapeNet dataset demonstrated that the evaluation indices—Chamfer distance (CD), Earth mover’s distance (EMD), and intersection over union—of the model constructed in this study outperformed those of alternative methods. After incorporating the CNN module in the ablation experiment, CD and EMD increased by an average of 0.1 and 0.06, respectively. This validates that the proposed CNN module effectively enhances point cloud reconstruction accuracy. Upon adding the CNN module, the CD index and EMD index in the dataset increased by an average of 0.34 and 0.54, respectively. These results indicate that the proposed CNN module exhibits strong predictive capabilities for point cloud coordinates. Furthermore, the model demonstrates good generalization performance.
APA, Harvard, Vancouver, ISO, and other styles
14

Nayak, Omprakash, Hrishikesh Khandare, Nikhil Kumar Parida, Ramnivas Giri, Rekh Ram Janghel, and Himanshu Govil. "Hyperspectral Image Classification using Hybrid Deep Convolutional Neural Network." Journal of Physics: Conference Series 2273, no. 1 (2022): 012028. http://dx.doi.org/10.1088/1742-6596/2273/1/012028.

Full text
Abstract:
Abstract The Hyperspectral Images (HSI) are now being widely popular due to the evolution of satellite imagery and camera technology. Remote sensing has also gained popularity and it is also closely related to HSI. HSI possesses a wide variety of spatial and spectral features. However, HSI also has a consider-able amount of useless or redundant data. This redundant data causes a lot of trouble during classifications as it possesses a huge range in contrast to RGB. Traditional classification techniques do not apply efficiently to HSI. Even if somehow the traditional techniques are applied to it, the results produced are inefficient and undesirable. The Convolutional Neural Network (CNN), which are widely famous for the classification of images, have their fair share of trouble when dealing with HSI. 2D CNNs is not very efficient and 3D CNNs increases the computational complexity. To overcome these issues a new hybrid CNN approach is used which uses sigmoid activation function at the output layer, using a 2D CNN with 3D CNN to generate the desired output. Here, we are using HSI classification using hybrid CNN i.e., 2D and 3D. The dataset used is the Indian pines dataset sigmoid classifier for classification. And we gain the Overall accuracy 99.34 %, average accuracy 99.27%, kappa 99.25%.
APA, Harvard, Vancouver, ISO, and other styles
15

Dong, Min, Zhenglin Fang, Yongfa Li, Sheng Bi, and Jiangcheng Chen. "AR3D: Attention Residual 3D Network for Human Action Recognition." Sensors 21, no. 5 (2021): 1656. http://dx.doi.org/10.3390/s21051656.

Full text
Abstract:
At present, in the field of video-based human action recognition, deep neural networks are mainly divided into two branches: the 2D convolutional neural network (CNN) and 3D CNN. However, 2D CNN’s temporal and spatial feature extraction processes are independent of each other, which means that it is easy to ignore the internal connection, affecting the performance of recognition. Although 3D CNN can extract the temporal and spatial features of the video sequence at the same time, the parameters of the 3D model increase exponentially, resulting in the model being difficult to train and transfer. To solve this problem, this article is based on 3D CNN combined with a residual structure and attention mechanism to improve the existing 3D CNN model, and we propose two types of human action recognition models (the Residual 3D Network (R3D) and Attention Residual 3D Network (AR3D)). Firstly, in this article, we propose a shallow feature extraction module and improve the ordinary 3D residual structure, which reduces the parameters and strengthens the extraction of temporal features. Secondly, we explore the application of the attention mechanism in human action recognition and design a 3D spatio-temporal attention mechanism module to strengthen the extraction of global features of human action. Finally, in order to make full use of the residual structure and attention mechanism, an Attention Residual 3D Network (AR3D) is proposed, and its two fusion strategies and corresponding model structure (AR3D_V1, AR3D_V2) are introduced in detail. Experiments show that the fused structure shows different degrees of performance improvement compared to a single structure.
APA, Harvard, Vancouver, ISO, and other styles
16

Ruan, Kun, Shun Zhao, Xueqin Jiang, et al. "A 3D Fluorescence Classification and Component Prediction Method Based on VGG Convolutional Neural Network and PARAFAC Analysis Method." Applied Sciences 12, no. 10 (2022): 4886. http://dx.doi.org/10.3390/app12104886.

Full text
Abstract:
Three-dimensional fluorescence is currently studied by methods such as parallel factor analysis (PARAFAC), fluorescence regional integration (FRI), and principal component analysis (PCA). There are also many studies combining convolutional neural networks at present, but there is no one method recognized as the most effective among the methods combining convolutional neural networks and 3D fluorescence analysis. Based on this, we took some samples from the actual environment for measuring 3D fluorescence data and obtained a batch of public datasets from the internet species. Firstly, we preprocessed the data (including two steps of PARAFAC analysis and CNN dataset generation), and then we proposed a 3D fluorescence classification method and a components fitting method based on VGG16 and VGG11 convolutional neural networks. The VGG16 network is used for the classification of 3D fluorescence data with a training accuracy of 99.6% (as same as the PCA + SVM method (99.6%)). Among the component maps fitting networks, we comprehensively compared the improved LeNet network, the improved AlexNet network, and the improved VGG11 network, and finally selected the improved VGG11 network as the component maps fitting network. In the improved VGG11 network training, we used the MSE loss function and cosine similarity to judge the merit of the model, and the MSE loss of the network training reached 4.6 × 10−4 (characterizing the variability of the training results and the actual results), and we used the cosine similarity as the accuracy criterion, and the cosine similarity of the training results reached 0.99 (comparison of the training results and the actual results). The network performance is excellent. The experiments demonstrate that the convolutional neural network has a great application in 3D fluorescence analysis.
APA, Harvard, Vancouver, ISO, and other styles
17

Yu, Run, Youqing Luo, Haonan Li, et al. "Three-Dimensional Convolutional Neural Network Model for Early Detection of Pine Wilt Disease Using UAV-Based Hyperspectral Images." Remote Sensing 13, no. 20 (2021): 4065. http://dx.doi.org/10.3390/rs13204065.

Full text
Abstract:
As one of the most devastating disasters to pine forests, pine wilt disease (PWD) has caused tremendous ecological and economic losses in China. An effective way to prevent large-scale PWD outbreaks is to detect and remove the damaged pine trees at the early stage of PWD infection. However, early infected pine trees do not show obvious changes in morphology or color in the visible wavelength range, making early detection of PWD tricky. Unmanned aerial vehicle (UAV)-based hyperspectral imagery (HI) has great potential for early detection of PWD. However, the commonly used methods, such as the two-dimensional convolutional neural network (2D-CNN), fail to simultaneously extract and fully utilize the spatial and spectral information, whereas the three-dimensional convolutional neural network (3D-CNN) is able to collect this information from raw hyperspectral data. In this paper, we applied the residual block to 3D-CNN and constructed a 3D-Res CNN model, the performance of which was then compared with that of 3D-CNN, 2D-CNN, and 2D-Res CNN in identifying PWD-infected pine trees from the hyperspectral images. The 3D-Res CNN model outperformed the other models, achieving an overall accuracy (OA) of 88.11% and an accuracy of 72.86% for detecting early infected pine trees (EIPs). Using only 20% of the training samples, the OA and EIP accuracy of 3D-Res CNN can still achieve 81.06% and 51.97%, which is superior to the state-of-the-art method in the early detection of PWD based on hyperspectral images. Collectively, 3D-Res CNN was more accurate and effective in early detection of PWD. In conclusion, 3D-Res CNN is proposed for early detection of PWD in this paper, making the prediction and control of PWD more accurate and effective. This model can also be applied to detect pine trees damaged by other diseases or insect pests in the forest.
APA, Harvard, Vancouver, ISO, and other styles
18

Liu, Lu, and Guobao Feng. "Polarimetric SAR image classification using 3D generative adversarial network." MATEC Web of Conferences 336 (2021): 08012. http://dx.doi.org/10.1051/matecconf/202133608012.

Full text
Abstract:
In this paper, a new architecture of three-dimensional deep convolutional generative adversarial network(3D-DCGAN) is specially defined to solve the unstable training problem of GAN and make full use of the information involved in polarimetric data. Firstly, a data cube with nine components of polarimetric coherency matrix are directly used as the input features of DCGAN. After that, a 3D convolutional model is designed as the components of generator and discriminator to construct the 3D-DCGAN, which considers the effective feature extraction capability of 3D convolutional neural network(CNN). Finally parameters of the network are fine-tuned to realize the polarimetric SAR image classification. The experiments results show the feasibility and efficiency of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
19

Chen, Dong, and Young Hoon Joo. "A Novel Approach to 3D-DOA Estimation of Stationary EM Signals Using Convolutional Neural Networks." Sensors 20, no. 10 (2020): 2761. http://dx.doi.org/10.3390/s20102761.

Full text
Abstract:
This paper proposes a novel three-dimensional direction-of-arrival (3D-DOA) estimation method for electromagnetic (EM) signals using convolutional neural networks (CNN) in a Gaussian or non-Gaussian noise environment. First of all, in the presence of Gaussian noise, four output covariance matrices of the uniform triangular array (UTA) are normalized and then fed into four neural networks for 1D-DOA estimation with identical parameters in parallel; then four 1D-DOA estimations of the UTA can be obtained, and finally, the 3D-DOA estimation could be obtained through post-processing. Secondly, in the presence of non-Gaussian noise, the array output covariance matrices are normalized by the infinity-norm and then processed in Gaussian noise environment; the infinity-norm normalization could effectively suppress impulsive outliers and then provide appropriate input features for the neural network. In addition, the outputs of the neural network are controlled by a signal monitoring network to avoid misjudgments. Comprehensive simulations demonstrate that in Gaussian or non-Gaussian noise environment, the proposed method is superior and effective in computation speed and accuracy in 1D-DOA and 3D-DOA estimations, and the signal monitoring network could also effectively control the neural network outputs. Consequently, we can conclude that CNN has better generalization ability in DOA estimation.
APA, Harvard, Vancouver, ISO, and other styles
20

Ding, Bo, Lei Tang, and Yong-jun He. "An Efficient 3D Model Retrieval Method Based on Convolutional Neural Network." Complexity 2020 (June 11, 2020): 1–14. http://dx.doi.org/10.1155/2020/9050459.

Full text
Abstract:
Recently, 3D model retrieval based on views has become a research hotspot. In this method, 3D models are represented as a collection of 2D projective views, which allows deep learning techniques to be used for 3D model classification and retrieval. However, current methods need improvements in both accuracy and efficiency. To solve these problems, we propose a new 3D model retrieval method, which includes index building and model retrieval. In the index building stage, 3D models in library are projected to generate a large number of views, and then representative views are selected and input into a well-learned convolutional neural network (CNN) to extract features. Next, the features are organized according to their labels to build indexes. In this stage, the views used for representing 3D models are reduced substantially on the premise of keeping enough information of 3D models. This method reduces the number of similarity matching by 87.8%. In retrieval, the 2D views of the input model are classified into a category with the CNN and voting algorithm, and then only the features of one category rather than all categories are chosen to perform similarity matching. In this way, the searching space for retrieval is reduced. In addition, the number of used views for retrieval is gradually increased. Once there is enough evidence to determine a 3D model, the retrieval process will be terminated ahead of time. The variable view matching method further reduces the number of similarity matching by 21.4%. Experiments on the rigid 3D model datasets ModelNet10 and ModelNet40 and the nonrigid 3D model dataset McGill10 show that the proposed method has achieved retrieval accuracy rates of 94%, 92%, and 100%, respectively.
APA, Harvard, Vancouver, ISO, and other styles
21

Kharrat, Ahmed, and Mahmoud Neji. "Segmentation of Brain Tumors Using Three-Dimensional Convolutional Neural Network on MRI Images 3D MedImg-CNN." International Journal of Cognitive Informatics and Natural Intelligence 15, no. 4 (2021): 1–17. http://dx.doi.org/10.4018/ijcini.20211001.oa4.

Full text
Abstract:
We consider the problem of fully automatic brain tumor segmentation in MR images containing glioblastomas. We propose a three Dimensional Convolutional Neural Network (3D MedImg-CNN) approach which achieves high performance while being extremely efficient, a balance that existing methods have struggled to achieve. Our 3D MedImg-CNN is formed directly on the raw image modalities and thus learn a characteristic representation directly from the data. We propose a new cascaded architecture with two pathways that each model normal details in tumors. Fully exploiting the convolutional nature of our model also allows us to segment a complete cerebral image in one minute. The performance of the proposed 3D MedImg-CNN with CNN segmentation method is computed using dice similarity coefficient (DSC). In experiments on the 2013, 2015 and 2017 BraTS challenges datasets; we unveil that our approach is among the most powerful methods in the literature, while also being very effective.
APA, Harvard, Vancouver, ISO, and other styles
22

Manimegalai, P., R. Suresh Kumar, Prajoona Valsalan, R. Dhanagopal, P. T. Vasanth Raj, and Jerome Christhudass. "3D Convolutional Neural Network Framework with Deep Learning for Nuclear Medicine." Scanning 2022 (July 16, 2022): 1–9. http://dx.doi.org/10.1155/2022/9640177.

Full text
Abstract:
Though artificial intelligence (AI) has been used in nuclear medicine for more than 50 years, more progress has been made in deep learning (DL) and machine learning (ML), which have driven the development of new AI abilities in the field. ANNs are used in both deep learning and machine learning in nuclear medicine. Alternatively, if 3D convolutional neural network (CNN) is used, the inputs may be the actual images that are being analyzed, rather than a set of inputs. In nuclear medicine, artificial intelligence reimagines and reengineers the field’s therapeutic and scientific capabilities. Understanding the concepts of 3D CNN and U-Net in the context of nuclear medicine provides for a deeper engagement with clinical and research applications, as well as the ability to troubleshoot problems when they emerge. Business analytics, risk assessment, quality assurance, and basic classifications are all examples of simple ML applications. General nuclear medicine, SPECT, PET, MRI, and CT may benefit from more advanced DL applications for classification, detection, localization, segmentation, quantification, and radiomic feature extraction utilizing 3D CNNs. An ANN may be used to analyze a small dataset at the same time as traditional statistical methods, as well as bigger datasets. Nuclear medicine’s clinical and research practices have been largely unaffected by the introduction of artificial intelligence (AI). Clinical and research landscapes have been fundamentally altered by the advent of 3D CNN and U-Net applications. Nuclear medicine professionals must now have at least an elementary understanding of AI principles such as neural networks (ANNs) and convolutional neural networks (CNNs).
APA, Harvard, Vancouver, ISO, and other styles
23

TAŞPINAR, Gürcan, and Nalan ÖZKURT. "3D CNN Based Automatic Diagnosis of ADHD Using fMRI Volumes." Deu Muhendislik Fakultesi Fen ve Muhendislik 25, no. 73 (2023): 1–8. http://dx.doi.org/10.21205/deufmd.2023257301.

Full text
Abstract:
Attention deficit hyperactivity disorder (ADHD) is one of the most common mental health disorders and it is threatening especially to the academic performance of children. Its neurobiological diagnosis is essential for clinicians to treat ADHD patients properly. Along with machine learning algorithms, and neuroimaging technologies, especially functional magnetic resonance imaging is increasingly used as biomarker in attention deficit hyperactivity disorder. Also, machine learning methods have been becoming popular at last times. This study presents an optimized 3-dimensional convolutional neural network to classify functional magnetic resonance imaging volumes into two classes to assist experts in diagnosing ADHD. To demonstrate the importance of extracting 3D relationships of data, the method has been tested on ADHD-200 public datasets and its performance on the hold-out testing datasets has been evaluated. Then the network performance has been compared with several recent ADHD detection convolutional neural networks in the literature. It has been observed that the proposed network has a promising performance.
APA, Harvard, Vancouver, ISO, and other styles
24

Wang, Yu, Shuyang Ma, and Xuanjing Shen. "A Novel Video Face Verification Algorithm Based on TPLBP and the 3D Siamese-CNN." Electronics 8, no. 12 (2019): 1544. http://dx.doi.org/10.3390/electronics8121544.

Full text
Abstract:
In order to reduce the computational consumption of the training and the testing phases of video face recognition methods based on a global statistical method and a deep learning network, a novel video face verification algorithm based on a three-patch local binary pattern (TPLBP) and the 3D Siamese convolutional neural network is proposed in this paper. The proposed method takes the TPLBP texture feature which has excellent performance in face analysis as the input of the network. In order to extract the inter-frame information of the video, the texture feature maps of the multi-frames are stacked, and then a shallow Siamese 3D convolutional neural network is used to realize dimension reduction. The similarity of high-level features of the video pair is solved by the shallow Siamese 3D convolutional neural network, and then mapped to the interval of 0 to 1 by linear transformation. The classification result can be obtained with the threshold of 0.5. Through an experiment on the YouTube Face database, the proposed algorithm got higher accuracy with less computational consumption than baseline methods and deep learning methods.
APA, Harvard, Vancouver, ISO, and other styles
25

Kiranpure, Ayush. "Cyclone Intensity Prediction Using Deep Learning on INSAT-3D IR Imagery: A Comparative Analysis." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem45392.

Full text
Abstract:
This study investigates the effectiveness of deep learning techniques in accurately estimating tropical cyclone intensity using infrared (IR) imagery from the INSAT-3D satellite. We assess the performance of three models—Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and a hybrid CNN-RNN model—comparing them against traditional machine learning methods like Support Vector Machines (SVM) and Random Forests (RF). Results demonstrate that deep learning models significantly outperform traditional approaches, with the CNN-RNN model achieving the highest accuracy. These findings highlight the potential of deep learning to enhance early warning systems for extreme weather events. Keywords: Deep learning, Machine Learning, Preprocessing , CNN, INSAT 3D Images
APA, Harvard, Vancouver, ISO, and other styles
26

Ullah, Fath U. Min, Amin Ullah, Khan Muhammad, Ijaz Ul Haq, and Sung Wook Baik. "Violence Detection Using Spatiotemporal Features with 3D Convolutional Neural Network." Sensors 19, no. 11 (2019): 2472. http://dx.doi.org/10.3390/s19112472.

Full text
Abstract:
The worldwide utilization of surveillance cameras in smart cities has enabled researchers to analyze a gigantic volume of data to ensure automatic monitoring. An enhanced security system in smart cities, schools, hospitals, and other surveillance domains is mandatory for the detection of violent or abnormal activities to avoid any casualties which could cause social, economic, and ecological damages. Automatic detection of violence for quick actions is very significant and can efficiently assist the concerned departments. In this paper, we propose a triple-staged end-to-end deep learning violence detection framework. First, persons are detected in the surveillance video stream using a light-weight convolutional neural network (CNN) model to reduce and overcome the voluminous processing of useless frames. Second, a sequence of 16 frames with detected persons is passed to 3D CNN, where the spatiotemporal features of these sequences are extracted and fed to the Softmax classifier. Furthermore, we optimized the 3D CNN model using an open visual inference and neural networks optimization toolkit developed by Intel, which converts the trained model into intermediate representation and adjusts it for optimal execution at the end platform for the final prediction of violent activity. After detection of a violent activity, an alert is transmitted to the nearest police station or security department to take prompt preventive actions. We found that our proposed method outperforms the existing state-of-the-art methods for different benchmark datasets.
APA, Harvard, Vancouver, ISO, and other styles
27

Gao, Zhiyong, and Jianhong Xiang. "Real-time 3D Object Detection Using Improved Convolutional Neural Network Based on Image-driven Point Cloud." (Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) 14, no. 8 (2021): 826–36. http://dx.doi.org/10.2174/2352096514666211026142721.

Full text
Abstract:
Background: While detecting the object directly from the 3D point cloud, the natural 3D patterns and invariance of 3D data are often obscure. Objective: In this work, we aimed at studying the 3D object detection from discrete, disordered and sparse 3D point clouds. Methods: The CNN comprises the frustum sequence module, 3D instance segmentation module SNET, 3D point cloud transformation module T-NET, and 3D boundary box estimation module ENET. The search space of the object is determined by the frustum sequence module. The instance segmentation of the point cloud is performed by the 3D instance segmentation module. The 3D coordinates of the object are confirmed by the transformation module and the 3D bounding box estimation module. Results: Evaluated on KITTI benchmark dataset, our method outperforms state of the art by remarkable margins while having real-time capability. Conclusion: We achieve real-time 3D object detection by proposing an improved Convolutional Neural Network (CNN) based on image-driven point clouds.
APA, Harvard, Vancouver, ISO, and other styles
28

Espinosa-Bernal, Osmar Antonio, Jesús Carlos Pedraza-Ortega, Marco Antonio Aceves-Fernandez, et al. "Quasi/Periodic Noise Reduction in Images Using Modified Multiresolution-Convolutional Neural Networks for 3D Object Reconstructions and Comparison with Other Convolutional Neural Network Models." Computers 13, no. 6 (2024): 145. http://dx.doi.org/10.3390/computers13060145.

Full text
Abstract:
The modeling of real objects digitally is an area that has generated a high demand due to the need to obtain systems that are able to reproduce 3D objects from real objects. To this end, several techniques have been proposed to model objects in a computer, with the fringe profilometry technique being the one that has been most researched. However, this technique has the disadvantage of generating Moire noise that ends up affecting the accuracy of the final 3D reconstructed object. In order to try to obtain 3D objects as close as possible to the original object, different techniques have been developed to attenuate the quasi/periodic noise, namely the application of convolutional neural networks (CNNs), a method that has been recently applied for restoration and reduction and/or elimination of noise in images applied as a pre-processing in the generation of 3D objects. For this purpose, this work is carried out to attenuate the quasi/periodic noise in images acquired by the fringe profilometry technique, using a modified CNN-Multiresolution network. The results obtained are compared with the original CNN-Multiresolution network, the UNet network, and the FCN32s network and a quantitative comparison is made using the Image Mean Square Error E (IMMS), Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Profile (MSE) metrics.
APA, Harvard, Vancouver, ISO, and other styles
29

Chen, Boyu, Zhihao Zhang, Nian Liu, Yang Tan, Xinyu Liu, and Tong Chen. "Spatiotemporal Convolutional Neural Network with Convolutional Block Attention Module for Micro-Expression Recognition." Information 11, no. 8 (2020): 380. http://dx.doi.org/10.3390/info11080380.

Full text
Abstract:
A micro-expression is defined as an uncontrollable muscular movement shown on the face of humans when one is trying to conceal or repress his true emotions. Many researchers have applied the deep learning framework to micro-expression recognition in recent years. However, few have introduced the human visual attention mechanism to micro-expression recognition. In this study, we propose a three-dimensional (3D) spatiotemporal convolutional neural network with the convolutional block attention module (CBAM) for micro-expression recognition. First image sequences were input to a medium-sized convolutional neural network (CNN) to extract visual features. Afterwards, it learned to allocate the feature weights in an adaptive manner with the help of a convolutional block attention module. The method was testified in spontaneous micro-expression databases (Chinese Academy of Sciences Micro-expression II (CASME II), Spontaneous Micro-expression Database (SMIC)). The experimental results show that the 3D CNN with convolutional block attention module outperformed other algorithms in micro-expression recognition.
APA, Harvard, Vancouver, ISO, and other styles
30

Yuan, Q., Y. Ang, and H. Z. M. Shafri. "HYPERSPECTRAL IMAGE CLASSIFICATION USING RESIDUAL 2D AND 3D CONVOLUTIONAL NEURAL NETWORK JOINT ATTENTION MODEL." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIV-M-3-2021 (August 10, 2021): 187–93. http://dx.doi.org/10.5194/isprs-archives-xliv-m-3-2021-187-2021.

Full text
Abstract:
Abstract. Hyperspectral image classification (HSIC) is a challenging task in remote sensing data analysis, which has been applied in many domains for better identification and inspection of the earth surface by extracting spectral and spatial information. The combination of abundant spectral features and accurate spatial information can improve classification accuracy. However, many traditional methods are based on handcrafted features, which brings difficulties for multi-classification tasks due to spectral intra-class heterogeneity and similarity of inter-class. The deep learning algorithm, especially the convolutional neural network (CNN), has been perceived promising feature extractor and classification for processing hyperspectral remote sensing images. Although 2D CNN can extract spatial features, the specific spectral properties are not used effectively. While 3D CNN has the capability for them, but the computational burden increases as stacking layers. To address these issues, we propose a novel HSIC framework based on the residual CNN network by integrating the advantage of 2D and 3D CNN. First, 3D convolutions focus on extracting spectral features with feature recalibration and refinement by channel attention mechanism. The 2D depth-wise separable convolution approach with different size kernels concentrates on obtaining multi-scale spatial features and reducing model parameters. Furthermore, the residual structure optimizes the back-propagation for network training. The results and analysis of extensive HSIC experiments show that the proposed residual 2D-3D CNN network can effectively extract spectral and spatial features and improve classification accuracy.
APA, Harvard, Vancouver, ISO, and other styles
31

Pastor, Francisco, Juan M. Gandarias, Alfonso J. García-Cerezo, and Jesús M. Gómez-de-Gabriel. "Using 3D Convolutional Neural Networks for Tactile Object Recognition with Robotic Palpation." Sensors 19, no. 24 (2019): 5356. http://dx.doi.org/10.3390/s19245356.

Full text
Abstract:
In this paper, a novel method of active tactile perception based on 3D neural networks and a high-resolution tactile sensor installed on a robot gripper is presented. A haptic exploratory procedure based on robotic palpation is performed to get pressure images at different grasping forces that provide information not only about the external shape of the object, but also about its internal features. The gripper consists of two underactuated fingers with a tactile sensor array in the thumb. A new representation of tactile information as 3D tactile tensors is described. During a squeeze-and-release process, the pressure images read from the tactile sensor are concatenated forming a tensor that contains information about the variation of pressure matrices along with the grasping forces. These tensors are used to feed a 3D Convolutional Neural Network (3D CNN) called 3D TactNet, which is able to classify the grasped object through active interaction. Results show that 3D CNN performs better, and provide better recognition rates with a lower number of training data.
APA, Harvard, Vancouver, ISO, and other styles
32

Mayasari, Dita Ayu, Ihtifazhuddin Hawari, Sheba Atma Dwiyanti, et al. "Convolutional neural network for assisting accuracy of personalized clavicle bone implant designs." International Journal of Electrical and Computer Engineering (IJECE) 14, no. 3 (2024): 3208. http://dx.doi.org/10.11591/ijece.v14i3.pp3208-3219.

Full text
Abstract:
<span lang="EN-US">The clavicle is a long bone that tends to be frequently fractured in the midshaft region. The plate and screw fixing method is mainly applied to address this issue. This study aims to construct a clavicle bone implant design with a consideration to achieve a high accuracy and high-quality surface between the plate and the clavicle surface. The computational tomography scanning (CT-scan) image series data were processed using a convolutional neural network (CNN) to classify the clavicle image. The CNN outcomes were gathered as three-dimensional (3D) volume data of clavicle bone. This 3D model was then proposed for the plate design. The CNN testing results of 97.4% for the image clavicle bones classification, whereas the prints of the 3D model from clavicle bone and its plate and screw design reveal compatibility between the bone surface and the plate surface. Overall, the CNN application to the series of CT images could ease the classification of clavicle bone images that would precisely construct the 3D model of clavicle bone and its suitable clavicle bone plate design. This study could contribute as a guideline for other bone plate areas that need to fit the patient’s bone geometry.</span>
APA, Harvard, Vancouver, ISO, and other styles
33

Ayu, Mayasari Dita, Ihtifazhuddin Hawari, Dwiyanti Sheba Atma, et al. "Convolutional neural network for assisting accuracy of personalized clavicle bone implant designs." Convolutional neural network for assisting accuracy of personalized clavicle bone implant designs 14, no. 3 (2024): 3208–19. https://doi.org/10.11591/ijece.v14i3.pp3208-3219.

Full text
Abstract:
The clavicle is a long bone that tends to be frequently fractured in the midshaft region. The plate and screw fixing method is mainly applied to address this issue. This study aims to construct a clavicle bone implant design with a consideration to achieve a high accuracy and high-quality surface between the plate and the clavicle surface. The computational tomography scanning (CT-scan) image series data were processed using a convolutional neural network (CNN) to classify the clavicle image. The CNN outcomes were gathered as three-dimensional (3D) volume data of clavicle bone. This 3D model was then proposed for the plate design. The CNN testing results of 97.4% for the image clavicle bones classification, whereas the prints of the 3D model from clavicle bone and its plate and screw design reveal compatibility between the bone surface and the plate surface. Overall, the CNN application to the series of CT images could ease the classification of clavicle bone images that would precisely construct the 3D model of clavicle bone and its suitable clavicle bone plate design. This study could contribute as a guideline for other bone plate areas that need to fit the patient’s bone geometry.
APA, Harvard, Vancouver, ISO, and other styles
34

Adhithyaa, N., A. Tamilarasi, D. Sivabalaselvamani, and L. Rahunathan. "Face Positioned Driver Drowsiness Detection Using Multistage Adaptive 3D Convolutional Neural Network." Information Technology and Control 52, no. 3 (2023): 713–30. http://dx.doi.org/10.5755/j01.itc.52.3.33719.

Full text
Abstract:
Accidents due to driver drowsiness are observed to be increasing at an alarming rate across all countries and it becomes necessary to identify driver drowsiness to reduce accident rates. Researchers handled many machine learning and deep learning techniques especially many CNN variants created for drowsiness detection, but it is dangerous to use in real time, as the design fails due to high computational complexity, low evaluation accuracies and low reliability. In this article, we introduce a multistage adaptive 3D-CNN model with multi-expressive features for Driver Drowsiness Detection (DDD) with special attention to system complexity and performance. The proposed architecture is divided into five cascaded stages: (1) A three level Convolutional Neural Network (CNN) for driver face positioning (2) 3D-CNN based Spatio-Temporal (ST) Learning to extract 3D features from face positioned stacked samples. (3) State Understanding (SU) to train 3D-CNN based drowsiness models (4) Feature fusion using ST and SU stages (5) Drowsiness Detection stage. The Proposed system extract ST values from the face positioned images and then merges it with SU results from each state understanding sub models to create conditional driver facial features for final Drowsiness Detection (DD) model. Final DD Model is trained offline and implemented in online, results show the developed model performs well when compared to others and additionally capable of handling Indian conditions. This method is applied (Trained and Evaluated) using two different datasets, Kongu Engineering College Driver Drowsiness Detection (KEC-DDD) own dataset and National Tsing Hua University Driver Drowsiness Detection (NTHU-DDD) Benchmark Dataset. The proposed system trained with KEC-DDD dataset produces accuracy of 77.45% and 75.91% using evaluation set of KEC-DDD and NTHU-DDD dataset and capable to detect driver drowsiness from 256×256 resolution images at 39.6 fps at an average of 400 execution seconds.
APA, Harvard, Vancouver, ISO, and other styles
35

Pagès, Guillaume, Benoit Charmettant, and Sergei Grudinin. "Protein model quality assessment using 3D oriented convolutional neural networks." Bioinformatics 35, no. 18 (2019): 3313–19. http://dx.doi.org/10.1093/bioinformatics/btz122.

Full text
Abstract:
Abstract Motivation Protein model quality assessment (QA) is a crucial and yet open problem in structural bioinformatics. The current best methods for single-model QA typically combine results from different approaches, each based on different input features constructed by experts in the field. Then, the prediction model is trained using a machine-learning algorithm. Recently, with the development of convolutional neural networks (CNN), the training paradigm has changed. In computer vision, the expert-developed features have been significantly overpassed by automatically trained convolutional filters. This motivated us to apply a three-dimensional (3D) CNN to the problem of protein model QA. Results We developed Ornate (Oriented Routed Neural network with Automatic Typing)—a novel method for single-model QA. Ornate is a residue-wise scoring function that takes as input 3D density maps. It predicts the local (residue-wise) and the global model quality through a deep 3D CNN. Specifically, Ornate aligns the input density map, corresponding to each residue and its neighborhood, with the backbone topology of this residue. This circumvents the problem of ambiguous orientations of the initial models. Also, Ornate includes automatic identification of atom types and dynamic routing of the data in the network. Established benchmarks (CASP 11 and CASP 12) demonstrate the state-of-the-art performance of our approach among single-model QA methods. Availability and implementation The method is available at https://team.inria.fr/nano-d/software/Ornate/. It consists of a C++ executable that transforms molecular structures into volumetric density maps, and a Python code based on the TensorFlow framework for applying the Ornate model to these maps. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
36

Gao, Junlong, and Yucheng Wei. "Depression Level Assessment based on 3D CNN and Facial Expression Videos." Advances in Engineering Technology Research 10, no. 1 (2024): 382. http://dx.doi.org/10.56028/aetr.10.1.382.2024.

Full text
Abstract:
Depression has a severe impact on people's daily lives and work, and it may even lead to suicide. Computer visionbased methods are promising for providing more effective and objective assistance in the clinical diagnosis of depression. In this article, to compare the performance of different 3D convolutional neural networks in assessing depression levels, we tested 3D VGGNet18, 3D GoogleNet, 3DEfficientNetB7, and 3D MobileNetV3 networks based on the AVEC2013 and AVEC2014 datasets. Experimental results showed that the 3D MobileNetV3 network achieved the best evaluation results, with MAE=7.35 and RMSE=9.16 on the AVEC2013 dataset, and MAE=7.19 and RMSE=9.08 on the AVEC2014 dataset. Compared with other existing methods, 3D ResNet18 demonstrated excellent performance.
APA, Harvard, Vancouver, ISO, and other styles
37

Jiao, Bo. "Application Analysis of Virtual Simulation Network Model Based on 3D-CNN in Animation Teaching." Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications 15, no. 4 (2024): 11–24. http://dx.doi.org/10.58346/jowua.2024.i4.002.

Full text
Abstract:
With the continuous progress of educational technology, the application of animation in teaching has gradually become an effective means to improve learning experience and effectiveness. This study focuses on the application of 3D-CNN virtual simulation network model in animation teaching, aiming to deeply analyze the impact of this model on the learning process and its potential advantages in improving learning effectiveness. 3D-CNN stands for Three-Dimensional Convolutional Neural Network. Unlike traditional 2D-CNN, 3D-CNN is specifically designed for processing 3D data, such as video and volume data. This type of neural network is very useful in processing temporal and spatial information, and is therefore widely used in fields such as video analysis and image processing. By reviewing the history and development of animation in the field of teaching, and based on previous research, a 3D-CNN virtual simulation network model is proposed, and its basic principles and applications in virtual simulation are analyzed. Through experimental analysis of teaching effectiveness, quantitative and qualitative indicators for evaluating academic performance were revealed, revealing the specific impact of the model on learning outcomes. Compared with previous research, we can find that the 3D-CNN virtual simulation network model has achieved good practical application effects in animation teaching. Based on the survey results on student participation and interest, the potential mechanisms of virtual simulation in improving student interest and active participation were explored. This study provides a reference for promoting the integration of animation teaching and 3D-CNN virtual simulation network models in the field of education.
APA, Harvard, Vancouver, ISO, and other styles
38

Alnaim, Norah, Maysam Abbod, and Rafiq Swash. "Recognition of Holoscopic 3D Video Hand Gesture Using Convolutional Neural Networks." Technologies 8, no. 2 (2020): 19. http://dx.doi.org/10.3390/technologies8020019.

Full text
Abstract:
The convolutional neural network (CNN) algorithm is one of the efficient techniques to recognize hand gestures. In human–computer interaction, a human gesture is a non-verbal communication mode, as users communicate with a computer via input devices. In this article, 3D micro hand gesture recognition disparity experiments are proposed using CNN. This study includes twelve 3D micro hand motions recorded for three different subjects. The system is validated by an experiment that is implemented on twenty different subjects of different ages. The results are analysed and evaluated based on execution time, training, testing, sensitivity, specificity, positive and negative predictive value, and likelihood. The CNN training results show an accuracy as high as 100%, which present superior performance in all factors. On the other hand, the validation results average about 99% accuracy. The CNN algorithm has proven to be the most accurate classification tool for micro gesture recognition.
APA, Harvard, Vancouver, ISO, and other styles
39

Rao, Chengping, and Yang Liu. "Three-dimensional convolutional neural network (3D-CNN) for heterogeneous material homogenization." Computational Materials Science 184 (November 2020): 109850. http://dx.doi.org/10.1016/j.commatsci.2020.109850.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Kim, Harin, Sung Woo Joo, Yeon Ho Joo, and Jungsun Lee. "S152. DIAGNOSTIC CLASSIFICATION OF SCHIZOPHRENIA USING 3D CONVOLUTIONAL NEURAL NETWORK WITH RESTING-STATE FUNCTIONAL MRI." Schizophrenia Bulletin 46, Supplement_1 (2020): S94. http://dx.doi.org/10.1093/schbul/sbaa031.218.

Full text
Abstract:
Abstract Background Several machine-learning (ML) algorithms have been deployed in the diagnostic classification of schizophrenia. Compared to other ML methods, the 3D convolutional neural network (CNN) has an advantage of learning complex and subtle patterns in data and preserving spatial information, which is a more suitable tool for brain imaging data. Although resting-state functional MRI (rsfMRI) data has been used in previous ML studies relating to the diagnostic classification of schizophrenia, a limited number of studies have been conducted using resting-state functional connectivity resulted from group independent component analysis (ICA) and dual regression. The objective of this study was to investigate whether a successful diagnostic classification of schizophrenia vs. healthy controls could be achieved by the 3D CNN using resting-state networks in which areas with a significant group difference in activity existed. Methods T1 and rsfMRI data were collected in 46 patients with recent-onset schizophrenia and 22 healthy controls. In the pre-processing steps of rsfMRI, the ICA-based automatic removal of motion artifacts was applied to subject-level ICA results and the resulting rsfMRI data were temporally concatenated for group ICA and dual regression. The executive control and auditory networks had areas with significantly higher activity in the control group compared with the patient group. The independent components (ICs) respective to the executive control and auditory networks were used as input for the 3D CNN model which was developed to discriminate the schizophrenia patients from the healthy controls. Results The 3D CNN model using the executive control and auditory networks as inputs showed classification accuracies of 65~70%, and error rates of 30~35% approximately. Discussion Our findings suggest that the 3D CNN model using rsfMRI data can be useful for learning patterns implicated in schizophrenia and identifying discriminative patterns of schizophrenia in brain imaging data.
APA, Harvard, Vancouver, ISO, and other styles
41

Jain, Nidhi, and Prasadu Peddi. "Facial Expression Identification using Two-Stream Convolutional Neural Networks (TSCNNs) and Inception 3D Convolutional Neural Network (CNN)." International Journal of Renewable Energy Exchange 11, no. 5 (2023): 113–22. http://dx.doi.org/10.58443/ijrex.11.5.2023.113-122.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Bashi, Omar I. Dallal, Husamuldeen K. Hameed, Yasir Mahmood Al Kubaiaisi, and Ahmad H. Sabry. "Development of object detection from point clouds of a 3D dataset by Point-Pillars neural network." Eastern-European Journal of Enterprise Technologies 2, no. 9 (122) (2023): 26–33. http://dx.doi.org/10.15587/1729-4061.2023.275155.

Full text
Abstract:
Deep learning algorithms are able to automatically handle point clouds over a broad range of 3D imaging implementations. They have applications in advanced driver assistance systems, perception and robot navigation, scene classification, surveillance, stereo vision, and depth estimation. According to prior studies, the detection of objects from point clouds of a 3D dataset with acceptable accuracy is still a challenging task. The Point-Pillars technique is used in this work to detect a 3D object employing 2D convolutional neural network (CNN) layers. Point-Pillars architecture includes a learnable encoder to use Point-Nets for learning a demonstration of point clouds structured with vertical columns (pillars). The Point-Pillars architecture operates a 2D CNN to decode the predictions, create network estimations, and create 3D envelop boxes for various object labels like pedestrians, trucks, and cars. This study aims to detect objects from point clouds of a 3D dataset by Point-Pillars neural network architecture that makes it possible to detect a 3D object by means of 2D convolutional neural network (CNN) layers. The method includes producing a sparse pseudo-image from a point cloud using a feature encoder, using a 2D convolution backbone to process the pseudo-image into high-level, and using detection heads to regress and detect 3D bounding boxes. This work utilizes an augmentation for ground truth data as well as additional augmentations of global data methods to include further diversity in the data training and associating packs. The obtained results demonstrated that the average orientation similarity (AOS) and average precision (AP) were 0.60989, 0.61157 for trucks, and 0.74377, 0.75569 for cars.
APA, Harvard, Vancouver, ISO, and other styles
43

Omar, I. Dallal Bashi, K. Hameed Husamuldeen, Mahmood Al Kubaiaisi Yasir, and H. Sabry Ahmad. "Development of object detection from point clouds of a 3D dataset by Point-Pillars neural network." Eastern-European Journal of Enterprise Technologies 2, no. 9(122) (2023): 26–33. https://doi.org/10.15587/1729-4061.2023.275155.

Full text
Abstract:
Deep learning algorithms are able to automatically handle point clouds over a broad range of 3D imaging implementations. They have applications in advanced driver assistance systems, perception and robot navigation, scene classification, surveillance, stereo vision, and depth estimation. According to prior studies, the detection of objects from point clouds of a 3D dataset with acceptable accuracy is still a challenging task. The Point-Pillars technique is used in this work to detect a 3D object employing 2D convolutional neural network (CNN) layers. Point-Pillars architecture includes a learnable encoder to use Point-Nets for learning a demonstration of point clouds structured with vertical columns (pillars). The Point-Pillars architecture operates a 2D CNN to decode the predictions, create network estimations, and create 3D envelop boxes for various object labels like pedestrians, trucks, and cars. This study aims to detect objects from point clouds of a 3D dataset by Point-Pillars neural network architecture that makes it possible to detect a 3D object by means of 2D convolutional neural network (CNN) layers. The method includes producing a sparse pseudo-image from a point cloud using a feature encoder, using a 2D convolution backbone to process the pseudo-image into high-level, and using detection heads to regress and detect 3D bounding boxes. This work utilizes an augmentation for ground truth data as well as additional augmentations of global data methods to include further diversity in the data training and associating packs. The obtained results demonstrated that the average orientation similarity (AOS) and average precision (AP) were 0.60989, 0.61157 for trucks, and 0.74377, 0.75569 for cars.
APA, Harvard, Vancouver, ISO, and other styles
44

Kim, Sungkyu, Tae-Seong Kim, and Won Hee Lee. "Accelerating 3D Convolutional Neural Network with Channel Bottleneck Module for EEG-Based Emotion Recognition." Sensors 22, no. 18 (2022): 6813. http://dx.doi.org/10.3390/s22186813.

Full text
Abstract:
Deep learning-based emotion recognition using EEG has received increasing attention in recent years. The existing studies on emotion recognition show great variability in their employed methods including the choice of deep learning approaches and the type of input features. Although deep learning models for EEG-based emotion recognition can deliver superior accuracy, it comes at the cost of high computational complexity. Here, we propose a novel 3D convolutional neural network with a channel bottleneck module (CNN-BN) model for EEG-based emotion recognition, with the aim of accelerating the CNN computation without a significant loss in classification accuracy. To this end, we constructed a 3D spatiotemporal representation of EEG signals as the input of our proposed model. Our CNN-BN model extracts spatiotemporal EEG features, which effectively utilize the spatial and temporal information in EEG. We evaluated the performance of the CNN-BN model in the valence and arousal classification tasks. Our proposed CNN-BN model achieved an average accuracy of 99.1% and 99.5% for valence and arousal, respectively, on the DEAP dataset, while significantly reducing the number of parameters by 93.08% and FLOPs by 94.94%. The CNN-BN model with fewer parameters based on 3D EEG spatiotemporal representation outperforms the state-of-the-art models. Our proposed CNN-BN model with a better parameter efficiency has excellent potential for accelerating CNN-based emotion recognition without losing classification performance.
APA, Harvard, Vancouver, ISO, and other styles
45

Wang, Haiying. "Recognition of Wrong Sports Movements Based on Deep Neural Network." Revue d'Intelligence Artificielle 34, no. 5 (2020): 663–71. http://dx.doi.org/10.18280/ria.340518.

Full text
Abstract:
During physical education (PE), the teaching quality is severely affected by problems like nonstandard technical movements or wrong demonstrative movements. High-speed photography can capture instantaneous movements that cannot be recognized with naked eyes. Therefore, this technology has been widely used to judge the sprint movements in track and field competitions, and assess the quality of artistic gymnastics. Inspired by three-dimensional (3D) image analysis, this paper proposes a method to recognize the standard and wrong demonstrative sports movements, based on 3D convolutional neural network (CNN) and graph theory. Firstly, a 3D posture perception strategy for demonstrative sports movements was constructed based on video sequence. Next, the authors provided the framework of the recognition system for standard and wrong demonstrative sports movements. After that, a 3D CNN was stablished to distinguish between standard and wrong demonstrative sports movements. The proposed method was proved effective and superior through experiments. The research results provide a good reference for the application of 3D image analysis in the recognition of other body behaviors and movements.
APA, Harvard, Vancouver, ISO, and other styles
46

Cui, Liyuan, Shanhua Han, Shouliang Qi, Yang Duan, Yan Kang, and Yu Luo. "Deep symmetric three-dimensional convolutional neural networks for identifying acute ischemic stroke via diffusion-weighted images." Journal of X-Ray Science and Technology 29, no. 4 (2021): 551–66. http://dx.doi.org/10.3233/xst-210861.

Full text
Abstract:
BACKGROUND: Acute ischemic stroke (AIS) results in high morbidity, disability, and mortality. Early and automatic diagnosis of AIS can help clinicians administer the appropriate interventions. OBJECTIVE: To develop a deep symmetric 3D convolutional neural network (DeepSym-3D-CNN) for automated AIS diagnosis via diffusion-weighted imaging (DWI) images. METHODS: This study includes 190 study subjects (97 AIS and 93 Non-AIS) by collecting both DWI and Apparent Diffusion Coefficient (ADC) images. 3D DWI brain images are split into left and right hemispheres and input into two paths. A map with 125×253×14×12 features is extracted by each path of Inception Modules. After the features computed from two paths are subtracted through L-2 normalization, four multi-scale convolution layers produce the final predation. Three comparative models using DWI images including MedicalNet with transfer learning, Simple DeepSym-3D-CNN (each 3D Inception Module is replaced by a simple 3D-CNN layer), and L-1 DeepSym-3D-CNN (L-2 normalization is replaced by L-1 normalization) are constructed. Moreover, using ADC images and the combination of DWI and ADC images as inputs, the performance of DeepSym-3D-CNN is also investigated. Performance levels of all three models are evaluated by 5-fold cross-validation and the values of area under ROC curve (AUC) are compared by DeLong’s test. RESULTS: DeepSym-3D-CNN achieves an accuracy of 0.850 and an AUC of 0.864. DeLong’s test of AUC values demonstrates that DeepSym-3D-CNN significantly outperforms other comparative models (p < 0.05). The highlighted regions in the feature maps of DeepSym-3D-CNN spatially match with AIS lesions. Meanwhile, DeepSym-3D-CNN using DWI images presents the significant higher AUC than that either using ADC images or using DWI-ADC images based on DeLong’s test (p < 0.05). CONCLUSIONS: DeepSym-3D-CNN is a potential method for automatically identifying AIS via DWI images and can be extended to other diseases with asymmetric lesions.
APA, Harvard, Vancouver, ISO, and other styles
47

Suryakanth, B., and S. A. Hari Prasad. "3D CNN-Residual Neural Network Based Multimodal Medical Image Classification." WSEAS TRANSACTIONS ON BIOLOGY AND BIOMEDICINE 19 (October 31, 2022): 204–14. http://dx.doi.org/10.37394/23208.2022.19.22.

Full text
Abstract:
Multimodal medical imaging has become incredibly common in the area of biomedical imaging. Medical image classification has been used to extract useful data from multimodality medical image data. Magnetic resonance imaging (MRI) and Computed tomography (CT) are some of the imaging methods. Different imaging technologies provide different imaging information for the same part. Traditional ways of illness classification are effective, but in today's environment, 3D images are used to identify diseases. In comparison to 1D and 2D images, 3D images have a very clear vision. The proposed method uses 3D Residual Convolutional Neural Network (CNN ResNet) for the 3D image classification. Various methods are available for classifying the disease, like cluster, KNN, and ANN. Traditional techniques are not trained to classify 3D images, so an advanced approach is introduced in the proposed method to predict the 3D images. Initially, the multimodal 2D medical image data is taken. This 2D input image is turned into 3D image data because 3D images give more information than the 2D image data. Then the 3D CT and MRI images are fused and using the Guided filtering, and the combined image is filtered for the further process. The fused image is then augmented. Finally, this fused image is fed to 3DCNN ResNet for classification purposes. The 3DCNN ResNet classifies the image data and produces the output as five different stages of the disease. The proposed method achieves 98% of accuracy. Thus the designed modal has predicted the stage of the disease in an effective manner.
APA, Harvard, Vancouver, ISO, and other styles
48

Al-Khuzaie, Maryam I. Mousa, and Waleed A. Mahmoud Al-Jawher. "Enhancing Brain Tumor Classification with a Novel Three-Dimensional Convolutional Neural Network (3D-CNN) Fusion Model." Journal Port Science Research 7, no. 3 (2024): 254–67. http://dx.doi.org/10.36371/port.2024.3.5.

Full text
Abstract:
Three-dimensional convolutional neural networks (3D CNNs) have been widely applied to analyze brain tumour images (BT) to understand the disease's progress better. It is well-known that training 3D-CNN is computationally expensive and has the potential of overfitting due to the small sample size available in the medical imaging field. Here, we proposed a novel 2D-3D approach by converting a 2D brain image to a 3D fused image using a gradient of the image Learnable Weighted. By the 2D-to-3D conversion, the proposed model can easily forward the fused 3D image through a pre-trained 3D model while achieving better performance over different 3D baselines. We used VGG16 for feature extraction in the implementation as it outperformed other 3D CNN backbones. We further showed that the weights of the slices are location-dependent, and the model performance relies on the 3D-to-2D fusion view, with the best outcomes from the coronal view. With the new approach, we increased the accuracy to 0.88, compared with conventional 3D CNNs, for classifying brain tumour images. The novel 2D-3D model may have profound implications for future timely BT diagnosis in clinical settings.
APA, Harvard, Vancouver, ISO, and other styles
49

Qing, Yuhao, and Wenyi Liu. "Hyperspectral Image Classification Based on Multi-Scale Residual Network with Attention Mechanism." Remote Sensing 13, no. 3 (2021): 335. http://dx.doi.org/10.3390/rs13030335.

Full text
Abstract:
In recent years, image classification on hyperspectral imagery utilizing deep learning algorithms has attained good results. Thus, spurred by that finding and to further improve the deep learning classification accuracy, we propose a multi-scale residual convolutional neural network model fused with an efficient channel attention network (MRA-NET) that is appropriate for hyperspectral image classification. The suggested technique comprises a multi-staged architecture, where initially the spectral information of the hyperspectral image is reduced into a two-dimensional tensor, utilizing a principal component analysis (PCA) scheme. Then, the constructed low-dimensional image is input to our proposed ECA-NET deep network, which exploits the advantages of its core components, i.e., multi-scale residual structure and attention mechanisms. We evaluate the performance of the proposed MRA-NET on three public available hyperspectral datasets and demonstrate that, overall, the classification accuracy of our method is 99.82 %, 99.81%, and 99.37, respectively, which is higher compared to the corresponding accuracy of current networks such as 3D convolutional neural network (CNN), three-dimensional residual convolution structure (RES-3D-CNN), and space–spectrum joint deep network (SSRN).
APA, Harvard, Vancouver, ISO, and other styles
50

S, Hemnath, and Geetha Ramalingam. "Comparing the Performance of Accuracy Using 3D CNN Model with the Fixed Spatial Transform With 3D CNN Model for the Detection of Pulmonary Nodules." E3S Web of Conferences 399 (2023): 09003. http://dx.doi.org/10.1051/e3sconf/202339909003.

Full text
Abstract:
Aim: The research study aims to detect the accuracy level of the pulmonary nodule using a convolutional neural network (CNN). The comparison between the Novel 3D CNN-fixed spatial transform algorithm and Novel 3D CNN Model algorithm for accurate detection. Materials and Methods: The information for this study was gained from the Kaggle website. The samples were taken into consideration as (N=20) for 3D CNN-fixed spatial transform and (N=20) 3D CNN Model according to the clinical. com, total sample size calculation was performed. Python software is used for accurate detection. Threshold Alpha is 0.05 %, G power is 80% and the enrollment ratio is set to 1. Result: This research study found that the 3D CNN with 89.29% of accuracy is preferred over 3D CNN with fixed spatial transform which gives 78.5% accuracy with a significance value (p=0.001), (p<0.05) with a 95% confidence interval. There is statistical significance between the two groups. Conclusion: The mean value of 3D CNN -fixed spatial transform is 78.5% and Novel 3D CNN is 89.29%.Novel 3D CNN appears to give better accuracy than 3D CNN-fixed spatial transform.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!