Log in

Relevant bibliographies by topics / 2D Convolution Neural Network (CNN)

Contents

Journal articles
Dissertations / Theses
Book chapters
Conference papers

Academic literature on the topic '2D Convolution Neural Network (CNN)'

Author: Grafiati

Published: 5 June 2025

Last updated: 9 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic '2D Convolution Neural Network (CNN).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "2D Convolution Neural Network (CNN)"

1

Liao, Shengbin, Xiaofeng Wang, and ZongKai Yang. "A heterogeneous two-stream network for human action recognition." AI Communications 36, no. 3 (2023): 219–33. http://dx.doi.org/10.3233/aic-220188.

Full text

Abstract:

The most widely used two-stream architectures and building blocks for human action recognition in videos generally consist of 2D or 3D convolution neural networks. 3D convolution can abstract motion messages between video frames, which is essential for video classification. 3D convolution neural networks usually obtain good performance compared with 2D cases, however it also increases computational cost. In this paper, we propose a heterogeneous two-stream architecture which incorporates two convolutional networks. One uses a mixed convolution network (MCN), which combines some 3D convolutions in the middle of 2D convolutions to train RGB frames, another one adopts BN-Inception network to train Optical Flow frames. Considering the redundancy of neighborhood video frames, we adopt a sparse sampling strategy to decrease the computational cost. Our architecture is trained and evaluated on the standard video actions benchmarks of HMDB51 and UCF101. Experimental results show our approach obtains the state-of-the-art performance on the datasets of HMDB51 (73.04%) and UCF101 (95.27%).

APA, Harvard, Vancouver, ISO, and other styles

2

Wang, Baiyang, Dongyue Huo, Yuyun Kang, and Jie Sun. "AGV Status Monitoring and Fault Diagnosis based on CNN." Journal of Physics: Conference Series 2281, no. 1 (2022): 012019. http://dx.doi.org/10.1088/1742-6596/2281/1/012019.

Full text

Abstract:

Abstract In order to solve the problem of AGV fault detection system’s complexion and low accuracy, a convolutional neural network (CNN) based on the status monitoring and fault diagnosis method for automatic guided vehicle (AGV) is proposed. Firstly, the vibration signals of the core components of AGV are converted into two-dimensional (2D) images. Secondly, 2D images are input into convolution neural network for training. Finally, the trained model is used to monitor the running status of AGV and identify faults. The results show that the proposed method can effectively monitor the status of AGV in operation.

APA, Harvard, Vancouver, ISO, and other styles

3

Chidester, Benjamin, Tianming Zhou, Minh N. Do, and Jian Ma. "Rotation equivariant and invariant neural networks for microscopy image analysis." Bioinformatics 35, no. 14 (2019): i530—i537. http://dx.doi.org/10.1093/bioinformatics/btz353.

Full text

Abstract:

Abstract Motivation Neural networks have been widely used to analyze high-throughput microscopy images. However, the performance of neural networks can be significantly improved by encoding known invariance for particular tasks. Highly relevant to the goal of automated cell phenotyping from microscopy image data is rotation invariance. Here we consider the application of two schemes for encoding rotation equivariance and invariance in a convolutional neural network, namely, the group-equivariant CNN (G-CNN), and a new architecture with simple, efficient conic convolution, for classifying microscopy images. We additionally integrate the 2D-discrete-Fourier transform (2D-DFT) as an effective means for encoding global rotational invariance. We call our new method the Conic Convolution and DFT Network (CFNet). Results We evaluated the efficacy of CFNet and G-CNN as compared to a standard CNN for several different image classification tasks, including simulated and real microscopy images of subcellular protein localization, and demonstrated improved performance. We believe CFNet has the potential to improve many high-throughput microscopy image analysis applications. Availability and implementation Source code of CFNet is available at: https://github.com/bchidest/CFNet. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

4

Feng, Yuchao, Jianwei Zheng, Mengjie Qin, Cong Bai, and Jinglin Zhang. "3D Octave and 2D Vanilla Mixed Convolutional Neural Network for Hyperspectral Image Classification with Limited Samples." Remote Sensing 13, no. 21 (2021): 4407. http://dx.doi.org/10.3390/rs13214407.

Full text

Abstract:

Owing to the outstanding feature extraction capability, convolutional neural networks (CNNs) have been widely applied in hyperspectral image (HSI) classification problems and have achieved an impressive performance. However, it is well known that 2D convolution suffers from the absent consideration of spectral information, while 3D convolution requires a huge amount of computational cost. In addition, the cost of labeling and the limitation of computing resources make it urgent to improve the generalization performance of the model with scarcely labeled samples. To relieve these issues, we design an end-to-end 3D octave and 2D vanilla mixed CNN, namely Oct-MCNN-HS, based on the typical 3D-2D mixed CNN (MCNN). It is worth mentioning that two feature fusion operations are deliberately constructed to climb the top of the discriminative features and practical performance. That is, 2D vanilla convolution merges the feature maps generated by 3D octave convolutions along the channel direction, and homology shifting aggregates the information of the pixels locating at the same spatial position. Extensive experiments are conducted on four publicly available HSI datasets to evaluate the effectiveness and robustness of our model, and the results verify the superiority of Oct-MCNN-HS both in efficacy and efficiency.

APA, Harvard, Vancouver, ISO, and other styles

5

M, Sowmya, Balasubramanian M, and Vaidehi K. "Human Behavior Classification using 2D – Convolutional Neural Network, VGG16 and ResNet50." Indian Journal of Science and Technology 16, no. 16 (2023): 1221–29. https://doi.org/10.17485/IJST/v16i16.199.

Full text

Abstract:

Abstract <strong>Objective:</strong> To develop a real-time application for human behavior classification using 2- Dimensional Convolution Neural Network, VGG16 and ResNet50. <strong>Methods:</strong> This study provides a novel system which considers sitting, standing and walking as normal human behaviors. It consists of three major steps: dataset collection, training, and testing. In this work real time images are used. In human behavior classification dataset there are 2271 trained images and 539 testing images. <strong>Findings:</strong> The Convolution Neural Network (CNN), VGG16 and ResNet50 are trained using human normal behavior images. <strong>Novelty:</strong> The dataset namely human behavior classification dataset is used in this work and the experimental results has shown that on human behavior classification ResNet50 has outperformed with accuracy of 99.72% compared to VGG16 and 2D-CNN. This work can detect the three normal behaviors of humans in an unconstrained laboratory environment. <strong>Keywords:</strong> Deep Learning; 2D Convolution Neural Network (CNN); Human Behavior Classification; ADAM Optimizer; VGG16; ResNet50

APA, Harvard, Vancouver, ISO, and other styles

6

Yuan, Q., Y. Ang, and H. Z. M. Shafri. "HYPERSPECTRAL IMAGE CLASSIFICATION USING RESIDUAL 2D AND 3D CONVOLUTIONAL NEURAL NETWORK JOINT ATTENTION MODEL." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIV-M-3-2021 (August 10, 2021): 187–93. http://dx.doi.org/10.5194/isprs-archives-xliv-m-3-2021-187-2021.

Full text

Abstract:

Abstract. Hyperspectral image classification (HSIC) is a challenging task in remote sensing data analysis, which has been applied in many domains for better identification and inspection of the earth surface by extracting spectral and spatial information. The combination of abundant spectral features and accurate spatial information can improve classification accuracy. However, many traditional methods are based on handcrafted features, which brings difficulties for multi-classification tasks due to spectral intra-class heterogeneity and similarity of inter-class. The deep learning algorithm, especially the convolutional neural network (CNN), has been perceived promising feature extractor and classification for processing hyperspectral remote sensing images. Although 2D CNN can extract spatial features, the specific spectral properties are not used effectively. While 3D CNN has the capability for them, but the computational burden increases as stacking layers. To address these issues, we propose a novel HSIC framework based on the residual CNN network by integrating the advantage of 2D and 3D CNN. First, 3D convolutions focus on extracting spectral features with feature recalibration and refinement by channel attention mechanism. The 2D depth-wise separable convolution approach with different size kernels concentrates on obtaining multi-scale spatial features and reducing model parameters. Furthermore, the residual structure optimizes the back-propagation for network training. The results and analysis of extensive HSIC experiments show that the proposed residual 2D-3D CNN network can effectively extract spectral and spatial features and improve classification accuracy.

APA, Harvard, Vancouver, ISO, and other styles

7

Gao, Wenqiang, Zhiyun Xiao, and Tengfei Bao. "Detection and Identification of Potato-Typical Diseases Based on Multidimensional Fusion Atrous-CNN and Hyperspectral Data." Applied Sciences 13, no. 8 (2023): 5023. http://dx.doi.org/10.3390/app13085023.

Full text

Abstract:

As one of the world’s most crucial crops, the potato is an essential source of nutrition for human activities. However, several diseases pose a severe threat to the yield and quality of potatoes. Timely and accurate detection and identification of potato diseases are of great importance. Hyperspectral imaging has emerged as an essential tool that provides rich spectral and spatial distribution information and has been widely used in potato disease detection and identification. Nevertheless, the accuracy of prediction is often low when processing hyperspectral data using a one-dimensional convolutional neural network (1D-CNN). Additionally, conventional three-dimensional convolutional neural networks (3D-CNN) often require high hardware consumption while processing hyperspectral data. In this paper, we propose an Atrous-CNN network structure that fuses multiple dimensions to address these problems. The proposed structure combines the spectral information extracted by 1D-CNN, the spatial information extracted by 2D-CNN, and the spatial spectrum information extracted by 3D-CNN. To enhance the perceptual field of the convolution kernel and reduce the loss of hyperspectral data, null convolution is utilized in 1D-CNN and 2D-CNN to extract data features. We tested the proposed structure on three real-world potato diseases and achieved recognition accuracy of up to 0.9987. The algorithm presented in this paper effectively extracts hyperspectral data feature information using three different dimensional CNNs, leading to higher recognition accuracy and reduced hardware consumption. Therefore, it is feasible to use the 1D-CNN network and hyperspectral image technology for potato plant disease identification.

APA, Harvard, Vancouver, ISO, and other styles

8

Leong, Mei Chee, Dilip K. Prasad, Yong Tsui Lee, and Feng Lin. "Semi-CNN Architecture for Effective Spatio-Temporal Learning in Action Recognition." Applied Sciences 10, no. 2 (2020): 557. http://dx.doi.org/10.3390/app10020557.

Full text

Abstract:

This paper introduces a fusion convolutional architecture for efficient learning of spatio-temporal features in video action recognition. Unlike 2D convolutional neural networks (CNNs), 3D CNNs can be applied directly on consecutive frames to extract spatio-temporal features. The aim of this work is to fuse the convolution layers from 2D and 3D CNNs to allow temporal encoding with fewer parameters than 3D CNNs. We adopt transfer learning from pre-trained 2D CNNs for spatial extraction, followed by temporal encoding, before connecting to 3D convolution layers at the top of the architecture. We construct our fusion architecture, semi-CNN, based on three popular models: VGG-16, ResNets and DenseNets, and compare the performance with their corresponding 3D models. Our empirical results evaluated on the action recognition dataset UCF-101 demonstrate that our fusion of 1D, 2D and 3D convolutions outperforms its 3D model of the same depth, with fewer parameters and reduces overfitting. Our semi-CNN architecture achieved an average of 16–30% boost in the top-1 accuracy when evaluated on an input video of 16 frames.

APA, Harvard, Vancouver, ISO, and other styles

9

Liang, Lianhui, Shaoquan Zhang, Jun Li, Antonio Plaza, and Zhi Cui. "Multi-Scale Spectral-Spatial Attention Network for Hyperspectral Image Classification Combining 2D Octave and 3D Convolutional Neural Networks." Remote Sensing 15, no. 7 (2023): 1758. http://dx.doi.org/10.3390/rs15071758.

Full text

Abstract:

Traditional convolutional neural networks (CNNs) can be applied to obtain the spectral-spatial feature information from hyperspectral images (HSIs). However, they often introduce significant redundant spatial feature information. The octave convolution network is frequently utilized instead of traditional CNN to decrease spatial redundant information of the network and extend its receptive field. However, the 3D octave convolution-based approaches may introduce extensive parameters and complicate the network. To solve these issues, we propose a new HSI classification approach with a multi-scale spectral-spatial network-based framework that combines 2D octave and 3D CNNs. Our method, called MOCNN, first utilizes 2D octave convolution and 3D DenseNet branch networks with various convolutional kernel sizes to obtain complex spatial contextual feature information and spectral characteristics, separately. Moreover, the channel and the spectral attention mechanisms are, respectively, applied to these two branch networks to emphasize significant feature regions and certain important spectral bands that comprise discriminative information for the categorization. Furthermore, a sample balancing strategy is applied to address the sample imbalance problem. Expansive experiments are undertaken on four HSI datasets, demonstrating that our MOCNN approach outperforms several other methods for HSI classification, especially in scenarios dominated by limited and imbalanced sample data.

APA, Harvard, Vancouver, ISO, and other styles

10

Archana, D. "Brain Tumor Detection Using Convolution Neural Networks." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem46974.

Full text

Abstract:

ABSTRACT Early diagnosis of brain tumors is important for improving patient prognoses; however, traditional diagnostic methods like biopsies require invasive surgical procedures. In this paper, we introduce two deep learning-based methods—a new two-dimensional Convolutional Neural Network (CNN) and a convolutional auto-encoder network—that enable the accurate classification of brain tumors from magnetic resonance imaging (MRI). A data set of 7,000 T1-weighted contrast-enhanced MRI images was utilized, including glioma, meningioma, pituitary gland tumor, and normal brain samples. Preprocessing and augmentation procedures were carried out on the data set in order to enhance the generalization ability of the models. The suggested architecture of the 2D CNN includes eight convolutional layers, four pooling layers, and utilizes batch normalization with a uniform 2×2 kernel across the network. The auto-encoder architecture combines feature extraction and classification by utilizing the last output of the encoder. Experimental results show that the 2D CNN achieved a training accuracy of 96.47% with an average recall of 95%, showing good performance and efficiency in computation. The simplicity and effectiveness of the proposed CNN make it a promising tool for real-time clinical applications, offering a non-surgical and highly reliable tool for the assistance of radiologists in the diagnosis of brain tumors.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "2D Convolution Neural Network (CNN)"

1

Kapoor, Rishika. "Malaria Detection Using Deep Convolution Neural Network." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1613749143868579.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Shuvo, Md Kamruzzaman. "Hardware Efficient Deep Neural Network Implementation on FPGA." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/theses/2792.

Full text

Abstract:

In recent years, there has been a significant push to implement Deep Neural Networks (DNNs) on edge devices, which requires power and hardware efficient circuits to carry out the intensive matrix-vector multiplication (MVM) operations. This work presents hardware efficient MVM implementation techniques using bit-serial arithmetic and a novel MSB first computation circuit. The proposed designs take advantage of the pre-trained network weight parameters, which are already known in the design stage. Thus, the partial computation results can be pre-computed and stored into look-up tables. Then the MVM results can be computed in a bit-serial manner without using multipliers. The proposed novel circuit implementation for convolution filters and rectified linear activation function used in deep neural networks conducts computation in an MSB-first bit-serial manner. It can predict earlier if the outcomes of filter computations will be negative and subsequently terminate the remaining computations to save power. The benefits of using the proposed MVM implementations techniques are demonstrated by comparing the proposed design with conventional implementation. The proposed circuit is implemented on an FPGA. It shows significant power and performance improvements compared to the conventional designs implemented on the same FPGA.

APA, Harvard, Vancouver, ISO, and other styles

3

Ďuriš, Denis. "Detekce ohně a kouře z obrazového signálu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-412968.

Full text

Abstract:

This diploma thesis deals with the detection of fire and smoke from the image signal. The approach of this work uses a combination of convolutional and recurrent neural network. Machine learning models created in this work contain inception modules and blocks of long short-term memory. The research part describes selected models of machine learning used in solving the problem of fire detection in static and dynamic image data. As part of the solution, a data set containing videos and still images used to train the designed neural networks was created. The results of this approach are evaluated in conclusion.

APA, Harvard, Vancouver, ISO, and other styles

4

Andersson, Viktor. "Semantic Segmentation : Using Convolutional Neural Networks and Sparse dictionaries." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139367.

Full text

Abstract:

The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This thesis investigates the effect of using such filters as kernels in the convolutional layers in the neural network. How do they affect the training time and final performance? The dataset used here is the Cityscapes-dataset which is a library of 25000 labeled road scene images.The sparse dictionary was acquired using the K-SVD method. The filters were added to two different networks whose performance was tested individually. One of the architectures is much deeper than the other. The results have been presented for both networks. The results show that filter initialization is an important aspect which should be taken into consideration while training the deep networks for semantic segmentation.

APA, Harvard, Vancouver, ISO, and other styles

5

Sparr, Henrik. "Object detection for a robotic lawn mower with neural network trained on automatically collected data." Thesis, Uppsala universitet, Datorteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444627.

Full text

Abstract:

Machine vision is hot research topic with findings being published at a high pace and more and more companies currently developing automated vehicles. Robotic lawn mowers are also increasing in popularity but most mowers still use relatively simple methods for cutting the lawn. No previous work has been published on machine learning networks that improved between cutting sessions by automatically collecting data and then used it for training. A data acquisition pipeline and neural network architecture that could help the mower in avoiding collision was therefor developed. Nine neural networks were tested of which a convolutional one reached the highest accuracy. The performance of the data acquisition routine and the networks show that it is possible to design a object detection model that improves between runs.

APA, Harvard, Vancouver, ISO, and other styles

6

Pradels, Léo. "Efficient CNN inference acceleration on FPGAs : a pattern pruning-driven approach." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS087.

Full text

Abstract:

Les modèles d'apprentissage profond basés sur les CNNs offrent des performances de pointe dans les tâches de traitement d'images et de vidéos, en particulier pour l'amélioration ou la classification d'images. Cependant, ces modèles sont lourds en calcul et en empreinte mémoire, ce qui les rend inadaptés aux contraintes de temps réel sur des FPGA embarqués. Il est donc essentiel de compresser ces CNNs et de concevoir des architectures d'accélérateurs pour l'inférence qui intègrent la compression dans une approche de co-conception matérielle et logicielle. Bien que des optimisations logicielles telles que l'élagage aient été proposées, elles manquent souvent de structure nécessaire à une intégration efficace de l'accélérateur. Pour répondre à ces limitations, cette thèse se concentre sur l'accélération des CNNs sur FPGA tout en respectant les contraintes de temps réel sur les systèmes embarqués. Cet objectif est atteint grâce à plusieurs contributions clés. Tout d'abord, elle introduit l'élagage des motifs, qui impose une structure à la sparsité du réseau, permettant une accélération matérielle efficace avec une perte de précision minimale due à la compression. Deuxièmement, un accélérateur pour l'inférence de CNN est présenté, qui adapte son architecture en fonction des critères de performance d'entrée, des spécifications FPGA et de l'architecture du modèle CNN cible. Une méthode efficace d'intégration de l'élagage des motifs dans l'accélérateur et un flux complet pour l'accélération de CNN sont proposés. Enfin, des améliorations de la compression du réseau sont explorées grâce à la quantification de Shift\&Add, qui modifie les méthodes de multiplication sur FPGA tout en maintenant la précision du réseau de base<br>CNN-based deep learning models provide state-of-the-art performance in image and video processing tasks, particularly for image enhancement or classification. However, these models are computationally and memory-intensive, making them unsuitable for real-time constraints on embedded FPGA systems. As a result, compressing these CNNs and designing accelerator architectures for inference that integrate compression in a hardware-software co-design approach is essential. While software optimizations like pruning have been proposed, they often lack the structured approach needed for effective accelerator integration. To address these limitations, this thesis focuses on accelerating CNNs on FPGAs while complying with real-time constraints on embedded systems. This is achieved through several key contributions. First, it introduces pattern pruning, which imposes structure on network sparsity, enabling efficient hardware acceleration with minimal accuracy loss due to compression. Second, a scalable accelerator for CNN inference is presented, which adapts its architecture based on input performance criteria, FPGA specifications, and target CNN model architecture. An efficient method for integrating pattern pruning within the accelerator and a complete flow for CNN acceleration are proposed. Finally, improvements in network compression are explored through Shift&Add quantization, which modifies FPGA computation methods while maintaining baseline network accuracy

APA, Harvard, Vancouver, ISO, and other styles

7

JAFARI, MUHAMMAD REZA. "PERSIAN SIGN GESTURE TRANSLATION TO ENGLISH SPOKEN LANGUAGE ON SMARTPHONE." Thesis, DELHI TECHNOLOGICAL UNIVERSITY, 2020. http://dspace.dtu.ac.in:8080/jspui/handle/repository/18787.

Full text

Abstract:

Hearing impaired and others with verbal challenges face difficulty to communicate with society; Sign Language represents their communication such as numbers or phrases. The communication becomes a challenge with people from other countries using different languages. Additionally, the sign language is different from one country to another. That is, learning one sign language doesn’t mean learning all sign languages. To translate a word from sign language to a spoken language is a challenge and to change a particular word from that language to another language is even a bigger challenge. In such cases, there is necessity for 2 interpreters: One from sign language to the source-spoken language and one from the source language to the target language. There is ample research done on sign recognition, yet this paper focuses on translating gestures from one language to another. In this study, a smartphone approach is proposed for Sign Language recognition, because smartphones are available worldwide. Smartphones are limited in computational power so, a client server application is proposed where most of processing tasks are done on the server side. In client-server application system, client could be a smartphone application that captures images of sign gestures to be recognized and sent to a server. In turn, the server processes the data and returns the translation Sign to client. On the server application side, where most of the sign recognition tasks take place, background of the sign image is deleted, and under Hue, Saturation, Value (HSV) color space is set to black. The sign gesture then separate by detecting the biggest linked constituent in the frame. Extracted feature are in binary form pixels, and Convolutional Neural Network (CNN) is used to classify sign images. After classification, the letter for a given sign is assigned, and by putting the sequence of letters, a word is created. The word translates to target language, in this case English, and the result returns to client application.

APA, Harvard, Vancouver, ISO, and other styles

8

Abidi, Azza. "Investigating Deep Learning and Image-Encoded Time Series Approaches for Multi-Scale Remote Sensing Analysis in the context of Land Use/Land Cover Mapping." Electronic Thesis or Diss., Université de Montpellier (2022-....), 2024. http://www.theses.fr/2024UMONS007.

Full text

Abstract:

Cette thèse explore le potentiel de l'apprentissage automatique pour améliorer la cartographie de modèles complexes d'utilisation des sols et de la couverture terrestre à l'aide de données d'observation de la Terre. Traditionnellement, les méthodes de cartographie reposent sur la classification et l'interprétation manuelles des images satellites, qui sont sujettes à l'erreur humaine. Cependant, l'application de l'apprentissage automatique, en particulier par le biais des réseaux neuronaux, a automatisé et amélioré le processus de classification, ce qui a permis d'obtenir des résultats plus objectifs et plus précis. En outre, l'intégration de données de séries temporelles d'images satellitaires (STIS) ajoute une dimension temporelle aux informations spatiales, offrant une vue dynamique de la surface de la Terre au fil du temps. Ces informations temporelles sont essentielles pour une classification précise et une prise de décision éclairée dans diverses applications. Les informations d'utilisation des sols et de la couverture terrestre précises et actuelles dérivées des données STIS sont essentielles pour guider les initiatives de développement durable, la gestion des ressources et l'atténuation des risques environnementaux. Le processus de cartographie de d'utilisation des sols et de la couverture terrestre à l'aide du l'apprentissage automatique implique la collecte de données, le prétraitement, l'extraction de caractéristiques et la classification à l'aide de divers algorithmes l'apprentissage automatique . Deux stratégies principales de classification des données STIS ont été proposées : l'approche au niveau du pixel et l'approche basée sur l'objet. Bien que ces deux approches se soient révélées efficaces, elles posent également des problèmes, tels que l'incapacité à capturer les informations contextuelles dans les approches basées sur les pixels et la complexité de la segmentation dans les approches basées sur les objets.Pour relever ces défis, cette thèse vise à mettre en œuvre une métho basée sur des informations multi-échelles pour effectuer la classification de l'utilisation des terres et de la couverture terrestre, en couplant les informations spectrales et temporelles par le biais d'une méthodologie combinée pixel-objet et en appliquant une approche méthodologique pour représenter efficacement les données multi-variées SITS dans le but de réutiliser la grande quantité d'avancées de la recherche proposées dans le domaine de la vision par ordinateur<br>In this thesis, the potential of machine learning (ML) in enhancing the mapping of complex Land Use and Land Cover (LULC) patterns using Earth Observation data is explored. Traditionally, mapping methods relied on manual and time-consuming classification and interpretation of satellite images, which are susceptible to human error. However, the application of ML, particularly through neural networks, has automated and improved the classification process, resulting in more objective and accurate results. Additionally, the integration of Satellite Image Time Series(SITS) data adds a temporal dimension to spatial information, offering a dynamic view of the Earth's surface over time. This temporal information is crucial for accurate classification and informed decision-making in various applications. The precise and current LULC information derived from SITS data is essential for guiding sustainable development initiatives, resource management, and mitigating environmental risks. The LULC mapping process using ML involves data collection, preprocessing, feature extraction, and classification using various ML algorithms. Two main classification strategies for SITS data have been proposed: pixel-level and object-based approaches. While both approaches have shown effectiveness, they also pose challenges, such as the inability to capture contextual information in pixel-based approaches and the complexity of segmentation in object-based approaches.To address these challenges, this thesis aims to implement a method based on multi-scale information to perform LULC classification, coupling spectral and temporal information through a combined pixel-object methodology and applying a methodological approach to efficiently represent multivariate SITS data with the aim of reusing the large amount of research advances proposed in the field of computer vision

APA, Harvard, Vancouver, ISO, and other styles

9

Šůstek, Martin. "Word2vec modely s přidanou kontextovou informací." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2017. http://www.nusl.cz/ntk/nusl-363837.

Full text

Abstract:

This thesis is concerned with the explanation of the word2vec models. Even though word2vec was introduced recently (2013), many researchers have already tried to extend, understand or at least use the model because it provides surprisingly rich semantic information. This information is encoded in N-dim vector representation and can be recall by performing some operations over the algebra. As an addition, I suggest a model modifications in order to obtain different word representation. To achieve that, I use public picture datasets. This thesis also includes parts dedicated to word2vec extension based on convolution neural network.

APA, Harvard, Vancouver, ISO, and other styles

10

Marek, Jan. "Rekonstrukce chybějících části obličeje pomocí neuronové sítě." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-433506.

Full text

Abstract:

Cílem této práce je vytvořit neuronovou síť která bude schopna rekonstruovat obličeje z fotografií na kterých je část obličeje překrytá maskou. Jsou prezentovány koncepty využívané při vývoji konvolučních neuronových sítí a generativních kompetitivních sítí. Dále jsou popsány koncepty používané v neuronových sítích specificky pro rekonstrukci fotografií obličejů. Je představen model generativní kompetitivní sítě využívající kombinaci hrazených konvolučních vrstev a víceškálových bloků schopný realisticky doplnit oblasti obličeje zakryté maskou.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "2D Convolution Neural Network (CNN)"

1

Girde, Surbhi, Reema Roychaudhary, and Aashay Wanjari. "Driver drowsiness detection using Convolution Neural Network (CNN)." In Technological Innovations & Applications in Industry 4.0. CRC Press, 2024. http://dx.doi.org/10.1201/9781003567653-47.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Gosai, Dhruvi, Shraddha Vyas, Sanjay Patel, Prasann Barot, and Krishna Suthar. "Handwritten Signature Verification Using Convolution Neural Network (CNN)." In Advancements in Smart Computing and Information Security. Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-23092-9_8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Yadav, Rohit, Sagar Pande, and Aditya Khamparia. "Breast Cancer Classification Using Convolution Neural Network (CNN)." In Communications in Computer and Information Science. Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-3660-8_27.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Sahu, Madhusmita, and Rasmita Dash. "A Survey on Deep Learning: Convolution Neural Network (CNN)." In Smart Innovation, Systems and Technologies. Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-6202-0_32.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Tayel, Mazhar B., Mohamed-Amr A. Mokhtar, and Ahmed F. Kishk. "Breast Cancer Diagnosis Using Histopathology and Convolution Neural Network CNN Method." In International Conference on Innovative Computing and Communications. Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-2821-5_49.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Srivastava, Shriansh, J. Priyadarshini, Sachin Gopal, Sanchay Gupta, and Har Shobhit Dayal. "Optical Character Recognition on Bank Cheques Using 2D Convolution Neural Network." In Advances in Intelligent Systems and Computing. Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-13-1822-1_55.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Vani, K. S., Rupesh Sapkota, Sparsh Shrestha, and Srujan B. "Creating a 3D Model from 2D Images Using Convolution Neural Network." In Advances in Intelligent Systems and Computing. Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-8443-5_56.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Cheng, Li, Xiao-Yuan Jing, Xiaoke Zhu, et al. "A Hybrid 2D and 3D Convolution Based Recurrent Network for Video-Based Person Re-identification." In Neural Information Processing. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-04167-0_40.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Naga Ajay Kumar, D. P., A. Sai Ganesh, B. Kaushik, M. Shailaja, and D. Mohan. "Traffic Sign Recognition and Classification with Convolution Neural Network (CNN) and OpenCV." In Lecture Notes in Networks and Systems. Springer Nature Singapore, 2024. https://doi.org/10.1007/978-981-97-7360-2_19.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Deb, Chandan Kumar, Ayon Tarafdar, Md Ashraful Haque, et al. "Convolution Neural Network (CNN)-Based Live Pig Weight Estimation in Controlled Imaging Platform." In Communication and Intelligent Systems. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2079-8_8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "2D Convolution Neural Network (CNN)"

1

Wijesundara, Chiran, Guangpeng Xu, James Berry, Abigail Stressinger, and Tim Thomay. "Experimental higher-order photon state classification assisted by machine learning." In Latin America Optics and Photonics Conference. Optica Publishing Group, 2024. https://doi.org/10.1364/laop.2024.tu1a.6.

Full text

Abstract:

Classification of experimentally determined higher order photon states using a novel Machine Learning model based on a 2D Convolutional Neural Network (CNN) for rapid classification of multiphoton Fock states up to |3⟩.

APA, Harvard, Vancouver, ISO, and other styles

2

Wahyudi, Wahyudi, and Guruh Fajar Shidik. "Edible and Poisonous Mushroom Classification using Convolution Neural Network (CNN)." In 2024 International Seminar on Application for Technology of Information and Communication (iSemantic). IEEE, 2024. https://doi.org/10.1109/isemantic63362.2024.10762192.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Madhura, R., S. Nikkath Bushra, A. Syed Ismail, and Haleema PK. "Oil Spill Detection Using 2D Convolution Neural Network and Generative Adversarial Network." In 2024 International Conference on Expert Clouds and Applications (ICOECA). IEEE, 2024. http://dx.doi.org/10.1109/icoeca62351.2024.00140.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Mohamed, Omar, Ahmed Mohamed, and Ahmed Alenany. "Heart Rate Estimation from Facial Videos Using 2D Convolution Neural Network." In 2024 Intelligent Methods, Systems, and Applications (IMSA). IEEE, 2024. http://dx.doi.org/10.1109/imsa61967.2024.10652792.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Gogoi, Lakhipriya, and Md Anwar Hussain. "Classification of lung CT scan images using 2D convolution neural network (2D CNN)." In RECENT ADVANCES IN INDUSTRY 4.0 TECHNOLOGIES. AIP Publishing, 2023. http://dx.doi.org/10.1063/5.0175565.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Xu, Jinglin, Xiangsen Zhang, Wenbin Li, Xinwang Liu, and Junwei Han. "Joint Multi-view 2D Convolutional Neural Networks for 3D Object Classification." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/443.

Full text

Abstract:

Three-dimensional (3D) object classification is widely involved in various computer vision applications, e.g., autonomous driving, simultaneous localization and mapping, which has attracted lots of attention in the committee. However, solving 3D object classification by directly employing the 3D convolutional neural networks (CNNs) generally suffers from high computational cost. Besides, existing view-based methods cannot better explore the content relationships between views. To this end, this work proposes a novel multi-view framework by jointly using multiple 2D-CNNs to capture discriminative information with relationships as well as a new multi-view loss fusion strategy, in an end-to-end manner. Specifically, we utilize multiple 2D views of a 3D object as input and integrate the intra-view and inter-view information of each view through the view-specific 2D-CNN and a series of modules (outer product, view pair pooling, 1D convolution, and fully connected transformation). Furthermore, we design a novel view ensemble mechanism that selects several discriminative and informative views to jointly infer the category of a 3D object. Extensive experiments demonstrate that the proposed method is able to outperform current state-of-the-art methods on 3D object classification. More importantly, this work provides a new way to improve 3D object classification from the perspective of fully utilizing well-established 2D-CNNs.

APA, Harvard, Vancouver, ISO, and other styles

7

Chudobova, Michaela, Jan Kubicek, Radomir Scurek, and Marek Hutter. "IMPLEMENTATION OF INTELLIGENT BIOMETRIC SYSTEM FOR FACE DETECTION AND CLASSIFICATION." In 22nd SGEM International Multidisciplinary Scientific GeoConference 2022. STEF92 Technology, 2022. http://dx.doi.org/10.5593/sgem2022/2.1/s07.06.

Full text

Abstract:

This article deals with the design and implementation of an intelligent biometric system that allows the detection and classification of a person's face from static image data and creates a system for evaluating its reliability. In its introductory part, it theoretically describes applied biometrics and biometric systems for security identification and user verification, and also deals with the theory of the description of algorithms for human face detection and recognition. Subsequently, the authors use the MATLAB programming language, which is highly optimized for modern processors and memory architectures, to focus on the implementation and testing of a biometric system using Viola-Jones algorithms and a convolutional neural network with a pre-trained network NetNet. Convolutional neural networks (CNN) are the most recognized and popular deep-learning neural networks, which are based on layers that perform two-dimensional (2D) convolution of input data with learned filters. In the final part there is a discussion where, based on the results of testing, the robustness and efficiency of the proposed intelligent biometric system is objectively evaluated. The results allow for the continued development of other pre-trained artificial neural networks, variable implementations for facial recognition, but also other things, such as the recognition of potentially dangerous people.

APA, Harvard, Vancouver, ISO, and other styles

8

Ghebrechristos, Henok, and Gita Alaghband. "3D Convolution for Proactive Défense Against Localized Adversary Attacks." In 12th International Conference on Soft Computing, Artificial Intelligence and Applications. Academy & Industry Research Collaboration Center, 2023. http://dx.doi.org/10.5121/csit.2023.132414.

Full text

Abstract:

This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks (CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations. When combined with 3D convolution and deep curriculum learning optimization (CLO), it significantly improves the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10 and CIFAR-100)and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing accuracy improvements over previous techniques. The results indicate that the combination of the volumetric input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating adversary training.

APA, Harvard, Vancouver, ISO, and other styles

9

Dai, Guoxian, Jin Xie, and Yi Fang. "Siamese CNN-BiLSTM Architecture for 3D Shape Representation Learning." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/93.

Full text

Abstract:

Learning a 3D shape representation from a collection of its rendered 2D images has been extensively studied. However, existing view-based techniques have not yet fully exploited the information among all the views of projections. In this paper, by employing recurrent neural network to efficiently capture features across different views, we propose a siamese CNN-BiLSTM network for 3D shape representation learning. The proposed method minimizes a discriminative loss function to learn a deep nonlinear transformation, mapping 3D shapes from the original space into a nonlinear feature space. In the transformed space, the distance of 3D shapes with the same label is minimized, otherwise the distance is maximized to a large margin. Specifically, the 3D shapes are first projected into a group of 2D images from different views. Then convolutional neural network (CNN) is adopted to extract features from different view images, followed by a bidirectional long short-term memory (LSTM) to aggregate information across different views. Finally, we construct the whole CNN-BiLSTM network into a siamese structure with contrastive loss function. Our proposed method is evaluated on two benchmarks, ModelNet40 and SHREC 2014, demonstrating superiority over the state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

10

Gangopadhyay, Tryambak, Anthony Locurto, Paige Boor, James B. Michael, and Soumik Sarkar. "Characterizing Combustion Instability Using Deep Convolutional Neural Network." In ASME 2018 Dynamic Systems and Control Conference. American Society of Mechanical Engineers, 2018. http://dx.doi.org/10.1115/dscc2018-9208.

Full text

Abstract:

Detecting the transition to an impending instability is important to initiate effective control in a combustion system. As one of the early applications of characterizing thermoacoustic instability using Deep Neural Networks, we train our proposed deep convolutional neural network (CNN) model on sequential image frames extracted from hi-speed flame videos by inducing instability in the system following a particular protocol — varying the acoustic length. We leverage the sound pressure data to define a non-dimensional instability measure used for applying an inexpensive but noisy labeling technique to train our supervised 2D CNN model. We attempt to detect the onset of instability in a transient dataset where instability is induced by a different protocol. With the continuous variation of the control parameter, we can successfully detect the critical transition to a state of high combustion instability demonstrating the robustness of our proposed detection framework, which is independent of the combustion inducing protocol.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!