To see the other types of publications on this topic, follow the link: Spatial-temporal CNN.

Journal articles on the topic 'Spatial-temporal CNN'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Spatial-temporal CNN.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhao, Zhen, Ze Li, Fuxin Li, and Yang Liu. "CNN-LSTM Based Traffic Prediction Using Spatial-temporal Features." Journal of Physics: Conference Series 2037, no. 1 (2021): 012065. http://dx.doi.org/10.1088/1742-6596/2037/1/012065.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Horváth, András, and Tamás Roska. "Detection of Spatial-Temporal Events with Delayed CNN Templates." IEICE Proceeding Series 2 (March 17, 2014): 378–81. http://dx.doi.org/10.15248/proc.2.378.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Liu, Yumin, Zheyun Zhao, Shuai Zhang, and Uk Jung. "Identification of Abnormal Processes with Spatial-Temporal Data Using Convolutional Neural Networks." Processes 8, no. 1 (2020): 73. http://dx.doi.org/10.3390/pr8010073.

Full text
Abstract:
Identifying abnormal process operation with spatial-temporal data remains an important and challenging work in many practical situations. Although spatial-temporal data identification has been extensively studied in some domains, such as public health, geological condition, and environment pollution, the challenge associated with designing accurate and convenient recognition schemes is very rarely addressed in modern manufacturing processes. This paper proposes a general recognition framework for identifying abnormal process with spatial-temporal data by employing a convolutional neural networ
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Changyuan, Ting Yan, and Hongbo Jia. "Spatial-Temporal Feature Representation Learning for Facial Fatigue Detection." International Journal of Pattern Recognition and Artificial Intelligence 32, no. 12 (2018): 1856018. http://dx.doi.org/10.1142/s0218001418560189.

Full text
Abstract:
In order to reduce the serious problems caused by the operators’ fatigue, we propose a novel network model Convolutional Neural Network and Long Short-Term Memory Network (CNN-LSTM) — for fatigue detection in the inter-frame images of video sequences, which mainly consists of CNN and LSTM network. Firstly, in order to improve the accuracy of the deep network structure, the Viola–Jones detection algorithm and the Kernelized Correlation Filter (KCF) tracking algorithm are used in the face detection to normalize the size of the inter-frame images of video sequences. Secondly, we use the CNN and t
APA, Harvard, Vancouver, ISO, and other styles
5

Wang, Zengkai. "Spatial-Temporal Feature-Based Sports Video Classification." International Journal of Ambient Computing and Intelligence 12, no. 4 (2021): 79–97. http://dx.doi.org/10.4018/ijaci.2021100105.

Full text
Abstract:
Video classification has been an active research field of computer vision in last few years. Its main purpose is to produce a label that is relevant to the video given its frames. Unlike image classification, which takes still pictures as input, the input of video classification is a sequence of images. The complex spatial and temporal structures of video sequence incur understanding and computation difficulties, which should be modeled to improve the video classification performance. This work focuses on sports video classification but can be expanded into other applications. In this paper, t
APA, Harvard, Vancouver, ISO, and other styles
6

Meshkini, K., F. Bovolo, and L. Bruzzone. "A 3D CNN APPROACH FOR CHANGE DETECTION IN HR SATELLITE IMAGE TIME SERIES BASED ON A PRETRAINED 2D CNN." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B3-2022 (May 30, 2022): 143–50. http://dx.doi.org/10.5194/isprs-archives-xliii-b3-2022-143-2022.

Full text
Abstract:
Abstract. Over recent decades, Change Detection (CD) has been intensively investigated due to the availability of High Resolution (HR) multi-spectral multi-temporal remote sensing images. Deep Learning (DL) based methods such as Convolutional Neural Network (CNN) have recently received increasing attention in CD problems demonstrating high potential. However, most of the CNN-based CD methods are designed for bi-temporal image analysis. Here, we propose a Three-Dimensional (3D) CNN-based CD approach that can effectively deal with HR image time series and process spatial-spectral-temporal featur
APA, Harvard, Vancouver, ISO, and other styles
7

Wu, Honggang, Jiabi Niu, Yongqiang Li, Yinsheng Wang, and Daohong Qiu. "Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model." Applied Sciences 15, no. 13 (2025): 7245. https://doi.org/10.3390/app15137245.

Full text
Abstract:
Accurate prediction of landslide susceptibility is a key component of disaster risk reduction and early warning systems. Traditional landslide susceptibility prediction methods often face challenges in capturing complex nonlinear and spatio-temporal relationships inherent in geospatial data. In this study, we propose a Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Spatial Attention Mechanism (SAM) hybrid deep learning model designed for spatial landslide susceptibility prediction. The model is trained on a comprehensive dataset comprising 19,898 samples, constructed from l
APA, Harvard, Vancouver, ISO, and other styles
8

Li, Jianing, Shiliang Zhang, and Tiejun Huang. "Multi-Scale 3D Convolution Network for Video Based Person Re-Identification." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 8618–25. http://dx.doi.org/10.1609/aaai.v33i01.33018618.

Full text
Abstract:
This paper proposes a two-stream convolution network to extract spatial and temporal cues for video based person ReIdentification (ReID). A temporal stream in this network is constructed by inserting several Multi-scale 3D (M3D) convolution layers into a 2D CNN network. The resulting M3D convolution network introduces a fraction of parameters into the 2D CNN, but gains the ability of multi-scale temporal feature learning. With this compact architecture, M3D convolution network is also more efficient and easier to optimize than existing 3D convolution networks. The temporal stream further invol
APA, Harvard, Vancouver, ISO, and other styles
9

Yang, Hao, Chunfeng Yuan, Li Zhang, Yunda Sun, Weiming Hu, and Stephen J. Maybank. "STA-CNN: Convolutional Spatial-Temporal Attention Learning for Action Recognition." IEEE Transactions on Image Processing 29 (2020): 5783–93. http://dx.doi.org/10.1109/tip.2020.2984904.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Liu, Yiqiang, Luming Shen, Xinghui Zhu, Yangfan Xie, and Shaofang He. "Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model." Applied Sciences 14, no. 24 (2024): 11687. https://doi.org/10.3390/app142411687.

Full text
Abstract:
Accurate prediction of soil properties is essential for sustainable land management and precision agriculture. This study presents an LSTM-CNN-Attention model that integrates temporal and spatial feature extraction with attention mechanisms to improve predictive accuracy. Utilizing the LUCAS soil dataset, the model analyzes spectral data to estimate key soil properties, including organic carbon (OC), nitrogen (N), calcium carbonate (CaCO3), and pH (in H2O). The Long Short-Term Memory (LSTM) component captures temporal dependencies, the Convolutional Neural Network (CNN) extracts spatial featur
APA, Harvard, Vancouver, ISO, and other styles
11

Yang, Shaojun, Shangping Zhong, and Kaizhi Chen. "W-WaveNet: A multi-site water quality prediction model incorporating adaptive graph convolution and CNN-LSTM." PLOS ONE 19, no. 3 (2024): e0276155. http://dx.doi.org/10.1371/journal.pone.0276155.

Full text
Abstract:
Water quality prediction is of great significance in pollution control, prevention, and management. Deep learning models have been applied to water quality prediction in many recent studies. However, most existing deep learning models for water quality prediction are used for single-site data, only considering the time dependency of water quality data and ignoring the spatial correlation among multi-sites. This research defines and analyzes the non-aligned spatial correlations that exist in multi-site water quality data. Then deploy spatial-temporal graph convolution to process water quality d
APA, Harvard, Vancouver, ISO, and other styles
12

Hwang, Bor-Jiunn, Hui-Hui Chen, Chaur-Heh Hsieh, and Deng-Yu Huang. "Gaze Tracking Based on Concatenating Spatial-Temporal Features." Sensors 22, no. 2 (2022): 545. http://dx.doi.org/10.3390/s22020545.

Full text
Abstract:
Based on experimental observations, there is a correlation between time and consecutive gaze positions in visual behaviors. Previous studies on gaze point estimation usually use images as the input for model trainings without taking into account the sequence relationship between image data. In addition to the spatial features, the temporal features are considered to improve the accuracy in this paper by using videos instead of images as the input data. To be able to capture spatial and temporal features at the same time, the convolutional neural network (CNN) and long short-term memory (LSTM)
APA, Harvard, Vancouver, ISO, and other styles
13

Mekouar, Youssef, Imad Saleh, and Mohammed Karim. "GreenNav: Spatiotemporal Prediction of CO2 Emissions in Paris Road Traffic Using a Hybrid CNN-LSTM Model." Network 5, no. 1 (2025): 2. https://doi.org/10.3390/network5010002.

Full text
Abstract:
In a global context where reducing the carbon footprint has become an urgent necessity, this article presents a hybrid CNN-LSTM prediction model to estimate CO2 emission rates of Paris road traffic using spatio-temporal data. Our hybrid prediction model relies on a real-time road traffic database that we built by fusing several APIs and datasets. In particular, we trained two specialized models: a CNN to extract spatial patterns and an LSTM to capture temporal dynamics. By merging their outputs, we leverage both spatial and temporal dependencies, ensuring more accurate predictions. Thus, this
APA, Harvard, Vancouver, ISO, and other styles
14

Sun, Tuo, Chenwei Yang, Ke Han, Wanjing Ma, and Fan Zhang. "Bidirectional Spatial–Temporal Network for Traffic Prediction with Multisource Data." Transportation Research Record: Journal of the Transportation Research Board 2674, no. 8 (2020): 78–89. http://dx.doi.org/10.1177/0361198120927393.

Full text
Abstract:
Urban traffic congestion has an obvious spatial and temporal relationship and is relevant to real traffic conditions. Traffic speed is a significant parameter for reflecting congestion of road networks, which is feasible to predict. Traditional traffic forecasting methods have poor accuracy for complex urban road networks, and do not take into account weather and other multisource data. This paper proposes a convolutional neural network (CNN)-based bidirectional spatial–temporal network (CNN-BDSTN) using traffic speed and weather data by crawling electric map information. In CNN-BDSTN, the spa
APA, Harvard, Vancouver, ISO, and other styles
15

Zhang, Yang, Ziwen Wei, Zhihua Liu, Xiaolong Wu, and Junchao Qian. "Posture Monitoring of Patients in Radiotherapy Scenarios Based on Stacked Grayscale 3-Channel Images." JUCS - Journal of Universal Computer Science 31, no. (6) (2025): 648–65. https://doi.org/10.3897/jucs.130186.

Full text
Abstract:
Purpose: Incorrect patient positioning during radiotherapy can significantly impact treatment efficacy and pose potential risks. This study aims to develop a model that can rapidly and effectively monitor the patient’s postures during radiotherapy sessions using real-time video. Methods: The neural network utilized in this research employed a two-stream architecture, consisting of spatial and temporal streams. For the spatial stream, RGB frames from the videos were directly used as input. In the temporal stream, representative frames were extracted from the video to construct stacked gra
APA, Harvard, Vancouver, ISO, and other styles
16

Afrasiabi, Mahlagha, Hassan Khotanlou, and Theo Gevers. "Spatial-temporal dual-actor CNN for human interaction prediction in video." Multimedia Tools and Applications 79, no. 27-28 (2020): 20019–38. http://dx.doi.org/10.1007/s11042-020-08845-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Cserey, György, András Falus, and Tamás Roska. "Immune response inspired spatial-temporal target detection algorithms with CNN-UM." International Journal of Circuit Theory and Applications 34, no. 1 (2006): 21–47. http://dx.doi.org/10.1002/cta.341.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Yao, Xiuzhen, Tianwen Li, Peng Ding, et al. "Emotion Classification Based on Transformer and CNN for EEG Spatial–Temporal Feature Learning." Brain Sciences 14, no. 3 (2024): 268. http://dx.doi.org/10.3390/brainsci14030268.

Full text
Abstract:
Objectives: The temporal and spatial information of electroencephalogram (EEG) signals is crucial for recognizing features in emotion classification models, but it excessively relies on manual feature extraction. The transformer model has the capability of performing automatic feature extraction; however, its potential has not been fully explored in the classification of emotion-related EEG signals. To address these challenges, the present study proposes a novel model based on transformer and convolutional neural networks (TCNN) for EEG spatial–temporal (EEG ST) feature learning to automatic e
APA, Harvard, Vancouver, ISO, and other styles
19

Censi, Alessandro Michele, Dino Ienco, Yawogan Jean Eudes Gbodjo, Ruggero Gaetano Pensa, Roberto Interdonato, and Raffaele Gaetano. "Attentive Spatial Temporal Graph CNN for Land Cover Mapping From Multi Temporal Remote Sensing Data." IEEE Access 9 (2021): 23070–82. http://dx.doi.org/10.1109/access.2021.3055554.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Garima Pandey, Abhishek Kumar Karn, and Manish Jha. "Human Activity Recognition Using CNN-LSTM-GRU Model." International Research Journal on Advanced Engineering Hub (IRJAEH) 2, no. 04 (2024): 889–94. http://dx.doi.org/10.47392/irjaeh.2024.0125.

Full text
Abstract:
Human Activity Recognition (HAR) is a fundamental task in the field of computer vision and machine learning, with applications spanning from healthcare monitoring to human- computer interaction. This research paper presents a novel approach to HAR utilizing a hybrid model combining Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks, referred to as the VGG-LSTM model. The proposed VGG-LSTM model leverages the power of deep learning to address the challenges associated with HAR, including capturing spatial features and modeling temporal dependencies in complex human
APA, Harvard, Vancouver, ISO, and other styles
21

Zeng, Chunyan, Shuai Kong, Zhifeng Wang, Kun Li, and Yuhao Zhao. "Digital Audio Tampering Detection Based on Deep Temporal–Spatial Features of Electrical Network Frequency." Information 14, no. 5 (2023): 253. http://dx.doi.org/10.3390/info14050253.

Full text
Abstract:
In recent years, digital audio tampering detection methods by extracting audio electrical network frequency (ENF) features have been widely applied. However, most digital audio tampering detection methods based on ENF have the problems of focusing on spatial features only, without effective representation of temporal features, and do not fully exploit the effective information in the shallow ENF features, which leads to low accuracy of audio tamper detection. Therefore, this paper proposes a new method for digital audio tampering detection based on the deep temporal–spatial feature of ENF. To
APA, Harvard, Vancouver, ISO, and other styles
22

Zhen, Hao, Dongxiao Niu, Min Yu, Keke Wang, Yi Liang, and Xiaomin Xu. "A Hybrid Deep Learning Model and Comparison for Wind Power Forecasting Considering Temporal-Spatial Feature Extraction." Sustainability 12, no. 22 (2020): 9490. http://dx.doi.org/10.3390/su12229490.

Full text
Abstract:
The inherent intermittency and uncertainty of wind power have brought challenges in accurate wind power output forecasting, which also cause tricky problems in the integration of wind power to the grid. In this paper, a hybrid deep learning model bidirectional long short term memory-convolutional neural network (BiLSTM-CNN) is proposed for short-term wind power forecasting. First, the grey correlation analysis is utilized to select the inputs for forecasting model; Then, the proposed hybrid model extracts multi-dimension features of inputs to predict the wind power from the temporal-spatial pe
APA, Harvard, Vancouver, ISO, and other styles
23

Wu, Ruowu, Yandan Liang, Lianlei Lin, and Zongwei Zhang. "Spatiotemporal Multivariate Weather Prediction Network Based on CNN-Transformer." Sensors 24, no. 23 (2024): 7837. https://doi.org/10.3390/s24237837.

Full text
Abstract:
Weather prediction is of great significance for human daily production activities, global extreme climate prediction, and environmental protection of the Earth. However, the existing data-based weather prediction methods cannot adequately capture the spatial and temporal evolution characteristics of the target region, which makes it difficult for the existing methods to meet practical application requirements in terms of efficiency and accuracy. Changes in weather involve both strongly correlated spatial and temporal continuation relationships, and at the same time, the variables interact with
APA, Harvard, Vancouver, ISO, and other styles
24

Bao, Yin-Xin, Quan Shi, Qin-Qin Shen, and Yang Cao. "Spatial-Temporal 3D Residual Correlation Network for Urban Traffic Status Prediction." Symmetry 14, no. 1 (2021): 33. http://dx.doi.org/10.3390/sym14010033.

Full text
Abstract:
Accurate traffic status prediction is of great importance to improve the security and reliability of the intelligent transportation system. However, urban traffic status prediction is a very challenging task due to the tight symmetry among the Human–Vehicle–Environment (HVE). The recently proposed spatial–temporal 3D convolutional neural network (ST-3DNet) effectively extracts both spatial and temporal characteristics in HVE, but ignores the essential long-term temporal characteristics and the symmetry of historical data. Therefore, a novel spatial–temporal 3D residual correlation network (ST-
APA, Harvard, Vancouver, ISO, and other styles
25

Zhou, C., J. Li, H. Shen, and Q. Yuan. "MULTI-TEMPORAL SAR IMAGE DESPECKLING BASED A CONVOLUTIONAL NEURAL NETWORK." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-5-2020 (August 3, 2020): 101–7. http://dx.doi.org/10.5194/isprs-annals-v-5-2020-101-2020.

Full text
Abstract:
Abstract. Speckle noise is an intrinsic property of Synthetic Aperture Radar (SAR) imagery, which affects the quality of image. Single-temporal despeckling methods usually pay attention to the utilization of spatial information, but sometimes due to lack of sufficient information, the despeckling image is too smooth or losses some information about edge details. However, multi-temporal SAR images can provide extra information for despeckling resulting in better performance. Therefore, in this paper, we proposed a novel multi-temporal SAR despeckling method based a convolutional neural network
APA, Harvard, Vancouver, ISO, and other styles
26

Abdlrazg, Bassma A. Awad, Sumaia Masoud, and Mnal M. Ali. "Human Action Detection Using A hybrid Architecture of CNN and Transformer." International Science and Technology Journal 34, no. 1 (2024): 1–15. http://dx.doi.org/10.62341/bsmh2119.

Full text
Abstract:
This work presents a Deep learning and Vision Transformer hybrid sequence model for the classification and identification of Human Motion Actions. The deep learning model works by extracting Spatial-temporal features from the features of every video, and then we use a CNN model that takes these inputs as spatial features map from videos and outputs them as a sequence of features. These sequences will be temporally fed into the Vision Transformer (ViT) which classifies the videos used into 7 different classes: Jump, Walk, Wave1, wave2, Bend, Jack, and powerful jump. The model was trained and te
APA, Harvard, Vancouver, ISO, and other styles
27

BÁLYA, DAVID. "SUDDEN GLOBAL SPATIAL-TEMPORAL CHANGE DETECTION AND ITS APPLICATIONS." Journal of Circuits, Systems and Computers 12, no. 06 (2003): 845–56. http://dx.doi.org/10.1142/s0218126603001173.

Full text
Abstract:
We are watching the news on TV: the change of the background tells us when a new story begins. A glance at the clock and we can clearly see what time it is. These are special spatial-temporal episodes caused by ballistic eye movements and sudden optical changes. In this paper we give a useful definition for generalized sudden global change events, present its main properties and give an algorithm to recognize them in any video-flow. The proposed algorithm is implemented on a standard Cellular Nonlinear Network Universal Machine (CNN-UM). The processing time of the detection is roughly one mill
APA, Harvard, Vancouver, ISO, and other styles
28

Chen, Zhigang, Haotian Peng, and Yongxin Su. "Nonintrusive load disaggregation by fusion of graph signal processing and CNN." Journal of Physics: Conference Series 2853, no. 1 (2024): 012061. http://dx.doi.org/10.1088/1742-6596/2853/1/012061.

Full text
Abstract:
Abstract Nonintrusive load disaggregation techniques play a pivotal role in power system planning and decision making for demand response. The sparsity of low frequency total power data features, coupled with the model’s challenge in effectively extracting and utilizing spatial and temporal correlations in diverse load data, presents obstacles to achieving high precision disaggregation in nonintrusive load disaggregation. To address these challenges, this paper proposes a disaggregation framework that integrates graph signal processing (GSP) with a convolutional neural network (CNN). In this f
APA, Harvard, Vancouver, ISO, and other styles
29

Tong, Runze, Yue Zhang, Hongfeng Chen, and Honghai Liu. "Learn the Temporal-Spatial Feature of sEMG via Dual-Flow Network." International Journal of Humanoid Robotics 16, no. 04 (2019): 1941004. http://dx.doi.org/10.1142/s0219843619410044.

Full text
Abstract:
Surface electromyography (sEMG) signals have been widely used in human–machine interaction, providing more nature control expedience for external devices. However, due to the instability of sEMG, it is hard to extract consistent sEMG patterns for motion recognition. This paper proposes a dual-flow network to extract the temporal-spatial feature of sEMG for gesture recognition. The proposed network model uses convolutional neural network (CNN) and long short-term memory methods (LSTM) to, respectively, extract the spatial feature and temporal feature of sEMG, simultaneously. These features extr
APA, Harvard, Vancouver, ISO, and other styles
30

Lu, Peng, Yaqin Zhao, and Yuan Xu. "A Two-Stream CNN Model with Adaptive Adjustment of Receptive Field Dedicated to Flame Region Detection." Symmetry 13, no. 3 (2021): 397. http://dx.doi.org/10.3390/sym13030397.

Full text
Abstract:
Convolutional neural networks (CNN) have yielded state-of-the-art performance in image segmentation. Their application in video surveillance systems can provide very useful information for extinguishing fire in time. The current studies mostly focused on CNN-based flame image classification and have achieved good accuracy. However, the research of CNN-based flame region detection is extremely scarce due to the bulky network structures and high hardware configuration requirements of the state-of-the-art CNN models. Therefore, this paper presents a two-stream convolutional neural network for fla
APA, Harvard, Vancouver, ISO, and other styles
31

Miao, Yunqi, Jungong Han, Yongsheng Gao, and Baochang Zhang. "ST-CNN: Spatial-Temporal Convolutional Neural Network for crowd counting in videos." Pattern Recognition Letters 125 (July 2019): 113–18. http://dx.doi.org/10.1016/j.patrec.2019.04.012.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Zhang, Peng, Tao Zhuo, Wei Huang, Kangli Chen, and Mohan Kankanhalli. "Online object tracking based on CNN with spatial-temporal saliency guided sampling." Neurocomputing 257 (September 2017): 115–27. http://dx.doi.org/10.1016/j.neucom.2016.10.073.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Lan, Yi. "A Hybrid CNN-LSTM Model for Stock Price Prediction with Spatial and Temporal Dependencies." Applied and Computational Engineering 155, no. 1 (2025): 236–42. https://doi.org/10.54254/2755-2721/2025.gl23570.

Full text
Abstract:
Stock price prediction is crucial in financial decision-making and investment strategies, significantly influencing investors' profitability and market stability. This paper aims to systematically review and evaluate Machine Learning (ML) and Deep Learning (DL) methodologies, primarily focusing on Long Short-Term Memory (LSTM) networks and Convolutional Neural Networks (CNNs) for stock price forecasting. A hybrid CNN-LSTM model is proposed to enhance predictive accuracy. Specifically, the CNN component initially extracts essential spatial features from historical financial data, while the LSTM
APA, Harvard, Vancouver, ISO, and other styles
34

Li, Qianjing, Jia Tian, and Qingjiu Tian. "Deep Learning Application for Crop Classification via Multi-Temporal Remote Sensing Images." Agriculture 13, no. 4 (2023): 906. http://dx.doi.org/10.3390/agriculture13040906.

Full text
Abstract:
The combination of multi-temporal images and deep learning is an efficient way to obtain accurate crop distributions and so has drawn increasing attention. However, few studies have compared deep learning models with different architectures, so it remains unclear how a deep learning model should be selected for multi-temporal crop classification, and the best possible accuracy is. To address this issue, the present work compares and analyzes a crop classification application based on deep learning models and different time-series data to exploit the possibility of improving crop classification
APA, Harvard, Vancouver, ISO, and other styles
35

Ullah, Hayat, and Arslan Munir. "Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework." Journal of Imaging 9, no. 7 (2023): 130. http://dx.doi.org/10.3390/jimaging9070130.

Full text
Abstract:
Vision-based human activity recognition (HAR) has emerged as one of the essential research areas in video analytics. Over the last decade, numerous advanced deep learning algorithms have been introduced to recognize complex human actions from video streams. These deep learning algorithms have shown impressive performance for the video analytics task. However, these newly introduced methods either exclusively focus on model performance or the effectiveness of these models in terms of computational efficiency, resulting in a biased trade-off between robustness and computational efficiency in the
APA, Harvard, Vancouver, ISO, and other styles
36

Mekruksavanich, Sakorn, Wikanda Phaphan, and Anuchit Jitpattanakul. "Epileptic seizure detection in EEG signals via an enhanced hybrid CNN with an integrated attention mechanism." Mathematical Biosciences and Engineering 22, no. 1 (2024): 73–105. https://doi.org/10.3934/mbe.2025004.

Full text
Abstract:
<p>Epileptic seizures, a prevalent neurological condition, necessitate precise and prompt identification for optimal care. Nevertheless, the intricate characteristics of electroencephalography (EEG) signals, noise, and the want for real-time analysis require enhancement in the creation of dependable detection approaches. Despite advances in machine learning and deep learning, capturing the intricate spatial and temporal patterns in EEG data remains challenging. This study introduced a novel deep learning framework combining a convolutional neural network (CNN), bidirectional gated recurr
APA, Harvard, Vancouver, ISO, and other styles
37

Kolipaka, Venkata Rama Rao, and Anupama Namburu. "Integrating Temporal Fluctuations in Crop Growth with Stacked Bidirectional LSTM and 3D CNN Fusion for Enhanced Crop Yield Prediction." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 9 (2023): 376–83. http://dx.doi.org/10.17762/ijritcc.v11i9.8543.

Full text
Abstract:
Optimizing farming methods and guaranteeing a steady supply of food depend critically on accurate predictions of crop yields. The dynamic temporal changes that occur during crop growth are generally ignored by conventional crop growth models, resulting in less precise projections. Using a stacked bidirectional Long Short-Term Memory (LSTM) structure and a 3D Convolutional Neural Network (CNN) fusion, we offer a novel neural network model that accounts for temporal oscillations in the crop growth process. The 3D CNN efficiently recovers spatial and temporal features from the crop development da
APA, Harvard, Vancouver, ISO, and other styles
38

Alharkan, Hamad, Shabana Habib, and Muhammad Islam. "Solar Power Prediction Using Dual Stream CNN-LSTM Architecture." Sensors 23, no. 2 (2023): 945. http://dx.doi.org/10.3390/s23020945.

Full text
Abstract:
The integration of solar energy with a power system brings great economic and environmental benefits. However, the high penetration of solar power is challenging due to the operation and planning of the existing power system owing to the intermittence and randomicity of solar power generation. Achieving accurate predictions for power generation is important to provide high-quality electric energy for end-users. Therefore, in this paper, we introduce a deep learning-based dual-stream convolutional neural network (CNN) and long short-term nemory (LSTM) network followed by a self-attention mechan
APA, Harvard, Vancouver, ISO, and other styles
39

Lin, Shaofu, Yuying Zhang, Xiliang Liu, Qiang Mei, Xiaoying Zhi, and Xingjia Fei. "Incorporating the Third Law of Geography with Spatial Attention Module–Convolutional Neural Network–Transformer for Fine-Grained Non-Stationary Air Quality Predictive Learning." Mathematics 12, no. 10 (2024): 1457. http://dx.doi.org/10.3390/math12101457.

Full text
Abstract:
Accurate air quality prediction is paramount in safeguarding public health and addressing air pollution control. However, previous studies often ignore the geographic similarity among different monitoring stations and face challenges in dynamically capturing different spatial–temporal relationships between stations. To address this, an air quality predictive learning approach incorporating the Third Law of Geography with SAM–CNN–Transformer is proposed. Firstly, the Third Law of Geography is incorporated to fully consider the geographical similarity among stations via a variogram and spatial c
APA, Harvard, Vancouver, ISO, and other styles
40

Yang, Honglei, Youfeng Liu, Qing Han, et al. "Improved Landslide Deformation Prediction Using Convolutional Neural Network–Gated Recurrent Unit and Spatial–Temporal Data." Remote Sensing 17, no. 4 (2025): 727. https://doi.org/10.3390/rs17040727.

Full text
Abstract:
As one of the major forms of geological disaster, landslides cause huge casualties and economic losses in China every year. Given the importance of landslide prediction, it is a challenging task due to difficulties in efficiently leveraging the spatial–temporal information for enhanced prediction. This paper presents a novel spatial–temporal enhanced CNN-GRU model to improve landslide predictions with the following contributions. First, this paper explicitly models the spatial correlation in the dataset and constructs a spatial–temporal time-sequence deformation prediction model that greatly i
APA, Harvard, Vancouver, ISO, and other styles
41

Baral, Rojina, Sanjivan Satyal, and Anisha Pokhrel. "CNN-Transformer Based Speech Emotion Detection." Journal of Advanced College of Engineering and Management 10 (March 11, 2025): 135–45. https://doi.org/10.3126/jacem.v10i1.76324.

Full text
Abstract:
In this study, a parallel network technique trained on the Ryerson Audio-Visual Dataset of Speech and Song (RAVDESS) was used to perform an autonomous speech emotion recognition (SER) challenge to categorize four distinct emotions. To capture both spatial and temporal data, the architecture comprised attention-based networks with CNN-based networks that ran in tandem. Additive White Gaussian Noise (AWGN) was used as augmentation techniques for multiple folds to improve the model’s generalization. The model’s input was MFCC, which was created from the raw audio data. The MFCC were represented a
APA, Harvard, Vancouver, ISO, and other styles
42

Reddy, Mr G. Sekhar, A. Sahithi, P. Harsha Vardhan, and P. Ushasri. "Conversion of Sign Language Video to Text and Speech." International Journal for Research in Applied Science and Engineering Technology 10, no. 5 (2022): 159–64. http://dx.doi.org/10.22214/ijraset.2022.42078.

Full text
Abstract:
Abstract: Sign Language recognition (SLR) is a significant and promising technique to facilitate communication for hearingimpaired people. Here, we are dedicated to finding an efficient solution to the gesture recognition problem. This work develops a sign language (SL) recognition framework with deep neural networks, which directly transcribes videos of SL sign to word. We propose a novel approach, by using Video sequences that contain both the temporal as well as the spatial features. So, we have used two different models to train both the temporal as well as spatial features. To train the m
APA, Harvard, Vancouver, ISO, and other styles
43

Xu, Jie, Haoliang Wei, Linke Li, Qiuru Fu, and Jinhong Guo. "Video Description Model Based on Temporal-Spatial and Channel Multi-Attention Mechanisms." Applied Sciences 10, no. 12 (2020): 4312. http://dx.doi.org/10.3390/app10124312.

Full text
Abstract:
Video description plays an important role in the field of intelligent imaging technology. Attention perception mechanisms are extensively applied in video description models based on deep learning. Most existing models use a temporal-spatial attention mechanism to enhance the accuracy of models. Temporal attention mechanisms can obtain the global features of a video, whereas spatial attention mechanisms obtain local features. Nevertheless, because each channel of the convolutional neural network (CNN) feature maps has certain spatial semantic information, it is insufficient to merely divide th
APA, Harvard, Vancouver, ISO, and other styles
44

Yang, Zizhen, Wei Li, Fang Yuan, et al. "Hybrid CNN-BiLSTM-MHSA Model for Accurate Fault Diagnosis of Rotor Motor Bearings." Mathematics 13, no. 3 (2025): 334. https://doi.org/10.3390/math13030334.

Full text
Abstract:
Rotor motor fault diagnosis in Unmanned Aerial Vehicles (UAVs) presents significant challenges under variable speeds. Recent advances in deep learning offer promising solutions. To address challenges in extracting spatial, temporal, and hierarchical features from raw vibration signals, a hybrid CNN-BiLSTM-MHSA model is developed. This model leverages Convolutional Neural Networks (CNNs) to identify spatial patterns, a Bidirectional Long Short-Term Memory (BiLSTM) network to capture long- and short-term temporal dependencies, and a Multi-Head Self-Attention (MHSA) mechanism to highlight essenti
APA, Harvard, Vancouver, ISO, and other styles
45

Shreya, Shankar. "A CNN-LSTM hybrid model for parkinson's disease detection from handwritten spirals using transfer learning." i-manager’s Journal on Image Processing 12, no. 2 (2025): 16. https://doi.org/10.26634/jip.12.2.21905.

Full text
Abstract:
Parkinson's Disease (PD) is a progressive neurological disorder that significantly affects motor skills, typically altering a person's handwriting. This work investigates deep learning-based approaches for classifying Parkinson's Disease using images of handwritten spiral drawings. The study began with a transfer learning approach using EfficientNet, as well as a CNN-LSTM architecture that combines convolutional and recurrent layers for spatial- temporal feature modeling. However, both approaches individually yielded suboptimal results, each achieving only 50% classification accuracy. To overc
APA, Harvard, Vancouver, ISO, and other styles
46

Gao, Song, Dingzhuo Zhang, Zhaoming Tang, and Hongyan Wang. "Deep Fusion of Skeleton Spatial–Temporal and Dynamic Information for Action Recognition." Sensors 24, no. 23 (2024): 7609. http://dx.doi.org/10.3390/s24237609.

Full text
Abstract:
Focusing on the issue of the low recognition rates achieved by traditional deep-information-based action recognition algorithms, an action recognition approach was developed based on skeleton spatial–temporal and dynamic features combined with a two-stream convolutional neural network (TS-CNN). Firstly, the skeleton’s three-dimensional coordinate system was transformed to obtain coordinate information related to relative joint positions. Subsequently, this relevant joint information was encoded as a color texture map to construct the spatial–temporal feature descriptor of the skeleton. Further
APA, Harvard, Vancouver, ISO, and other styles
47

Althamary, Ibrahim, Rubbens Boisguene, and Chih-Wei Huang. "Enhanced Multi-Task Traffic Forecasting in Beyond 5G Networks: Leveraging Transformer Technology and Multi-Source Data Fusion." Future Internet 16, no. 5 (2024): 159. http://dx.doi.org/10.3390/fi16050159.

Full text
Abstract:
Managing cellular networks in the Beyond 5G (B5G) era is a complex and challenging task requiring advanced deep learning approaches. Traditional models focusing on internet traffic (INT) analysis often fail to capture the rich temporal and spatial contexts essential for accurate INT predictions. Furthermore, these models do not account for the influence of external factors such as weather, news, and social trends. This study proposes a multi-source CNN-RNN (MSCR) model that leverages a rich dataset, including periodic, weather, news, and social data to address these limitations. This model ena
APA, Harvard, Vancouver, ISO, and other styles
48

Fu, Zhongjun, Yuhui Wang, Lei Zhou, Keyang Li, and Hang Rao. "Partial Discharge Recognition of Transformers Based on Data Augmentation and CNN-BiLSTM-Attention Mechanism." Electronics 14, no. 1 (2025): 193. https://doi.org/10.3390/electronics14010193.

Full text
Abstract:
Partial discharge (PD) is a commonly encountered discharge-related fault in transformers. Due to the unique characteristics of the environment where PD occurs, challenges such as difficulty in data acquisition and scarcity of samples arise. Convolutional neural networks (CNNs) are widely used in pattern recognition because of their strong feature extraction capabilities. To improve the recognition accuracy of PD models, this paper integrates CNN, bidirectional long short-term memory (BiLSTM), and an attention mechanism. In the proposed model, CNN is employed to extract local spatial and tempor
APA, Harvard, Vancouver, ISO, and other styles
49

Carpentier, B., A. Masse, E. Lavergne, and C. Sannier. "BENCHMARKING OF CONVOLUTIONAL NEURAL NETWORK APPROACHES FOR VEGETATION LAND COVER MAPPING." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2021 (June 28, 2021): 915–22. http://dx.doi.org/10.5194/isprs-archives-xliii-b2-2021-915-2021.

Full text
Abstract:
Abstract. Satellite Image Time Series (SITS) are becoming available at high spatial, spectral and temporal resolutions across the globe by the latest remote sensing sensors. These series of images can be highly valuable when exploited by classification systems to produce frequently updated and accurate land cover maps. The richness of spectral, spatial and temporal features in SITS is a promising source of data for developing better classification algorithms. However, machine learning methods such as Random Forests (RF), despite their fruitful application to SITS to produce land cover maps, ar
APA, Harvard, Vancouver, ISO, and other styles
50

N., Atansuyi. "Hybrid Deep Learning Models for Gait Recognition: A Comparative Analysis of CNN, CNN-LSTM, and HOA Techniques." International Journal for Research in Applied Science and Engineering Technology 13, no. 7 (2025): 1831–38. https://doi.org/10.22214/ijraset.2025.73234.

Full text
Abstract:
Gait recognition is a critical biometric technique with applications in surveillance, healthcare, and security. This study proposes a hybrid deep learning framework combining Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and the Hippopotamus Optimization Algorithm (HOA) for robust gait recognition. By leveraging spatial feature extraction, temporal dynamics, and metaheuristic hyperparameter optimization, the proposed HOA-CNN-LSTM model achieves superior performance. Experimental results on the TUM-GAID dataset show that the hybrid model outperforms standalone CNN and CNN-
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!