Log in

Relevant bibliographies by topics / RGB-D Image / Journal articles

To see the other types of publications on this topic, follow the link: RGB-D Image.

Journal articles on the topic 'RGB-D Image'

Author: Grafiati

Published: 11 January 2025

Last updated: 31 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'RGB-D Image.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Uddin, Md Kamal, Amran Bhuiyan, and Mahmudul Hasan. "Fusion in Dissimilarity Space Between RGB D and Skeleton for Person Re Identification." International Journal of Innovative Technology and Exploring Engineering 10, no. 12 (2021): 69–75. http://dx.doi.org/10.35940/ijitee.l9566.10101221.

Full text

Abstract:

Person re-identification (Re-id) is one of the important tools of video surveillance systems, which aims to recognize an individual across the multiple disjoint sensors of a camera network. Despite the recent advances on RGB camera-based person re-identification methods under normal lighting conditions, Re-id researchers fail to take advantages of modern RGB-D sensor-based additional information (e.g. depth and skeleton information). When traditional RGB-based cameras fail to capture the video under poor illumination conditions, RGB-D sensor-based additional information can be advantageous to

APA, Harvard, Vancouver, ISO, and other styles

2

Md, Kamal Uddin, Bhuiyan Amran, and Hasan Mahmudul. "Fusion in Dissimilarity Space Between RGB-D and Skeleton for Person Re-Identification." International Journal of Innovative Technology and Exploring Engineering (IJITEE) 10, no. 12 (2021): 69–75. https://doi.org/10.35940/ijitee.L9566.10101221.

Full text

Abstract:

Person re-identification (Re-id) is one of the important tools of video surveillance systems, which aims to recognize an individual across the multiple disjoint sensors of a camera network. Despite the recent advances on RGB camera-based person re-identification methods under normal lighting conditions, Re-id researchers fail to take advantages of modern RGB-D sensor-based additional information (e.g. depth and skeleton information). When traditional RGB-based cameras fail to capture the video under poor illumination conditions, RGB-D sensor-based additional information can be advantageous to

APA, Harvard, Vancouver, ISO, and other styles

3

Li, Hengyu, Hang Liu, Ning Cao, et al. "Real-time RGB-D image stitching using multiple Kinects for improved field of view." International Journal of Advanced Robotic Systems 14, no. 2 (2017): 172988141769556. http://dx.doi.org/10.1177/1729881417695560.

Full text

Abstract:

This article concerns the problems of a defective depth map and limited field of view of Kinect-style RGB-D sensors. An anisotropic diffusion based hole-filling method is proposed to recover invalid depth data in the depth map. The field of view of the Kinect-style RGB-D sensor is extended by stitching depth and color images from several RGB-D sensors. By aligning the depth map with the color image, the registration data calculated by registering color images can be used to stitch depth and color images into a depth and color panoramic image concurrently in real time. Experiments show that the

APA, Harvard, Vancouver, ISO, and other styles

4

Wu, Yan, Jiqian Li, and Jing Bai. "Multiple Classifiers-Based Feature Fusion for RGB-D Object Recognition." International Journal of Pattern Recognition and Artificial Intelligence 31, no. 05 (2017): 1750014. http://dx.doi.org/10.1142/s0218001417500148.

Full text

Abstract:

RGB-D-based object recognition has been enthusiastically investigated in the past few years. RGB and depth images provide useful and complementary information. Fusing RGB and depth features can significantly increase the accuracy of object recognition. However, previous works just simply take the depth image as the fourth channel of the RGB image and concatenate the RGB and depth features, ignoring the different power of RGB and depth information for different objects. In this paper, a new method which contains three different classifiers is proposed to fuse features extracted from RGB image a

APA, Harvard, Vancouver, ISO, and other styles

5

Kitzler, Florian, Norbert Barta, Reinhard W. Neugschwandtner, Andreas Gronauer, and Viktoria Motsch. "WE3DS: An RGB-D Image Dataset for Semantic Segmentation in Agriculture." Sensors 23, no. 5 (2023): 2713. http://dx.doi.org/10.3390/s23052713.

Full text

Abstract:

Smart farming (SF) applications rely on robust and accurate computer vision systems. An important computer vision task in agriculture is semantic segmentation, which aims to classify each pixel of an image and can be used for selective weed removal. State-of-the-art implementations use convolutional neural networks (CNN) that are trained on large image datasets. In agriculture, publicly available RGB image datasets are scarce and often lack detailed ground-truth information. In contrast to agriculture, other research areas feature RGB-D datasets that combine color (RGB) with additional distanc

APA, Harvard, Vancouver, ISO, and other styles

6

Zheng, Huiming, and Wei Gao. "End-to-End RGB-D Image Compression via Exploiting Channel-Modality Redundancy." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 7 (2024): 7562–70. http://dx.doi.org/10.1609/aaai.v38i7.28588.

Full text

Abstract:

As a kind of 3D data, RGB-D images have been extensively used in object tracking, 3D reconstruction, remote sensing mapping, and other tasks. In the realm of computer vision, the significance of RGB-D images is progressively growing. However, the existing learning-based image compression methods usually process RGB images and depth images separately, which cannot entirely exploit the redundant information between the modalities, limiting the further improvement of the Rate-Distortion performance. With the goal of overcoming the defect, in this paper, we propose a learning-based dual-branch RGB

APA, Harvard, Vancouver, ISO, and other styles

7

Peroš, Josip, Rinaldo Paar, Vladimir Divić, and Boštjan Kovačić. "Fusion of Laser Scans and Image Data—RGB+D for Structural Health Monitoring of Engineering Structures." Applied Sciences 12, no. 22 (2022): 11763. http://dx.doi.org/10.3390/app122211763.

Full text

Abstract:

A novel method for structural health monitoring (SHM) by using RGB+D data has been recently proposed. RGB+D data are created by fusing image and laser scan data, where the D channel represents the distance, interpolated from laser scanner data. RGB channel represents image data obtained by an image sensor integrated in robotic total station (RTS) telescope, or on top of the telescope i.e., image assisted total station (IATS). Images can also be obtained by conventional cameras, or cameras integrated with RTS (different kind of prototypes). RGB+D image combines the advantages of the two measuri

APA, Harvard, Vancouver, ISO, and other styles

8

Yan, Zhiqiang, Hongyuan Wang, Qianhao Ning, and Yinxi Lu. "Robust Image Matching Based on Image Feature and Depth Information Fusion." Machines 10, no. 6 (2022): 456. http://dx.doi.org/10.3390/machines10060456.

Full text

Abstract:

In this paper, we propose a robust image feature extraction and fusion method to effectively fuse image feature and depth information and improve the registration accuracy of RGB-D images. The proposed method directly splices the image feature point descriptors with the corresponding point cloud feature descriptors to obtain the fusion descriptor of the feature points. The fusion feature descriptor is constructed based on the SIFT, SURF, and ORB feature descriptors and the PFH and FPFH point cloud feature descriptors. Furthermore, the registration performance based on fusion features is tested

APA, Harvard, Vancouver, ISO, and other styles

9

Yuan, Yuan, Zhitong Xiong, and Qi Wang. "ACM: Adaptive Cross-Modal Graph Convolutional Neural Networks for RGB-D Scene Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 9176–84. http://dx.doi.org/10.1609/aaai.v33i01.33019176.

Full text

Abstract:

RGB image classification has achieved significant performance improvement with the resurge of deep convolutional neural networks. However, mono-modal deep models for RGB image still have several limitations when applied to RGB-D scene recognition. 1) Images for scene classification usually contain more than one typical object with flexible spatial distribution, so the object-level local features should also be considered in addition to global scene representation. 2) Multi-modal features in RGB-D scene classification are still under-utilized. Simply combining these modal-specific features suff

APA, Harvard, Vancouver, ISO, and other styles

10

Wang, Z., T. Li, L. Pan, and Z. Kang. "SCENE SEMANTIC SEGMENTATION FROM INDOOR RGB-D IMAGES USING ENCODE-DECODER FULLY CONVOLUTIONAL NETWORKS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7 (September 12, 2017): 397–404. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w7-397-2017.

Full text

Abstract:

With increasing attention for the indoor environment and the development of low-cost RGB-D sensors, indoor RGB-D images are easily acquired. However, scene semantic segmentation is still an open area, which restricts indoor applications. The depth information can help to distinguish the regions which are difficult to be segmented out from the RGB images with similar color or texture in the indoor scenes. How to utilize the depth information is the key problem of semantic segmentation for RGB-D images. In this paper, we propose an Encode-Decoder Fully Convolutional Networks for RGB-D image clas

APA, Harvard, Vancouver, ISO, and other styles

11

Kanda, Takuya, Kazuya Miyakawa, Jeonghwang Hayashi, et al. "Locating Mechanical Switches Using RGB-D Sensor Mounted on a Disaster Response Robot." Electronic Imaging 2020, no. 6 (2020): 16–1. http://dx.doi.org/10.2352/issn.2470-1173.2020.6.iriacv-016.

Full text

Abstract:

To achieve one of the tasks required for disaster response robots, this paper proposes a method for locating 3D structured switches’ points to be pressed by the robot in disaster sites using RGBD images acquired by Kinect sensor attached to our disaster response robot. Our method consists of the following five steps: 1)Obtain RGB and depth images using an RGB-D sensor. 2) Detect the bounding box of switch area from the RGB image using YOLOv3. 3)Generate 3D point cloud data of the target switch by combining the bounding box and the depth image.4)Detect the center position of the switch button f

APA, Harvard, Vancouver, ISO, and other styles

12

Dai, Xinxin, Ran Zhao, Pengpeng Hu, and Adrian Munteanu. "KD-Net: Continuous-Keystroke-Dynamics-Based Human Identification from RGB-D Image Sequences." Sensors 23, no. 20 (2023): 8370. http://dx.doi.org/10.3390/s23208370.

Full text

Abstract:

Keystroke dynamics is a soft biometric based on the assumption that humans always type in uniquely characteristic manners. Previous works mainly focused on analyzing the key press or release events. Unlike these methods, we explored a novel visual modality of keystroke dynamics for human identification using a single RGB-D sensor. In order to verify this idea, we created a dataset dubbed KD-MultiModal, which contains 243.2 K frames of RGB images and depth images, obtained by recording a video of hand typing with a single RGB-D sensor. The dataset comprises RGB-D image sequences of 20 subjects

APA, Harvard, Vancouver, ISO, and other styles

13

Lv, Ying, and Wujie Zhou. "Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency." Computational Intelligence and Neuroscience 2020 (November 20, 2020): 1–9. http://dx.doi.org/10.1155/2020/8841681.

Full text

Abstract:

Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) network to facilitate end-to-end prediction of RGB-D saliency. In the proposed method, hierarchical (multilevel) multimodal features are first extracted from an RGB image and depth map using a VGG-16-based two-stream network. Subsequently, the most significant hierarchical features of the said RGB image a

APA, Harvard, Vancouver, ISO, and other styles

14

Tang, Shengjun, Qing Zhu, Wu Chen, et al. "ENHANCED RGB-D MAPPING METHOD FOR DETAILED 3D MODELING OF LARGE INDOOR ENVIRONMENTS." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences III-1 (June 2, 2016): 151–58. http://dx.doi.org/10.5194/isprsannals-iii-1-151-2016.

Full text

Abstract:

RGB-D sensors are novel sensing systems that capture RGB images along with pixel-wise depth information. Although they are widely used in various applications, RGB-D sensors have significant drawbacks with respect to 3D dense mapping of indoor environments. First, they only allow a measurement range with a limited distance (e.g., within 3&thinsp;m) and a limited field of view. Second, the error of the depth measurement increases with increasing distance to the sensor. In this paper, we propose an enhanced RGB-D mapping method for detailed 3D modeling of large indoor environments by combini

APA, Harvard, Vancouver, ISO, and other styles

15

Tang, Shengjun, Qing Zhu, Wu Chen, et al. "ENHANCED RGB-D MAPPING METHOD FOR DETAILED 3D MODELING OF LARGE INDOOR ENVIRONMENTS." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences III-1 (June 2, 2016): 151–58. http://dx.doi.org/10.5194/isprs-annals-iii-1-151-2016.

Full text

Abstract:

RGB-D sensors are novel sensing systems that capture RGB images along with pixel-wise depth information. Although they are widely used in various applications, RGB-D sensors have significant drawbacks with respect to 3D dense mapping of indoor environments. First, they only allow a measurement range with a limited distance (e.g., within 3&thinsp;m) and a limited field of view. Second, the error of the depth measurement increases with increasing distance to the sensor. In this paper, we propose an enhanced RGB-D mapping method for detailed 3D modeling of large indoor environments by combini

APA, Harvard, Vancouver, ISO, and other styles

16

Li, Shipeng, Di Li, Chunhua Zhang, Jiafu Wan, and Mingyou Xie. "RGB-D Image Processing Algorithm for Target Recognition and Pose Estimation of Visual Servo System." Sensors 20, no. 2 (2020): 430. http://dx.doi.org/10.3390/s20020430.

Full text

Abstract:

This paper studies the control performance of visual servoing system under the planar camera and RGB-D cameras, the contribution of this paper is through rapid identification of target RGB-D images and precise measurement of depth direction to strengthen the performance indicators of visual servoing system such as real time and accuracy, etc. Firstly, color images acquired by the RGB-D camera are segmented based on optimized normalized cuts. Next, the gray scale is restored according to the histogram feature of the target image. Then, the obtained 2D graphics depth information and the enhanced

APA, Harvard, Vancouver, ISO, and other styles

17

Tu Shuqin, 涂淑琴, 薛月菊 Xue Yueju, 梁云 Liang Yun, 黄宁 Huang Ning, and 张晓 Zhang Xiao. "Review on RGB-D Image Classification." Laser & Optoelectronics Progress 53, no. 6 (2016): 060003. http://dx.doi.org/10.3788/lop53.060003.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Lin, Wei-Yang, Chih-Fong Tsai, Pei-Chen Wu, and Bo-Rong Chen. "Image retargeting using RGB-D camera." Multimedia Tools and Applications 74, no. 9 (2014): 3155–70. http://dx.doi.org/10.1007/s11042-013-1776-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

B L, Sunil Kumar, and Sharmila Kumari M. "RGB-D FACE RECOGNITION USING LBP-DCT ALGORITHM." Applied Computer Science 17, no. 3 (2021): 73–81. http://dx.doi.org/10.35784/acs-2021-22.

Full text

Abstract:

Face recognition is one of the applications in image processing that recognizes or checks an individual's identity. 2D images are used to identify the face, but the problem is that this kind of image is very sensitive to changes in lighting and various angles of view. The images captured by 3D camera and stereo camera can also be used for recognition, but fairly long processing times is needed. RGB-D images that Kinect produces are used as a new alternative approach to 3D images. Such cameras cost less and can be used in any situation and any environment. This paper shows the face recognition

APA, Harvard, Vancouver, ISO, and other styles

20

Du, Qinsheng, Yingxu Bian, Jianyu Wu, Shiyan Zhang, and Jian Zhao. "Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection." Applied Sciences 14, no. 17 (2024): 7440. http://dx.doi.org/10.3390/app14177440.

Full text

Abstract:

The salient object detection (SOD) task aims to automatically detect the most prominent areas observed by the human eye in an image. Since RGB images and depth images contain different information, how to effectively integrate cross-modal features in the RGB-D SOD task remains a major challenge. Therefore, this paper proposes a cross-modal adaptive interaction network (CMANet) for the RGB-D salient object detection task, which consists of a cross-modal feature integration module (CMF) and an adaptive feature fusion module (AFFM). These modules are designed to integrate and enhance multi-scale

APA, Harvard, Vancouver, ISO, and other styles

21

Xu, Chi, Jun Zhou, Wendi Cai, Yunkai Jiang, Yongbo Li, and Yi Liu. "Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments." Sensors 20, no. 21 (2020): 6360. http://dx.doi.org/10.3390/s20216360.

Full text

Abstract:

Three-dimensional hand detection from a single RGB-D image is an important technology which supports many useful applications. Practically, it is challenging to robustly detect human hands in unconstrained environments because the RGB-D channels can be affected by many uncontrollable factors, such as light changes. To tackle this problem, we propose a 3D hand detection approach which improves the robustness and accuracy by adaptively fusing the complementary features extracted from the RGB-D channels. Using the fused RGB-D feature, the 2D bounding boxes of hands are detected first, and then th

APA, Harvard, Vancouver, ISO, and other styles

22

Aubry, Sophie, Sohaib Laraba, Joëlle Tilmanne, and Thierry Dutoit. "Action recognition based on 2D skeletons extracted from RGB videos." MATEC Web of Conferences 277 (2019): 02034. http://dx.doi.org/10.1051/matecconf/201927702034.

Full text

Abstract:

In this paper a methodology to recognize actions based on RGB videos is proposed which takes advantages of the recent breakthrough made in deep learning. Following the development of Convolutional Neural Networks (CNNs), research was conducted on the transformation of skeletal motion data into 2D images. In this work, a solution is proposed requiring only the use of RGB videos instead of RGB-D videos. This work is based on multiple works studying the conversion of RGB-D data into 2D images. From a video stream (RGB images), a two-dimension skeleton of 18 joints for each detected body is extrac

APA, Harvard, Vancouver, ISO, and other styles

23

Kostusiak, Aleksander, and Piotr Skrzypczyński. "Enhancing Visual Odometry with Estimated Scene Depth: Leveraging RGB-D Data with Deep Learning." Electronics 13, no. 14 (2024): 2755. http://dx.doi.org/10.3390/electronics13142755.

Full text

Abstract:

Advances in visual odometry (VO) systems have benefited from the widespread use of affordable RGB-D cameras, improving indoor localization and mapping accuracy. However, older sensors like the Kinect v1 face challenges due to depth inaccuracies and incomplete data. This study compares indoor VO systems that use RGB-D images, exploring methods to enhance depth information. We examine conventional image inpainting techniques and a deep learning approach, utilizing newer depth data from devices like the Kinect v2. Our research highlights the importance of refining data from lower-quality sensors,

APA, Harvard, Vancouver, ISO, and other styles

24

Sun, Qingbo. "Research on RGB-D image recognition technology based on feature fusion and machine learning." Journal of Physics: Conference Series 2031, no. 1 (2021): 012022. http://dx.doi.org/10.1088/1742-6596/2031/1/012022.

Full text

Abstract:

Abstract The three-dimensional RGB-D image contains not only the color and texture information of the two-dimensional image, but also contains the surface geometry information of the target. This article analyzes the RGB-D image recognition methods, including stereo vision technology, structured light technology, etc. By studying the application points of RGB-D image recognition technology under the background of feature fusion and machine learning, the purpose is to improve the richness of image recognition content and provide reference for the smooth development of the follow-up work.

APA, Harvard, Vancouver, ISO, and other styles

25

Chen, Liang Chia, and Nguyen Van Thai. "Real-Time 3-D Mapping for Indoor Environments Using RGB-D Cameras." Advanced Materials Research 579 (October 2012): 435–44. http://dx.doi.org/10.4028/www.scientific.net/amr.579.435.

Full text

Abstract:

For three-dimensional (3-D) mapping, so far, 3-D laser scanners and stereo camera systems are used widely due to their high measurement range and accuracy. For stereo camera systems, establishing corresponding point pairs between two images is one crucial step for reconstructing depth information. However, mapping approaches using laser scanners are still restricted by a serious constraint by accurate image registration and mapping. In recent years, time-of-flight (ToF) cameras have been used for mapping tasks in providing high frame rates while preserving a compact size, but lack in measureme

APA, Harvard, Vancouver, ISO, and other styles

26

Peng, M., W. Wan, Y. Xing, et al. "INTEGRATING DEPTH AND IMAGE SEQUENCES FOR PLANETARY ROVER MAPPING USING RGB-D SENSOR." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-3 (April 30, 2018): 1369–74. http://dx.doi.org/10.5194/isprs-archives-xlii-3-1369-2018.

Full text

Abstract:

RGB-D camera allows the capture of depth and color information at high data rates, and this makes it possible and beneficial integrate depth and image sequences for planetary rover mapping. The proposed mapping method consists of three steps. First, the strict projection relationship among 3D space, depth data and visual texture data is established based on the imaging principle of RGB-D camera, then, an extended bundle adjustment (BA) based SLAM method with integrated 2D and 3D measurements is applied to the image network for high-precision pose estimation. Next, as the interior and exterior

APA, Harvard, Vancouver, ISO, and other styles

27

Wang, Liang, and Zhiqiu Wu. "RGB-D SLAM with Manhattan Frame Estimation Using Orientation Relevance." Sensors 19, no. 5 (2019): 1050. http://dx.doi.org/10.3390/s19051050.

Full text

Abstract:

Due to image noise, image blur, and inconsistency between depth data and color image, the accuracy and robustness of the pairwise spatial transformation computed by matching extracted features of detected key points in existing sparse Red Green Blue-Depth (RGB-D) Simultaneously Localization And Mapping (SLAM) algorithms are poor. Considering that most indoor environments follow the Manhattan World assumption and the Manhattan Frame can be used as a reference to compute the pairwise spatial transformation, a new RGB-D SLAM algorithm is proposed. It first performs the Manhattan Frame Estimation

APA, Harvard, Vancouver, ISO, and other styles

28

Zhao, Bohu, Lebao Li, and Haipeng Pan. "Non-Local Means Hole Repair Algorithm Based on Adaptive Block." Applied Sciences 14, no. 1 (2023): 159. http://dx.doi.org/10.3390/app14010159.

Full text

Abstract:

RGB-D cameras provide depth and color information and are widely used in 3D reconstruction and computer vision. In the majority of existing RGB-D cameras, a considerable portion of depth values is often lost due to severe occlusion or limited camera coverage, thereby adversely impacting the precise localization and three-dimensional reconstruction of objects. In this paper, to address the issue of poor-quality in-depth images captured by RGB-D cameras, a depth image hole repair algorithm based on non-local means is proposed first, leveraging the structural similarities between grayscale and de

APA, Harvard, Vancouver, ISO, and other styles

29

Chi, Chen Tung, Shih Chien Yang, and Yin Tien Wang. "Calibration of RGB-D Sensors for Robot SLAM." Applied Mechanics and Materials 479-480 (December 2013): 677–81. http://dx.doi.org/10.4028/www.scientific.net/amm.479-480.677.

Full text

Abstract:

This paper presents a calibration procedure for a Kinect RGB-D sensor and its application to robot simultaneous localization and mapping(SLAM). The calibration procedure consists of two stages: in the first stage, the RGB image is aligned with the depth image by using the bilinear interpolation. The distorted RGB image is further corrected in the second stage. The calibrated RGB-D sensor is used as the sensing device for robot navigation in unknown environment. In SLAM tasks, the speeded-up robust features (SURF) are detected from the RGB image and used as landmarks in the environment map. The

APA, Harvard, Vancouver, ISO, and other styles

30

Jiang, Ming-xin, Chao Deng, Ming-min Zhang, Jing-song Shan, and Haiyan Zhang. "Multimodal Deep Feature Fusion (MMDFF) for RGB-D Tracking." Complexity 2018 (November 28, 2018): 1–8. http://dx.doi.org/10.1155/2018/5676095.

Full text

Abstract:

Visual tracking is still a challenging task due to occlusion, appearance changes, complex motion, etc. We propose a novel RGB-D tracker based on multimodal deep feature fusion (MMDFF) in this paper. MMDFF model consists of four deep Convolutional Neural Networks (CNNs): Motion-specific CNN, RGB- specific CNN, Depth-specific CNN, and RGB-Depth correlated CNN. The depth image is encoded into three channels which are sent into depth-specific CNN to extract deep depth features. The optical flow image is calculated for every frame and then is fed to motion-specific CNN to learn deep motion features

APA, Harvard, Vancouver, ISO, and other styles

31

Jung, Geunho, Yong-Yuk Won, and Sang Min Yoon. "Computational Large Field-of-View RGB-D Integral Imaging System." Sensors 21, no. 21 (2021): 7407. http://dx.doi.org/10.3390/s21217407.

Full text

Abstract:

The integral imaging system has received considerable research attention because it can be applied to real-time three-dimensional image displays with a continuous view angle without supplementary devices. Most previous approaches place a physical micro-lens array in front of the image, where each lens looks different depending on the viewing angle. A computational integral imaging system with a virtual micro-lens arrays has been proposed in order to provide flexibility for users to change micro-lens arrays and focal length while reducing distortions due to physical mismatches with the lens arr

APA, Harvard, Vancouver, ISO, and other styles

32

Wang, Huiqun, Di Huang, Kui Jia, and Yunhong Wang. "Hierarchical Image Segmentation Ensemble for Objectness in RGB-D Images." IEEE Transactions on Circuits and Systems for Video Technology 29, no. 1 (2019): 93–103. http://dx.doi.org/10.1109/tcsvt.2017.2776220.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Liu, Weiyu, and Nan Di. "RSCS6D: Keypoint Extraction-Based 6D Pose Estimation." Applied Sciences 15, no. 12 (2025): 6729. https://doi.org/10.3390/app15126729.

Full text

Abstract:

In this work, we propose an improved network, RSCS6D, for 6D pose estimation from RGB-D images by extracting keypoint-based point clouds. Our key insight is that keypoint cloud can reduce data redundancy in 3D point clouds and accelerate the convergence of convolutional neural networks. First, we employ a semantic segmentation network on the RGB image to obtain mask images containing positional information and per-pixel labels. Next, we introduce a novel keypoint cloud extraction algorithm that combines RGB and depth images to detect 2D keypoints and convert them into 3D keypoints. Specificall

APA, Harvard, Vancouver, ISO, and other styles

34

Zeng, Hui, Bin Yang, Xiuqing Wang, Jiwei Liu, and Dongmei Fu. "RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory." Sensors 19, no. 3 (2019): 529. http://dx.doi.org/10.3390/s19030529.

Full text

Abstract:

With the development of low-cost RGB-D (Red Green Blue-Depth) sensors, RGB-D object recognition has attracted more and more researchers’ attention in recent years. The deep learning technique has become popular in the field of image analysis and has achieved competitive results. To make full use of the effective identification information in the RGB and depth images, we propose a multi-modal deep neural network and a DS (Dempster Shafer) evidence theory based RGB-D object recognition method. First, the RGB and depth images are preprocessed and two convolutional neural networks are trained, res

APA, Harvard, Vancouver, ISO, and other styles

35

Feng, Guanyuan, Lin Ma, and Xuezhi Tan. "Visual Map Construction Using RGB-D Sensors for Image-Based Localization in Indoor Environments." Journal of Sensors 2017 (2017): 1–18. http://dx.doi.org/10.1155/2017/8037607.

Full text

Abstract:

RGB-D sensors capture RGB images and depth images simultaneously, which makes it possible to acquire the depth information at pixel level. This paper focuses on the use of RGB-D sensors to construct a visual map which is an extended dense 3D map containing essential elements for image-based localization, such as poses of the database camera, visual features, and 3D structures of the building. Taking advantage of matched visual features and corresponding depth values, a novel local optimization algorithm is proposed to achieve point cloud registration and database camera pose estimation. Next,

APA, Harvard, Vancouver, ISO, and other styles

36

Hacking, Chris, Nitesh Poona, Nicola Manzan, and Carlos Poblete-Echeverría. "Investigating 2-D and 3-D Proximal Remote Sensing Techniques for Vineyard Yield Estimation." Sensors 19, no. 17 (2019): 3652. http://dx.doi.org/10.3390/s19173652.

Full text

Abstract:

Vineyard yield estimation provides the winegrower with insightful information regarding the expected yield, facilitating managerial decisions to achieve maximum quantity and quality and assisting the winery with logistics. The use of proximal remote sensing technology and techniques for yield estimation has produced limited success within viticulture. In this study, 2-D RGB and 3-D RGB-D (Kinect sensor) imagery were investigated for yield estimation in a vertical shoot positioned (VSP) vineyard. Three experiments were implemented, including two measurement levels and two canopy treatments. The

APA, Harvard, Vancouver, ISO, and other styles

37

Salazar, Isail, Said Pertuz, and Fabio Martínez. "Multi-modal RGB-D Image Segmentation from Appearance and Geometric Depth Maps." TecnoLógicas 23, no. 48 (2020): 143–61. http://dx.doi.org/10.22430/22565337.1538.

Full text

Abstract:

Classical image segmentation algorithms exploit the detection of similarities and discontinuities of different visual cues to define and differentiate multiple regions of interest in images. However, due to the high variability and uncertainty of image data, producing accurate results is difficult. In other words, segmentation based just on color is often insufficient for a large percentage of real-life scenes. This work presents a novel multi-modal segmentation strategy that integrates depth and appearance cues from RGB-D images by building a hierarchical region-based representation, i.e., a

APA, Harvard, Vancouver, ISO, and other styles

38

Sudharshan Duth, P., and M. Mary Deepa. "Color detection in RGB-modeled images using MAT LAB." International Journal of Engineering & Technology 7, no. 2.31 (2018): 29. http://dx.doi.org/10.14419/ijet.v7i2.31.13391.

Full text

Abstract:

This research work introduces a method of using color thresholds to identify two-dimensional images in MATLAB using the RGB Color model to recognize the Color preferred by the user in the picture. Methodologies including image color detection convert a 3-D RGB Image into a Gray-scale Image, at that point subtract the two pictures to obtain a 2-D black-and-white picture, filtering the noise picture elements using a median filter, detecting with a connected component mark digital pictures in the connected area and utilize the bounding box and its properties to calculate the metric for every mark

APA, Harvard, Vancouver, ISO, and other styles

39

Heravi, Hamed, Roghaieh Aghaeifard, Ali Rahimpour Jounghani, Afshin Ebrahimi, and Masumeh Delgarmi. "EXTRACTING FEATURES OF THE HUMAN FACE FROM RGB-D IMAGES TO PLAN FACIAL SURGERIES." Biomedical Engineering: Applications, Basis and Communications 32, no. 06 (2020): 2050042. http://dx.doi.org/10.4015/s1016237220500428.

Full text

Abstract:

Biometric identification of the human face is a pervasive subject which deals with a wide range of disciplines such as image processing, computer vision, pattern recognition, artificial intelligence, and cognitive psychology. Extracting key face points for developing software and commercial devices of face surgery analysis is one of the most challenging fields in computer image and vision processing. Many studies have developed a variety of techniques to extract facial features from color and gray images. In recent years, using depth information has opened up new approaches to researchers in t

APA, Harvard, Vancouver, ISO, and other styles

40

Na, Myung Hwan, Wan Hyun Cho, Sang Kyoon Kim, and In Seop Na. "Automatic Weight Prediction System for Korean Cattle Using Bayesian Ridge Algorithm on RGB-D Image." Electronics 11, no. 10 (2022): 1663. http://dx.doi.org/10.3390/electronics11101663.

Full text

Abstract:

Weighting the Hanwoo (Korean cattle) is very important for Korean beef producers when selling the Hanwoo at the right time. Recently, research is being conducted on the automatic prediction of the weight of Hanwoo only through images with the achievement of research using deep learning and image recognition. In this paper, we propose a method for the automatic weight prediction of Hanwoo using the Bayesian ridge algorithm on RGB-D images. The proposed system consists of three parts: segmentation, extraction of features, and estimation of the weight of Korean cattle from a given RGB-D image. Th

APA, Harvard, Vancouver, ISO, and other styles

41

Kong, Yuqiu, He Wang, Lingwei Kong, Yang Liu, Cuili Yao, and Baocai Yin. "Absolute and Relative Depth-Induced Network for RGB-D Salient Object Detection." Sensors 23, no. 7 (2023): 3611. http://dx.doi.org/10.3390/s23073611.

Full text

Abstract:

Detecting salient objects in complicated scenarios is a challenging problem. Except for semantic features from the RGB image, spatial information from the depth image also provides sufficient cues about the object. Therefore, it is crucial to rationally integrate RGB and depth features for the RGB-D salient object detection task. Most existing RGB-D saliency detectors modulate RGB semantic features with absolution depth values. However, they ignore the appearance contrast and structure knowledge indicated by relative depth values between pixels. In this work, we propose a depth-induced network

APA, Harvard, Vancouver, ISO, and other styles

42

Liu, Zhengyi, Tengfei Song, and Feng Xie. "RGB-D image saliency detection from 3D perspective." Multimedia Tools and Applications 78, no. 6 (2018): 6787–804. http://dx.doi.org/10.1007/s11042-018-6319-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Hong, Sungjin, and Myounggyu Kim. "A Framework for Human Body Parts Detection in RGB-D Image." Journal of Korea Multimedia Society 19, no. 12 (2016): 1927–35. http://dx.doi.org/10.9717/kmms.2016.19.12.1927.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Ouloul, M. I., Z. Moutakki, K. Afdel, and A. Amghar. "An Efficient Face Recognition Using SIFT Descriptor in RGB-D Images." International Journal of Electrical and Computer Engineering (IJECE) 5, no. 6 (2015): 1227. http://dx.doi.org/10.11591/ijece.v5i6.pp1227-1233.

Full text

Abstract:

Automatic face recognition has known a very important evolution in the last decade, due to its huge usage in the security systems. The most of facial recognition approaches use 2D image, but the problem is that this type of image is very sensible to the illumination and lighting changes. Another approach uses the 3D camera and stereo cameras as well, but it’s rarely used because it requires a relatively long processing duration. A new approach rise in this field, which is based on RGB-D images produced by Kinect, this type of cameras cost less and it can be used in any environment and under an

APA, Harvard, Vancouver, ISO, and other styles

45

Chen, Songnan, Mengxia Tang, Ruifang Dong, and Jiangming Kan. "Encoder–Decoder Structure Fusing Depth Information for Outdoor Semantic Segmentation." Applied Sciences 13, no. 17 (2023): 9924. http://dx.doi.org/10.3390/app13179924.

Full text

Abstract:

The semantic segmentation of outdoor images is the cornerstone of scene understanding and plays a crucial role in the autonomous navigation of robots. Although RGB–D images can provide additional depth information for improving the performance of semantic segmentation tasks, current state–of–the–art methods directly use ground truth depth maps for depth information fusion, which relies on highly developed and expensive depth sensors. Aiming to solve such a problem, we proposed a self–calibrated RGB-D image semantic segmentation neural network model based on an improved residual network without

APA, Harvard, Vancouver, ISO, and other styles

46

Yan, Hailong, Wenqi Wu, Zhenghua Deng, Junjian Huang, Zhizhang Li, and Luting Zhang. "Image Inpainting for 3D Reconstruction Based on the Known Region Boundaries." Mathematics 10, no. 15 (2022): 2761. http://dx.doi.org/10.3390/math10152761.

Full text

Abstract:

Pointcloud is a collection of 3D object coordinate systems in 3D scene. Generally, point data in pointclouds represent the outer surface of an object. It is widely used in 3D reconstruction applications in various fields. When obtaining pointcloud data from RGB-D images, if part of the information in the RGB-D images is lost or damaged, the pointcloud data will be hollow or too sparse. Moreover, it is not conducive to the subsequent application of pointcloud data. Based on the boundary of the region to be repaired, we proposes to repair the damaged image and synthesize the complete pointcloud

APA, Harvard, Vancouver, ISO, and other styles

47

Banchajarurat, Chanikan, Khwantri Saengprachatanarug, Nattpol Damrongplasit, and Chanat Ratanasumawong. "Volume estimation of cassava using consumer-grade RGB-D camera." E3S Web of Conferences 187 (2020): 02002. http://dx.doi.org/10.1051/e3sconf/202018702002.

Full text

Abstract:

Mismanagement during postharvest handling of cassava can degrade the quality of the product and depreciate its selling price considerably. This study proposed the feasibility of using RGB-depth camera to measure the quality of cassava roots in a non-destructive, fast and cost-effective manner. Methodology to estimate the volume of cassavas Kasetsart 50 variety was the focus of this study. Using RGB-D images collected from 60 cassava samples with each one being photographed from 6 different orientations. Image Processing model and Point Cloud image model were used to find the volume from depth

APA, Harvard, Vancouver, ISO, and other styles

48

Zhang, Heng, Zhenqiang Wen, Yanli Liu, and Gang Xu. "Edge Detection from RGB-D Image Based on Structured Forests." Journal of Sensors 2016 (2016): 1–10. http://dx.doi.org/10.1155/2016/5328130.

Full text

Abstract:

This paper looks into the fundamental problem in computer vision: edge detection. We propose a new edge detector using structured random forests as the classifier, which can make full use of RGB-D image information from Kinect. Before classification, the adaptive bilateral filter is used for the denoising processing of the depth image. As data sources, information of 13 channels from RGB-D image is computed. In order to train the random forest classifier, the approximation measurement of the information gain is used. All the structured labels at a given node are mapped to a discrete set of lab

APA, Harvard, Vancouver, ISO, and other styles

49

Lin, Jiaying, Yuen-Hei Yeung, Shuquan Ye, and Rynson W. H. Lau. "Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 5 (2025): 5254–61. https://doi.org/10.1609/aaai.v39i5.32558.

Full text

Abstract:

Glass surfaces are becoming increasingly ubiquitous as modern buildings tend to use a lot of glass panels. This, however, poses substantial challenges to the operations of autonomous systems such as robots, self-driving cars, and drones, as the glass panels can become transparent obstacles to navigation. Existing works attempt to exploit various cues, including glass boundary context or reflections, as a prior. However, they are all based on input RGB images. We observe that the transmission of 3D depth sensor light through glass surfaces often produces blank regions in the depth maps, which c

APA, Harvard, Vancouver, ISO, and other styles

50

Roman-Rivera, Luis-Rogelio, Israel Sotelo-Rodríguez, Jesus Carlos Pedraza-Ortega, Marco Antonio Aceves-Fernandez, Juan Manuel Ramos-Arreguín, and Efrén Gorrostieta-Hurtado. "Reduced Calibration Strategy Using a Basketball for RGB-D Cameras." Mathematics 10, no. 12 (2022): 2085. http://dx.doi.org/10.3390/math10122085.

Full text

Abstract:

RGB-D cameras produce depth and color information commonly used in the 3D reconstruction and vision computer areas. Different cameras with the same model usually produce images with different calibration errors. The color and depth layer usually requires calibration to minimize alignment errors, adjust precision, and improve data quality in general. Standard calibration protocols for RGB-D cameras require a controlled environment to allow operators to take many RGB and depth pair images as an input for calibration frameworks making the calibration protocol challenging to implement without idea

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!