Log in

Relevant bibliographies by topics / 3D Human Pose Estimation / Journal articles

To see the other types of publications on this topic, follow the link: 3D Human Pose Estimation.

Journal articles on the topic '3D Human Pose Estimation'

Author: Grafiati

Published: 4 June 2021

Last updated: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic '3D Human Pose Estimation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wei, Guoqiang, Cuiling Lan, Wenjun Zeng, and Zhibo Chen. "View Invariant 3D Human Pose Estimation." IEEE Transactions on Circuits and Systems for Video Technology 30, no. 12 (December 2020): 4601–10. http://dx.doi.org/10.1109/tcsvt.2019.2928813.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Bao, Wenxia, Zhongyu Ma, Dong Liang, Xianjun Yang, and Tao Niu. "Pose ResNet: 3D Human Pose Estimation Based on Self-Supervision." Sensors 23, no. 6 (March 12, 2023): 3057. http://dx.doi.org/10.3390/s23063057.

Full text

Abstract:

The accurate estimation of a 3D human pose is of great importance in many fields, such as human–computer interaction, motion recognition and automatic driving. In view of the difficulty of obtaining 3D ground truth labels for a dataset of 3D pose estimation techniques, we take 2D images as the research object in this paper, and propose a self-supervised 3D pose estimation model called Pose ResNet. ResNet50 is used as the basic network for extract features. First, a convolutional block attention module (CBAM) was introduced to refine selection of significant pixels. Then, a waterfall atrous spatial pooling (WASP) module is used to capture multi-scale contextual information from the extracted features to increase the receptive field. Finally, the features are input into a deconvolution network to acquire the volume heat map, which is later processed by a soft argmax function to obtain the coordinates of the joints. In addition to the two learning strategies of transfer learning and synthetic occlusion, a self-supervised training method is also used in this model, in which the 3D labels are constructed by the epipolar geometry transformation to supervise the training of the network. Without the need for 3D ground truths for the dataset, accurate estimation of the 3D human pose can be realized from a single 2D image. The results show that the mean per joint position error (MPJPE) is 74.6 mm without the need for 3D ground truth labels. Compared with other approaches, the proposed method achieves better results.

APA, Harvard, Vancouver, ISO, and other styles

3

Nguyen, Hung-Cuong, Thi-Hao Nguyen, Rafal Scherer, and Van-Hung Le. "Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications." Sensors 22, no. 14 (July 20, 2022): 5419. http://dx.doi.org/10.3390/s22145419.

Full text

Abstract:

Three-dimensional human pose estimation is widely applied in sports, robotics, and healthcare. In the past five years, the number of CNN-based studies for 3D human pose estimation has been numerous and has yielded impressive results. However, studies often focus only on improving the accuracy of the estimation results. In this paper, we propose a fast, unified end-to-end model for estimating 3D human pose, called YOLOv5-HR-TCM (YOLOv5-HRet-Temporal Convolution Model). Our proposed model is based on the 2D to 3D lifting approach for 3D human pose estimation while taking care of each step in the estimation process, such as person detection, 2D human pose estimation, and 3D human pose estimation. The proposed model is a combination of best practices at each stage. Our proposed model is evaluated on the Human 3.6M dataset and compared with other methods at each step. The method achieves high accuracy, not sacrificing processing speed. The estimated time of the whole process is 3.146 FPS on a low-end computer. In particular, we propose a sports scoring application based on the deviation angle between the estimated 3D human posture and the standard (reference) origin. The average deviation angle evaluated on the Human 3.6M dataset (Protocol #1–Pro #1) is 8.2 degrees.

APA, Harvard, Vancouver, ISO, and other styles

4

Yin, He, Chang Lv, and Yeqin Shao. "3D Human Pose Estimation Based on Transformer." Journal of Physics: Conference Series 2562, no. 1 (August 1, 2023): 012067. http://dx.doi.org/10.1088/1742-6596/2562/1/012067.

Full text

Abstract:

Abstract Currently, 3D human pose estimation has gradually been a well-liked subject. Although various models based on the deep neural network have produced an excellent performance, they still suffer from the ignorance of multiple feasible pose solutions and the problem of the relatively-fixed input length. To solve these issues, a coordinate transformer encoder based on a 2D pose is constructed to generate multiple feasible pose solutions, and multi-to-one pose mapping is employed to generate a reliable pose. A temporal transformer encoder is used to exploit the temporal dependencies of consecutive pose sequences, which avoids the issue of relatively-fixed input length caused by temporal dilated convolution. Adequate experiments indicate that our model achieves a promising performance.

APA, Harvard, Vancouver, ISO, and other styles

5

Jiang, Longkui, Yuru Wang, and Weijia Li. "Regress 3D human pose from 2D skeleton with kinematics knowledge." Electronic Research Archive 31, no. 3 (2023): 1485–97. http://dx.doi.org/10.3934/era.2023075.

Full text

Abstract:

<abstract> <p>3D human pose estimation is a hot topic in the field of computer vision. It provides data support for tasks such as pose recognition, human tracking and action recognition. Therefore, it is widely applied in the fields of advanced human-computer interaction, intelligent monitoring and so on. Estimating 3D human pose from a single 2D image is an ill-posed problem and is likely to cause low prediction accuracy, due to the problems of self-occlusion and depth ambiguity. This paper developed two types of human kinematics to improve the estimation accuracy. First, taking the 2D human body skeleton sequence obtained by the 2D human body pose detector as input, a temporal convolutional network is proposed to develop the movement periodicity in temporal domain. Second, geometrical prior knowledge is introduced into the model to constrain the estimated pose to fit the general kinematics knowledge. The experiments are tested on Human3.6M and MPII (Max Planck Institut Informatik) Human Pose (MPI-INF-3DHP) datasets, and the proposed model shows better generalization ability compared with the baseline and the state-of-the-art models.</p> </abstract>

APA, Harvard, Vancouver, ISO, and other styles

6

Sun, Jun, Mantao Wang, Xin Zhao, and Dejun Zhang. "Multi-View Pose Generator Based on Deep Learning for Monocular 3D Human Pose Estimation." Symmetry 12, no. 7 (July 4, 2020): 1116. http://dx.doi.org/10.3390/sym12071116.

Full text

Abstract:

In this paper, we study the problem of monocular 3D human pose estimation based on deep learning. Due to single view limitations, the monocular human pose estimation cannot avoid the inherent occlusion problem. The common methods use the multi-view based 3D pose estimation method to solve this problem. However, single-view images cannot be used directly in multi-view methods, which greatly limits practical applications. To address the above-mentioned issues, we propose a novel end-to-end 3D pose estimation network for monocular 3D human pose estimation. First, we propose a multi-view pose generator to predict multi-view 2D poses from the 2D poses in a single view. Secondly, we propose a simple but effective data augmentation method for generating multi-view 2D pose annotations, on account of the existing datasets (e.g., Human3.6M, etc.) not containing a large number of 2D pose annotations in different views. Thirdly, we employ graph convolutional network to infer a 3D pose from multi-view 2D poses. From experiments conducted on public datasets, the results have verified the effectiveness of our method. Furthermore, the ablation studies show that our method improved the performance of existing 3D pose estimation networks.

APA, Harvard, Vancouver, ISO, and other styles

7

Wang, Jinbao, Shujie Tan, Xiantong Zhen, Shuo Xu, Feng Zheng, Zhenyu He, and Ling Shao. "Deep 3D human pose estimation: A review." Computer Vision and Image Understanding 210 (September 2021): 103225. http://dx.doi.org/10.1016/j.cviu.2021.103225.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Wu, Jianzhai, Dewen Hu, Fengtao Xiang, Xingsheng Yuan, and Jiongming Su. "3D human pose estimation by depth map." Visual Computer 36, no. 7 (September 3, 2019): 1401–10. http://dx.doi.org/10.1007/s00371-019-01740-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Kim, Jong-Wook, Jin-Young Choi, Eun-Ju Ha, and Jae-Ho Choi. "Human Pose Estimation Using MediaPipe Pose and Optimization Method Based on a Humanoid Model." Applied Sciences 13, no. 4 (February 20, 2023): 2700. http://dx.doi.org/10.3390/app13042700.

Full text

Abstract:

Seniors who live alone at home are at risk of falling and injuring themselves and, thus, may need a mobile robot that monitors and recognizes their poses automatically. Even though deep learning methods are actively evolving in this area, they have limitations in estimating poses that are absent or rare in training datasets. For a lightweight approach, an off-the-shelf 2D pose estimation method, a more sophisticated humanoid model, and a fast optimization method are combined to estimate joint angles for 3D pose estimation. As a novel idea, the depth ambiguity problem of 3D pose estimation is solved by adding a loss function deviation of the center of mass from the center of the supporting feet and penalty functions concerning appropriate joint angle rotation range. To verify the proposed pose estimation method, six daily poses were estimated with a mean joint coordinate difference of 0.097 m and an average angle difference per joint of 10.017 degrees. In addition, to confirm practicality, videos of exercise activities and a scene of a person falling were filmed, and the joint angle trajectories were produced as the 3D estimation results. The optimized execution time per frame was measured at 0.033 s on a single-board computer (SBC) without GPU, showing the feasibility of the proposed method as a real-time system.

APA, Harvard, Vancouver, ISO, and other styles

10

Xia, Hailun, and Tianyang Zhang. "Self-Attention Network for Human Pose Estimation." Applied Sciences 11, no. 4 (February 18, 2021): 1826. http://dx.doi.org/10.3390/app11041826.

Full text

Abstract:

Estimating the positions of human joints from monocular single RGB images has been a challenging task in recent years. Despite great progress in human pose estimation with convolutional neural networks (CNNs), a central problem still exists: the relationships and constraints, such as symmetric relations of human structures, are not well exploited in previous CNN-based methods. Considering the effectiveness of combining local and nonlocal consistencies, we propose an end-to-end self-attention network (SAN) to alleviate this issue. In SANs, attention-driven and long-range dependency modeling are adopted between joints to compensate for local content and mine details from all feature locations. To enable an SAN for both 2D and 3D pose estimations, we also design a compatible, effective and general joint learning framework to mix up the usage of different dimension data. We evaluate the proposed network on challenging benchmark datasets. The experimental results show that our method has significantly achieved competitive results on Human3.6M, MPII and COCO datasets.

APA, Harvard, Vancouver, ISO, and other styles

11

El Kaid, Amal, Denis Brazey, Vincent Barra, and Karim Baïna. "Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos." Sensors 22, no. 11 (May 28, 2022): 4109. http://dx.doi.org/10.3390/s22114109.

Full text

Abstract:

Two-dimensional (2D) multi-person pose estimation and three-dimensional (3D) root-relative pose estimation from a monocular RGB camera have made significant progress recently. Yet, real-world applications require depth estimations and the ability to determine the distances between people in a scene. Therefore, it is necessary to recover the 3D absolute poses of several people. However, this is still a challenge when using cameras from single points of view. Furthermore, the previously proposed systems typically required a significant amount of resources and memory. To overcome these restrictions, we herein propose a real-time framework for multi-person 3D absolute pose estimation from a monocular camera, which integrates a human detector, a 2D pose estimator, a 3D root-relative pose reconstructor, and a root depth estimator in a top-down manner. The proposed system, called Root-GAST-Net, is based on modified versions of GAST-Net and RootNet networks. The efficiency of the proposed Root-GAST-Net system is demonstrated through quantitative and qualitative evaluations on two benchmark datasets, Human3.6M and MuPoTS-3D. On all evaluated metrics, our experimental results on the MuPoTS-3D dataset outperform the current state-of-the-art by a significant margin, and can run in real-time at 15 fps on the Nvidia GeForce GTX 1080.

APA, Harvard, Vancouver, ISO, and other styles

12

Gu, Lanqing, and Yu Wang. "3D human pose estimation based on negative exponential reduction Gaussian kernel." Journal of Physics: Conference Series 2400, no. 1 (December 1, 2022): 012011. http://dx.doi.org/10.1088/1742-6596/2400/1/012011.

Full text

Abstract:

Abstract Human pose estimation has become an important research direction in the field of motion recognition. 3D human pose estimation adds depth information to 2D pose estimation, which is more widely used. In this paper, the weight of each voxel is calculated in the 3D discrete space by projecting the joint point heatmap to directly estimate the 3D human pose. To improve the accuracy of 3D human pose estimation, the Gaussian kernel of heatmap with the variable size is reduced by a negative exponent in the process of training. The dilated convolution of a small convolution kernel is used to replace the large convolution kernel to solve the problem of large computation overhead when detecting key points in discrete 3D space. Experimental results show that this method is effective and can accurately estimate the 3D pose in multi view images.

APA, Harvard, Vancouver, ISO, and other styles

13

Wang, Jue, and Zhigang Luo. "Pointless Pose: Part Affinity Field-Based 3D Pose Estimation without Detecting Keypoints." Electronics 10, no. 8 (April 13, 2021): 929. http://dx.doi.org/10.3390/electronics10080929.

Full text

Abstract:

Human pose estimation finds its application in an extremely wide domain and is therefore never pointless. We propose in this paper a new approach that, unlike any prior one that we are aware of, bypasses the 2D keypoint detection step based on which the 3D pose is estimated, and is thus pointless. Our motivation is rather straightforward: 2D keypoint detection is vulnerable to occlusions and out-of-image absences, in which case the 2D errors propagate to 3D recovery and deteriorate the results. To this end, we resort to explicitly estimating the human body regions of interest (ROI) and their 3D orientations. Even if a portion of the human body, like the lower arm, is partially absent, the predicted orientation vector pointing from the upper arm will take advantage of the local image evidence and recover the 3D pose. This is achieved, specifically, by deforming a skeleton-shaped puppet template to fit the estimated orientation vectors. Despite its simple nature, the proposed approach yields truly robust and state-of-the-art results on several benchmarks and in-the-wild data.

APA, Harvard, Vancouver, ISO, and other styles

14

Chang, Ju Yong, and Kyoung Mu Lee. "2D–3D pose consistency-based conditional random fields for 3D human pose estimation." Computer Vision and Image Understanding 169 (April 2018): 52–61. http://dx.doi.org/10.1016/j.cviu.2018.02.004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Liu, Huan, Jian Wu, and Rui He. "Center point to pose: Multiple views 3D human pose estimation for multi-person." PLOS ONE 17, no. 9 (September 13, 2022): e0274450. http://dx.doi.org/10.1371/journal.pone.0274450.

Full text

Abstract:

3D human pose estimation has always been an important task in computer vision, especially in crowded scenes where multiple people interact with each other. There are many state-of-the-arts for object detection based on single view. However, recovering the location of people is complicated in crowded and occluded scenes due to the lack of depth information for single view, which is the lack of robustness. Multi-view Human Pose Estimation for Multi-Person became an effective approach. The previous multi-view 3D human pose estimation method can be attributed to a strategy to associate the joints of the same person from 2D pose estimation. However, the incompleteness and noise of the 2D pose are inevitable. In addition, how to associate the joints itself is challenging. To solve this issue, we propose a CTP (Center Point to Pose) network based on multi-view which directly operates in the 3D space. The 2D joint features in all cameras are projected into 3D voxel space. Our CTP network regresses the center of one person as the location, and the 3D bounding box as the activity area of one person. Then our CTP network estimates detailed 3D pose for each bounding box. Besides, our CTP network is Non-Maximum Suppression free at the stage of regressing the center of one person, which makes it more efficient and simpler. Our method outperforms competitively on several public datasets which shows the efficacy of our center point to pose network representation.

APA, Harvard, Vancouver, ISO, and other styles

16

Manesco, João Renato Ribeiro, Stefano Berretti, and Aparecido Nilceu Marana. "DUA: A Domain-Unified Approach for Cross-Dataset 3D Human Pose Estimation." Sensors 23, no. 17 (August 22, 2023): 7312. http://dx.doi.org/10.3390/s23177312.

Full text

Abstract:

Human pose estimation is an important Computer Vision problem, whose goal is to estimate the human body through joints. Currently, methods that employ deep learning techniques excel in the task of 2D human pose estimation. However, the use of 3D poses can bring more accurate and robust results. Since 3D pose labels can only be acquired in restricted scenarios, fully convolutional methods tend to perform poorly on the task. One strategy to solve this problem is to use 2D pose estimators, to estimate 3D poses in two steps using 2D pose inputs. Due to database acquisition constraints, the performance improvement of this strategy can only be observed in controlled environments, therefore domain adaptation techniques can be used to increase the generalization capability of the system by inserting information from synthetic domains. In this work, we propose a novel method called Domain Unified approach, aimed at solving pose misalignment problems on a cross-dataset scenario, through a combination of three modules on top of the pose estimator: pose converter, uncertainty estimator, and domain classifier. Our method led to a 44.1mm (29.24%) error reduction, when training with the SURREAL synthetic dataset and evaluating with Human3.6M over a no-adaption scenario, achieving state-of-the-art performance.

APA, Harvard, Vancouver, ISO, and other styles

17

Zhang, Dejun, Yiqi Wu, Mingyue Guo, and Yilin Chen. "Deep Learning Methods for 3D Human Pose Estimation under Different Supervision Paradigms: A Survey." Electronics 10, no. 18 (September 15, 2021): 2267. http://dx.doi.org/10.3390/electronics10182267.

Full text

Abstract:

The rise of deep learning technology has broadly promoted the practical application of artificial intelligence in production and daily life. In computer vision, many human-centered applications, such as video surveillance, human-computer interaction, digital entertainment, etc., rely heavily on accurate and efficient human pose estimation techniques. Inspired by the remarkable achievements in learning-based 2D human pose estimation, numerous research studies are devoted to the topic of 3D human pose estimation via deep learning methods. Against this backdrop, this paper provides an extensive literature survey of recent literature about deep learning methods for 3D human pose estimation to display the development process of these research studies, track the latest research trends, and analyze the characteristics of devised types of methods. The literature is reviewed, along with the general pipeline of 3D human pose estimation, which consists of human body modeling, learning-based pose estimation, and regularization for refinement. Different from existing reviews of the same topic, this paper focus on deep learning-based methods. The learning-based pose estimation is discussed from two categories: single-person and multi-person. Each one is further categorized by data type to the image-based methods and the video-based methods. Moreover, due to the significance of data for learning-based methods, this paper surveys the 3D human pose estimation methods according to the taxonomy of supervision form. At last, this paper also enlists the current and widely used datasets and compares performances of reviewed methods. Based on this literature survey, it can be concluded that each branch of 3D human pose estimation starts with fully-supervised methods, and there is still much room for multi-person pose estimation based on other supervision methods from both image and video. Besides the significant development of 3D human pose estimation via deep learning, the inherent ambiguity and occlusion problems remain challenging issues that need to be better addressed.

APA, Harvard, Vancouver, ISO, and other styles

18

Nguyen, Tuong Thanh, Van-Hung Le, Duy-Long Duong, Thanh-Cong Pham, and Dung Le. "3D Human Pose Estimation in Vietnamese Traditional Martial Art Videos." Journal of Advanced Engineering and Computation 3, no. 3 (September 30, 2019): 471. http://dx.doi.org/10.25073/jaec.201933.252.

Full text

Abstract:

Preserving, maintaining and teaching traditional martial arts are very important activities in social life. That helps preserve national culture, exercise and self-defense for practitioners. However, traditional martial arts have many different postures and activities of the body and body parts are diverse. The problem of estimating the actions of the human body still has many challenges, such as accuracy, obscurity, etc. In this paper, we survey several strong studies in the recent years for 3-D human pose estimation. Statistical tables have been compiled for years, typical results of these studies on the Human 3.6m dataset have been summarized. We also present a comparative study for 3-D human pose estimation based on the method that uses a single image. This study based on the methods that use the Convolutional Neural Network (CNN) for 2-D pose estimation, and then using 3-D pose library for mapping the 2-D results into the 3-D space. The CNNs model is trained on the benchmark datasets as MSCOCO Keypoints Challenge dataset [1], Human 3.6m [2], MPII dataset [3], LSP [4], [5], etc. We final publish the dataset of Vietnamese's traditional martial arts in Binh Dinh province for evaluating the 3-D human pose estimation. Quantitative results are presented and evaluated.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium provided the original work is properly cited.

APA, Harvard, Vancouver, ISO, and other styles

19

Rapczyński, Michał, Philipp Werner, Sebastian Handrich, and Ayoub Al-Hamadi. "A Baseline for Cross-Database 3D Human Pose Estimation." Sensors 21, no. 11 (May 28, 2021): 3769. http://dx.doi.org/10.3390/s21113769.

Full text

Abstract:

Vision-based 3D human pose estimation approaches are typically evaluated on datasets that are limited in diversity regarding many factors, e.g., subjects, poses, cameras, and lighting. However, for real-life applications, it would be desirable to create systems that work under arbitrary conditions (“in-the-wild”). To advance towards this goal, we investigated the commonly used datasets HumanEva-I, Human3.6M, and Panoptic Studio, discussed their biases (that is, their limitations in diversity), and illustrated them in cross-database experiments (for which we used a surrogate for roughly estimating in-the-wild performance). For this purpose, we first harmonized the differing skeleton joint definitions of the datasets, reducing the biases and systematic test errors in cross-database experiments. We further proposed a scale normalization method that significantly improved generalization across camera viewpoints, subjects, and datasets. In additional experiments, we investigated the effect of using more or less cameras, training with multiple datasets, applying a proposed anatomy-based pose validation step, and using OpenPose as the basis for the 3D pose estimation. The experimental results showed the usefulness of the joint harmonization, of the scale normalization, and of augmenting virtual cameras to significantly improve cross-database and in-database generalization. At the same time, the experiments showed that there were dataset biases that could not be compensated and call for new datasets covering more diversity. We discussed our results and promising directions for future work.

APA, Harvard, Vancouver, ISO, and other styles

20

Huang, Xiaoshan, Jun Huang, and Zengming Tang. "3D Human Pose Estimation With Spatial Structure Information." IEEE Access 9 (2021): 35947–56. http://dx.doi.org/10.1109/access.2021.3062426.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Chen, Guowei. "3D Human Pose Estimation Based on Transformer Algorithm." Mobile Information Systems 2022 (August 29, 2022): 1–9. http://dx.doi.org/10.1155/2022/6858822.

Full text

Abstract:

Human pose estimation (HPE) is a fundamental problem in computer vision, and it is also the basis of applied research in many fields, which can be used for virtual fitting, fashion analysis, behavior analysis, human-computer interaction, and auxiliary pedestrian detection. The purpose of HPE is to use image processing and machine learning methods to find out the positions and types of joints of people in pictures. There are two main difficulties in HPE. First, the complex human images make the model need to learn a highly nonlinear mapping relationship, and the learning of this mapping relationship is extremely difficult. Second, the highly nonlinear mapping relationship needs to be learned by using a model with high complexity, and a model with high complexity requires a lot of computational overhead. In this context, this paper studies the 3D HPE based on the transformer. We introduce the research status of HPE at home and abroad and provide a theoretical basis for designing the transformer 3D HPE model in this paper. We introduce the technical principle and optimization scheme of CNN and transformer and propose a 3D HPE model based on transformer. We used two datasets, COCO and the MPII datasets, and performed a number of experiments to find the best parameters for model development and then assess the model’s performance. The experimental findings suggest that the strategy described in this study outperforms all other methods on both datasets. The average precision (AP) of our model reaches up to 79% on COCO dataset but a PCKh-0.5 score of 81.5% on the MPII dataset.

APA, Harvard, Vancouver, ISO, and other styles

22

Gholami, Mohsen, Ahmad Rezaei, Helge Rhodin, Rabab Ward, and Z. Jane Wang. "Self-supervised 3D human pose estimation from video." Neurocomputing 488 (June 2022): 97–106. http://dx.doi.org/10.1016/j.neucom.2022.02.076.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Xia, Hailun, and Meng Xiao. "3D Human Pose Estimation With Generative Adversarial Networks." IEEE Access 8 (2020): 206198–206. http://dx.doi.org/10.1109/access.2020.3037829.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Belagiannis, Vasileios, Sikandar Amin, Mykhaylo Andriluka, Bernt Schiele, Nassir Navab, and Slobodan Ilic. "3D Pictorial Structures Revisited: Multiple Human Pose Estimation." IEEE Transactions on Pattern Analysis and Machine Intelligence 38, no. 10 (October 1, 2016): 1929–42. http://dx.doi.org/10.1109/tpami.2015.2509986.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Véges, Márton, Viktor Varga, and András Lőrincz. "3D human pose estimation with siamese equivariant embedding." Neurocomputing 339 (April 2019): 194–201. http://dx.doi.org/10.1016/j.neucom.2019.02.029.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Ji, Xiaopeng, Qi Fang, Junting Dong, Qing Shuai, Wen Jiang, and Xiaowei Zhou. "A survey on monocular 3D human pose estimation." Virtual Reality & Intelligent Hardware 2, no. 6 (December 2020): 471–500. http://dx.doi.org/10.1016/j.vrih.2020.04.005.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Ershadi-Nasab, Sara, Erfan Noury, Shohreh Kasaei, and Esmaeil Sanaei. "Multiple human 3D pose estimation from multiview images." Multimedia Tools and Applications 77, no. 12 (September 4, 2017): 15573–601. http://dx.doi.org/10.1007/s11042-017-5133-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

朱, 志玮. "Graph Attention Based Monocular 3D Human Pose Estimation." Artificial Intelligence and Robotics Research 12, no. 02 (2023): 143–53. http://dx.doi.org/10.12677/airr.2023.122017.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Guan, Shannan, Haiyan Lu, Linchao Zhu, and Gengfa Fang. "PoseGU: 3D human pose estimation with novel human pose generator and unbiased learning." Computer Vision and Image Understanding 233 (August 2023): 103715. http://dx.doi.org/10.1016/j.cviu.2023.103715.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Li, Yang, Kan Li, Shuai Jiang, Ziyue Zhang, Congzhentao Huang, and Richard Yi Da Xu. "Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11442–49. http://dx.doi.org/10.1609/aaai.v34i07.6808.

Full text

Abstract:

The neural network based approach for 3D human pose estimation from monocular images has attracted growing interest. However, annotating 3D poses is a labor-intensive and expensive process. In this paper, we propose a novel self-supervised approach to avoid the need of manual annotations. Different from existing weakly/self-supervised methods that require extra unpaired 3D ground-truth data to alleviate the depth ambiguity problem, our method trains the network only relying on geometric knowledge without any additional 3D pose annotations. The proposed method follows the two-stage pipeline: 2D pose estimation and 2D-to-3D pose lifting. We design the transform re-projection loss that is an effective way to explore multi-view consistency for training the 2D-to-3D lifting network. Besides, we adopt the confidences of 2D joints to integrate losses from different views to alleviate the influence of noises caused by the self-occlusion problem. Finally, we design a two-branch training architecture, which helps to preserve the scale information of re-projected 2D poses during training, resulting in accurate 3D pose predictions. We demonstrate the effectiveness of our method on two popular 3D human pose datasets, Human3.6M and MPI-INF-3DHP. The results show that our method significantly outperforms recent weakly/self-supervised approaches.

APA, Harvard, Vancouver, ISO, and other styles

31

Chang, Inho, Min-Gyu Park, Je Woo Kim, and Ju Hong Yoon. "Absolute 3D Human Pose Estimation Using Noise-Aware Radial Distance Predictions." Symmetry 15, no. 1 (December 22, 2022): 25. http://dx.doi.org/10.3390/sym15010025.

Full text

Abstract:

We present a simple yet effective pipeline for absolute three-dimensional (3D) human pose estimation from two-dimensional (2D) joint keypoints, namely, the 2D-to-3D human pose lifting problem. Our method comprises two simple baseline networks, a 3D conversion function, and a correction network. The former two networks predict the root distance and the root-relative joint distance simultaneously. Given the input and predicted distances, the 3D conversion function recovers the absolute 3D pose, and the correction network reduces 3D pose noise caused by input uncertainties. Furthermore, to cope with input noise implicitly, we adopt a Siamese architecture that enforces the consistency of features between two training inputs, i.e., ground truth 2D joint keypoints and detected 2D joint keypoints. Finally, we experimentally validate the advantages of the proposed method and demonstrate its competitive performance over state-of-the-art absolute 2D-to-3D pose-lifting methods.

APA, Harvard, Vancouver, ISO, and other styles

32

Duan, Chengpeng, Bingliang Hu, Wei Liu, and Jie Song. "Motion Capture for Sporting Events Based on Graph Convolutional Neural Networks and Single Target Pose Estimation Algorithms." Applied Sciences 13, no. 13 (June 27, 2023): 7611. http://dx.doi.org/10.3390/app13137611.

Full text

Abstract:

Human pose estimation refers to accurately estimating the position of the human body from a single RGB image and detecting the location of the body. It serves as the basis for several computer vision tasks, such as human tracking, 3D reconstruction, and autonomous driving. Improving the accuracy of pose estimation has significant implications for the advancement of computer vision. This paper addresses the limitations of single-branch networks in pose estimation. It presents a top-down single-target pose estimation approach based on multi-branch self-calibrating networks combined with graph convolutional neural networks. The study focuses on two aspects: human body detection and human body pose estimation. The human body detection is for athletes appearing in sports competitions, followed by human body pose estimation, which is divided into two methods: coordinate regression-based and heatmap test-based. To improve the accuracy of the heatmap test, the high-resolution feature map output from HRNet is used for deconvolution to improve the accuracy of single-target pose estimation recognition.

APA, Harvard, Vancouver, ISO, and other styles

33

Kundu, Jogendra Nath, Siddharth Seth, Rahul M V, Mugalodi Rakesh, Venkatesh Babu Radhakrishnan, and Anirban Chakraborty. "Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11312–19. http://dx.doi.org/10.1609/aaai.v34i07.6792.

Full text

Abstract:

Estimation of 3D human pose from monocular image has gained considerable attention, as a key step to several human-centric applications. However, generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable, as these models often perform unsatisfactorily on unseen in-the-wild environments. Though weakly-supervised models have been proposed to address this shortcoming, performance of such models relies on availability of paired supervision on some related task, such as 2D pose or multi-view image pairs. In contrast, we propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions. Our pose estimation framework relies on a minimal set of prior knowledge that defines the underlying kinematic 3D structure, such as skeletal joint connectivity information with bone-length ratios in a fixed canonical scale. The proposed model employs three consecutive differentiable transformations namely forward-kinematics, camera-projection and spatial-map transformation. This design not only acts as a suitable bottleneck stimulating effective pose disentanglement, but also yields interpretable latent pose representations avoiding training of an explicit latent embedding to pose mapper. Furthermore, devoid of unstable adversarial setup, we re-utilize the decoder to formalize an energy-based loss, which enables us to learn from in-the-wild videos, beyond laboratory settings. Comprehensive experiments demonstrate our state-of-the-art unsupervised and weakly-supervised pose estimation performance on both Human3.6M and MPI-INF-3DHP datasets. Qualitative results on unseen environments further establish our superior generalization ability.

APA, Harvard, Vancouver, ISO, and other styles

34

Xu, Chenxin, Siheng Chen, Maosen Li, and Ya Zhang. "Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 4 (May 18, 2021): 3013–21. http://dx.doi.org/10.1609/aaai.v35i4.16409.

Full text

Abstract:

We propose a novel method based on teacher-student learning framework for 3D human pose estimation without any 3D annotation or side information. To solve this unsupervised-learning problem, the teacher network adopts pose-dictionary-based modeling for regularization to estimate a physically plausible 3D pose. To handle the decomposition ambiguity in the teacher network, we propose a cycle-consistent architecture promoting a 3D rotation-invariant property to train the teacher network. To further improve the estimation accuracy, the student network adopts a novel graph convolution network for flexibility to directly estimate the 3D coordinates. Another cycle-consistent architecture promoting 3D rotation-equivariant property is adopted to exploit geometry consistency, together with knowledge distillation from the teacher network to improve the pose estimation performance. We conduct extensive experiments on Human3.6M and MPI-INF-3DHP. Our method reduces the 3D joint prediction error by 11.4% compared to state-of-the-art unsupervised methods and also outperforms many weakly-supervised methods that use side information on Human3.6M. Code will be available at https://github.com/sjtuxcx/ITES.

APA, Harvard, Vancouver, ISO, and other styles

35

Wang, Hao, Ming-hui Sun, Hao Zhang, and Li-yan Dong. "LHPE-nets: A lightweight 2D and 3D human pose estimation model with well-structural deep networks and multi-view pose sample simplification method." PLOS ONE 17, no. 2 (February 23, 2022): e0264302. http://dx.doi.org/10.1371/journal.pone.0264302.

Full text

Abstract:

The cross-view 3D human pose estimation model has made significant progress, it better completed the task of human joint positioning and skeleton modeling in 3D through multi-view fusion method. The multi-view 2D pose estimation part of this model is very important, but its training cost is also very high. It uses some deep learning networks to generate heatmaps for each view. Therefore, in this article, we tested some new deep learning networks for pose estimation tasks. These deep networks include Mobilenetv2, Mobilenetv3, Efficientnetv2 and Resnet. Then, based on the performance and drawbacks of these networks, we built multiple deep learning networks with better performance. We call our network in this article LHPE-nets, which mainly includes Low-Span network and RDNS network. LHPE-nets uses a network structure with evenly distributed channels, inverted residuals, external residual blocks and a framework for processing small-resolution samples to achieve training saturation faster. And we also designed a static pose sample simplification method for 3D pose data. It implemented low-cost sample storage, and it was also convenient for models to read these samples. In the experiment, we used several recent models and two public estimation indicators. The experimental results show the superiority of this work in fast start-up and network lightweight, it is about 1-5 epochs faster than the Resnet-34 during training. And they also show the accuracy improvement of this work in estimating different joints, the estimated performance of approximately 60% of the joints is improved. Its performance in the overall human pose estimation exceeds other networks by more than 7mm. The experiment analyzes the network size, fast start-up and the performance in 2D and 3D pose estimation of the model in this paper in detail. Compared with other pose estimation models, its performance has also reached a higher level of application.

APA, Harvard, Vancouver, ISO, and other styles

36

Bin Sulong, Ghazali, and M . Randles. "Computer Vision Using Pose Estimation." Wasit Journal of Computer and Mathematics Science 2, no. 1 (March 31, 2023): 85–92. http://dx.doi.org/10.31185/wjcm.111.

Full text

Abstract:

Pose estimation involves estimating the position and orientation of objects in a 3D space, and it has applications in areas such as robotics, augmented reality, and human-computer interaction. There are several methods for pose estimation, including model-based, feature-based, direct, hybrid, and deep learning-based methods. Each method has its own strengths and weaknesses, and the choice of method depends on the specific requirements of the application, object being estimated, and available data. Advancements in computer vision and machine learning have made it possible to achieve high accuracy and robustness in pose estimation, allowing for the development of a wide range of innovative applications. Pose estimation will continue to be an important area of research and development, and we can expect to see further improvements in the accuracy and robustness of pose estimation methods in the future.

APA, Harvard, Vancouver, ISO, and other styles

37

Sun, Haixun, Yanyan Zhang, Yijie Zheng, Jianxin Luo, and Zhisong Pan. "G2O-Pose: Real-Time Monocular 3D Human Pose Estimation Based on General Graph Optimization." Sensors 22, no. 21 (October 30, 2022): 8335. http://dx.doi.org/10.3390/s22218335.

Full text

Abstract:

Monocular 3D human pose estimation is used to calculate a 3D human pose from monocular images or videos. It still faces some challenges due to the lack of depth information. Traditional methods have tried to disambiguate it by building a pose dictionary or using temporal information, but these methods are too slow for real-time application. In this paper, we propose a real-time method named G2O-pose, which has a high running speed without affecting the accuracy so much. In our work, we regard the 3D human pose as a graph, and solve the problem by general graph optimization (G2O) under multiple constraints. The constraints are implemented by algorithms including 3D bone proportion recovery, human orientation classification and reverse joint correction and suppression. When the depth of the human body does not change much, our method outperforms the previous non-deep learning methods in terms of running speed, with only a slight decrease in accuracy.

APA, Harvard, Vancouver, ISO, and other styles

38

Wu, Haiping, and Bin Xiao. "3D Human Pose Estimation via Explicit Compositional Depth Maps." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 12378–85. http://dx.doi.org/10.1609/aaai.v34i07.6923.

Full text

Abstract:

In this work, we tackle the problem of estimating 3D human pose in camera space from a monocular image. First, we propose to use densely-generated limb depth maps to ease the learning of body joints depth, which are well aligned with image cues. Then, we design a lifting module from 2D pixel coordinates to 3D camera coordinates which explicitly takes the depth values as inputs, and is aligned with camera perspective projection model. We show our method achieves superior performance on large-scale 3D pose datasets Human3.6M and MPI-INF-3DHP, and sets the new state-of-the-art.

APA, Harvard, Vancouver, ISO, and other styles

39

Zhou, Lu, Yingying Chen, Jinqiao Wang, and Hanqing Lu. "Progressive Bi-C3D Pose Grammar for Human Pose Estimation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 13033–40. http://dx.doi.org/10.1609/aaai.v34i07.7004.

Full text

Abstract:

In this paper, we propose a progressive pose grammar network learned with Bi-C3D (Bidirectional Convolutional 3D) for human pose estimation. Exploiting the dependencies among the human body parts proves effective in solving the problems such as complex articulation, occlusion and so on. Therefore, we propose two articulated grammars learned with Bi-C3D to build the relationships of the human joints and exploit the contextual information of human body structure. Firstly, a local multi-scale Bi-C3D kinematics grammar is proposed to promote the message passing process among the locally related joints. The multi-scale kinematics grammar excavates different levels human context learned by the network. Moreover, a global sequential grammar is put forward to capture the long-range dependencies among the human body joints. The whole procedure can be regarded as a local-global progressive refinement process. Without bells and whistles, our method achieves competitive performance on both MPII and LSP benchmarks compared with previous methods, which confirms the feasibility and effectiveness of C3D in information interactions.

APA, Harvard, Vancouver, ISO, and other styles

40

Dai, Shiming, Wei Liu, Wenji Yang, Lili Fan, and Jihao Zhang. "Cascaded Hierarchical CNN for RGB-Based 3D Hand Pose Estimation." Mathematical Problems in Engineering 2020 (July 15, 2020): 1–13. http://dx.doi.org/10.1155/2020/8432840.

Full text

Abstract:

3D hand pose estimation can provide basic information about gestures, which has an important significance in the fields of Human-Machine Interaction (HMI) and Virtual Reality (VR). In recent years, 3D hand pose estimation from a single depth image has made great research achievements due to the development of depth cameras. However, 3D hand pose estimation from a single RGB image is still a highly challenging problem. In this work, we propose a novel four-stage cascaded hierarchical CNN (4CHNet), which leverages hierarchical network to decompose hand pose estimation into finger pose estimation and palm pose estimation, extracts separately finger features and palm features, and finally fuses them to estimate 3D hand pose. Compared with direct estimation methods, the hand feature information extracted by the hierarchical network is more representative. Furthermore, concatenating various stages of the network for end-to-end training can make each stage mutually beneficial and progress. The experimental results on two public datasets demonstrate that our 4CHNet can significantly improve the accuracy of 3D hand pose estimation from a single RGB image.

APA, Harvard, Vancouver, ISO, and other styles

41

Zhang, Siqi, Chaofang Wang, Wenlong Dong, and Bin Fan. "A Survey on Depth Ambiguity of 3D Human Pose Estimation." Applied Sciences 12, no. 20 (October 20, 2022): 10591. http://dx.doi.org/10.3390/app122010591.

Full text

Abstract:

Depth ambiguity is one of the main challenges of three-dimensional (3D) human pose estimation (HPE). The recent strategies of disambiguating have brought significant progress and remarkable breakthroughs in the field of 3D human pose estimation (3D HPE). This survey extensively reviews the causes and solutions of the depth ambiguity. The solutions are systematically classified into four categories: camera parameter constraints, temporal consistency constraints, kinematic constraints, and image cues constraints. This paper summarizes the performance comparison, challenges, main frameworks, and evaluation metrics, and discusses some promising future research directions.

APA, Harvard, Vancouver, ISO, and other styles

42

Gärtner, Erik, Aleksis Pirinen, and Cristian Sminchisescu. "Deep Reinforcement Learning for Active Human Pose Estimation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 10835–44. http://dx.doi.org/10.1609/aaai.v34i07.6714.

Full text

Abstract:

Most 3d human pose estimation methods assume that input – be it images of a scene collected from one or several viewpoints, or from a video – is given. Consequently, they focus on estimates leveraging prior knowledge and measurement by fusing information spatially and/or temporally, whenever available. In this paper we address the problem of an active observer with freedom to move and explore the scene spatially – in ‘time-freeze’ mode – and/or temporally, by selecting informative viewpoints that improve its estimation accuracy. Towards this end, we introduce Pose-DRL, a fully trainable deep reinforcement learning-based active pose estimation architecture which learns to select appropriate views, in space and time, to feed an underlying monocular pose estimator. We evaluate our model using single- and multi-target estimators with strong result in both settings. Our system further learns automatic stopping conditions in time and transition functions to the next temporal processing step in videos. In extensive experiments with the Panoptic multi-view setup, and for complex scenes containing multiple people, we show that our model learns to select viewpoints that yield significantly more accurate pose estimates compared to strong multi-view baselines.

APA, Harvard, Vancouver, ISO, and other styles

43

Sagawa, Yuichi, Masamichi Shimosaka, Taketoshi Mori, and Tomomasa Sato. "Fast Online Human Pose Estimation via 3D Voxel Data." Journal of the Robotics Society of Japan 26, no. 8 (2008): 913–24. http://dx.doi.org/10.7210/jrsj.26.913.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Shen, Jianfeng, Wenming Yang, and Qingmin Liao. "Part template: 3D representation for multiview human pose estimation." Pattern Recognition 46, no. 7 (July 2013): 1920–32. http://dx.doi.org/10.1016/j.patcog.2013.01.001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Li, Yimeng, Jun Xiao, Di Xie, Jian Shao, and Jinlong Wang. "Adversarial learning for viewpoints invariant 3D human pose estimation." Journal of Visual Communication and Image Representation 58 (January 2019): 374–79. http://dx.doi.org/10.1016/j.jvcir.2018.11.021.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Hofmann, M., and D. M. Gavrila. "Multi-view 3D Human Pose Estimation in Complex Environment." International Journal of Computer Vision 96, no. 1 (May 1, 2011): 103–24. http://dx.doi.org/10.1007/s11263-011-0451-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Rogez, Grégory, and Cordelia Schmid. "Image-Based Synthesis for Deep 3D Human Pose Estimation." International Journal of Computer Vision 126, no. 9 (March 19, 2018): 993–1008. http://dx.doi.org/10.1007/s11263-018-1071-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

万, 云翀, Yunpeng Song, and Ligang Liu. "3D Human Pose Estimation Based on Volumetric Joint Coordinates." Journal of Computer-Aided Design & Computer Graphics 34, no. 09 (September 1, 2022): 1411–19. http://dx.doi.org/10.3724/sp.j.1089.2022.19167.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Liu, Shuqin. "Discrete pointvv 3D reconstruction algorithm based human pose estimation." Microprocessors and Microsystems 82 (April 2021): 103806. http://dx.doi.org/10.1016/j.micpro.2020.103806.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Zhang, Xiaoyan, Zhengchun Zhou, Ying Han, Hua Meng, Meng Yang, and Sutharshan Rajasegarar. "Deep learning-based real-time 3D human pose estimation." Engineering Applications of Artificial Intelligence 119 (March 2023): 105813. http://dx.doi.org/10.1016/j.engappai.2022.105813.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!