Dissertations / Theses on the topic '3D Human Pose Estimation'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic '3D Human Pose Estimation.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Budaraju, Sri Datta. "Unsupervised 3D Human Pose Estimation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291435.
Full textUppsatsen föreslår en oövervakad metod för representationslärande för att förutsäga en 3Dpose från ett 2D skelett med hjälp av ett VAE GAN (Variationellt Autoenkodande Generativt Adversariellt Nätverk) hybrid neuralt nätverk. Metoden lär sig att utvidga poser från 2D till 3D genom att använda självövervakning och adversariella inlärningstekniker. Metoden använder sig vare sig av bilder, värmekartor, 3D poseannotationer, parade/oparade 2D till 3D skelett, a priori information i 3D, syntetiska 2Dskelett, flera vyer, eller tidsinformation. 2Dskelettindata tas från ett VAE som kodar det i en latent rymd och sedan avkodar den latenta representationen till en 3Dpose. 3D posen är sedan återprojicerad till 2D för att genomgå begränsad, självövervakad optimering med hjälp av den tvådimensionella posen. Parallellt roteras dessutom 3Dposen slumpmässigt och återprojiceras till 2D för att generera en ny 2D vy för obegränsad adversariell optimering med hjälp av ett diskriminatornätverk. Kombinationen av optimeringarna av den ursprungliga och den nya 2Dvyn av den förutsagda 3Dposen resulterar i en realistisk 3Dposegenerering. Resultaten i uppsatsen visar att kodningsoch avkodningsprocessen av VAE adresserar utmaningen med felaktiga och ofullständiga skelett från 2D detekteringsnätverk som indata och att variansen av VAE kan modifieras för att få flera troliga 3D poser för givna 2D indata. Dessutom kan den latenta representationen användas för crossmodal träning och flera nedströmsapplikationer. Resultaten på datamängder från Human3.6M är bättre än tidigare oövervakade metoder med mindre modellkomplexitet samtidigt som de adresserar flera hinder för att skala upp uppgiften till verkliga tillämpningar.
Wang, Jianquan. "A Human Kinetic Dataset and a Hybrid Model for 3D Human Pose Estimation." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/41437.
Full textGong, Wenjuan. "3D Motion Data aided Human Action Recognition and Pose Estimation." Doctoral thesis, Universitat Autònoma de Barcelona, 2013. http://hdl.handle.net/10803/116189.
Full textEn este trabajo se exploran el reconocimiento de acciones humanas y la estimación de su postura en secuencias de imágenes. A diferencia de las técnicas tradicionales de aprendizaje a partir de imágenes 2D o vídeo con la salida anotada, en esta Tesis abordamos este objetivo con la información de movimiento 3D capturado, que nos ayudar a cerrar el lazo entre las caracteríssticas 2D de la imagen y las interpretaciones sobre el movimiento humano.
In this work, we explore human action recognition and pose estimation problems. Different from traditional works of learning from 2D images or video sequences and their annotated output, we seek to solve the problems with additional 3D motion capture information, which helps to fill the gap between 2D image features and human interpretations.
Yu, Tsz-Ho. "Classification and pose estimation of 3D shapes and human actions." Thesis, University of Cambridge, 2014. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.708443.
Full textDarby, John. "3D Human Motion Tracking and Pose Estimation using Probabilistic Activity Models." Thesis, Manchester Metropolitan University, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.523145.
Full textBorodulina, A. (Anastasiia). "Application of 3D human pose estimation for motion capture and character animation." Master's thesis, University of Oulu, 2019. http://jultika.oulu.fi/Record/nbnfioulu-201906262670.
Full textBurenius, Magnus. "Human 3D Pose Estimation in the Wild : using Geometrical Models and Pictorial Structures." Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-138136.
Full textMehta, Dushyant [Verfasser]. "Real-time 3D human body pose estimation from monocular RGB input / Dushyant Mehta." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2020. http://d-nb.info/1220691135/34.
Full textNorman, Jacob. "3D POSE ESTIMATION IN THE CONTEXT OF GRIP POSITION FOR PHRI." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-55166.
Full textFathollahi, Ghezelghieh Mona. "Estimation of Human Poses Categories and Physical Object Properties from Motion Trajectories." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6835.
Full textHossain, Mir Rayat Imtiaz. "Understanding the sources of error for 3D human pose estimation from monocular images and videos." Thesis, University of British Columbia, 2017. http://hdl.handle.net/2429/63808.
Full textScience, Faculty of
Computer Science, Department of
Graduate
SARMADI, HAMID. "Human Detection and Pose Estimation in aMulti-camera System : Using a 3D version of Pictorial Structures." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-142372.
Full textCarbonera, Luvizon Diogo. "Apprentissage automatique pour la reconnaissance d'action humaine et l'estimation de pose à partir de l'information 3D." Thesis, Cergy-Pontoise, 2019. http://www.theses.fr/2019CERG1015.
Full text3D human action recognition is a challenging task due to the complexity ofhuman movements and to the variety on poses and actions performed by distinctsubjects. Recent technologies based on depth sensors can provide 3D humanskeletons with low computational cost, which is an useful information foraction recognition. However, such low cost sensors are restricted tocontrolled environment and frequently output noisy data. Meanwhile,convolutional neural networks (CNN) have shown significant improvements onboth action recognition and 3D human pose estimation from RGB images. Despitebeing closely related problems, the two tasks are frequently handled separatedin the literature. In this work, we analyze the problem of 3D human actionrecognition in two scenarios: first, we explore spatial and temporalfeatures from human skeletons, which are aggregated by a shallow metriclearning approach. In the second scenario, we not only show that precise 3Dposes are beneficial to action recognition, but also that both tasks can beefficiently performed by a single deep neural network and stillachieves state-of-the-art results. Additionally, wedemonstrate that optimization from end-to-end using poses as an intermediateconstraint leads to significant higher accuracy on the action task thanseparated learning. Finally, we propose a new scalable architecture forreal-time 3D pose estimation and action recognition simultaneously, whichoffers a range of performance vs speed trade-off with a single multimodal andmultitask training procedure
Benzine, Abdallah. "Estimation de poses 3D multi-personnes à partir d'images RGB." Thesis, Sorbonne université, 2020. http://www.theses.fr/2020SORUS103.
Full text3D human pose estimation from RGB monocular images is the processus allowing to locate human joints from an image or of a sequence of images. It provides rich geometric and motion information about the human body. Most existing 3D pose estimation approaches assume that the image contains only one person, fully visible. Such a scenario is not realistic. In real life conditions several people interact. They then tend to hide each other, which makes 3D pose estimation even more ambiguous and complex. The work carried out during this thesis focused on single-shot estimation. of multi-person 3D poses from RGB monocular images. We first proposed a bottom-up approach for predicting multi-person 3D poses that first predicts the 3D coordinates of all the joints present in the image and then uses a grouping process to predict full 3D skeletons. In order to be robust in cases where the people in the image are numerous and far away from the camera, we developed PandaNet, which is based on an anchor representation and integrates a process that allows ignoring anchors ambiguously associated to ground truthes and an automatic weighting of losses. Finally, PandaNet is completed with an Absolute Distance Estimation Module (ADEM). The combination of these two models, called Absolute PandaNet, allows the prediction of absolute human 3D poses expressed in the camera frame
Duncan, Kester. "Scene-Dependent Human Intention Recognition for an Assistive Robotic System." Scholar Commons, 2014. https://scholarcommons.usf.edu/etd/5009.
Full textJack, Dominic. "Deep learning approaches for 3D inference from monocular vision." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/204267/1/Dominic_Jack_Thesis.pdf.
Full textRegia, Corte Fabiola. "Studio ed implementazione di un modello di Human Pose Estimation 3D. Analisi tecnica della posizione del corpo dell’atleta durante un match di Tennis." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Find full textAmin, Sikandar [Verfasser], Bernd [Akademischer Betreuer] Radig, Darius [Gutachter] Burschka, and Bernd [Gutachter] Radig. "Multi-view Part-based Models for 3D Human Pose Estimation in Real-World Scenes / Sikandar Amin ; Gutachter: Darius Burschka, Bernd Radig ; Betreuer: Bernd Radig." München : Universitätsbibliothek der TU München, 2018. http://d-nb.info/1171425422/34.
Full textRydén, Anna, and Amanda Martinsson. "Evaluation of 3D motion capture data from a deep neural network combined with a biomechanical model." Thesis, Linköpings universitet, Institutionen för medicinsk teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176543.
Full textXu, Wanxin. "AFFECT-PRESERVING VISUAL PRIVACY PROTECTION." UKnowledge, 2018. https://uknowledge.uky.edu/ece_etds/122.
Full textZendjebil, Imane. "Localisation 3D basée sur une approche de suppléance multi-capteurs pour la Réalité Augmentée Mobile en Milieu Extérieur." Phd thesis, Université d'Evry-Val d'Essonne, 2010. http://tel.archives-ouvertes.fr/tell-00541366.
Full textBlanc, Beyne Thibault. "Estimation de posture 3D à partir de données imprécises et incomplètes : application à l'analyse d'activité d'opérateurs humains dans un centre de tri." Thesis, Toulouse, INPT, 2020. http://www.theses.fr/2020INPT0106.
Full textIn a context of study of stress and ergonomics at work for the prevention of musculoskeletal disorders, the company Ebhys wants to develop a tool for analyzing the activity of human operators in a waste sorting center, by measuring ergonomic indicators. To cope with the uncontrolled environment of the sorting center, these indicators are measured from depth images. An ergonomic study allows us to define the indicators to be measured. These indicators are zones of movement of the operator’s hands and zones of angulations of certain joints of the upper body. They are therefore indicators that can be obtained from an analysis of the operator’s 3D pose. The software for calculating the indicators will thus be composed of three steps : a first part segments the operator from the rest of the scene to ease the 3D pose estimation, a second part estimates the operator’s 3D pose, and the third part uses the operator’s 3D pose to compute the ergonomic indicators. First of all, we propose an algorithm that extracts the operator from the rest of the depth image. To do this, we use a first automatic segmentation based on static background removal and selection of a moving element given its position and size. This first segmentation allows us to train a neural network that improves the results. This neural network is trained using the segmentations obtained from the first automatic segmentation, from which the best quality samples are automatically selected during training. Next, we build a neural network model to estimate the operator’s 3D pose. We propose a study that allows us to find a light and optimal model for 3D pose estimation on synthetic depth images, which we generate numerically. However, if this network gives outstanding performances on synthetic depth images, it is not directly applicable to real depth images that we acquired in an industrial context. To overcome this issue, we finally build a module that allows us to transform the synthetic depth images into more realistic depth images. This image-to-image translation model modifies the style of the depth image without changing its content, keeping the 3D pose of the operator from the synthetic source image unchanged on the translated realistic depth frames. These more realistic depth images are then used to re-train the 3D pose estimation neural network, to finally obtain a convincing 3D pose estimation on the depth images acquired in real conditions, to compute de ergonomic indicators
Murali, Ram Subramanian. "Pose Estimation and 3D Reconstruction for 3D Dispensing." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288533.
Full textDispensering av vätskor är en viktig del av många industriella tillämpningar, som 3D-skrivare och ytmontering. I en majoritet av dessa tillämpningar idag sker dispenseringen på platta, två-dimensionella ytor (eller substrat). I framtiden kommer dispensering i ökad omfattning ske på godtyckliga tredimensionella objekt, varför positionen av dispenseringshuvudet relativt hela det tre-dimensionella substratet kommer att krävas. Behovet driver på möjligheten att använda vision-baserade, fem- eller sex-axliga robotmanipulatorer, istället för befintliga lösningar med hårdkodade rumsbeskrivningar av ett objekt som dispenseringshuvudet ska följa. Given en tre-dimensionell kropp, en CAD-beskrivning av densamme och en bana på kroppen som ska följas av ett dispenseringshuvud syftar denna avhandling till att svara på följande industriella problem: Hur anpassas dispenseringsbanan om kroppen förskjuts från CAD-modellens position? Det viktigaste kravet för tillämpningen är hög volymsnoggrannhet på den nedlagda volymen som kräver hög positioneringsnoggrannhet, ned till 100 um med avseende på den tänkta CAD-definierade dispenseringsbanan. Den hög positioneringsnoggrannheten på dispenseringshuvudet garanterar även god positioneringsnoggrannhet på den dispenserade vätskan på substratet. För att uppnå hög doseringsnoggrannhet krävs robust volymetrisk skanning (3D-rekonstruktion) av objektet som liknar CADmodellen och robotmanipulator med hög noggrannhet. Arbetet i denna avhandling är begränsad till den tre-dimensionella rekonstruktionen och justering av dispenseringsbanan baserat på en ändring av objektets position. Andra tillämpningsspecifikationer är låg total takttid och låg utrustningskostnad. Avhandling syftar till att: i) undersöka olika typer av markörbaserade 3Dpositioneringsuppskattningar med en billig RGBD konsumentkamera ii) generera en 3D-rekonstruktion av objektet för alla typer av poseuppskattningar och iii) hitta objektets positioneringsförskjutning från CAD-modellens. Schackbräda och ArUco-markörer används som fiducialmarkörer. Olika typer av poseuppskattningar som utnyttjar RGB or RGB-D-kombinerade tekniker används. Truncated Signed Distance Function (TSDF) används för ytrekonstruktion. Iterative Closest Point (ICP) används för att hitta pose-justering mellan CADbeskrivningen och rekonstruktionen. Tester utförs för olika objektformer i olika positioner och riktningar. Därefter utvärderas rekonstruktionen, noggrannheten i justeringen av dispenseringsbanan, samt takttiden.
Peñate, Sánchez Adrián. "3D pose estimation in complex environments." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/406085.
Full textAunque ha habido un progreso notable en la literatura de estimación de pose, todavía hay un número de limitaciones cuando los algoritmos existentes deben ser aplicados en aplicaciones de uso diario, especialmente en ambientes no controlados. En esta tesis se han abordado algunas de estas limitaciones, la computación de la pose para cámaras no calibradas, la computación de la pose sin conocer la correspondencia entre puntos 20 y 30, la computación de la pose cuando los puntos de interés no son fiables y la computación de la pose usando exclusivamente datos de profundidad. Los problemas abordados, y en consecuencia las contribuciones aportadas, han sido analizados en orden creciente de complejidad. En cada nueva etapa de la tesis doctoral se incrementaban las restricciones existentes para la obtención de la pose 30 de la cámara. La tesis ha constado de cuatro partes sobre las que pasaremos a definir las contribuciones realizadas al área de la Visión por Computador. La primera contribución de la tesis doctoral se ha centrado en ofrecer una técnica para la obtención de la pose de una cámara perspectiva sin calibrar más robusta y precisa que los existentes. Mediante la re-formulación de las ecuaciones perspectivas usadas en métodos calibrados y el estudio de la estabilidad numérica de las mismas se ha obtenido una formulación extendida de las ecuaciones perspectivas que ofrece una solución cerrada al problema y una mayor estabilidad en presencia de ruido. La segunda contribución de la tesis se ha centrado en el hecho de que la mayoría de los algoritmos se basan en tener un conjunto de correspondencias 20-30. Esta tarea implica generalmente la extracción y emparejamiento de puntos de interés. En esta tesis doctoral se ha desarrollado un algoritmo que aborda la estimación de las correspondencias entre puntos y estimación de la pose de la cámara de manera conjunta. Al resolver ambos problemas conjuntamente se puede optimizar los pasos a tomar mucho mejor que resolviéndolos por separado. En los trabajos publicados a raíz de este trabajo se han mostrado las ventajas inherentes a esta aproximación al problema. La tercera contribución de la tesis ha sido la de aportar una solución para la estimación de la pose de la cámara en situaciones extremas en las que la calidad de la imagen se encuentra muy deteriorada. Esto es posible mediante el uso de técnicas de aprendizaje a partir de datos de alta calidad y modelos 30 del entorno y los objetos presentes. Esta aproximación se basa en la noción de que a partir de un aprendizaje sobre datos de alta calidad se pueden obtener detectores que son capaces de reconocer los objetos en las peores circunstancias ya que conocen en profundidad aquello que define al objeto en cuestión. La cuarta contribución de la tesis es la creación de un método de estimación de pose que no requiere de información de color, solamente profundidad. Mediante una definición de apariencia volumétrica local y la extracción densa de características en la imagen de profundidad se obtiene un método comparable en precisión al estado de la cuestión pero un orden de magnitud mas rápido. La suma de las contribuciones anteriores en las tareas de estimación de pose 30 han posibilitado la mejora en las herramientas de reconstrucción 30, visión robótica y relocalización en mapas 30. Todas las contribuciones han sido publicadas en revistas y congresos internacionales y de reputado prestigio científico en el área.
Pitteri, Giorgia. "3D Object Pose Estimation in Industrial Context." Thesis, Bordeaux, 2020. http://www.theses.fr/2020BORD0202.
Full text3D object detection and pose estimation are of primary importance for tasks such as robotic manipulation, augmented reality and they have been the focus of intense research in recent years. Methods relying on depth data acquired by depth cameras are robust. Unfortunately, active depth sensors are power hungry or sometimes it is not possible to use them. It is therefore often desirable to rely on color images. When training machine learning algorithms that aim at estimate object's 6D poses from images, many challenges arise, especially in industrial context that requires handling objects with symmetries and generalizing to unseen objects, i.e. objects never seen by the networks during training.In this thesis, we first analyse the link between the symmetries of a 3D object and its appearance in images. Our analysis explains why symmetrical objects can be a challenge when training machine learning algorithms to predict their 6D pose from images. We then propose an efficient and simple solution that relies on the normalization of the pose rotation. This approach is general and can be used with any 6D pose estimation algorithm.Then, we address the second main challenge: the generalization to unseen objects. Many recent methods for 6D pose estimation are robust and accurate but their success can be attributed to supervised Machine Learning approaches. For each new object, these methods have to be retrained on many different images of this object, which are not always available. Even if domain transfer methods allow for training such methods with synthetic images instead of real ones-at least to some extent-such training sessions take time, and it is highly desirable to avoid them in practice.We propose two methods to handle this problem. The first method relies only on the objects’ geometries and focuses on objects with prominent corners, which covers a large number of industrial objects. We first learn to detect object corners of various shapes in images and also to predict their 3D poses, by using training images of a small set of objects. To detect a new object in a given image, we first identify its corners from its CAD model; we also detect the corners visible in the image and predict their 3D poses. We then introduce a RANSAC-like algorithm that robustly and efficiently detects and estimates the object’s 3D pose by matching its corners on the CAD model with their detected counterparts in the image.The second method overcomes the limitations of the first one as it does not require objects to have specific corners and the offline selection of the corners on the CAD model. It combines Deep Learning and 3D geometry and relies on an embedding of the local 3D geometry to match the CAD models to the input images. For points at the surface of objects, this embedding can be computed directly from the CAD model; for image locations, we learn to predict it from the image itself. This establishes correspondences between 3D points on the CAD model and 2D locations of the input images. However, many of these correspondences are ambiguous as many points may have similar local geometries. We also show that we can use Mask-RCNN in a class-agnostic way to detect the new objects without retraining and thus drastically limit the number of possible correspondences. We can then robustly estimate a 3D pose from these discriminative correspondences using a RANSAC-like algorithm
Madadi, Meysam. "Human segmentation, pose estimation and applications." Doctoral thesis, Universitat Autònoma de Barcelona, 2017. http://hdl.handle.net/10803/457900.
Full textAutomatic analyzing humans in photographs or videos has great potential applications in computer vision containing medical diagnosis, sports, entertainment, movie editing and surveillance, just to name a few. Body, face and hand are the most studied components of humans. Body has many variabilities in shape and clothing along with high degrees of freedom in pose. Face has many muscles causing many visible deformity, beside variable shape and hair style. Hand is a small object, moving fast and has high degrees of freedom. Adding human characteristics to all aforementioned variabilities makes human analysis quite a challenging task. In this thesis, we developed human segmentation in different modalities. In a first scenario, we segmented human body and hand in depth images using example-based shape warping. We developed a shape descriptor based on shape context and class probabilities of shape regions to extract nearest neighbors. We then considered rigid affine alignment vs. non-rigid iterative shape warping. In a second scenario, we segmented face in RGB images using convolutional neural networks (CNN). We modeled conditional random field with recurrent neural networks. In our model pair-wise kernels are not fixed and learned during training. We trained the network end-to-end using adversarial networks which improved hair segmentation by a high margin. We also worked on 3D hand pose estimation in depth images. In a generative approach, we fitted a finger model separately for each finger based on our example-based rigid hand segmentation. We minimized an energy function based on overlapping area, depth discrepancy and finger collisions. We also applied linear models in joint trajectory space to refine occluded joints based on visible joints error and invisible joints trajectory smoothness. In a CNN-based approach, we developed a tree-structure network to train specific features for each finger and fused them for global pose consistency. We also formulated physical and appearance constraints as loss functions. Finally, we developed a number of applications consisting of human soft biometrics measurement and garment retexturing. We also generated some datasets in this thesis consisting of human segmentation, synthetic hand pose, garment retexturing and Italian gestures.
Gkagkos, Polydefkis. "3D Human Pose and Shape-aware Modelling." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285922.
Full textFokus för denna avhandling är uppgiften att skatta en mänsklig 3D-pose ochsamtidigt ta hänsyn till personens form i en bild. För att rendera mänskligaposer och kroppsformer använder vi en nyligen föreslagen statistisk modell,SMPL [1]. Vi tränar ett neuralt nätverk för att skatta en persons pose och formi en bild. Därefter använder vi en optimerings-procedur för att ytterligare förbättradessa skattningar. Nätverket tränas genom att integrera de förbättradeskattningarna i en målfunktion tillsammans med de primitiva skattningarna.Denna strategi är baserad på SPIN [2]. Vi utökar denna metod genom att användaen optimerings-procedur som bygger på att inkorporera flera vyer ochsummera felet över alla dessa. Motivationen för vår metod är att utforska omden kan förbättra guidningen av nätverkets träning. För att få vårt nätverk attgeneralisera bättre så tränar vi på sju dataset samtidigt och uppnår jämförbarnoggrannhet med liknande metoder från relaterad forskning. Vi utför även fleraexperiment för att verifiera vår metods effektivitet.
Johnson, Samuel Alan. "Articulated human pose estimation in natural images." Thesis, University of Leeds, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.598026.
Full textOleinikov, Georgii. "Towards human pose estimation in video sequences." Thesis, University of British Columbia, 2014. http://hdl.handle.net/2429/45767.
Full textLa, Gorce Martin de. "Model-based 3D hand pose estimation from monocular video." Thesis, Châtenay-Malabry, Ecole centrale de Paris, 2009. http://www.theses.fr/2009ECAP0045/document.
Full textIn this thesis we propose two methods that allow to recover automatically a full description of the 3d motion of a hand given a monocular video sequence of this hand. Using the information provided by the video, our aimto is to determine the full set of kinematic parameters that are required to describe the pose of the skeleton of the hand. This set of parameters is composed of the angles associate to each joint/articulation and the global position and orientation of the wrist. This problem is extremely challenging. The hand as many degrees of freedom and auto-occlusion are ubiquitous, which makes difficult the estimation of occluded or partially ocluded hand parts.In this thesis, we introduce two novel methods of increasing complexity that improve to certain extend the state-of-the-art for monocular hand tracking problem. Both are model-based methods and are based on a hand model that is fitted to the image. This process is guided by an objective function that defines some image-based measure of the hand projection given the model parameters. The fitting process is achieved through an iterative refinement technique that is based on gradient-descent and aims a minimizing the objective function. The two methos differ mainly by the choice of the hand model and of the cost function.The first method relies on a hand model made of ellipsoids and a simple discrepancy measure based on global color distributions of the hand and the background. The second method uses a triangulated surface model with texture and shading and exploits a robust distance between the synthetic and observed image as discrepancy measure.While computing the gradient of the discrepancy measure, a particular attention is given to terms related to the changes of visibility of the surface near self occlusion boundaries that are neglected in existing formulations. Our hand tracking method is not real-time, which makes interactive applications not yet possible. Increase of computation power of computers and improvement of our method might make real-time attainable
Cao, Hui, Yoshinori Takeuchi, Tetsuya Matsumoto, Hiroaki Kudo, and Noboru Ohnishi. "Recovering Human Pose by Collaborative Generative Models Estimation." INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE, 2005. http://hdl.handle.net/2237/10377.
Full textZhu, Aichun. "Articulated human pose estimation in images and video." Thesis, Troyes, 2016. http://www.theses.fr/2016TROY0013/document.
Full textHuman pose estimation is a challenging problem in computer vision and shares all the difficulties of object detection. This thesis focuses on the problems of human pose estimation in still images or video, including the diversity of appearances, changes in scene illumination and confounding background clutter. To tackle these problems, we build a robust model consisting of the following components. First, the top-down and bottom-up methods are combined to estimation human pose. We extend the Pictorial Structure (PS) model to cooperate with annealed particle filter (APF) for robust multi-view pose estimation. Second, we propose an upper body based multiple mixture parts (MMP) model for human pose estimation that contains two stages. In the pre-estimation stage, there are three steps: upper body detection, model category estimation for upper body, and full model selection for pose estimation. In the estimation stage, we address the problem of a variety of human poses and activities. Finally, a Deep Convolutional Neural Network (DCNN) is introduced for human pose estimation. A Local Multi-Resolution Convolutional Neural Network (LMR-CNN) is proposed to learn the representation for each body part. Moreover, a LMR-CNN based hierarchical model is defined to meet the structural complexity of limb parts. The experimental results demonstrate the effectiveness of the proposed model
Navaratnam, Ramanan. "Probabilistic human body pose estimation from monocular images." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.612174.
Full textSandhu, Romeil Singh. "Statistical methods for 2D image segmentation and 3D pose estimation." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/37245.
Full textThayananthan, Arasanathan. "Template-based pose estimation and tracking of 3D hand motion." Thesis, University of Cambridge, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.613782.
Full textCetinkaya, Guven. "A Comparative Study On Pose Estimation Algorithms Using Visual Data." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614109/index.pdf.
Full textnamely, Orthogonal Iterations, POSIT, DLT and Efficient PnP are compared. Moreover, two other well-known algorithms that solve the correspondence and pose problems simultaneously
Soft POSIT and Blind- PnP are also compared in the scope of this thesis study. In the first step of the simulations, synthetic data is formed using a realistic motion scenario and the algorithms are compared using this data. In the next step, real images captured by a calibrated camera for an object with known 3D model are exploited. The simulation results indicate that POSIT algorithm performs the best among the algorithms requiring point correspondences. Another result obtained from the experiments is that Soft-POSIT algorithm can be considered to perform better than Blind-PnP algorithm.
Brauer, Jürgen [Verfasser]. "Human Pose Estimation with Implicit Shape Models / Jürgen Brauer." Karlsruhe : KIT Scientific Publishing, 2014. http://www.ksp.kit.edu.
Full textBurke, Michael Glen. "Fast upper body pose estimation for human-robot interaction." Thesis, University of Cambridge, 2015. https://www.repository.cam.ac.uk/handle/1810/256305.
Full textZhu, Youding. "Model-Based Human Pose Estimation with Spatio-Temporal Inferencing." The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1242752509.
Full textKrejov, Philip G. "Real time hand pose estimation for human computer interaction." Thesis, University of Surrey, 2016. http://epubs.surrey.ac.uk/809973/.
Full textJaeggli, Tobias. "Statistical models for human body pose estimation from videos." Konstanz Hartung-Gorre, 2008. http://d-nb.info/991839315/04.
Full textGomez-Donoso, Francisco. "Contributions to 3D object recognition and 3D hand pose estimation using deep learning techniques." Doctoral thesis, Universidad de Alicante, 2020. http://hdl.handle.net/10045/110658.
Full textDevanne, Maxime. "3D human behavior understanding by shape analysis of human motion and pose." Thesis, Lille 1, 2015. http://www.theses.fr/2015LIL10138/document.
Full textThe emergence of RGB-D sensors providing the 3D structure of both the scene and the human body offers new opportunities for studying human motion and understanding human behaviors. However, the design and development of models for behavior recognition that are both accurate and efficient is a challenging task due to the variability of the human pose, the complexity of human motion and possible interactions with the environment. In this thesis, we first focus on the action recognition problem by representing human action as the trajectory of 3D coordinates of human body joints over the time, thus capturing simultaneously the body shape and the dynamics of the motion. The action recognition problem is then formulated as the problem of computing the similarity between shape of trajectories in a Riemannian framework. Experiments carried out on four representative benchmarks demonstrate the potential of the proposed solution in terms of accuracy/latency for a low-latency action recognition. Second, we extend the study to more complex behaviors by analyzing the evolution of the human pose shape to decompose the motion stream into short motion units. Each motion unit is then characterized by the motion trajectory and depth appearance around hand joints, so as to describe the human motion and interaction with objects. Finally, the sequence of temporal segments is modeled through a Dynamic Naive Bayesian Classifier. Experiments on four representative datasets evaluate the potential of the proposed approach in different contexts, including recognition and online detection of behaviors
Cabras, Paolo. "3D Pose estimation of continuously deformable instruments in robotic endoscopic surgery." Thesis, Strasbourg, 2016. http://www.theses.fr/2016STRAD007/document.
Full textKnowing the 3D position of robotized instruments can be useful in surgical context for e.g. their automatic control or gesture guidance. We propose two methods to infer the 3D pose of a single bending section instrument equipped with colored markers using only the images provided by the monocular camera embedded in the endoscope. A graph-based method is used to segment the markers. Their corners are extracted by detecting color transitions along Bézier curves fitted on edge points. These features are used to estimate the 3D pose of the instrument using an adaptive model that takes into account the mechanical plays of the system. Since this method can be affected by model uncertainties, the image-to-3d function can be learned according to a training set. We opted for two techniques that have been improved : Radial Basis Function Network with Gaussian kernel and Locally Weighted Projection. The proposed methods are validated on a robotic experimental cell and in in-vivo sequences
Schick, Alexander [Verfasser], and R. [Akademischer Betreuer] Stiefelhagen. "Human Pose Estimation with Supervoxels / Alexander Schick. Betreuer: R. Stiefelhagen." Karlsruhe : KIT-Bibliothek, 2014. http://d-nb.info/1051371317/34.
Full textDogan, Emre. "Human pose estimation and action recognition by multi-robot systems." Thesis, Lyon, 2017. http://www.theses.fr/2017LYSEI060/document.
Full textEstimating human pose and recognizing human activities are important steps in many applications, such as human computer interfaces (HCI), health care, smart conferencing, robotics, security surveillance etc. Despite the ongoing effort in the domain, these tasks remained unsolved in unconstrained and non cooperative environments in particular. Pose estimation and activity recognition face many challenges under these conditions such as occlusion or self occlusion, variations in clothing, background clutter, deformable nature of human body and diversity of human behaviors during activities. Using depth imagery has been a popular solution to address appearance and background related challenges, but it has restricted application area due to its hardware limitations and fails to handle remaining problems. Specifically, we considered action recognition scenarios where the position of the recording device is not fixed, and consequently require a method which is not affected by the viewpoint. As a second prob- lem, we tackled the human pose estimation task in particular settings where multiple visual sensors are available and allowed to collaborate. In this thesis, we addressed these two related problems separately. In the first part, we focused on indoor action recognition from videos and we consider complex ac- tivities. To this end, we explored several methodologies and eventually introduced a 3D spatio-temporal representation for a video sequence that is viewpoint independent. More specifically, we captured the movement of the person over time using depth sensor and we encoded it in 3D to represent the performed action with a single structure. A 3D feature descriptor was employed afterwards to build a codebook and classify the actions with the bag-of-words approach. As for the second part, we concentrated on articulated pose estimation, which is often an intermediate step for activity recognition. Our motivation was to incorporate information from multiple sources and views and fuse them early in the pipeline to overcome the problem of self-occlusion, and eventually obtain robust estimations. To achieve this, we proposed a multi-view flexible mixture of parts model inspired by the classical pictorial structures methodology. In addition to the single-view appearance of the human body and its kinematic priors, we demonstrated that geometrical constraints and appearance- consistency parameters are effective for boosting the coherence between the viewpoints in a multi-view setting. Both methods that we proposed was evaluated on public benchmarks and showed that the use of view-independent representations and integrating information from multiple viewpoints improves the performance of action recognition and pose estimation tasks, respectively
Lu, Yao. "Human body tracking and pose estimation from monocular image sequences." Thesis, Curtin University, 2013. http://hdl.handle.net/20.500.11937/1665.
Full textGarau, Nicola. "Design of Viewpoint-Equivariant Networks to Improve Human Pose Estimation." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/345132.
Full textDerkach, Dmytro. "Spectrum analysis methods for 3D facial expression recognition and head pose estimation." Doctoral thesis, Universitat Pompeu Fabra, 2018. http://hdl.handle.net/10803/664578.
Full textFacial analysis has attracted considerable research efforts over the last decades, with a growing interest in improving the interaction and cooperation between people and computers. This makes it necessary that automatic systems are able to react to things such as the head movements of a user or his/her emotions. Further, this should be done accurately and in unconstrained environments, which highlights the need for algorithms that can take full advantage of 3D data. These systems could be useful in multiple domains such as human-computer interaction, tutoring, interviewing, health-care, marketing etc. In this thesis, we focus on two aspects of facial analysis: expression recognition and head pose estimation. In both cases, we specifically target the use of 3D data and present contributions that aim to identify meaningful representations of the facial geometry based on spectral decomposition methods: 1. We propose a spectral representation framework for facial expression recognition using exclusively 3D geometry, which allows a complete description of the underlying surface that can be further tuned to the desired level of detail. It is based on the decomposition of local surface patches in their spatial frequency components, much like a Fourier transform, which are related to intrinsic characteristics of the surface. We propose the use of Graph Laplacian Features (GLFs), which result from the projection of local surface patches into a common basis obtained from the Graph Laplacian eigenspace. The proposed approach is tested in terms of expression and Action Unit recognition and results confirm that the proposed GLFs produce state-of-the-art recognition rates. 2. We propose an approach for head pose estimation that allows modeling the underlying manifold that results from general rotations in 3D. We start by building a fully-automatic system based on the combination of landmark detection and dictionary-based features, which obtained the best results in the FG2017 Head Pose Estimation Challenge. Then, we use tensor representation and higher order singular value decomposition to separate the subspaces that correspond to each rotation factor and show that each of them has a clear structure that can be modeled with trigonometric functions. Such representation provides a deep understanding of data behavior, and can be used to further improve the estimation of the head pose angles.
Lecrosnier, Louis. "Estimation de pose multimodale- Approche robuste par les droites 2D et 3D." Thesis, Normandie, 2019. http://www.theses.fr/2019NORMR089.
Full textCamera pose estimation consists in determining the position and the orientation of a camera with respect to a reference frame. In the context of mobile robotics, multimodality, i.e. the use of various sensor types, is often a requirement to solve complex tasks. However, knowing the orientation and position, i.e. the pose, of each sensor regarding a common frame is generally necessary to benefit multimodality. In this context, we present two major contributions with this PhD thesis. First, we introduce a pose estimation algorithm relying on 2D and 3D line and a known vertical direction. Secondly, we present two outliers rejection and line pairing methods based on the well known RANSAC algorithm. Our methods make use of the vertical direction to reduce the number of lines required to 2 and 1, i.e. RANSAC2 and RANSAC1. A robustness evaluation of our contributions is performed on simulated and real data. We show state of the art results