To see the other types of publications on this topic, follow the link: Depth camera.

Dissertations / Theses on the topic 'Depth camera'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Depth camera.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Sjöholm, Daniel. "Calibration using a general homogeneous depth camera model." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-204614.

Full text
Abstract:
Being able to accurately measure distances in depth images is important for accurately reconstructing objects. But the measurement of depth is a noisy process and depth sensors could use additional correction even after factory calibration. We regard the pair of depth sensor and image sensor to be one single unit, returning complete 3D information. The 3D information is combined by relying on the more accurate image sensor for everything except the depth measurement. We present a new linear method of correcting depth distortion, using an empirical model based around the constraint of only modifying depth data, while keeping planes planar. The depth distortion model is implemented and tested on the Intel RealSense SR300 camera. The results show that the model is viable and generally decreases depth measurement errors after calibrating, with an average improvement in the 50 percent range on the tested data sets.
Att noggrant kunna mäta avstånd i djupbilder är viktigt för att kunna göra bra rekonstruktioner av objekt. Men denna mätprocess är brusig och dagens djupsensorer tjänar på ytterligare korrektion efter fabrikskalibrering. Vi betraktar paret av en djupsensor och en bildsensor som en enda enhet som returnerar komplett 3D information. 3D informationen byggs upp från de två sensorerna genom att lita på den mer precisa bildsensorn för allt förutom djupmätningen. Vi presenterar en ny linjär metod för att korrigera djupdistorsion med hjälp av en empirisk modell, baserad kring att enbart förändra djupdatan medan plana ytor behålls plana. Djupdistortionsmodellen implementerades och testades på kameratypen Intel RealSense SR300. Resultaten visar att modellen fungerar och i regel minskar mätfelet i djupled efter kalibrering, med en genomsnittlig förbättring kring 50 procent för de testade dataseten.
APA, Harvard, Vancouver, ISO, and other styles
2

Jansson, Isabell. "Visualizing Realtime Depth Camera Configuration using Augmented Reality." Thesis, Linköpings universitet, Medie- och Informationsteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139934.

Full text
Abstract:
A Time-of-Flight camera from SICK IVP AB is used to monitor a region of interest which implies that the camera has to be configured, it has to be mounted correctly and be aware of the region of interest. Performing the configuration process currently induce the user to manage the captured 3-Dimensional data in a 2-Dimensional environment which obstructs the process and possibly causes configuration errors due to misinterpretations of the captured data. The aim of the thesis is to investigate the concept of using Augmented Reality as a tool for facilitating the configuration process of a Time-of-Flight camera and evaluate if Augmented Reality enhances the understanding for the process. In order to evaluate the concept, a prototype application is developed. The thesis report discusses the motivation and background of the work, the implementation as well as the results.
APA, Harvard, Vancouver, ISO, and other styles
3

Efstratiou, Panagiotis. "Skeleton Tracking for Sports Using LiDAR Depth Camera." Thesis, KTH, Medicinteknik och hälsosystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297536.

Full text
Abstract:
Skeletal tracking can be accomplished deploying human pose estimation strategies. Deep learning is shown to be the paramount approach in the realm where in collaboration with a ”light detection and ranging” depth camera the development of a markerless motion analysis software system seems to be feasible. The project utilizes a trained convolutional neural network in order to track humans doing sport activities and to provide feedback after biomechanical analysis. Implementations of four filtering methods are presented regarding movement’s nature, such as kalman filter, fixedinterval smoother, butterworth and moving average filter. The software seems to be practicable in the field evaluating videos at 30Hz, as it is demonstrated by indoor cycling and hammer throwing events. Nonstatic camera behaves quite well against a standstill and upright person while the mean absolute error is 8.32% and 6.46% referential to left and right knee angle, respectively. An impeccable system would benefit not only the sports domain but also the health industry as a whole.
Skelettspårning kan åstadkommas med hjälp av metoder för uppskattning av mänsklig pose. Djupinlärningsmetoder har visat sig vara det främsta tillvägagångssättet och om man använder en djupkamera med ljusdetektering och varierande omfång verkar det vara möjligt att utveckla ett markörlöst system för rörelseanalysmjukvara. I detta projekt används ett tränat neuralt nätverk för att spåra människor under sportaktiviteter och för att ge feedback efter biomekanisk analys. Implementeringar av fyra olika filtreringsmetoder för mänskliga rörelser presenteras, kalman filter, utjämnare med fast intervall, butterworth och glidande medelvärde. Mjukvaran verkar vara användbar vid fälttester för att utvärdera videor vid 30Hz. Detta visas genom analys av inomhuscykling och släggkastning. En ickestatisk kamera fungerar ganska bra vid mätningar av en stilla och upprättstående person. Det genomsnittliga absoluta felet är 8.32% respektive 6.46% då vänster samt höger knävinkel användes som referens. Ett felfritt system skulle gynna såväl idrottssom hälsoindustrin.
APA, Harvard, Vancouver, ISO, and other styles
4

Huotari, V. (Ville). "Depth camera based customer behaviour analysis for retail." Master's thesis, University of Oulu, 2015. http://urn.fi/URN:NBN:fi:oulu-201510292099.

Full text
Abstract:
In 2000s traditional shop-based retailing has had to adapt to competition created by internet-based e-commerce. As a distinction from traditional retail, e-commerce can gather unprecedented amount of information about its customers and their behaviour. To enable behaviour-based analysis in traditional retailing, the customers need to be tracked reliably through the store. One such tracking technology is depth camera people tracking system developed at VTT, Technical Research Centre of Finland Ltd. This study aims to use the aforementioned people tracking system’s data to enable e-commerce style behavioural analysis in physical retail locations. This study is done following the design science research paradigm to construct a real-life artefact. The artefact designed and implemented is based on accumulated knowledge from a systematic literature review, application domain analysis and iterative software engineering practices. Systematic literature review is used to understand what kind of performance evaluation is done in retail. These metrics are then analysed in regards to people tracking technologies to propose a conceptual framework for customer tracking in retail. From this the artefact is designed, implemented and evaluated. Evaluation is done by combination of requirement validation, field experiments and three distinct real-life field studies. Literature review found that retailing uses traditionally easily available performance metrics such as sales and profit. It was also clear that movement data, apart from traffic calculation, has been unavailable for retail and thus is not often used as quantifiable performance metric. As a result this study presents one novel way to use customer movement as a store performance metric. The artefact constructed quantifies, visualises and analyses customer tracking data with the provided depth camera system, which is a new approach to people tracking domain. The evaluation with real-life cases concludes that the artefact can indeed find and classify interesting behavioural patterns from customer tracking data.
APA, Harvard, Vancouver, ISO, and other styles
5

Januzi, Altin. "Triple-Camera Setups for Image-Based Depth Estimation." Thesis, Uppsala universitet, Institutionen för elektroteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-422717.

Full text
Abstract:
This study was conducted as to attempt to find whether a three-camera setup could improve depth map retrieval as opposed to a two-camera setup. The study is based on three main ideas to exploit information from the additional camera. These are three cameras on one axis, as to exploit wide and short baseline benefits, three cameras on different axes, as to exploit vertical and horizontal scanning of the scene and a third idea of combining the two previous ideas. More than three cameras would impose particular implications on the solution and without sufficient theoretical justification thereof the study was limited to a study of the different three-camera configurations possible. As a practical connection was of interest, the study was further limited by the possibility to perform in real-time. An implementation based on previous research was made such as to evaluate images with specific scenes. Pre-processing by Census transformation of the images, camera calibration and rectification of different camera setups and optimizaton by the SGM algorithm are part of the solution used to retrieve results to analyse. Two tests were then studied, first one with rendered images and then one with images from real cameras. From these tests it was noted that a three-camera configuration can improve upon the results significantly and further, if the third camera was placed in perpendicular axis to the first camera pair, unique information was yielded which improves upon the result in specific cases. Using three cameras on the same axis showed no improvement when considering the error metrics BMP and BMPRE, but offers wider application uses, consistently providing better results than the worst pair.
APA, Harvard, Vancouver, ISO, and other styles
6

Rangappa, Shreedhar. "Absolute depth using low-cost light field cameras." Thesis, Loughborough University, 2018. https://dspace.lboro.ac.uk/2134/36224.

Full text
Abstract:
Digital cameras are increasingly used for measurement tasks within engineering scenarios, often being part of metrology platforms. Existing cameras are well equipped to provide 2D information about the fields of view (FOV) they observe, the objects within the FOV, and the accompanying environments. But for some applications these 2D results are not sufficient, specifically applications that require Z dimensional data (depth data) along with the X and Y dimensional data. New designs of camera systems have previously been developed by integrating multiple cameras to provide 3D data, ranging from 2 camera photogrammetry to multiple camera stereo systems. Many earlier attempts to record 3D data on 2D sensors have been completed, and likewise many research groups around the world are currently working on camera technology but from different perspectives; computer vision, algorithm development, metrology, etc. Plenoptic or Lightfield camera technology was defined as a technique over 100 years ago but has remained dormant as a potential metrology instrument. Lightfield cameras utilize an additional Micro Lens Array (MLA) in front of the imaging sensor, to create multiple viewpoints of the same scene and allow encoding of depth information. A small number of companies have explored the potential of lightfield cameras, but in the majority, these have been aimed at domestic consumer photography, only ever recording scenes as relative scale greyscale images. This research considers the potential for lightfield cameras to be used for world scene metrology applications, specifically to record absolute coordinate data. Specific interest has been paid to a range of low cost lightfield cameras to; understand the functional/behavioural characteristics of the optics, identify potential need for optical and/or algorithm development, define sensitivity, repeatability and accuracy characteristics and limiting thresholds of use, and allow quantified 3D absolute scale coordinate data to be extracted from the images. The novel output of this work is; an analysis of lightfield camera system sensitivity leading to the definition of Active Zones (linear data generation good data) and In-active Zones (non-linear data generation poor data), development of bespoke calibration algorithms that remove radial/tangential distortion from the data captured using any MLA based camera, and, a light field camera independent algorithm that allows the delivery of 3D coordinate data in absolute units within a well-defined measurable range from a given camera.
APA, Harvard, Vancouver, ISO, and other styles
7

Nassir, Cesar. "Domain-Independent Moving Object Depth Estimation using Monocular Camera." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233519.

Full text
Abstract:
Today automotive companies across the world strive to create vehicles with fully autonomous capabilities. There are many benefits of developing autonomous vehicles, such as reduced traffic congestion, increased safety and reduced pollution, etc. To be able to achieve that goal there are many challenges ahead, one of them is visual perception. Being able to estimate depth from a 2D image has been shown to be a key component for 3D recognition, reconstruction and segmentation. Being able to estimate depth in an image from a monocular camera is an ill-posed problem since there is ambiguity between the mapping from colour intensity and depth value. Depth estimation from stereo images has come far compared to monocular depth estimation and was initially what depth estimation relied on. However, being able to exploit monocular cues is necessary for scenarios when stereo depth estimation is not possible. We have presented a novel CNN network, BiNet which is inspired by ENet, to tackle depth estimation of moving objects using only a monocular camera in real-time. It performs better than ENet in the Cityscapes dataset while adding only a small overhead to the complexity.
I dag strävar bilföretag över hela världen för att skapa fordon med helt autonoma möjligheter. Det finns många fördelar med att utveckla autonoma fordon, såsom minskad trafikstockning, ökad säkerhet och minskad förorening, etc. För att kunna uppnå det målet finns det många utmaningar framåt, en av dem är visuell uppfattning. Att kunna uppskatta djupet från en 2D-bild har visat sig vara en nyckelkomponent för 3D-igenkännande, rekonstruktion och segmentering. Att kunna uppskatta djupet i en bild från en monokulär kamera är ett svårt problem eftersom det finns tvetydighet mellan kartläggningen från färgintensitet och djupvärde. Djupestimering från stereobilder har kommit långt jämfört med monokulär djupestimering och var ursprungligen den metod som man har förlitat sig på. Att kunna utnyttja monokulära bilder är dock nödvändig för scenarier när stereodjupuppskattning inte är möjligt. Vi har presenterat ett nytt nätverk, BiNet som är inspirerat av ENet, för att ta itu med djupestimering av rörliga objekt med endast en monokulär kamera i realtid. Det fungerar bättre än ENet med datasetet Cityscapes och lägger bara till en liten kostnad på komplexiteten.
APA, Harvard, Vancouver, ISO, and other styles
8

Pinard, Clément. "Robust Learning of a depth map for obstacle avoidance with a monocular stabilized flying camera." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLY003/document.

Full text
Abstract:
Le drone orienté grand public est principalement une caméra volante, stabilisée et de bonne qualité. Ceux-ci ont démocratisé la prise de vue aérienne, mais avec leur succès grandissant, la notion de sécurité est devenue prépondérante.Ce travail s'intéresse à l'évitement d'obstacle, tout en conservant un vol fluide pour l'utilisateur.Dans ce contexte technologique, nous utilisons seulement une camera stabilisée, par contrainte de poids et de coût.Pour leur efficacité connue en vision par ordinateur et leur performance avérée dans la résolution de tâches complexes, nous utilisons des réseaux de neurones convolutionnels (CNN). Notre stratégie repose sur un systeme de plusieurs niveaux de complexité dont les premieres étapes sont de mesurer une carte de profondeur depuis la caméra. Cette thèse étudie les capacités d'un CNN à effectuer cette tâche.La carte de profondeur, étant particulièrement liée au flot optique dans le cas d'images stabilisées, nous adaptons un réseau connu pour cette tâche, FlowNet, afin qu'il calcule directement la carte de profondeur à partir de deux images stabilisées. Ce réseau est appelé DepthNet.Cette méthode fonctionne en simulateur avec un entraînement supervisé, mais n'est pas assez robuste pour des vidéos réelles. Nous étudions alors les possibilites d'auto-apprentissage basées sur la reprojection différentiable d'images. Cette technique est particulièrement nouvelle sur les CNNs et nécessite une étude détaillée afin de ne pas dépendre de paramètres heuristiques.Finalement, nous développons un algorithme de fusion de cartes de profondeurs pour utiliser DepthNet sur des vidéos réelles. Plusieurs paires différentes sont données à DepthNet afin d'avoir une grande plage de profondeurs mesurées
Customer unmanned aerial vehicles (UAVs) are mainly flying cameras. They democratized aerial footage, but with thei success came security concerns.This works aims at improving UAVs security with obstacle avoidance, while keeping a smooth flight. In this context, we use only one stabilized camera, because of weight and cost incentives.For their robustness in computer vision and thei capacity to solve complex tasks, we chose to use convolutional neural networks (CNN). Our strategy is based on incrementally learning tasks with increasing complexity which first steps are to construct a depth map from the stabilized camera. This thesis is focused on studying ability of CNNs to train for this task.In the case of stabilized footage, the depth map is closely linked to optical flow. We thus adapt FlowNet, a CNN known for optical flow, to output directly depth from two stabilized frames. This network is called DepthNet.This experiment succeeded with synthetic footage, but is not robust enough to be used directly on real videos. Consequently, we consider self supervised training with real videos, based on differentiably reproject images. This training method for CNNs being rather novel in literature, a thorough study is needed in order not to depend too moch on heuristics.Finally, we developed a depth fusion algorithm to use DepthNet efficiently on real videos. Multiple frame pairs are fed to DepthNet to get a great depth sensing range
APA, Harvard, Vancouver, ISO, and other styles
9

Kuznetsova, Alina [Verfasser]. "Hand pose recogniton using a consumer depth camera / Alina Kuznetsova." Hannover : Technische Informationsbibliothek (TIB), 2016. http://d-nb.info/1100290125/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Sandberg, David. "Model-Based Video Coding Using a Colour and Depth Camera." Thesis, Linköpings universitet, Datorseende, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-68737.

Full text
Abstract:
In this master thesis, a model-based video coding algorithm has been developed that uses input from a colour and depth camera, such as the Microsoft Kinect. Using a model-based representation of a video has several advantages over the commonly used block-based approach, used by the H.264 standard. For example, videos can be rendered in 3D, be viewed from alternative views, and have objects inserted into them for augmented reality and user interaction. This master thesis demonstrates a very efficient way of encoding the geometry of a scene. The results of the proposed algorithm show that it can reach very low bitrates with comparable results to the H.264 standard.
I detta examensarbete har en modellbaserad videokodningsalgoritm utvecklats som använder data från en djup- och färgkamera, exempelvis Microsoft Kinect. Det finns flera fördelar med en modellbaserad representation av en video över den mer vanligt förekommande blockbaserade varianten, vilket används av bland annat H.264. Några exempel är möjligheten att rendera videon i 3D samt från alternativa vyer, placera in objekt i videon samt möjlighet för användaren att interagera med scenen. Detta examensarbete påvisar en väldigt effektiv metod för komprimering av scengeometri. Resultaten av den presenterade algoritmen visar att möjligheten att uppnå väldigt låg bithastighet med jämförelsebara resultat med H.264-standarden.
APA, Harvard, Vancouver, ISO, and other styles
11

Pinard, Clément. "Robust Learning of a depth map for obstacle avoidance with a monocular stabilized flying camera." Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLY003.

Full text
Abstract:
Le drone orienté grand public est principalement une caméra volante, stabilisée et de bonne qualité. Ceux-ci ont démocratisé la prise de vue aérienne, mais avec leur succès grandissant, la notion de sécurité est devenue prépondérante.Ce travail s'intéresse à l'évitement d'obstacle, tout en conservant un vol fluide pour l'utilisateur.Dans ce contexte technologique, nous utilisons seulement une camera stabilisée, par contrainte de poids et de coût.Pour leur efficacité connue en vision par ordinateur et leur performance avérée dans la résolution de tâches complexes, nous utilisons des réseaux de neurones convolutionnels (CNN). Notre stratégie repose sur un systeme de plusieurs niveaux de complexité dont les premieres étapes sont de mesurer une carte de profondeur depuis la caméra. Cette thèse étudie les capacités d'un CNN à effectuer cette tâche.La carte de profondeur, étant particulièrement liée au flot optique dans le cas d'images stabilisées, nous adaptons un réseau connu pour cette tâche, FlowNet, afin qu'il calcule directement la carte de profondeur à partir de deux images stabilisées. Ce réseau est appelé DepthNet.Cette méthode fonctionne en simulateur avec un entraînement supervisé, mais n'est pas assez robuste pour des vidéos réelles. Nous étudions alors les possibilites d'auto-apprentissage basées sur la reprojection différentiable d'images. Cette technique est particulièrement nouvelle sur les CNNs et nécessite une étude détaillée afin de ne pas dépendre de paramètres heuristiques.Finalement, nous développons un algorithme de fusion de cartes de profondeurs pour utiliser DepthNet sur des vidéos réelles. Plusieurs paires différentes sont données à DepthNet afin d'avoir une grande plage de profondeurs mesurées
Customer unmanned aerial vehicles (UAVs) are mainly flying cameras. They democratized aerial footage, but with thei success came security concerns.This works aims at improving UAVs security with obstacle avoidance, while keeping a smooth flight. In this context, we use only one stabilized camera, because of weight and cost incentives.For their robustness in computer vision and thei capacity to solve complex tasks, we chose to use convolutional neural networks (CNN). Our strategy is based on incrementally learning tasks with increasing complexity which first steps are to construct a depth map from the stabilized camera. This thesis is focused on studying ability of CNNs to train for this task.In the case of stabilized footage, the depth map is closely linked to optical flow. We thus adapt FlowNet, a CNN known for optical flow, to output directly depth from two stabilized frames. This network is called DepthNet.This experiment succeeded with synthetic footage, but is not robust enough to be used directly on real videos. Consequently, we consider self supervised training with real videos, based on differentiably reproject images. This training method for CNNs being rather novel in literature, a thorough study is needed in order not to depend too moch on heuristics.Finally, we developed a depth fusion algorithm to use DepthNet efficiently on real videos. Multiple frame pairs are fed to DepthNet to get a great depth sensing range
APA, Harvard, Vancouver, ISO, and other styles
12

Harding, Cressida M. "How far away is it? : depth estimation by a moving camera." Thesis, University of Canterbury. Electrical and Electronic Engineering, 2001. http://hdl.handle.net/10092/6157.

Full text
Abstract:
This thesis considers the challenge of autonomous robot navigation. Effective self-guiding robots are a tool applicable to many important and critical tasks, such as fire-fighting, transporting dangerous materials, even bomb disposal. In many cases the robots are even more useful if their method of guidance is passive and utilises common technology such as CCD cameras. Using biological models to inspire the design of such robots is an exhilarating approach to the problem and provides sensible and novel solutions. The method of determining distance to objects using the optical flow from sequences of camera images is well-known and many techniques for estimating optical flow have been proposed. This thesis explores those differential optical flow techniques which solve the aperture problem by using a window of pixels and a model of the structure of the optical flow within that window. It shows that a number of these methods can be incorporated into a general framework utilising a sum of basis functions over the window. A more or less complicated structure for the optical flow can be achieved by selecting a greater or fewer number of these basis functions. Certain choices of basis function correspond to published models, such as those of Lucas and Kanade (1981), Campani and Verri (1992), Schalkoff and McVey (1982), Nagle and Srinivasan (1996), and Waxman and Wohn (1985). A number of these models were compared over different image sequences, both real and synthetic, and the errors in each case quantified. This comparison shows that the best choice of model is dictated both by the size of the pixel window and also by the surface being viewed. A set of basis functions will cause a bias in the optical flow estimates if the surface structure is more complex than the model can fit. This causes errors in the location of the focus of expansion. A new method, only recently proposed for robot navigation, is known as volumetric stereo or voxel colouring. Most of the work performed in this area uses the method for computer graphics purposes, to produce photo-realistic scenes or images. It can also be used to produce accurate and detailed depth maps of a scene. Rather than using multiple pixels from a single camera, as optical flow does, it relies upon multiple camera observations of a single point. The camera observations of points in space are compared and those where the cameras agree are deemed to be surface points. The concepts behind this approach are explained, including a number of ways this method can reconstruct partially occluded objects. The emphasis then shifts to specific implementations for robot navigation. These include assumptions about camera motion and methods to speed up the calculation procedure. Results for real and synthetic sequences are shown and comparison is performed with optical flow, showing the volumetric technique is greatly superior in a number of important respects, not the least of which is accuracy. Finally, some important extensions to the algorithm are discussed. These extensions make it robust to three problems often ignored in computer vision: inaccurate calibration, variable lighting, and specular surfaces. The first of these is overcome by showing that the algorithm is capable of self-calibration, allowing it to substantially improve depth estimates in the case of inaccurate camera positions or rotations. By using a lighting-invariant colour model, the algorithm can successfully reconstruct depth, even when the sequence lighting is altered. Finally, the algorithm successfully reconstructs specularities in images at the same time as reconstructing the Lambertian regions. This is done by observing the pattern of intensity variation in the camera observations. Results for these situations are shown for real images sequences and the improvements are demonstrated quantitatively.
APA, Harvard, Vancouver, ISO, and other styles
13

Barandas, Marília da Silveira Gouveia. "Range of motion measurements based on depth camera for clinical rehabilitation." Master's thesis, Faculdade de Ciências e Tecnologia, 2013. http://hdl.handle.net/10362/11046.

Full text
Abstract:
Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica
In clinical rehabilitation, biofeedback increases the patient’s motivation which makes it one of the most effective motor rehabilitation mechanisms. In this field it is very helpful for the patient and even for the therapist to know the level of success and performance of the training process. The human motion tracking study can provide relevant information for this purpose. Existing lab-based Three-Dimensional (3D) motion capture systems are capable to provide this information in real-time. However, these systems still present some limitations when used in rehabilitation processes involving biofeedback. A new depth camera - the Microsoft KinectTM - was recently developed overcoming the limitations associated with the lab-based movement analysis systems. This depth camera is easy to use, inexpensive and portable. The aim of this work is to introduce a system in clinical practice to do Range of Motion(ROM) measurements, using the KinectTM sensor and providing real-time biofeedback. For this purpose, the ROM measurements were computed using the joints spatial coordinates provided by the official Microsoft KinectTM Software Development Kit (SDK)and also using our own developed algorithm. The obtained results were compared with a triaxial accelerometer data, used as reference. The upper movements studied were abduction, flexion/extension and internal/external rotation with the arm at 90 degrees of elevation. With our algorithm the Mean Error (ME) was less than 1.5 degrees for all movements. Only in abduction the KinectTM Sketelon Tracking obtained comparable data. In other movements the ME increased an order of magnitude. Given the potential benefits, our method can be a useful tool for ROM measurements in clinics.
APA, Harvard, Vancouver, ISO, and other styles
14

Carraro, Marco. "Real-time RGB-Depth preception of humans for robots and camera networks." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3426800.

Full text
Abstract:
This thesis deals with robot and camera network perception using RGB-Depth data. The goal is to provide efficient and robust algorithms for interacting with humans. For this reason, a special care has been devoted to design algorithms which can run in real-time on consumer computers and embedded cards. The main contribution of this thesis is the 3D body pose estimation of the human body. We propose two novel algorithms which take advantage of the data stream of a RGB-D camera network outperforming the state-of-the-art performance in both single-view and multi-view tests. While the first algorithm works on point cloud data which is feasible also with no external light, the second one performs better, since it deals with multiple persons with negligible overhead and does not rely on the synchronization between the different cameras in the network. The second contribution regards long-term people re-identification in camera networks. This is particularly challenging since we cannot rely on appearance cues, in order to be able to re-identify people also in different days. We address this problem by proposing a face-recognition framework based on a Convolutional Neural Network and a Bayes inference system to re-assign the correct ID and person name to each new track. The third contribution is about Ambient Assisted Living. We propose a prototype of an assistive robot which periodically patrols a known environment, reporting unusual events as people fallen on the ground. To this end, we developed a fast and robust approach which can work also in dimmer scenes and is validated using a new publicly-available RGB-D dataset recorded on-board of our open-source robot prototype. As a further contribution of this work, in order to boost the research on this topics and to provide the best benefit to the robotics and computer vision community, we released under open-source licenses most of the software implementations of the novel algorithms described in this work.
Questa tesi tratta di percezione per robot autonomi e per reti di telecamere da dati RGB-Depth. L'obiettivo è quello di fornire algoritmi robusti ed efficienti per l'interazione con le persone. Per questa ragione, una particolare attenzione è stata dedicata allo sviluppo di soluzioni efficienti che possano essere eseguite in tempo reale su computer e schede grafiche consumer. Il contributo principale di questo lavoro riguarda la stima automatica della posa 3D del corpo delle persone presenti in una scena. Vengono proposti due algoritmi che sfruttano lo stream di dati RGB-Depth da una rete di telecamere andando a migliorare lo stato dell'arte sia considerando dati da singola telecamera che usando tutte le telecamere disponibili. Il secondo algoritmo ottiene risultati migliori in quanto riesce a stimare la posa di tutte le persone nella scena con overhead trascurabile e non richiede sincronizzazione tra i vari nodi della rete. Tuttavia, il primo metodo utilizza solamente nuvole di punti che sono disponibili anche in ambiente con poca luce nei quali il secondo algoritmo non raggiungerebbe gli stessi risultati. Il secondo contributo riguarda la re-identificazione di persone a lungo termine in reti di telecamere. Questo problema è particolarmente difficile in quanto non si può contare su feature di colore o che considerino i vestiti di ogni persona, in quanto si vuole che il riconoscimento funzioni anche a distanza di giorni. Viene proposto un framework che sfrutta il riconoscimento facciale utilizzando una Convolutional Neural Network e un sistema di classificazione Bayesiano. In questo modo, ogni qual volta viene generata una nuova traccia dal sistema di people tracking, la faccia della persona viene analizzata e, in caso di match, il vecchio ID viene riassegnato. Il terzo contributo riguarda l'Ambient Assisted Living. Abbiamo proposto e implementato un robot di assistenza che ha il compito di sorvegliare periodicamente un ambiente conosciuto, riportando eventi non usuali come la presenza di persone a terra. A questo fine, abbiamo sviluppato un approccio veloce e robusto che funziona anche in assenza di luce ed è stato validato usando un nuovo dataset RGB-Depth registrato a bordo robot. Con l'obiettivo di avanzare la ricerca in questi campi e per fornire il maggior beneficio possibile alle community di robotica e computer vision, come contributo aggiuntivo di questo lavoro, abbiamo rilasciato, con licenze open-source, la maggior parte delle implementazioni software degli algoritmi descritti in questo lavoro.
APA, Harvard, Vancouver, ISO, and other styles
15

Wang, Chong, and 王翀. "Joint color-depth restoration with kinect depth camera and its applications to image-based rendering and hand gesture recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/206343.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Sun, Yi. "Depth Estimation Methodology for Modern Digital Photography." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1563527854489549.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Beien. "3D Scintillation Positioning Method in a Breast-specific Gamma Camera." Thesis, KTH, Medicinsk teknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-176453.

Full text
Abstract:
In modern clinical practice, gamma camera is one of the most important imaging modalities for tumour diagnosis. The standard technique uses scintillator-based gamma cameras equipped with parallel-hole collimator to detect the planar position of γ photon interaction (scintillation). However, the positioning is of insufficient resolution and linearity for breast imaging. With the aim to improve spatial resolution and positioning linearity, a new gamma camera configuration was described specifically for breast-imaging. This breast-specific gamma camera was supposed to have the following technical features: variable angle slant-hole collimator; double SiPM arrays readout at the front and back sides of the scintillator; diffusive reflectors at the edges around the scintillator. Because slant-hole collimator was used, a new 3D scintillation positioning method was introduced and tested. The setup of the gamma detector was created in a Monte Carlo simulation toolkit, and a library of a number of light distributions from known positions was acquired through optical simulation. Two library-based positioning algorithms, similarity comparison and maximum likelihood, were developed to estimate the 3D scintillation position by comparing the responses from simulated gamma interactions and the responses from library. Results indicated that the planar spatial resolution and positioning linearity estimated with this gamma detector setup and positioning algorithm was higher than the conventional gamma detectors. The depth-of-interaction estimation was also of high linearity and resolution. With the results presented, the gamma detector setup and positioning method is promising in future breast cancer diagnosis.
APA, Harvard, Vancouver, ISO, and other styles
18

Ye, Mao. "MONOCULAR POSE ESTIMATION AND SHAPE RECONSTRUCTION OF QUASI-ARTICULATED OBJECTS WITH CONSUMER DEPTH CAMERA." UKnowledge, 2014. http://uknowledge.uky.edu/cs_etds/25.

Full text
Abstract:
Quasi-articulated objects, such as human beings, are among the most commonly seen objects in our daily lives. Extensive research have been dedicated to 3D shape reconstruction and motion analysis for this type of objects for decades. A major motivation is their wide applications, such as in entertainment, surveillance and health care. Most of existing studies relied on one or more regular video cameras. In recent years, commodity depth sensors have become more and more widely available. The geometric measurements delivered by the depth sensors provide significantly valuable information for these tasks. In this dissertation, we propose three algorithms for monocular pose estimation and shape reconstruction of quasi-articulated objects using a single commodity depth sensor. These three algorithms achieve shape reconstruction with increasing levels of granularity and personalization. We then further develop a method for highly detailed shape reconstruction based on our pose estimation techniques. Our first algorithm takes advantage of a motion database acquired with an active marker-based motion capture system. This method combines pose detection through nearest neighbor search with pose refinement via non-rigid point cloud registration. It is capable of accommodating different body sizes and achieves more than twice higher accuracy compared to a previous state of the art on a publicly available dataset. The above algorithm performs frame by frame estimation and therefore is less prone to tracking failure. Nonetheless, it does not guarantee temporal consistent of the both the skeletal structure and the shape and could be problematic for some applications. To address this problem, we develop a real-time model-based approach for quasi-articulated pose and 3D shape estimation based on Iterative Closest Point (ICP) principal with several novel constraints that are critical for monocular scenario. In this algorithm, we further propose a novel method for automatic body size estimation that enables its capability to accommodate different subjects. Due to the local search nature, the ICP-based method could be trapped to local minima in the case of some complex and fast motions. To address this issue, we explore the potential of using statistical model for soft point correspondences association. Towards this end, we propose a unified framework based on Gaussian Mixture Model for joint pose and shape estimation of quasi-articulated objects. This method achieves state-of-the-art performance on various publicly available datasets. Based on our pose estimation techniques, we then develop a novel framework that achieves highly detailed shape reconstruction by only requiring the user to move naturally in front of a single depth sensor. Our experiments demonstrate reconstructed shapes with rich geometric details for various subjects with different apparels. Last but not the least, we explore the applicability of our method on two real-world applications. First of all, we combine our ICP-base method with cloth simulation techniques for Virtual Try-on. Our system delivers the first promising 3D-based virtual clothing system. Secondly, we explore the possibility to extend our pose estimation algorithms to assist physical therapist to identify their patients’ movement dysfunctions that are related to injuries. Our preliminary experiments have demonstrated promising results by comparison with the gold standard active marker-based commercial system. Throughout the dissertation, we develop various state-of-the-art algorithms for pose estimation and shape reconstruction of quasi-articulated objects by leveraging the geometric information from depth sensors. We also demonstrate their great potentials for different real-world applications.
APA, Harvard, Vancouver, ISO, and other styles
19

Djikic, Addi. "Segmentation and Depth Estimation of Urban Road Using Monocular Camera and Convolutional Neural Networks." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235496.

Full text
Abstract:
Deep learning for safe autonomous transport is rapidly emerging. Fast and robust perception for autonomous vehicles will be crucial for future navigation in urban areas with high traffic and human interplay. Previous work focuses on extracting full image depth maps, or finding specific road features such as lanes. However, in urban environments lanes are not always present, and sensors such as LiDAR with 3D point clouds provide a quite sparse depth perception of road with demanding algorithmic approaches. In this thesis we derive a novel convolutional neural network that we call AutoNet. It is designed as an encoder-decoder network for pixel-wise depth estimation of an urban drivable free-space road, using only a monocular camera, and handled as a supervised regression problem. AutoNet is also constructed as a classification network to solely classify and segment the drivable free-space in real- time with monocular vision, handled as a supervised classification problem, which shows to be a simpler and more robust solution than the regression approach. We also implement the state of the art neural network ENet for comparison, which is designed for fast real-time semantic segmentation and fast inference speed. The evaluation shows that AutoNet outperforms ENet for every performance metrics, but shows to be slower in terms of frame rate. However, optimization techniques are proposed for future work, on how to advance the frame rate of the network while still maintaining the robustness and performance. All the training and evaluation is done on the Cityscapes dataset. New ground truth labels for road depth perception are created for training with a novel approach of fusing pre-computed depth maps with semantic labels. Data collection with a Scania vehicle is conducted, mounted with a monocular camera to test the final derived models. The proposed AutoNet shows promising state of the art performance in regards to road depth estimation as well as road classification.
Deep learning för säkra autonoma transportsystem framträder mer och mer inom forskning och utveckling. Snabb och robust uppfattning om miljön för autonoma fordon kommer att vara avgörande för framtida navigering inom stadsområden med stor trafiksampel. I denna avhandling härleder vi en ny form av ett neuralt nätverk som vi kallar AutoNet. Där nätverket är designat som en autoencoder för pixelvis djupskattning av den fria körbara vägytan för stadsområden, där nätverket endast använder sig av en monokulär kamera och dess bilder. Det föreslagna nätverket för djupskattning hanteras som ett regressions problem. AutoNet är även konstruerad som ett klassificeringsnätverk som endast ska klassificera och segmentera den körbara vägytan i realtid med monokulärt seende. Där detta är hanterat som ett övervakande klassificerings problem, som även visar sig vara en mer simpel och mer robust lösning för att hitta vägyta i stadsområden. Vi implementerar även ett av de främsta neurala nätverken ENet för jämförelse. ENet är utformat för snabb semantisk segmentering i realtid, med hög prediktions- hastighet. Evalueringen av nätverken visar att AutoNet utklassar ENet i varje prestandamätning för noggrannhet, men visar sig vara långsammare med avseende på antal bilder per sekund. Olika optimeringslösningar föreslås för framtida arbete, för hur man ökar nätverk-modelens bildhastighet samtidigt som man behåller robustheten.All träning och utvärdering görs på Cityscapes dataset. Ny data för träning samt evaluering för djupskattningen för väg skapas med ett nytt tillvägagångssätt, genom att kombinera förberäknade djupkartor med semantiska etiketter för väg. Datainsamling med ett Scania-fordon utförs även, monterad med en monoculär kamera för att testa den slutgiltiga härleda modellen. Det föreslagna nätverket AutoNet visar sig vara en lovande topp-presterande modell i fråga om djupuppskattning för väg samt vägklassificering för stadsområden.
APA, Harvard, Vancouver, ISO, and other styles
20

Yuan, Qiantailang. "The Performance of the Depth Camera in Capturing Human Body Motion for Biomechanical Analysis." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235944.

Full text
Abstract:
Three-dimensional human movement tracking has long been an important topic in medical and engineering field. Complex camera systems such as Vicon can be used to retrieve very precise motion data. However, the system is more commercial-oriented with a high cost. Besides, it would also be tedious and cumbersome to wear the special markers and suits for tracking. Therefore, there's an urgent need to investigate a cost-effective and markless tool for motion tracking. Microsoft Kinect provides a promising solution with a vast variety of libraries, allowing quick development of 3-D spatial modeling and analysis such as moving skeleton possible. For example, the kinematics of the joints such as acceleration, velocity, and angle changes can be deduced from the spatial position information acquired by the camera. In order to validate whether the Kinect system is sufficient for the analysis in practice, a micro-controller platform Arduino along with Intel® Curie™ IMU (Inertial Measurement Unit) module is developed. In particular, the velocity and Euler angels of joint movements, as well as head orientations are measured and compared between the two systems. In this paper, the goal is to present (i) the use of Kinect Depth sensor for data acquisition, (ii) post-processing with the retrieved data, (iii) validation of the Kinect camera. Results show that the RMS error of the velocity tracking ranges from 1.78% to 23.34%, presenting a good agreement of measurement between the two systems. Moreover, the relative error of the angle tracking is between 4.0% and 24.3%. The results of the head orientations tracking are hard to perform a mathematical analysis due to the noise and invalid data from the camera caused by the loss of tracking. Overall, the accuracy of joint movement tracked by the Kinect camera, particularly velocity, is proved to be acceptable and the depth camera has been found to be an effective tool for kinematic measurement as a cost-effective option. A platform and workflow are now established, thus making future work regarding validation and application possible when the advanced hardware is available.
Tre dimensionell rörelse spårning har alltid varit ett viktigt ämne inom medicinska och tekniska områden. Komplexa kamerasystem så som Vicon kan användas för att hämta exakta data för olika rörelser. Dessa system är dock mer kommersiellt orienterade, och är oftast dyra. Systemen är dessutom besvärliga eftersom man är tvungen att bära speciella dräkter med markörer, för att kunna spåra rörelser. Därav finns det ett stort intresse av att undersöka ett kostnadseffektivt och markörfria verktyg för rörelsespårning. Microsoft Kinect är en lovande lösning med en mängd olika bibliotek som möjliggör en snabb utveckling av 3D spatial modellering och analys. Från den spatiala positionsinformationen kan man få fram information om ledernas acceleration, hastighet och vinkelförändring. För att kunna validera om Kinect är passande för analysen, utvecklades en mikro-styrplattform Ardunino tillsammans med Intel R CurieTM IMU (tröghetsmätningsenhet). Hastigheten och Eulers vinkel vid rörelse av lederna, samt orienteringen av huvudet mättes och jämfördes mellan dessa två system. Målet med detta arbete är att presentera (i) användningen av Kinect Depth sensor för datainsamling, (ii) efterbehandling av inhämtad data, (iii) validering av Kinect Kamera. Resultatet visade att RMS-errorn av hastighetsspårningen varierade mellan 1.78% och 23.34%, vilket påvisar en god likhet mellan mätningarna av de två systemen. Det relativa felet i vinkelspårningen är mellan 4.0% och 24.3%. Resultatet för orienteringen av huvudet var svår att ta fram genom matematisk analys eftersom brus och invalid data från kameran uppstod pga förlust av spårning. Noggrannheten av ledrörelsen detekterad av Kinect kameran bevisas vara acceptabel, speciellt för hastighetsmätningar. Djupkameran har visat vara ett effektivt verktyg för kinematiks mätning som ett kostnadseffektivt alternativ. En plattform och arbetsflöde har tagits fram, vilket möjliggör validering och tillämpning när den avancerade hårdvaran är tillgänglig.
APA, Harvard, Vancouver, ISO, and other styles
21

Bodesund, Fredrik. "Pose estimation of a VTOL UAV using IMU, Camera and GPS." Thesis, Linköpings universitet, Reglerteknik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-60641.

Full text
Abstract:
When an autonomous vehicle has a mission to perform, it is of high importance that the robot has good knowledge about its position. Without good knowledge of the position, it will not be able to navigate properly and the data that it gathers, which could be of importance for the mission, might not be usable. A helicopter could for example be used to collect laser data from the terrain beneath it, which could be used to produce a 3D map of the terrain. If the knowledge about the position and orientation of the helicopter is poor then the collected laser data will be useless since it is unknown what the laser actually measures. A successful solution to position and orientation (pose) estimation of an autonomous helicopter, using an inertial measurement unit (IMU), a camera and a GPS, is proposed in this thesis. The problem is to estimate the unknown pose using sensors that measure different physical attributes and give readings containing noise. An extended Kalman filter solution to the simultaneous localisation and mapping (SLAM) is used to fuse data from the different sensors and estimate the pose of the robot. The scale invariant feature transform (SIFT) is used for feature extraction and the unified inverse depth parametrisation (UIDP) model is used to parametrise the landmarks. The orientation of the robot is described by quaternions. To be able to evaluate the performance of the filter an ABB industrial robot has been used as reference. The pose of the end tool of the robot is known with high accuracy and gives a good ground truth so that the estimations can be evaluated. The results shows that the algorithm performs well and that the pose is estimated with good accuracy.
När en autonom farkost skall utföra ett uppdrag är det av högsta vikt att den har god kännedom av sin position. Utan detta kommer den inte att kunna navigera och den data som den samlar in, relevant för uppdraget, kan vara oanvändbar. Till exempel skulle en helikopter kunna användas för att samla in laser data av terrängen under den, för att skapa en 3D karta av terrängen. Om kännedomen av helikopterns position och orientering är dålig kommer de insamlade lasermätningarna att vara oanvändbara eftersom det inte är känt vad lasern faktiskt mäter. I detta examensarbete presenteras en väl fungerande lösning för position och orienterings estimering av autonom helikopter med hjälp av en inertial measurement unit (IMU), en kamera och GPS. Problemet är att skatta positionen och orienteringen med hjälp av sensorer som mäter olika fysiska storheter och vars mätningar innehåller brus. En extended Kalman filter (EKF) lösning för simultaneous localisation and mapping (SLAM) problemet används för att fusionera data från de olika sensorerna och estimera positionen och orienteringen. För feature extrahering används scale invariant feature transform (SIFT) och för att parametrisera landmärken används unified inverse depth parametrisation (UIDP). Orienteringen av roboten beskrivs med hjälp av qvartinjoner. För att evaluera skattningarna har en ABB robot används som referens vid datainsamling. Då roboten har god kännedom om position och orientering av sitt främre verktyg gör detta att prestandan i filtret kan undersökas. Resultaten visar att algorithmen fungerar bra och att skattningar har hög noggrannhet.
APA, Harvard, Vancouver, ISO, and other styles
22

Galardini, Luca. "Research on Methods for Processing Large-Baseline Stereo Camera Data for Long Range 3D Environmental Perception." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text
Abstract:
Depth perception and 3D object recognition with stereo cameras has a variety of applications. For example, the technology enables to sense ob-jects and possible dangers on automated cars using passive and cheap sen-sors. However, sensing range corresponds to camera baseline, and rigid camera mounting is only available to rather small baseline setups. Thus, typical systems do not exceed sensing ranges of around 50m. To enable higher ranges with baselines of about 1-2m on moving and vibrating plat-forms, non-rigidity of the setup (i.e. a variable relative camera orientation) must be considered. Hence, calibration and depth image processing must be extended with re-calibration with every stereo pair along the image se-quence. Ideally, this re-calibration works in real-time and on arbitrary im-age sequences, however this must still be evaluated.
APA, Harvard, Vancouver, ISO, and other styles
23

Stattin, Sebastian. "Concurrent validity and reliability of a time of-flight camera on measuring muscle’s mechanical properties during sprint running." Thesis, Umeå universitet, Avdelningen för idrottsmedicin, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-163191.

Full text
Abstract:
Recent advancements in 3D data gathering have made it possible to measure the distance to an object at different time stamps through the use of time-of-flight cameras. Therefore, the purpose of this study was to investigate the validity and reliability of a time-of-flight camera on different mechanical sprint properties of the muscle. Fifteen male football players performed four 30m maximal sprint bouts which was simultaneously recorded with a time-of-flight camera and 1080 sprint device. By using an exponential function on the collected positional- and velocity-time data from both the devices, following variables were derived and analyzed: Maximal velocity (nmax), time constant (t), theoretical maximal force (F0), theoretical maximal velocity (V0), peak power output (Pmax), F-V mechanical profile (Sfv) and decrease in ratio of force (Drf). The results showed strong correlation in vmax along with a fairly small standard error of estimate (SEE) (r = 0,817, SEE = 0,27 m/s), while t displayed moderate correlation and relatively high SEE (r = 0,620, SEE = 0,12 s). Furthermore, moderate mean bias (>5%) were revealed for most of the variables, except for vmax and V0. The within-sessions reliability using Intraclass correlation coefficient (ICC) and standard error of measurement (SEM) ranged from excellent to poor with Pmax displaying excellent reliability (ICC = 0,91, SEM = 72W), while vmax demonstrated moderate reliability (ICC = 0,61, SEM = 0,26 m/s) and t poor(ICC = 0,44, SEM = 0,11 s). In conclusion, these findings showed that in its current state, the time-of-flight camera is not a reliable or valid device in estimating different mechanical properties of the muscle during sprint running using Samozino et al’s computations. Further development is needed.
APA, Harvard, Vancouver, ISO, and other styles
24

Aldrovandi, Lorenzo. "Depth estimation algorithm for light field data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
25

Sych, Alexey, and Олексій Сергійович Сич. "Image depth evaluation system by stream video." Thesis, National Aviation University, 2021. https://er.nau.edu.ua/handle/NAU/50762.

Full text
Abstract:
1. Depth map generation for 2d-to-3d conversion by short-term motion assisted color segmentation/Yu-Lin Chang, Chih-Ying Fang, Li-Fu Ding, Shao-Yi Chen, and Liang-Gee Chen - DSP/IC Design Lab, Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan 2. Scharstein D., Szeliski R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms // Int. Journal of Computer Vision 47. April-June 2002. PP. 7–42. 3. Разработка и исследование алгоритма вычисления карты глубины стереоизображения/ В.В. Воронин. 4. Метод оценки глубины сцены и текстуры невидимых частей изображения URL: https://neurohive.io/ru/papers/pokazat-to-chto-skryto-metod-ocenki-glubiny-i-nevidimyh-chastej-izobrazhenij/ (Last accessed: 11.01.2021).
One of the data processing applications is stereo vision, in which obtaining a three-dimensional scene is based on models for determining the depths of key points of images from a video sequence or several images. If it is considered an example with a person, then a two-dimensional image is formed on the retina, but despite this, a person perceives the depth of space, that is, has three-dimensional, stereoscopic vision. As a result, in the presence of data on the size of an object, it can be estimated the distance to it or understand which of the objects is closer. When one object is in front of the other and partially obscures it, the person perceives the front object at a closer distance. Because of this, the need arose to teach machine devices to do this for various tasks. Based on the processing results, you can have spatial information for assessing the relief, obstacles while driving, etc. This algorithm is based on combining images of the same object, photographed or filmed on video with constant camera parameters and in the same focal plane from different angles, allows to obtain information about the distance to the object by perspective distortions (discrepancies).
Одним із додатків для обробки даних є стереобачення, в якому отримання тривимірної сцени базується на моделях для визначення глибини ключових точок зображень із відеопослідовності або декількох зображень. Якщо це розглядати як приклад з людиною, то на сітківці утворюється двовимірне зображення, але, незважаючи на це, людина сприймає глибину простору, тобто має тривимірне, стереоскопічне бачення. Як результат, за наявності даних про розмір об’єкта можна оцінити відстань до нього або зрозуміти, який з об’єктів знаходиться ближче. Коли один предмет перебуває перед іншим і частково затемнює його, людина сприймає передній предмет на більш близькій відстані. Через це виникла потреба навчити машинні пристрої робити це для різних завдань. На основі результатів обробки ви можете мати просторову інформацію для оцінки рельєфу, перешкод під час руху тощо. Цей алгоритм заснований на поєднанні зображень одного і того ж об'єкта, сфотографованих чи знятих на відео з постійними параметрами камери і в одній і тій же фокальній площині з різних кутів, дозволяє отримувати інформацію про відстань до об'єкта шляхом перспективних спотворень (розбіжностей).
APA, Harvard, Vancouver, ISO, and other styles
26

Basso, Filippo. "A non-parametric Calibration Algorithm for Depth Sensors Exploiting RGB Cameras." Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3424206.

Full text
Abstract:
Range sensors are common devices on modern robotic platforms. They endow the robot with information about distance and shape of the objects in the sensors field of view. In particular, the advent in the last few years of consumer RGB- D sensors such as the Microsoft Kinect, has greatly fostered the development of depth-based algorithms for robotics. In fact, such sensors can provide a large quantity of data at a relatively low price. In this thesis three different calibration problems for depth sensors are tackled. The first original contribution to the state of the art is an algorithm to recover the axis of rotation of a 2D laser range finder (LRF) mounted on a rotating support. The key difference with other approaches is the use of kinematics point-plane constraints to estimate the pose of the LRF with respect to a static camera, and screw decomposition to recover the axis of rotation. The correct reconstruction of a small indoor environment after calibration validates the proposed algorithm. The second and most important original contribution of the thesis is a fully automatic two-steps calibration algorithm for structured-light depth sensors (e.g. Kinect). The key novelty of this work is the separation of the depth error into two components, corrected with functions estimated on a pixel-basis. This separation, validated by experimental observations, allows to dramatically reduce the number of parameters in the final non-linear minimization and, consequently, the time for the solution to converge to the global minimum. The depth images of a test set corrected using the obtained calibration parameters are analyzed and compared to the ground truth. The comparison shows that they differ from the real ones just for an unpredictable noise. A qualitative analysis of the fusion between depth and RGB data further confirms the effectiveness of the approach. Moreover, a ROS package for both calibrating and correcting the Kinect data has been released as open source. The third contribution reported in the thesis is a new distributed calibration algorithm for networks composed by cameras and already-calibrated depth sensors. A ROS package implementing the proposed approach has been developed and is available for free as a part of a big open source project for people tracking: OpenPTrack. The developed package is able to calibrate networks composed by a dozen sensors in real-time (i.e., batch processing is not needed), exploiting plane- to-plane constraints and non-linear least squares optimization.
I sensori di profondità sono dispositivi comuni sui robot moderni. Essi forniscono al robot informazioni sulla distanza e sulla forma degli oggetti nel loro campo di visione, permettendogli di agire di conseguenza. In particolare, l’arrivo negli ultimi anni di sensori RGB-D di consumo come Microsoft Kinect, ha favorito lo sviluppo di algoritmi per la robotica basati su dati di profondità. Di fatto, questi sensori sono in grado di generare una grande quantità di dati ad un prezzo relativamente basso. In questa tesi vengono affrontati tre diversi problemi riguardanti la calibrazione di sensori di profondità. Il primo contributo originale allo stato dell’arte è un algoritmo per stimare l’asse di rotazione di un laser range finder (LRF) 2D montato su un supporto rotante. La differenza chiave con gli altri approcci è l’utilizzo di vincoli punto-piano derivanti dalla cinematica per stimare la posizione del LRF rispetto ad una videocamera fissa, e l’uso di una screw decomposition per stimare l’asse di rotazione. La corretta ricostruzione di una stanza dopo la calibrazione valida l’algoritmo proposto. Il secondo e più importante contributo originale di questa tesi è un algoritmo completamente automatico per la calibrazione di sensori di profondità a luce strut- turata (ad esempio Kinect). La chiave di questo lavoro è la separazione dell’errore di profondità in due componenti, entrambe corrette pixel a pixel. Questa separa- zione, validata da osservazioni sperimentali, permette di ridurre sensibilmente il numero di parametri nell’ottimizzazione finale e, di conseguenza, il tempo neces- sario affinché la soluzione converga al minimo globale. Il confronto tra le immagini di profondità di un test set, corrette con i parametri di calibrazione ottenuti, e quelle attese, dimostra che la differenza tra le due è solamente di una quantità ca- suale. Un’analisi qualitativa della fusione tra dati di profondità e RGB conferma ulteriormente l’efficacia dell’approccio. Inoltre, un pacchetto ROS per calibrare e correggere i dati generati da Kinect è disponibile open source. Il terzo contributo riportato nella tesi è un nuovo algoritmo distribuito per la calibrazione di reti composte da videocamere e sensori di profondità già calibrati. Un pacchetto ROS che implementa l’algoritmo proposto è stato rilasciato come parte di un grande progetto open source per il tracking di persone: OpenPTrack. Il pacchetto sviluppato è in grado di calibrare reti composte da una decina di sensori in tempo reale (non è necessario processare i dati in un secondo tempo), sfruttando vincoli piano-piano e un’ottimizzazione non lineare.
APA, Harvard, Vancouver, ISO, and other styles
27

Dey, Rohit. "MonoDepth-vSLAM: A Visual EKF-SLAM using Optical Flow and Monocular Depth Estimation." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1627666226301079.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Müller, Franziska [Verfasser]. "Real-time 3D hand reconstruction in challenging scenes from a single color or depth camera / Franziska Müller." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2020. http://d-nb.info/1224883594/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Liao, Miao. "Single View Modeling and View Synthesis." UKnowledge, 2011. http://uknowledge.uky.edu/gradschool_diss/828.

Full text
Abstract:
This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments. In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm. Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work. In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video.
APA, Harvard, Vancouver, ISO, and other styles
30

Tahavori, F. "The application of a low-cost 3D depth camera for patient set-up and respiratory motion management in radiotherapy." Thesis, University of Surrey, 2017. http://epubs.surrey.ac.uk/813529/.

Full text
Abstract:
Respiratory motion induces uncertainty in External Beam Radiotherapy (EBRT), which can result in sub-optimal dose delivery to the target tissue and unwanted dose to normal tissue. The conventional approach to managing patient respiratory motion for EBRT within the area of abdominal-thoracic cancer is through the use of internal radiological imaging methods (e.g. Megavoltage imaging or Cone-Beam Computed Tomography) or via surrogate estimates of tumour position using external markers placed on the patient chest. This latter method uses tracking with video-based techniques, and relies on an assumed correlation or mathematical model, between the external surrogate signal and the internal target position. The marker's trajectory can be used in both respiratory gating techniques and real-time tracking methods. Internal radiological imaging methods bring with them limited temporal resolution, and additional radiation burden, which can be addressed by external marker-based methods that carry no such issues. Moreover, by including multiple external markers and placing them closer to the internal target organs, the effciency of correlation algorithms can be increased. However, the quality of such external monitoring methods is underpinned by the performance of the associated correlation model. Therefore, several new approaches to correlation modelling have been developed as part of this thesis and compared using publicly-available datasets. Highly competitive results have been obtained when compared against state-of-the-art methods. Marker-based methods also have the disadvantages of requiring manual set-up time for marker placement and patient positioning and potential issues with reproducibility of marker placement. This motivates the investigation of non-contact marker-free methods for use in EBRT, which is the main topic of this thesis. The Microsoft Kinect is used as an example of a low-cost consumer grade 3D depth camera for capturing and analysing external respiratory motion. This thesis makes the first presentation of detailed studies of external respiratory motion captured using such low-cost technology and demonstrates its potential in a healthcare environment. Firstly, the fundamental performance of a range of Microsoft Kinect sensors is assessed for use in radiotherapy (and potentially other healthcare applications), in terms of static and dynamic performance using both phantoms and volunteers. Then external respiratory motion is captured using the above technology from a group of 32 healthy volunteers and Principal Component Analysis (PCA) is applied to a region of interest encompassing the complete anterior surface to demonstrate breathing style. This work demonstrates that this surface motion can be compactly described by the first two PCA eigenvectors. The reproducibility of subject-specific EBRT set-up using conventional laser-based alignment and marker-based Deep Inspiration Breath Hold (DIBH) methods are also studied using the Microsoft Kinect sensor. A cohort of five healthy female volunteers is repeatedly set-up for left-sided breast cancer EBRT and multiple DIBH episodes captured over five separate sessions representing multiple fractionated radiotherapy treatment sessions, but without dose delivery. This provided an independent assessment that subjects were set-up and generally achieved variations within currently accepted margins of clinical practice. Moreover, this work demonstrated the potential role of consumer-grade 3D depth camera technology as a possible replacement for marker based set-up and DIBH management procedures. This brings with it the additional benefits of low cost, and potential through-put benefits, as patient set-up could ultimately be fully automated with this technology, and DIBH could be independently monitored without requiring preparatory manual intervention.
APA, Harvard, Vancouver, ISO, and other styles
31

Lamprecht, Bernhard. "A testbed for vision based advanced driver assistance systems with special emphasis on multi-camera calibration and depth perception /." Aachen : Shaker, 2008. http://d-nb.info/990314847/04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Chia-ChunWeng and 翁嘉駿. "Collision Detection Using Depth Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/q5v5bk.

Full text
Abstract:
碩士
國立成功大學
機械工程學系
106
With the rise of industrial automation, collision avoidance systems have been developed to reduce the damage caused by machine impact. Conversely, collision confirmation systems have been developed to determine whether collide or not. In view of this, demand for collision detection is increasing. The sensing method can be divided into contact and non-contact methods. This study uses a non-contact machine vision method to detect collisions using a depth camera. In this study. First, a depth camera is used to surround the workspace, and captures the 3D point cloud images of the workspace. Then, the Iterative Closest Point (ICP) algorithm is used to align and merge the point clouds into a merged point cloud. The merged point cloud is divided into target object and obstacle point clouds based on Euclidean distance. The target object point cloud is expanded along the direction of normal vector of each point. Finally, each point cloud can be meshed and the GJK algorithm (Gilbert-Johnson–Keerthi distance algorithm) is used to perform collision detection. This study analyzes four cases, which are six-axis robot arm collision detection, gripper grip detection, mobile robot collision detection, and weapon arts judgment. The results show that the proposed method successfully divides the workspace into target objects, statically constructed obstacles, and dynamically detected obstacles for collision detection. This study proposes an expansion method to perform the collisions prediction
APA, Harvard, Vancouver, ISO, and other styles
33

Chen, Zhi-Liang, and 陳致良. "Depth Camera - Assisted Indoor Localization Enhancement." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/88236027504172742228.

Full text
Abstract:
碩士
國立臺灣科技大學
資訊工程系
101
This paper develops an approach for image triangulation from point cloud. This approach can be divided into three parts: reconstructing environment, virtual images database establishment and triangulation. During constructing virtual images database, we can acquire extra localization information which traditional image localization lacks. When camera is far from scene or camera is sheltered by objects, traditional SIFT localization may decrease the accuracy. Our approach provides higher localization accuracy and coverage ratio by choosing better camera angles and positions automatically. In experiments, we take practical localization by traditional SIFT localization and virtual images triangulation to compare result.
APA, Harvard, Vancouver, ISO, and other styles
34

Chiou, Yi-Wen, and 邱義文. "Depth Refinement for View Synthesis using Depth Sensor and RGB Camera." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/29737408822663940034.

Full text
Abstract:
碩士
國立交通大學
電子研究所
100
In recent years, three-dimension (3D) video has been a trend after the very popular science fiction movie – Avatar produced in the 2009. Many 3D movies, TV sets and even mobile phone sets have been developed. The view synthesis technology is an essential element in a 3D video system. The current technology adopted by the international MPEG committee on 3DVC standard is to generate a virtual viewpoint scene by using the received 2D views and their associated depth information. Therefore, the depth information plays an important role in 3D view synthesis. In general, we can estimate the depth information by using two 2D texture images of different viewpoints using various depth estimation methods, and this approach is called Passive Depth Estimation. Very often, this approach fails to provide accurate depth information on the textureless regions. Furthermore, the occlusion regions which always exist due to 2 cameras, often lead to the “hole” defects on the synthesized views. In this study, we adopt an active sensor – Kinect to capture the depth information. This active sensor provides pretty accurate depth information on the textureless regions, and it can operate in real-time. To generate a new view or virtual view, we use a pair of Kinect sensors as the left and right cameras. The Kinect depth sensor can be treated as another camera and thus we employ and improve some conventional techniques to estimate its camera parameters, calibrate its images (depth maps) and reduce artifacts. In the calibration step, we use the information between two texture images to estimate the 3D geometry relationship between two Kinect sensors. Furthermore, because the depth sensor and color camera are located at two different positions, we propose an “Alignment” procedure to match the coordinates of the depth image and the texture image. In designing and implementing our alignment procedure, we use a disparity model and the Kinect SDK functions. Finally, we use the Joint Bilateral Filter and color information to reduce noises and defects on the sensor-acquired depth map. Comparing to the depth map estimated by using the MPEG Depth Estimation Reference Software (DERS), the captured and processed depth map clearly provide more accurate depth information. At the end, we examine and compare the synthesized images using the original and the refined depth maps. The quality of the refined synthesized image is noticed improved.
APA, Harvard, Vancouver, ISO, and other styles
35

Tu, Chieh-Min, and 杜介民. "Depth Image Inpainting with RGB-D Camera." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/k4m42a.

Full text
Abstract:
碩士
義守大學
資訊工程學系
103
Since Microsoft released the cheap Kinect sensors as a new natural user interface, stereo imaging is made from previous multi-view color image synthesis, to now synthesis of color image and depth image. But the captured depth images may lose some depth values so that stereoscopic effect is often poor in general. This thesis is based on Kinect RGB-D camera to develop an object-based depth inpainting method. Firstly, the background differencing, frame differencing and depth thresholding strategies are used as a basis for segmenting foreground objects from a dynamic background image. Then, the task of hole inpainting is divided into background area and foreground area, in which background area is inpainted by background depth image and foreground area is inpainted by a best-fit neighborhood depth value. Experimental results show that such an inpainting method is helpful to fill holes, and to improve the contour edges and image quality.
APA, Harvard, Vancouver, ISO, and other styles
36

Tsai, Sheng-Che, and 蔡昇哲. "Implementation Of 3D Cursor Using Depth Camera." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/20737575961640569497.

Full text
Abstract:
碩士
國立臺灣大學
電信工程學研究所
101
The 3D cursor system is a new idea to in 3D user interface (UI). Rather than traditional 3D input device, 3D cursor system use depth camera to help user control cursor using their hands. It is different to regular 2D cursor because the 3D cursor has one more attribute value: depth. Since 2D cursor only have two attribute, 3D cursor can control object expertise and convenient in 3D virtual graphic world. This paper is focus on how to design the 3D cursor system and discuss the implementation problems. Also involved some 3D UI design and 3D gesture design, and show the result of implementation and discussion the pros and cons at the end.
APA, Harvard, Vancouver, ISO, and other styles
37

Liou, Jia-Lin, and 劉嘉麟. "Noncontact Respiratory Volume Measurement Using Depth Camera." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/64011430563257356131.

Full text
Abstract:
碩士
國立臺灣大學
資訊工程學研究所
100
Breathing is important. In this study, a noncontact respiration measurement technique using depth cameras, Kinect camera, is developed to measure respiratory volume from the morphological changes of chest wall region. Then, user’s breathing status, i.e. the respiratory rate, respiratory depth, and inhale-to-exhale ratio, and breathing methods, i.e. thoracic breathing and abdominal breathing, can be measured. For measuring the chest wall movements, a dynamic thoracic and abdominal region of interest (ROI) tracking technique is used. In this study, two frameworks of noncontact respiratory measurement are proposed. One is a single-sided depth camera system to measure respiratory volume in sitting and lying postures. The other is a double-sided depth camera system to measure respiratory volume in standing posture. Through experiments, the system was evaluated in three different wearing conditions (necked, thin clothing, and thick coat) and three different postures (sitting, lying, and standing). Among these experiments, the measured respiratory volumes are compared by our method and a reference device, spirometer. Finally, three noncontact respiratory measurement applications are developed, including a Mao-Kung Ting system, an Intensive Care Unit (ICU) respiratory monitoring system, and a standing meditation system. In conclusion, a low-cost and easy-operating noncontact regional respiratory measurement system is developed, and the contribution of this study is to develop a system which could help users aware of their breathing conditions, and toward the ultimate goal of preventive medicine.
APA, Harvard, Vancouver, ISO, and other styles
38

Chen, Hao-Yu, and 陳皓宇. "An Extrinsic Calibration for Depth Camera to Narrow Field of View Color Camera." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/47e5k2.

Full text
Abstract:
碩士
國立臺灣大學
資訊網路與多媒體研究所
107
This work proposes a calibration method between a narrow field of view cameraandadepthcamerainanendoscope-likescenario. Anendoscopy-like scenario has several properties including limited specular reflective surface, a camera with a narrow field of view. Instead of pushing the accuracy of the target marker with low-resolution data, we propose a solution with a loss function. The proposed loss function utilizes all of the 3 dimensions points of the checkerboard measured with the depth camera, and calculates the distance between projected 3D positions onto 2D image surface and the color image. The final re-projected error is improved to average under 1 millimeters, and further trainingand evaluation of depth estimation algorithms could be performed.
APA, Harvard, Vancouver, ISO, and other styles
39

Hu, Jhen-Da, and 胡振達. "Hybrid Hand Gesture Recognition Based on Depth Camera." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/7febjn.

Full text
Abstract:
碩士
國立交通大學
多媒體工程研究所
103
Hand gesture recognition (HRG) becomes one of most popular topics in recent years because that hand gesture is one of the most natural and intuitive way of communication between Human and machines. It is widely used in HCI (Human-Computer-interaction). In this paper, we proposed a method for hand gesture recognition based on depth camera. Firstly, the hand information within depth image is separated from background based on a specific range of depth. And the contour of hand is detected after segmentation. After that, we estimate centroid of hand, and palm size is calculated by using linear regression. Then, fingers’ states of gesture are estimated depending on information of hand contour. And fingertips are estimated by means of smooth hand contours which reduce number of contours by Douglas-Peucker Algorithm. Finally, we propose a gesture type estimation algorithm to determine which gesture is. The extensive experiments demonstrate that the accuracy rate of our method is from 84.35% to 99.55%, and the mean accuracy is 94.29%.
APA, Harvard, Vancouver, ISO, and other styles
40

Shiu, Hung-Wei, and 徐鴻煒. "3D Human Posture Tracking Based on Depth Camera." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/81555185244865050910.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Wu, Cheng-tsung, and 吳承宗. "A Depth-camera-cased Mixed Reality Interactive Table." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/90326863396371121985.

Full text
Abstract:
碩士
國立中央大學
資訊工程研究所
100
This thesis combines Microsoft Kinect with a projector to create a mixed reality interactive table. This interactive table can provide two different modes, the touch screen mode and the mixed reality interactive music mode. Before the implementation of two modes must be regulate the Kinect and the projector to get the coordinates transformation matrix. The purpose of transformation matrix is change the origin of the coordinate system from Kinect to the upper left corner of the projector’s screen. The real world points set of each frame multiply this transformation matrix. We could get new a point set and the origin of this point set is the upper left corner of the projector’s screen. Then build the disparity map in top view by the converted point set.   In the touch screen mode, this system could recognize eight hand gestures. According to the hand gestures and hand’s height to decide the instruction of mouse. So we can change the projector’s screen into touch screen. In the mixed reality interactive music mode, we will provide a three-dimensional object recognition. Users could develop their creativity to compose blocks of arbitrary shape on the projector’s screen. And select a suitable instrument from 120 kinds of musical instruments for the objects (blocks). Blocks have been only tactile and visual, coupled with hearing. This system will recognize the user-specified instrument object, and drew 21 notes next to the instrument object. People could play more than one instrument at the same time. Achieving the understanding of musical instruments, the fun of ensemble, and infinite creative space and stimulates thinking.
APA, Harvard, Vancouver, ISO, and other styles
42

Yan, Chin-Hsien, and 顏欽賢. "THE RESEARCH OF FALL DETECTION BY DEPTH CAMERA." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/25038998937168589780.

Full text
Abstract:
碩士
大同大學
資訊工程學系(所)
101
Home safety for the elderly is an important issue in current society, many elderly people living alone and not have all day daycare of their loved ones. The fall event occurred in home environment, or the condition of people cannot help themselves and first aid. At this time, home security detection equipment can come in handy, the system will detect falls, track personnel dynamically, and report back to the nursing staff or relatives come to the rescue in a timely manner, this system can help solve the security problems of the elderly people living alone at home.Fall detection system can detect fall and report people`s action. Numerous academic studies for people who wear sensors such as accelerometers, gyroscopes, level meter, heart rate monitor, Electromyography signals (EMG) measurement system, using floor induction or general color captured by camera to catch body contours for fall posture detect.The purpose of study which use depth camera to track the human body and find the body 20 "3D coordinate" of joint points, and made correction tools of camera coordinate turn into world coordinate to produce the world axis, next, we let 20 camera coordinate of skeleton points turn into world coordinate of skeleton points.We use PCA (principal component analysis) to find the direction of body and the body's spindle angle of floor as fall characteristics. It can reduce the wear sensor discomfort and cost of instrument for the elderly to use. It does not affect the user's home habits, so that users can easy to use this device for home security of protection.The depth camera which in low light environment can trace human body, and can be used in any environment at home, next, to improve the accuracy and efficiency of the system identification. The experimental results of the system identification fall detection can achieve 98.86% accuracy rate.
APA, Harvard, Vancouver, ISO, and other styles
43

TSAI-JIE-SHIOU and 蔡杰修. "Gaze Direction Estimation Using Only A Depth Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/qmgq94.

Full text
Abstract:
碩士
國立臺灣科技大學
電子工程系
106
Over the years, gaze estimations have become more and more popular in the computer vision processing and been widely used in many applications. Many methods had been developed to solve the problem of determining where people are looking at. However, current gaze estimation methods tend to be expensive, inconvenient, and invasive. This thesis presents a non-invasive approach to estimate the user’s gaze direction based on a single consumer depth camera. The proposed method is able to estimate gaze directions under various lighting conditions and operating distances by using depth information. First, we use the contour of the chin to match the location of the head. Then, we employ the facial geometric relationship to locate the regions of eyes as the ROI for cutting down the computational complexity. Next, we utilize the depth value differences between the nose and eyes to locate the bridge of the nose, and then utilize the average distance from the eye center to the bridge of the nose to locate eye centers, which are defined as reference points in the proposed method. After that, we increase the intensity of ROI to get more obvious features of the dark effects to estimate the location of pupils. Finally, we get the gaze directions by analyzing the relative position be-tween pupils and reference points. By using contours and geometric relationship of head, the proposed method can estimate the gaze direction without any color information. The performance of the proposed system was verified for five different users in three luminance levels and three testing distances. Experimental results show that the proposed method achieves the average accuracy of 80.1 %, where is close to the existing RGB based method.
APA, Harvard, Vancouver, ISO, and other styles
44

Liu, Yueh-Sheng, and 劉曰聖. "Depth Acquisition System Based on Occlusion-Aware Depth Estimation with Light-Field Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/9287j7.

Full text
Abstract:
碩士
國立交通大學
光電工程研究所
106
Light-field cameras or plenoptic cameras have recently become available in consumer and industrial applications. Over the last decade, the researches have investigated the problem of highly-accurate and real-time depth reconstruction from a plenoptic camera. However, the depth resolution of plenoptic camera is very low due to narrow baseline, and the depth estimation is time-consuming because of the rough high software complexity. In this thesis, a detailed analysis of plenoptic camera is presented and an optimal plenoptic camera with a specific lens design is introduced to improve depth resolution and FOV simultaneously. In addition, we propose an occlusion-aware depth estimation algorithm with an efficient framework. We assume depth values are similar within color-homogeneous region and depth discontinuities occur in color edges and texture variation boundaries. Firstly, initial disparities are estimated at edge locations by enforcing photo-consistency only on non-occluded region of angular patch. Then a dense depth map is recovered from the sparse depth estimates with an edge-aware propagation. The proposed method is evaluated on synthetic and real-world datasets. Experiment results show that our algorithm outperforms the other relevant algorithms on runtime reduced to 8.8% of others while keeping competing accuracy of depth maps.
APA, Harvard, Vancouver, ISO, and other styles
45

Wong, Sin-Lung, and 黃新隆. "Real-Time Human Body Posture Estimation Using Depth Camera." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/25145582122553447840.

Full text
Abstract:
碩士
國立中興大學
電機工程學系所
102
This thesis proposes a real-time three dimensional (3D) human body posture estimation method using significant points extracted from two-dimensional (2D) human body contour and their depth information from a depth camera. The located human body significant points include the head, center of the body, the tips of the feet and the hands, the shoulders, the elbows, and the knees. This thesis uses the Kinect camera to capture both the color and depth image sequences at the same time. For human body segmentation, an Angle-Compensated segmentation method in the Red, Green, and Blue color space (AC-RGB) is used to segment a moving object from the background in the color image and reduce the influence of shadow. Segmentation of the human body using the depth information in the depth image is conducted as well. Segmentation results from the color and depth images are combined by using the OR operation. After the segmentation of the human body, 2D locations of the head and tips of the feet and the hands are obtained based on 2D contour convex points and body geometrical characteristics. When occlusion of the hand occurs, this thesis uses the depth information to locate the candidate region and the optical flow to find the tip of the hand. Two-dimensional locations of the elbows and the knees are obtained by using the skeleton pixels of a 2D body silhouette. After localization of the 2D significant points, the corresponding depth information is obtained from the depth image. The 3D locations of the significant points are sent to the Virtools software to reconstruct a virtual 3D human model. This thesis sets up a real-time virtual 3D human model system to verify the effectiveness of the proposed approach. To show the potential of the system, the system is also applied to some 3D interactive entertainment games.
APA, Harvard, Vancouver, ISO, and other styles
46

Wang, Jie-Hung, and 王傑鴻. "Simulation-Based 3D Pose Estimation Using a Depth Camera." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/d62482.

Full text
Abstract:
碩士
國立臺北科技大學
製造科技研究所
104
Object recognition and posture estimate play an important role in apply of automatic assembly and service robot. Today, the most popular method of recognition is using geometric or feature statistics further identify the object and estimate the posture by comparison of CAD model and point cloud of object by scanning. However, due to the infrared ray’s sensing parallax effect of Kinect sensor, causes significant distortion of point cloud information. In order to improve the efficacy of feature comparison of distortion point cloud data and object’s CAD model. We propose a Kinect virtual simulation space comparison method, can simulate the Kinect scan situation by estimate the sight angle, object distance and the intensity of environment parameters. The point cloud similarity can be increase by this method. So, it can precisely and rapidly achieve object comparison and picking posture estimation. Compare with the previous research, the result of this study reveal that, the accuracy of posture estimation and efficacy of operation power has significant increase. This technique will help automatic assembly and service robot field has more application of immediacy and accuracy.
APA, Harvard, Vancouver, ISO, and other styles
47

LU, HUNG-CHIH, and 盧泓志. "Structured Light Depth Camera Motion Blur Detection and Deblurring." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/42540991324647985208.

Full text
Abstract:
碩士
國立臺灣大學
資訊工程學研究所
103
Deblurring of 3D scenes captured by 3D sensors is a novel topic in computer vision. Motion blur occurs in a number of 3D sensors based on structured light techniques. We analyze the causes of motion blur captured by structured light depth cameras and design a novel algorithm using the speed cue and object models to deblur a 3D scene. The main idea is using the 3D model of an object to replace the blurry object in the scene. Because we aim to deal with consecutive 3D frame sequences, ie 3D videos, an object model can be built in the frame where the object is not blurry yet. Our deblurring method can be divided into two parts: motion blur detection and motion blur removal. For the motion blur detection part, we use the speed cue to detect where the motion blur is. For the motion blur removal part, first we judge the type of the motion blur, and then we apply the iterative closest point (ICP) algorithm in different ways according to the motion blur type. The proposed method is evaluated in real world cases and successfully accomplishes motion blur detection and blur removal.
APA, Harvard, Vancouver, ISO, and other styles
48

Lin, Yi-ta, and 林逸達. "3D Object Tracking and Recognition with RGB-Depth Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/yn2qzy.

Full text
Abstract:
碩士
國立中山大學
電機工程學系研究所
106
The main purpose of this paper is 3D object tracking by using RGB-D camera. In addition, we would change our object during the tracking phase and our system can identify the new object. This paper is basically composed of three phases. The first phase is off-line training. The second phase is on-line tracking. The third phase is identification of the new object. In the first phase, we create three 3D models of the tracking objects which are box, cylinder and sphere, and we use a method to calculate the point pair features for each 3D model. Then, we store those point pair feature into the database which would be used later. In the second phase, use the RGB-D sensor to obtain the real world scenery, and calculate the point pair feature of the real world scenery as well as the first phase. After that, we compare the scenery ''s point pair features to the database so that we can find out where the 3D model is in the scenery. However, it is just an initial pose for the 3D model, so here we have to use the Iterative Closet Point (ICP) algorithm to obtain a better pose. In the third phase, we would change the tracking object during the tracking phase, and our system can detect the situation from the scenery. Besides, it can identify the new tracking object and keep tracking of it by the method introduced in the second phase.
APA, Harvard, Vancouver, ISO, and other styles
49

Gu, Kai-Da, and 古凱達. "Depth-of-Field Simulation Based on a Real Camera Model." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/22785912255652743121.

Full text
Abstract:
碩士
國立中正大學
電機工程所
95
Depth of field is an inherent physical phenomenon of optical systems.To animation technology and cinema special effects, this phenomenon is an indispensable important factor.A number of approaches have been developed to generate depth of field effects, with varying degrees of success, such as distributed ray tracing, image accumulation and layered depth.This thesis presents a new post-processing method of simulating depth of field base on a real camera model.Our algorithm takes the advantage of layered depth information and the smooth characteristic of Gaussian filter.Different from the previous approaches, we use some simple calibration method to find the relation between ``the distance from camera lens to the objects'' and ``the blur degree of the image''.By these calibration results, we can simulate appropriate blur in accordance with the distance from camera lens to the objects.On the other hand, we can utilize this relation between the distance from camera lens to the objects and the blur degree of the image to estimate the real distance between objects and camera lens.The result is accurate, and the application of distance estimation is useful in many applications.
APA, Harvard, Vancouver, ISO, and other styles
50

Shih, Kuang-Tsu, and 施光祖. "High-Resolution Imaging and Depth Acquisition Using a Camera Array." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/un25de.

Full text
Abstract:
博士
國立臺灣大學
電信工程學研究所
105
In this age where everyone can be a photographer with his or her smart phone, the pursuit of higher imaging quality has become more important and profitable than ever before. Among the quality metrics of images, resolution is often the top one that people care the most. Being one of the conventional approaches to increasing the image resolution, optics optimization is believed to have reached its bottleneck. As a consequence, researchers are turning to computational photography to seek breakthrough. In this dissertation, we study the computational approach to high-resolution imaging based on multi-aperture systems such as a camera array or a lenslet array. This dissertation can be divided into two parts. The first part is dedicated to the analysis of existing approaches. Particularly, two approaches are inspected in depth: subpixel refocusing and reconstruction-based light field super-resolution. For subpixel refocusing, we show that a deconvolution step is missing in previous work and incorporating a deconvolution in the loop significantly enhances the sharpness of the results. We also conduct experiments to quantitatively analyze the effect of calibration error on subpixel refocusing and analyze the upper bound of the error for a targeted image quality. On the other hand, for reconstruction-based light field super-resolution, we show through experiments that the resolution gain obtainable by super-resolution does not increase boundlessly to the number of cameras and is ultimately limited by the size of the point spread function. In addition, we point out through experiment that there is a tradeoff between the obtainable resolution and the registration accuracy. The tradeoff is a fundamental limit of reconstruction-based approaches. In contrast to the analysis work in the first part, the second part of the dissertation describes our original solution: a computational photography system based on a camera array with mixed focal lengths. Our solution has two distinguished features: it can generate an output image whose resolution is higher than 80% of the total captured pixels and a disparity map of the same resolution that contains the depth information about the scene. Our solution consists of optimized hardware and an image fusion algorithm. On the hardware size, we propose an approach to optimize the configuration of a camera array for high-resolution imaging using cameras with mixed focal lengths and non-parallel optical axes. On the software side, an algorithm is developed to integrate the low-resolution images captured by the proposed camera array into a high-resolution image without the blurry appearance problem of previous methods.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography