Dissertations / Theses on the topic 'Depth camera'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Depth camera.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Sjöholm, Daniel. "Calibration using a general homogeneous depth camera model." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-204614.
Full textAtt noggrant kunna mäta avstånd i djupbilder är viktigt för att kunna göra bra rekonstruktioner av objekt. Men denna mätprocess är brusig och dagens djupsensorer tjänar på ytterligare korrektion efter fabrikskalibrering. Vi betraktar paret av en djupsensor och en bildsensor som en enda enhet som returnerar komplett 3D information. 3D informationen byggs upp från de två sensorerna genom att lita på den mer precisa bildsensorn för allt förutom djupmätningen. Vi presenterar en ny linjär metod för att korrigera djupdistorsion med hjälp av en empirisk modell, baserad kring att enbart förändra djupdatan medan plana ytor behålls plana. Djupdistortionsmodellen implementerades och testades på kameratypen Intel RealSense SR300. Resultaten visar att modellen fungerar och i regel minskar mätfelet i djupled efter kalibrering, med en genomsnittlig förbättring kring 50 procent för de testade dataseten.
Jansson, Isabell. "Visualizing Realtime Depth Camera Configuration using Augmented Reality." Thesis, Linköpings universitet, Medie- och Informationsteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139934.
Full textEfstratiou, Panagiotis. "Skeleton Tracking for Sports Using LiDAR Depth Camera." Thesis, KTH, Medicinteknik och hälsosystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297536.
Full textSkelettspårning kan åstadkommas med hjälp av metoder för uppskattning av mänsklig pose. Djupinlärningsmetoder har visat sig vara det främsta tillvägagångssättet och om man använder en djupkamera med ljusdetektering och varierande omfång verkar det vara möjligt att utveckla ett markörlöst system för rörelseanalysmjukvara. I detta projekt används ett tränat neuralt nätverk för att spåra människor under sportaktiviteter och för att ge feedback efter biomekanisk analys. Implementeringar av fyra olika filtreringsmetoder för mänskliga rörelser presenteras, kalman filter, utjämnare med fast intervall, butterworth och glidande medelvärde. Mjukvaran verkar vara användbar vid fälttester för att utvärdera videor vid 30Hz. Detta visas genom analys av inomhuscykling och släggkastning. En ickestatisk kamera fungerar ganska bra vid mätningar av en stilla och upprättstående person. Det genomsnittliga absoluta felet är 8.32% respektive 6.46% då vänster samt höger knävinkel användes som referens. Ett felfritt system skulle gynna såväl idrottssom hälsoindustrin.
Huotari, V. (Ville). "Depth camera based customer behaviour analysis for retail." Master's thesis, University of Oulu, 2015. http://urn.fi/URN:NBN:fi:oulu-201510292099.
Full textJanuzi, Altin. "Triple-Camera Setups for Image-Based Depth Estimation." Thesis, Uppsala universitet, Institutionen för elektroteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-422717.
Full textRangappa, Shreedhar. "Absolute depth using low-cost light field cameras." Thesis, Loughborough University, 2018. https://dspace.lboro.ac.uk/2134/36224.
Full textNassir, Cesar. "Domain-Independent Moving Object Depth Estimation using Monocular Camera." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233519.
Full textI dag strävar bilföretag över hela världen för att skapa fordon med helt autonoma möjligheter. Det finns många fördelar med att utveckla autonoma fordon, såsom minskad trafikstockning, ökad säkerhet och minskad förorening, etc. För att kunna uppnå det målet finns det många utmaningar framåt, en av dem är visuell uppfattning. Att kunna uppskatta djupet från en 2D-bild har visat sig vara en nyckelkomponent för 3D-igenkännande, rekonstruktion och segmentering. Att kunna uppskatta djupet i en bild från en monokulär kamera är ett svårt problem eftersom det finns tvetydighet mellan kartläggningen från färgintensitet och djupvärde. Djupestimering från stereobilder har kommit långt jämfört med monokulär djupestimering och var ursprungligen den metod som man har förlitat sig på. Att kunna utnyttja monokulära bilder är dock nödvändig för scenarier när stereodjupuppskattning inte är möjligt. Vi har presenterat ett nytt nätverk, BiNet som är inspirerat av ENet, för att ta itu med djupestimering av rörliga objekt med endast en monokulär kamera i realtid. Det fungerar bättre än ENet med datasetet Cityscapes och lägger bara till en liten kostnad på komplexiteten.
Pinard, Clément. "Robust Learning of a depth map for obstacle avoidance with a monocular stabilized flying camera." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLY003/document.
Full textCustomer unmanned aerial vehicles (UAVs) are mainly flying cameras. They democratized aerial footage, but with thei success came security concerns.This works aims at improving UAVs security with obstacle avoidance, while keeping a smooth flight. In this context, we use only one stabilized camera, because of weight and cost incentives.For their robustness in computer vision and thei capacity to solve complex tasks, we chose to use convolutional neural networks (CNN). Our strategy is based on incrementally learning tasks with increasing complexity which first steps are to construct a depth map from the stabilized camera. This thesis is focused on studying ability of CNNs to train for this task.In the case of stabilized footage, the depth map is closely linked to optical flow. We thus adapt FlowNet, a CNN known for optical flow, to output directly depth from two stabilized frames. This network is called DepthNet.This experiment succeeded with synthetic footage, but is not robust enough to be used directly on real videos. Consequently, we consider self supervised training with real videos, based on differentiably reproject images. This training method for CNNs being rather novel in literature, a thorough study is needed in order not to depend too moch on heuristics.Finally, we developed a depth fusion algorithm to use DepthNet efficiently on real videos. Multiple frame pairs are fed to DepthNet to get a great depth sensing range
Kuznetsova, Alina [Verfasser]. "Hand pose recogniton using a consumer depth camera / Alina Kuznetsova." Hannover : Technische Informationsbibliothek (TIB), 2016. http://d-nb.info/1100290125/34.
Full textSandberg, David. "Model-Based Video Coding Using a Colour and Depth Camera." Thesis, Linköpings universitet, Datorseende, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-68737.
Full textI detta examensarbete har en modellbaserad videokodningsalgoritm utvecklats som använder data från en djup- och färgkamera, exempelvis Microsoft Kinect. Det finns flera fördelar med en modellbaserad representation av en video över den mer vanligt förekommande blockbaserade varianten, vilket används av bland annat H.264. Några exempel är möjligheten att rendera videon i 3D samt från alternativa vyer, placera in objekt i videon samt möjlighet för användaren att interagera med scenen. Detta examensarbete påvisar en väldigt effektiv metod för komprimering av scengeometri. Resultaten av den presenterade algoritmen visar att möjligheten att uppnå väldigt låg bithastighet med jämförelsebara resultat med H.264-standarden.
Pinard, Clément. "Robust Learning of a depth map for obstacle avoidance with a monocular stabilized flying camera." Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLY003.
Full textCustomer unmanned aerial vehicles (UAVs) are mainly flying cameras. They democratized aerial footage, but with thei success came security concerns.This works aims at improving UAVs security with obstacle avoidance, while keeping a smooth flight. In this context, we use only one stabilized camera, because of weight and cost incentives.For their robustness in computer vision and thei capacity to solve complex tasks, we chose to use convolutional neural networks (CNN). Our strategy is based on incrementally learning tasks with increasing complexity which first steps are to construct a depth map from the stabilized camera. This thesis is focused on studying ability of CNNs to train for this task.In the case of stabilized footage, the depth map is closely linked to optical flow. We thus adapt FlowNet, a CNN known for optical flow, to output directly depth from two stabilized frames. This network is called DepthNet.This experiment succeeded with synthetic footage, but is not robust enough to be used directly on real videos. Consequently, we consider self supervised training with real videos, based on differentiably reproject images. This training method for CNNs being rather novel in literature, a thorough study is needed in order not to depend too moch on heuristics.Finally, we developed a depth fusion algorithm to use DepthNet efficiently on real videos. Multiple frame pairs are fed to DepthNet to get a great depth sensing range
Harding, Cressida M. "How far away is it? : depth estimation by a moving camera." Thesis, University of Canterbury. Electrical and Electronic Engineering, 2001. http://hdl.handle.net/10092/6157.
Full textBarandas, Marília da Silveira Gouveia. "Range of motion measurements based on depth camera for clinical rehabilitation." Master's thesis, Faculdade de Ciências e Tecnologia, 2013. http://hdl.handle.net/10362/11046.
Full textIn clinical rehabilitation, biofeedback increases the patient’s motivation which makes it one of the most effective motor rehabilitation mechanisms. In this field it is very helpful for the patient and even for the therapist to know the level of success and performance of the training process. The human motion tracking study can provide relevant information for this purpose. Existing lab-based Three-Dimensional (3D) motion capture systems are capable to provide this information in real-time. However, these systems still present some limitations when used in rehabilitation processes involving biofeedback. A new depth camera - the Microsoft KinectTM - was recently developed overcoming the limitations associated with the lab-based movement analysis systems. This depth camera is easy to use, inexpensive and portable. The aim of this work is to introduce a system in clinical practice to do Range of Motion(ROM) measurements, using the KinectTM sensor and providing real-time biofeedback. For this purpose, the ROM measurements were computed using the joints spatial coordinates provided by the official Microsoft KinectTM Software Development Kit (SDK)and also using our own developed algorithm. The obtained results were compared with a triaxial accelerometer data, used as reference. The upper movements studied were abduction, flexion/extension and internal/external rotation with the arm at 90 degrees of elevation. With our algorithm the Mean Error (ME) was less than 1.5 degrees for all movements. Only in abduction the KinectTM Sketelon Tracking obtained comparable data. In other movements the ME increased an order of magnitude. Given the potential benefits, our method can be a useful tool for ROM measurements in clinics.
Carraro, Marco. "Real-time RGB-Depth preception of humans for robots and camera networks." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3426800.
Full textQuesta tesi tratta di percezione per robot autonomi e per reti di telecamere da dati RGB-Depth. L'obiettivo è quello di fornire algoritmi robusti ed efficienti per l'interazione con le persone. Per questa ragione, una particolare attenzione è stata dedicata allo sviluppo di soluzioni efficienti che possano essere eseguite in tempo reale su computer e schede grafiche consumer. Il contributo principale di questo lavoro riguarda la stima automatica della posa 3D del corpo delle persone presenti in una scena. Vengono proposti due algoritmi che sfruttano lo stream di dati RGB-Depth da una rete di telecamere andando a migliorare lo stato dell'arte sia considerando dati da singola telecamera che usando tutte le telecamere disponibili. Il secondo algoritmo ottiene risultati migliori in quanto riesce a stimare la posa di tutte le persone nella scena con overhead trascurabile e non richiede sincronizzazione tra i vari nodi della rete. Tuttavia, il primo metodo utilizza solamente nuvole di punti che sono disponibili anche in ambiente con poca luce nei quali il secondo algoritmo non raggiungerebbe gli stessi risultati. Il secondo contributo riguarda la re-identificazione di persone a lungo termine in reti di telecamere. Questo problema è particolarmente difficile in quanto non si può contare su feature di colore o che considerino i vestiti di ogni persona, in quanto si vuole che il riconoscimento funzioni anche a distanza di giorni. Viene proposto un framework che sfrutta il riconoscimento facciale utilizzando una Convolutional Neural Network e un sistema di classificazione Bayesiano. In questo modo, ogni qual volta viene generata una nuova traccia dal sistema di people tracking, la faccia della persona viene analizzata e, in caso di match, il vecchio ID viene riassegnato. Il terzo contributo riguarda l'Ambient Assisted Living. Abbiamo proposto e implementato un robot di assistenza che ha il compito di sorvegliare periodicamente un ambiente conosciuto, riportando eventi non usuali come la presenza di persone a terra. A questo fine, abbiamo sviluppato un approccio veloce e robusto che funziona anche in assenza di luce ed è stato validato usando un nuovo dataset RGB-Depth registrato a bordo robot. Con l'obiettivo di avanzare la ricerca in questi campi e per fornire il maggior beneficio possibile alle community di robotica e computer vision, come contributo aggiuntivo di questo lavoro, abbiamo rilasciato, con licenze open-source, la maggior parte delle implementazioni software degli algoritmi descritti in questo lavoro.
Wang, Chong, and 王翀. "Joint color-depth restoration with kinect depth camera and its applications to image-based rendering and hand gesture recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/206343.
Full textSun, Yi. "Depth Estimation Methodology for Modern Digital Photography." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1563527854489549.
Full textWang, Beien. "3D Scintillation Positioning Method in a Breast-specific Gamma Camera." Thesis, KTH, Medicinsk teknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-176453.
Full textYe, Mao. "MONOCULAR POSE ESTIMATION AND SHAPE RECONSTRUCTION OF QUASI-ARTICULATED OBJECTS WITH CONSUMER DEPTH CAMERA." UKnowledge, 2014. http://uknowledge.uky.edu/cs_etds/25.
Full textDjikic, Addi. "Segmentation and Depth Estimation of Urban Road Using Monocular Camera and Convolutional Neural Networks." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235496.
Full textDeep learning för säkra autonoma transportsystem framträder mer och mer inom forskning och utveckling. Snabb och robust uppfattning om miljön för autonoma fordon kommer att vara avgörande för framtida navigering inom stadsområden med stor trafiksampel. I denna avhandling härleder vi en ny form av ett neuralt nätverk som vi kallar AutoNet. Där nätverket är designat som en autoencoder för pixelvis djupskattning av den fria körbara vägytan för stadsområden, där nätverket endast använder sig av en monokulär kamera och dess bilder. Det föreslagna nätverket för djupskattning hanteras som ett regressions problem. AutoNet är även konstruerad som ett klassificeringsnätverk som endast ska klassificera och segmentera den körbara vägytan i realtid med monokulärt seende. Där detta är hanterat som ett övervakande klassificerings problem, som även visar sig vara en mer simpel och mer robust lösning för att hitta vägyta i stadsområden. Vi implementerar även ett av de främsta neurala nätverken ENet för jämförelse. ENet är utformat för snabb semantisk segmentering i realtid, med hög prediktions- hastighet. Evalueringen av nätverken visar att AutoNet utklassar ENet i varje prestandamätning för noggrannhet, men visar sig vara långsammare med avseende på antal bilder per sekund. Olika optimeringslösningar föreslås för framtida arbete, för hur man ökar nätverk-modelens bildhastighet samtidigt som man behåller robustheten.All träning och utvärdering görs på Cityscapes dataset. Ny data för träning samt evaluering för djupskattningen för väg skapas med ett nytt tillvägagångssätt, genom att kombinera förberäknade djupkartor med semantiska etiketter för väg. Datainsamling med ett Scania-fordon utförs även, monterad med en monoculär kamera för att testa den slutgiltiga härleda modellen. Det föreslagna nätverket AutoNet visar sig vara en lovande topp-presterande modell i fråga om djupuppskattning för väg samt vägklassificering för stadsområden.
Yuan, Qiantailang. "The Performance of the Depth Camera in Capturing Human Body Motion for Biomechanical Analysis." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235944.
Full textTre dimensionell rörelse spårning har alltid varit ett viktigt ämne inom medicinska och tekniska områden. Komplexa kamerasystem så som Vicon kan användas för att hämta exakta data för olika rörelser. Dessa system är dock mer kommersiellt orienterade, och är oftast dyra. Systemen är dessutom besvärliga eftersom man är tvungen att bära speciella dräkter med markörer, för att kunna spåra rörelser. Därav finns det ett stort intresse av att undersöka ett kostnadseffektivt och markörfria verktyg för rörelsespårning. Microsoft Kinect är en lovande lösning med en mängd olika bibliotek som möjliggör en snabb utveckling av 3D spatial modellering och analys. Från den spatiala positionsinformationen kan man få fram information om ledernas acceleration, hastighet och vinkelförändring. För att kunna validera om Kinect är passande för analysen, utvecklades en mikro-styrplattform Ardunino tillsammans med Intel R CurieTM IMU (tröghetsmätningsenhet). Hastigheten och Eulers vinkel vid rörelse av lederna, samt orienteringen av huvudet mättes och jämfördes mellan dessa två system. Målet med detta arbete är att presentera (i) användningen av Kinect Depth sensor för datainsamling, (ii) efterbehandling av inhämtad data, (iii) validering av Kinect Kamera. Resultatet visade att RMS-errorn av hastighetsspårningen varierade mellan 1.78% och 23.34%, vilket påvisar en god likhet mellan mätningarna av de två systemen. Det relativa felet i vinkelspårningen är mellan 4.0% och 24.3%. Resultatet för orienteringen av huvudet var svår att ta fram genom matematisk analys eftersom brus och invalid data från kameran uppstod pga förlust av spårning. Noggrannheten av ledrörelsen detekterad av Kinect kameran bevisas vara acceptabel, speciellt för hastighetsmätningar. Djupkameran har visat vara ett effektivt verktyg för kinematiks mätning som ett kostnadseffektivt alternativ. En plattform och arbetsflöde har tagits fram, vilket möjliggör validering och tillämpning när den avancerade hårdvaran är tillgänglig.
Bodesund, Fredrik. "Pose estimation of a VTOL UAV using IMU, Camera and GPS." Thesis, Linköpings universitet, Reglerteknik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-60641.
Full textNär en autonom farkost skall utföra ett uppdrag är det av högsta vikt att den har god kännedom av sin position. Utan detta kommer den inte att kunna navigera och den data som den samlar in, relevant för uppdraget, kan vara oanvändbar. Till exempel skulle en helikopter kunna användas för att samla in laser data av terrängen under den, för att skapa en 3D karta av terrängen. Om kännedomen av helikopterns position och orientering är dålig kommer de insamlade lasermätningarna att vara oanvändbara eftersom det inte är känt vad lasern faktiskt mäter. I detta examensarbete presenteras en väl fungerande lösning för position och orienterings estimering av autonom helikopter med hjälp av en inertial measurement unit (IMU), en kamera och GPS. Problemet är att skatta positionen och orienteringen med hjälp av sensorer som mäter olika fysiska storheter och vars mätningar innehåller brus. En extended Kalman filter (EKF) lösning för simultaneous localisation and mapping (SLAM) problemet används för att fusionera data från de olika sensorerna och estimera positionen och orienteringen. För feature extrahering används scale invariant feature transform (SIFT) och för att parametrisera landmärken används unified inverse depth parametrisation (UIDP). Orienteringen av roboten beskrivs med hjälp av qvartinjoner. För att evaluera skattningarna har en ABB robot används som referens vid datainsamling. Då roboten har god kännedom om position och orientering av sitt främre verktyg gör detta att prestandan i filtret kan undersökas. Resultaten visar att algorithmen fungerar bra och att skattningar har hög noggrannhet.
Galardini, Luca. "Research on Methods for Processing Large-Baseline Stereo Camera Data for Long Range 3D Environmental Perception." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Find full textStattin, Sebastian. "Concurrent validity and reliability of a time of-flight camera on measuring muscle’s mechanical properties during sprint running." Thesis, Umeå universitet, Avdelningen för idrottsmedicin, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-163191.
Full textAldrovandi, Lorenzo. "Depth estimation algorithm for light field data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.
Find full textSych, Alexey, and Олексій Сергійович Сич. "Image depth evaluation system by stream video." Thesis, National Aviation University, 2021. https://er.nau.edu.ua/handle/NAU/50762.
Full textOne of the data processing applications is stereo vision, in which obtaining a three-dimensional scene is based on models for determining the depths of key points of images from a video sequence or several images. If it is considered an example with a person, then a two-dimensional image is formed on the retina, but despite this, a person perceives the depth of space, that is, has three-dimensional, stereoscopic vision. As a result, in the presence of data on the size of an object, it can be estimated the distance to it or understand which of the objects is closer. When one object is in front of the other and partially obscures it, the person perceives the front object at a closer distance. Because of this, the need arose to teach machine devices to do this for various tasks. Based on the processing results, you can have spatial information for assessing the relief, obstacles while driving, etc. This algorithm is based on combining images of the same object, photographed or filmed on video with constant camera parameters and in the same focal plane from different angles, allows to obtain information about the distance to the object by perspective distortions (discrepancies).
Одним із додатків для обробки даних є стереобачення, в якому отримання тривимірної сцени базується на моделях для визначення глибини ключових точок зображень із відеопослідовності або декількох зображень. Якщо це розглядати як приклад з людиною, то на сітківці утворюється двовимірне зображення, але, незважаючи на це, людина сприймає глибину простору, тобто має тривимірне, стереоскопічне бачення. Як результат, за наявності даних про розмір об’єкта можна оцінити відстань до нього або зрозуміти, який з об’єктів знаходиться ближче. Коли один предмет перебуває перед іншим і частково затемнює його, людина сприймає передній предмет на більш близькій відстані. Через це виникла потреба навчити машинні пристрої робити це для різних завдань. На основі результатів обробки ви можете мати просторову інформацію для оцінки рельєфу, перешкод під час руху тощо. Цей алгоритм заснований на поєднанні зображень одного і того ж об'єкта, сфотографованих чи знятих на відео з постійними параметрами камери і в одній і тій же фокальній площині з різних кутів, дозволяє отримувати інформацію про відстань до об'єкта шляхом перспективних спотворень (розбіжностей).
Basso, Filippo. "A non-parametric Calibration Algorithm for Depth Sensors Exploiting RGB Cameras." Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3424206.
Full textI sensori di profondità sono dispositivi comuni sui robot moderni. Essi forniscono al robot informazioni sulla distanza e sulla forma degli oggetti nel loro campo di visione, permettendogli di agire di conseguenza. In particolare, l’arrivo negli ultimi anni di sensori RGB-D di consumo come Microsoft Kinect, ha favorito lo sviluppo di algoritmi per la robotica basati su dati di profondità. Di fatto, questi sensori sono in grado di generare una grande quantità di dati ad un prezzo relativamente basso. In questa tesi vengono affrontati tre diversi problemi riguardanti la calibrazione di sensori di profondità. Il primo contributo originale allo stato dell’arte è un algoritmo per stimare l’asse di rotazione di un laser range finder (LRF) 2D montato su un supporto rotante. La differenza chiave con gli altri approcci è l’utilizzo di vincoli punto-piano derivanti dalla cinematica per stimare la posizione del LRF rispetto ad una videocamera fissa, e l’uso di una screw decomposition per stimare l’asse di rotazione. La corretta ricostruzione di una stanza dopo la calibrazione valida l’algoritmo proposto. Il secondo e più importante contributo originale di questa tesi è un algoritmo completamente automatico per la calibrazione di sensori di profondità a luce strut- turata (ad esempio Kinect). La chiave di questo lavoro è la separazione dell’errore di profondità in due componenti, entrambe corrette pixel a pixel. Questa separa- zione, validata da osservazioni sperimentali, permette di ridurre sensibilmente il numero di parametri nell’ottimizzazione finale e, di conseguenza, il tempo neces- sario affinché la soluzione converga al minimo globale. Il confronto tra le immagini di profondità di un test set, corrette con i parametri di calibrazione ottenuti, e quelle attese, dimostra che la differenza tra le due è solamente di una quantità ca- suale. Un’analisi qualitativa della fusione tra dati di profondità e RGB conferma ulteriormente l’efficacia dell’approccio. Inoltre, un pacchetto ROS per calibrare e correggere i dati generati da Kinect è disponibile open source. Il terzo contributo riportato nella tesi è un nuovo algoritmo distribuito per la calibrazione di reti composte da videocamere e sensori di profondità già calibrati. Un pacchetto ROS che implementa l’algoritmo proposto è stato rilasciato come parte di un grande progetto open source per il tracking di persone: OpenPTrack. Il pacchetto sviluppato è in grado di calibrare reti composte da una decina di sensori in tempo reale (non è necessario processare i dati in un secondo tempo), sfruttando vincoli piano-piano e un’ottimizzazione non lineare.
Dey, Rohit. "MonoDepth-vSLAM: A Visual EKF-SLAM using Optical Flow and Monocular Depth Estimation." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1627666226301079.
Full textMüller, Franziska [Verfasser]. "Real-time 3D hand reconstruction in challenging scenes from a single color or depth camera / Franziska Müller." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2020. http://d-nb.info/1224883594/34.
Full textLiao, Miao. "Single View Modeling and View Synthesis." UKnowledge, 2011. http://uknowledge.uky.edu/gradschool_diss/828.
Full textTahavori, F. "The application of a low-cost 3D depth camera for patient set-up and respiratory motion management in radiotherapy." Thesis, University of Surrey, 2017. http://epubs.surrey.ac.uk/813529/.
Full textLamprecht, Bernhard. "A testbed for vision based advanced driver assistance systems with special emphasis on multi-camera calibration and depth perception /." Aachen : Shaker, 2008. http://d-nb.info/990314847/04.
Full textChia-ChunWeng and 翁嘉駿. "Collision Detection Using Depth Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/q5v5bk.
Full text國立成功大學
機械工程學系
106
With the rise of industrial automation, collision avoidance systems have been developed to reduce the damage caused by machine impact. Conversely, collision confirmation systems have been developed to determine whether collide or not. In view of this, demand for collision detection is increasing. The sensing method can be divided into contact and non-contact methods. This study uses a non-contact machine vision method to detect collisions using a depth camera. In this study. First, a depth camera is used to surround the workspace, and captures the 3D point cloud images of the workspace. Then, the Iterative Closest Point (ICP) algorithm is used to align and merge the point clouds into a merged point cloud. The merged point cloud is divided into target object and obstacle point clouds based on Euclidean distance. The target object point cloud is expanded along the direction of normal vector of each point. Finally, each point cloud can be meshed and the GJK algorithm (Gilbert-Johnson–Keerthi distance algorithm) is used to perform collision detection. This study analyzes four cases, which are six-axis robot arm collision detection, gripper grip detection, mobile robot collision detection, and weapon arts judgment. The results show that the proposed method successfully divides the workspace into target objects, statically constructed obstacles, and dynamically detected obstacles for collision detection. This study proposes an expansion method to perform the collisions prediction
Chen, Zhi-Liang, and 陳致良. "Depth Camera - Assisted Indoor Localization Enhancement." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/88236027504172742228.
Full text國立臺灣科技大學
資訊工程系
101
This paper develops an approach for image triangulation from point cloud. This approach can be divided into three parts: reconstructing environment, virtual images database establishment and triangulation. During constructing virtual images database, we can acquire extra localization information which traditional image localization lacks. When camera is far from scene or camera is sheltered by objects, traditional SIFT localization may decrease the accuracy. Our approach provides higher localization accuracy and coverage ratio by choosing better camera angles and positions automatically. In experiments, we take practical localization by traditional SIFT localization and virtual images triangulation to compare result.
Chiou, Yi-Wen, and 邱義文. "Depth Refinement for View Synthesis using Depth Sensor and RGB Camera." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/29737408822663940034.
Full text國立交通大學
電子研究所
100
In recent years, three-dimension (3D) video has been a trend after the very popular science fiction movie – Avatar produced in the 2009. Many 3D movies, TV sets and even mobile phone sets have been developed. The view synthesis technology is an essential element in a 3D video system. The current technology adopted by the international MPEG committee on 3DVC standard is to generate a virtual viewpoint scene by using the received 2D views and their associated depth information. Therefore, the depth information plays an important role in 3D view synthesis. In general, we can estimate the depth information by using two 2D texture images of different viewpoints using various depth estimation methods, and this approach is called Passive Depth Estimation. Very often, this approach fails to provide accurate depth information on the textureless regions. Furthermore, the occlusion regions which always exist due to 2 cameras, often lead to the “hole” defects on the synthesized views. In this study, we adopt an active sensor – Kinect to capture the depth information. This active sensor provides pretty accurate depth information on the textureless regions, and it can operate in real-time. To generate a new view or virtual view, we use a pair of Kinect sensors as the left and right cameras. The Kinect depth sensor can be treated as another camera and thus we employ and improve some conventional techniques to estimate its camera parameters, calibrate its images (depth maps) and reduce artifacts. In the calibration step, we use the information between two texture images to estimate the 3D geometry relationship between two Kinect sensors. Furthermore, because the depth sensor and color camera are located at two different positions, we propose an “Alignment” procedure to match the coordinates of the depth image and the texture image. In designing and implementing our alignment procedure, we use a disparity model and the Kinect SDK functions. Finally, we use the Joint Bilateral Filter and color information to reduce noises and defects on the sensor-acquired depth map. Comparing to the depth map estimated by using the MPEG Depth Estimation Reference Software (DERS), the captured and processed depth map clearly provide more accurate depth information. At the end, we examine and compare the synthesized images using the original and the refined depth maps. The quality of the refined synthesized image is noticed improved.
Tu, Chieh-Min, and 杜介民. "Depth Image Inpainting with RGB-D Camera." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/k4m42a.
Full text義守大學
資訊工程學系
103
Since Microsoft released the cheap Kinect sensors as a new natural user interface, stereo imaging is made from previous multi-view color image synthesis, to now synthesis of color image and depth image. But the captured depth images may lose some depth values so that stereoscopic effect is often poor in general. This thesis is based on Kinect RGB-D camera to develop an object-based depth inpainting method. Firstly, the background differencing, frame differencing and depth thresholding strategies are used as a basis for segmenting foreground objects from a dynamic background image. Then, the task of hole inpainting is divided into background area and foreground area, in which background area is inpainted by background depth image and foreground area is inpainted by a best-fit neighborhood depth value. Experimental results show that such an inpainting method is helpful to fill holes, and to improve the contour edges and image quality.
Tsai, Sheng-Che, and 蔡昇哲. "Implementation Of 3D Cursor Using Depth Camera." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/20737575961640569497.
Full text國立臺灣大學
電信工程學研究所
101
The 3D cursor system is a new idea to in 3D user interface (UI). Rather than traditional 3D input device, 3D cursor system use depth camera to help user control cursor using their hands. It is different to regular 2D cursor because the 3D cursor has one more attribute value: depth. Since 2D cursor only have two attribute, 3D cursor can control object expertise and convenient in 3D virtual graphic world. This paper is focus on how to design the 3D cursor system and discuss the implementation problems. Also involved some 3D UI design and 3D gesture design, and show the result of implementation and discussion the pros and cons at the end.
Liou, Jia-Lin, and 劉嘉麟. "Noncontact Respiratory Volume Measurement Using Depth Camera." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/64011430563257356131.
Full text國立臺灣大學
資訊工程學研究所
100
Breathing is important. In this study, a noncontact respiration measurement technique using depth cameras, Kinect camera, is developed to measure respiratory volume from the morphological changes of chest wall region. Then, user’s breathing status, i.e. the respiratory rate, respiratory depth, and inhale-to-exhale ratio, and breathing methods, i.e. thoracic breathing and abdominal breathing, can be measured. For measuring the chest wall movements, a dynamic thoracic and abdominal region of interest (ROI) tracking technique is used. In this study, two frameworks of noncontact respiratory measurement are proposed. One is a single-sided depth camera system to measure respiratory volume in sitting and lying postures. The other is a double-sided depth camera system to measure respiratory volume in standing posture. Through experiments, the system was evaluated in three different wearing conditions (necked, thin clothing, and thick coat) and three different postures (sitting, lying, and standing). Among these experiments, the measured respiratory volumes are compared by our method and a reference device, spirometer. Finally, three noncontact respiratory measurement applications are developed, including a Mao-Kung Ting system, an Intensive Care Unit (ICU) respiratory monitoring system, and a standing meditation system. In conclusion, a low-cost and easy-operating noncontact regional respiratory measurement system is developed, and the contribution of this study is to develop a system which could help users aware of their breathing conditions, and toward the ultimate goal of preventive medicine.
Chen, Hao-Yu, and 陳皓宇. "An Extrinsic Calibration for Depth Camera to Narrow Field of View Color Camera." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/47e5k2.
Full text國立臺灣大學
資訊網路與多媒體研究所
107
This work proposes a calibration method between a narrow field of view cameraandadepthcamerainanendoscope-likescenario. Anendoscopy-like scenario has several properties including limited specular reflective surface, a camera with a narrow field of view. Instead of pushing the accuracy of the target marker with low-resolution data, we propose a solution with a loss function. The proposed loss function utilizes all of the 3 dimensions points of the checkerboard measured with the depth camera, and calculates the distance between projected 3D positions onto 2D image surface and the color image. The final re-projected error is improved to average under 1 millimeters, and further trainingand evaluation of depth estimation algorithms could be performed.
Hu, Jhen-Da, and 胡振達. "Hybrid Hand Gesture Recognition Based on Depth Camera." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/7febjn.
Full text國立交通大學
多媒體工程研究所
103
Hand gesture recognition (HRG) becomes one of most popular topics in recent years because that hand gesture is one of the most natural and intuitive way of communication between Human and machines. It is widely used in HCI (Human-Computer-interaction). In this paper, we proposed a method for hand gesture recognition based on depth camera. Firstly, the hand information within depth image is separated from background based on a specific range of depth. And the contour of hand is detected after segmentation. After that, we estimate centroid of hand, and palm size is calculated by using linear regression. Then, fingers’ states of gesture are estimated depending on information of hand contour. And fingertips are estimated by means of smooth hand contours which reduce number of contours by Douglas-Peucker Algorithm. Finally, we propose a gesture type estimation algorithm to determine which gesture is. The extensive experiments demonstrate that the accuracy rate of our method is from 84.35% to 99.55%, and the mean accuracy is 94.29%.
Shiu, Hung-Wei, and 徐鴻煒. "3D Human Posture Tracking Based on Depth Camera." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/81555185244865050910.
Full textWu, Cheng-tsung, and 吳承宗. "A Depth-camera-cased Mixed Reality Interactive Table." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/90326863396371121985.
Full text國立中央大學
資訊工程研究所
100
This thesis combines Microsoft Kinect with a projector to create a mixed reality interactive table. This interactive table can provide two different modes, the touch screen mode and the mixed reality interactive music mode. Before the implementation of two modes must be regulate the Kinect and the projector to get the coordinates transformation matrix. The purpose of transformation matrix is change the origin of the coordinate system from Kinect to the upper left corner of the projector’s screen. The real world points set of each frame multiply this transformation matrix. We could get new a point set and the origin of this point set is the upper left corner of the projector’s screen. Then build the disparity map in top view by the converted point set. In the touch screen mode, this system could recognize eight hand gestures. According to the hand gestures and hand’s height to decide the instruction of mouse. So we can change the projector’s screen into touch screen. In the mixed reality interactive music mode, we will provide a three-dimensional object recognition. Users could develop their creativity to compose blocks of arbitrary shape on the projector’s screen. And select a suitable instrument from 120 kinds of musical instruments for the objects (blocks). Blocks have been only tactile and visual, coupled with hearing. This system will recognize the user-specified instrument object, and drew 21 notes next to the instrument object. People could play more than one instrument at the same time. Achieving the understanding of musical instruments, the fun of ensemble, and infinite creative space and stimulates thinking.
Yan, Chin-Hsien, and 顏欽賢. "THE RESEARCH OF FALL DETECTION BY DEPTH CAMERA." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/25038998937168589780.
Full text大同大學
資訊工程學系(所)
101
Home safety for the elderly is an important issue in current society, many elderly people living alone and not have all day daycare of their loved ones. The fall event occurred in home environment, or the condition of people cannot help themselves and first aid. At this time, home security detection equipment can come in handy, the system will detect falls, track personnel dynamically, and report back to the nursing staff or relatives come to the rescue in a timely manner, this system can help solve the security problems of the elderly people living alone at home.Fall detection system can detect fall and report people`s action. Numerous academic studies for people who wear sensors such as accelerometers, gyroscopes, level meter, heart rate monitor, Electromyography signals (EMG) measurement system, using floor induction or general color captured by camera to catch body contours for fall posture detect.The purpose of study which use depth camera to track the human body and find the body 20 "3D coordinate" of joint points, and made correction tools of camera coordinate turn into world coordinate to produce the world axis, next, we let 20 camera coordinate of skeleton points turn into world coordinate of skeleton points.We use PCA (principal component analysis) to find the direction of body and the body's spindle angle of floor as fall characteristics. It can reduce the wear sensor discomfort and cost of instrument for the elderly to use. It does not affect the user's home habits, so that users can easy to use this device for home security of protection.The depth camera which in low light environment can trace human body, and can be used in any environment at home, next, to improve the accuracy and efficiency of the system identification. The experimental results of the system identification fall detection can achieve 98.86% accuracy rate.
TSAI-JIE-SHIOU and 蔡杰修. "Gaze Direction Estimation Using Only A Depth Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/qmgq94.
Full text國立臺灣科技大學
電子工程系
106
Over the years, gaze estimations have become more and more popular in the computer vision processing and been widely used in many applications. Many methods had been developed to solve the problem of determining where people are looking at. However, current gaze estimation methods tend to be expensive, inconvenient, and invasive. This thesis presents a non-invasive approach to estimate the user’s gaze direction based on a single consumer depth camera. The proposed method is able to estimate gaze directions under various lighting conditions and operating distances by using depth information. First, we use the contour of the chin to match the location of the head. Then, we employ the facial geometric relationship to locate the regions of eyes as the ROI for cutting down the computational complexity. Next, we utilize the depth value differences between the nose and eyes to locate the bridge of the nose, and then utilize the average distance from the eye center to the bridge of the nose to locate eye centers, which are defined as reference points in the proposed method. After that, we increase the intensity of ROI to get more obvious features of the dark effects to estimate the location of pupils. Finally, we get the gaze directions by analyzing the relative position be-tween pupils and reference points. By using contours and geometric relationship of head, the proposed method can estimate the gaze direction without any color information. The performance of the proposed system was verified for five different users in three luminance levels and three testing distances. Experimental results show that the proposed method achieves the average accuracy of 80.1 %, where is close to the existing RGB based method.
Liu, Yueh-Sheng, and 劉曰聖. "Depth Acquisition System Based on Occlusion-Aware Depth Estimation with Light-Field Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/9287j7.
Full text國立交通大學
光電工程研究所
106
Light-field cameras or plenoptic cameras have recently become available in consumer and industrial applications. Over the last decade, the researches have investigated the problem of highly-accurate and real-time depth reconstruction from a plenoptic camera. However, the depth resolution of plenoptic camera is very low due to narrow baseline, and the depth estimation is time-consuming because of the rough high software complexity. In this thesis, a detailed analysis of plenoptic camera is presented and an optimal plenoptic camera with a specific lens design is introduced to improve depth resolution and FOV simultaneously. In addition, we propose an occlusion-aware depth estimation algorithm with an efficient framework. We assume depth values are similar within color-homogeneous region and depth discontinuities occur in color edges and texture variation boundaries. Firstly, initial disparities are estimated at edge locations by enforcing photo-consistency only on non-occluded region of angular patch. Then a dense depth map is recovered from the sparse depth estimates with an edge-aware propagation. The proposed method is evaluated on synthetic and real-world datasets. Experiment results show that our algorithm outperforms the other relevant algorithms on runtime reduced to 8.8% of others while keeping competing accuracy of depth maps.
Wong, Sin-Lung, and 黃新隆. "Real-Time Human Body Posture Estimation Using Depth Camera." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/25145582122553447840.
Full text國立中興大學
電機工程學系所
102
This thesis proposes a real-time three dimensional (3D) human body posture estimation method using significant points extracted from two-dimensional (2D) human body contour and their depth information from a depth camera. The located human body significant points include the head, center of the body, the tips of the feet and the hands, the shoulders, the elbows, and the knees. This thesis uses the Kinect camera to capture both the color and depth image sequences at the same time. For human body segmentation, an Angle-Compensated segmentation method in the Red, Green, and Blue color space (AC-RGB) is used to segment a moving object from the background in the color image and reduce the influence of shadow. Segmentation of the human body using the depth information in the depth image is conducted as well. Segmentation results from the color and depth images are combined by using the OR operation. After the segmentation of the human body, 2D locations of the head and tips of the feet and the hands are obtained based on 2D contour convex points and body geometrical characteristics. When occlusion of the hand occurs, this thesis uses the depth information to locate the candidate region and the optical flow to find the tip of the hand. Two-dimensional locations of the elbows and the knees are obtained by using the skeleton pixels of a 2D body silhouette. After localization of the 2D significant points, the corresponding depth information is obtained from the depth image. The 3D locations of the significant points are sent to the Virtools software to reconstruct a virtual 3D human model. This thesis sets up a real-time virtual 3D human model system to verify the effectiveness of the proposed approach. To show the potential of the system, the system is also applied to some 3D interactive entertainment games.
Wang, Jie-Hung, and 王傑鴻. "Simulation-Based 3D Pose Estimation Using a Depth Camera." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/d62482.
Full text國立臺北科技大學
製造科技研究所
104
Object recognition and posture estimate play an important role in apply of automatic assembly and service robot. Today, the most popular method of recognition is using geometric or feature statistics further identify the object and estimate the posture by comparison of CAD model and point cloud of object by scanning. However, due to the infrared ray’s sensing parallax effect of Kinect sensor, causes significant distortion of point cloud information. In order to improve the efficacy of feature comparison of distortion point cloud data and object’s CAD model. We propose a Kinect virtual simulation space comparison method, can simulate the Kinect scan situation by estimate the sight angle, object distance and the intensity of environment parameters. The point cloud similarity can be increase by this method. So, it can precisely and rapidly achieve object comparison and picking posture estimation. Compare with the previous research, the result of this study reveal that, the accuracy of posture estimation and efficacy of operation power has significant increase. This technique will help automatic assembly and service robot field has more application of immediacy and accuracy.
LU, HUNG-CHIH, and 盧泓志. "Structured Light Depth Camera Motion Blur Detection and Deblurring." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/42540991324647985208.
Full text國立臺灣大學
資訊工程學研究所
103
Deblurring of 3D scenes captured by 3D sensors is a novel topic in computer vision. Motion blur occurs in a number of 3D sensors based on structured light techniques. We analyze the causes of motion blur captured by structured light depth cameras and design a novel algorithm using the speed cue and object models to deblur a 3D scene. The main idea is using the 3D model of an object to replace the blurry object in the scene. Because we aim to deal with consecutive 3D frame sequences, ie 3D videos, an object model can be built in the frame where the object is not blurry yet. Our deblurring method can be divided into two parts: motion blur detection and motion blur removal. For the motion blur detection part, we use the speed cue to detect where the motion blur is. For the motion blur removal part, first we judge the type of the motion blur, and then we apply the iterative closest point (ICP) algorithm in different ways according to the motion blur type. The proposed method is evaluated in real world cases and successfully accomplishes motion blur detection and blur removal.
Lin, Yi-ta, and 林逸達. "3D Object Tracking and Recognition with RGB-Depth Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/yn2qzy.
Full text國立中山大學
電機工程學系研究所
106
The main purpose of this paper is 3D object tracking by using RGB-D camera. In addition, we would change our object during the tracking phase and our system can identify the new object. This paper is basically composed of three phases. The first phase is off-line training. The second phase is on-line tracking. The third phase is identification of the new object. In the first phase, we create three 3D models of the tracking objects which are box, cylinder and sphere, and we use a method to calculate the point pair features for each 3D model. Then, we store those point pair feature into the database which would be used later. In the second phase, use the RGB-D sensor to obtain the real world scenery, and calculate the point pair feature of the real world scenery as well as the first phase. After that, we compare the scenery ''s point pair features to the database so that we can find out where the 3D model is in the scenery. However, it is just an initial pose for the 3D model, so here we have to use the Iterative Closet Point (ICP) algorithm to obtain a better pose. In the third phase, we would change the tracking object during the tracking phase, and our system can detect the situation from the scenery. Besides, it can identify the new tracking object and keep tracking of it by the method introduced in the second phase.
Gu, Kai-Da, and 古凱達. "Depth-of-Field Simulation Based on a Real Camera Model." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/22785912255652743121.
Full text國立中正大學
電機工程所
95
Depth of field is an inherent physical phenomenon of optical systems.To animation technology and cinema special effects, this phenomenon is an indispensable important factor.A number of approaches have been developed to generate depth of field effects, with varying degrees of success, such as distributed ray tracing, image accumulation and layered depth.This thesis presents a new post-processing method of simulating depth of field base on a real camera model.Our algorithm takes the advantage of layered depth information and the smooth characteristic of Gaussian filter.Different from the previous approaches, we use some simple calibration method to find the relation between ``the distance from camera lens to the objects'' and ``the blur degree of the image''.By these calibration results, we can simulate appropriate blur in accordance with the distance from camera lens to the objects.On the other hand, we can utilize this relation between the distance from camera lens to the objects and the blur degree of the image to estimate the real distance between objects and camera lens.The result is accurate, and the application of distance estimation is useful in many applications.
Shih, Kuang-Tsu, and 施光祖. "High-Resolution Imaging and Depth Acquisition Using a Camera Array." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/un25de.
Full text國立臺灣大學
電信工程學研究所
105
In this age where everyone can be a photographer with his or her smart phone, the pursuit of higher imaging quality has become more important and profitable than ever before. Among the quality metrics of images, resolution is often the top one that people care the most. Being one of the conventional approaches to increasing the image resolution, optics optimization is believed to have reached its bottleneck. As a consequence, researchers are turning to computational photography to seek breakthrough. In this dissertation, we study the computational approach to high-resolution imaging based on multi-aperture systems such as a camera array or a lenslet array. This dissertation can be divided into two parts. The first part is dedicated to the analysis of existing approaches. Particularly, two approaches are inspected in depth: subpixel refocusing and reconstruction-based light field super-resolution. For subpixel refocusing, we show that a deconvolution step is missing in previous work and incorporating a deconvolution in the loop significantly enhances the sharpness of the results. We also conduct experiments to quantitatively analyze the effect of calibration error on subpixel refocusing and analyze the upper bound of the error for a targeted image quality. On the other hand, for reconstruction-based light field super-resolution, we show through experiments that the resolution gain obtainable by super-resolution does not increase boundlessly to the number of cameras and is ultimately limited by the size of the point spread function. In addition, we point out through experiment that there is a tradeoff between the obtainable resolution and the registration accuracy. The tradeoff is a fundamental limit of reconstruction-based approaches. In contrast to the analysis work in the first part, the second part of the dissertation describes our original solution: a computational photography system based on a camera array with mixed focal lengths. Our solution has two distinguished features: it can generate an output image whose resolution is higher than 80% of the total captured pixels and a disparity map of the same resolution that contains the depth information about the scene. Our solution consists of optimized hardware and an image fusion algorithm. On the hardware size, we propose an approach to optimize the configuration of a camera array for high-resolution imaging using cameras with mixed focal lengths and non-parallel optical axes. On the software side, an algorithm is developed to integrate the low-resolution images captured by the proposed camera array into a high-resolution image without the blurry appearance problem of previous methods.