Dissertations / Theses: 'Image representation methods'

1

Chang, William. "Representation Theoretical Methods in Image Processing." Scholarship @ Claremont, 2004. https://scholarship.claremont.edu/hmc_theses/160.

Full text

Abstract:

Image processing refers to the various operations performed on pictures that are digitally stored as an aggregate of pixels. One can enhance or degrade the quality of an image, artistically transform the image, or even find or recognize objects within the image. This paper is concerned with image processing, but in a very mathematical perspective, involving representation theory. The approach traces back to Cooley and Tukey’s seminal paper on the Fast Fourier Transform (FFT) algorithm (1965). Recently, there has been a resurgence in investigating algebraic generalizations of this original algorithm with respect to different symmetry groups. My approach in the following chapters is as follows. First, I will give necessary tools from representation theory to explain how to generalize the Discrete Fourier Transform (DFT). Second, I will introduce wreath products and their application to images. Third, I will show some results from applying some elementary filters and compression methods to spectrums of images. Fourth, I will attempt to generalize my method to noncyclic wreath product transforms and apply it to images and three-dimensional geometries.

APA, Harvard, Vancouver, ISO, and other styles

2

Karmakar, Priyabrata. "Effective and efficient kernel-based image representations for classification and retrieval." Thesis, Federation University Australia, 2018. http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/165515.

Full text

Abstract:

Image representation is a challenging task. In particular, in order to obtain better performances in different image processing applications such as video surveillance, autonomous driving, crime scene detection and automatic inspection, effective and efficient image representation is a fundamental need. The performance of these applications usually depends on how accurately images are classified into their corresponding groups or how precisely relevant images are retrieved from a database based on a query. Accuracy in image classification and precision in image retrieval depend on the effectiveness of image representation. Existing image representation methods have some limitations. For example, spatial pyramid matching, which is a popular method incorporating spatial information in image-level representation, has not been fully studied to date. In addition, the strengths of pyramid match kernel and spatial pyramid matching are not combined for better image matching. Kernel descriptors based on gradient, colour and shape overcome the limitations of histogram-based descriptors, but suffer from information loss, noise effects and high computational complexity. Furthermore, the combined performance of kernel descriptors has limitations related to computational complexity, higher dimensionality and lower effectiveness. Moreover, the potential of a global texture descriptor which is based on human visual perception has not been fully explored to date. Therefore, in this research project, kernel-based effective and efficient image representation methods are proposed to address the above limitations. An enhancement is made to spatial pyramid matching in terms of improved rotation invariance. This is done by investigating different partitioning schemes suitable to achieve rotation-invariant image representation and the proposal of a weight function for appropriate level contribution in image matching. In addition, the strengths of pyramid match kernel and spatial pyramid are combined to enhance matching accuracy between images. The existing kernel descriptors are modified and improved to achieve greater effectiveness, minimum noise effects, less dimensionality and lower computational complexity. A novel fusion approach is also proposed to combine the information related to all pixel attributes, before the descriptor extraction stage. Existing kernel descriptors are based only on gradient, colour and shape information. In this research project, a texture-based kernel descriptor is proposed by modifying an existing popular global texture descriptor. Finally, all the contributions are evaluated in an integrated system. The performances of the proposed methods are qualitatively and quantitatively evaluated on two to four different publicly available image databases. The experimental results show that the proposed methods are more effective and efficient in image representation than existing benchmark methods.
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

3

Nygaard, Ranveig. "Shortest path methods in representation and compression of signals and image contours." Doctoral thesis, Norwegian University of Science and Technology, Department of Electronics and Telecommunications, 2000. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-1182.

Full text

Abstract:

Signal compression is an important problem encountered in many applications. Various techniques have been proposed over the years for adressing the problem. The focus of the dissertation is on signal representation and compression by the use of optimization theory, more shortest path methods.

Several new signal compression algorithms are presented. They are based on the coding of line segments which are used to spproximate, and thereby represent, the signal. These segments are fit in a way that is optimal given some constraints on the solution. By formulating the compession problem as a graph theory problem, shortest path methods can be applied in order to yeild optimal compresson with respect to the given constraints.

The approaches focused on in this dissertaion mainly have their origin in ECG comression and is often referred to as time domain compression methods. Coding by time domain methods is based on the idea of extracting a subset of significant signals samples to represent the signal. The key to a successful algoritm is a good rule for determining the most significant samples. Between any two succeeding samples in the extracted smaple set, different functions are applied in reconstruction of the signal. These functions are fitted in a wy that guaratees minimal reconstruction error under the gien constraints. Two main categories of compression schemes are developed:

1. Interpolating methods, in which it is insisted on equality between the original and reconstructed signal at the points of extraction.

2. Non-interpolating methods, where the inerpolatian restriction is released.

Both first and second order polynomials are used in reconstruction of the signal. There is solso developed an approach were multiple error measures are applied within one compression algorithm.

The approach of extracting the most significant smaples are further developed by measuring the samples in terms of the number of bits needed to encode such samples. This way we develop an approach which is optimal in the ratedistortion sense.

Although the approaches developed are applicable to any type of signal, the focus of this dissertaion is on the compression of electrodiogram (ECG) signals and image contours, ECG signal compression has traditionally been

APA, Harvard, Vancouver, ISO, and other styles

4

Sampaio, de Rezende Rafael. "New methods for image classification, image retrieval and semantic correspondence." Thesis, Paris Sciences et Lettres (ComUE), 2017. http://www.theses.fr/2017PSLEE068/document.

Full text

Abstract:

Le problème de représentation d’image est au cœur du domaine de vision. Le choix de représentation d’une image change en fonction de la tâche que nous voulons étudier. Un problème de recherche d’image dans des grandes bases de données exige une représentation globale compressée, alors qu’un problème de segmentation sémantique nécessite une carte de partitionnement de ses pixels. Les techniques d’apprentissage statistique sont l’outil principal pour la construction de ces représentations. Dans ce manuscrit, nous abordons l’apprentissage des représentations visuels dans trois problèmes différents : la recherche d’image, la correspondance sémantique et classification d’image. Premièrement, nous étudions la représentation vectorielle de Fisher et sa dépendance sur le modèle de mélange Gaussien employé. Nous introduisons l’utilisation de plusieurs modèles de mélange Gaussien pour différents types d’arrière-plans, e.g., différentes catégories de scènes, et analyser la performance de ces représentations pour objet classification et l’impact de la catégorie de scène en tant que variable latente. Notre seconde approche propose une extension de la représentation l’exemple SVM pipeline. Nous montrons d’abord que, en remplaçant la fonction de perte de la SVM par la perte carrée, on obtient des résultats similaires à une fraction de le coût de calcul. Nous appelons ce modèle la « square-loss exemplar machine », ou SLEM en anglais. Nous introduisons une variante de SLEM à noyaux qui bénéficie des même avantages computationnelles mais affiche des performances améliorées. Nous présentons des expériences qui établissent la performance et l’efficacité de nos méthodes en utilisant une grande variété de représentations de base et de jeux de données de recherche d’images. Enfin, nous proposons un réseau neuronal profond pour le problème de l’établissement sémantique correspondance. Nous utilisons des boîtes d’objets en tant qu’éléments de correspondance pour construire une architecture qui apprend simultanément l’apparence et la cohérence géométrique. Nous proposons de nouveaux scores géométriques de cohérence adaptés à l’architecture du réseau de neurones. Notre modèle est entrainé sur des paires d’images obtenues à partir des points-clés d’un jeu de données de référence et évaluées sur plusieurs ensembles de données, surpassant les architectures d’apprentissage en profondeur récentes et méthodes antérieures basées sur des caractéristiques artisanales. Nous terminons la thèse en soulignant nos contributions et en suggérant d’éventuelles directions de recherche futures
The problem of image representation is at the heart of computer vision. The choice of feature extracted of an image changes according to the task we want to study. Large image retrieval databases demand a compressed global vector representing each image, whereas a semantic segmentation problem requires a clustering map of its pixels. The techniques of machine learning are the main tool used for the construction of these representations. In this manuscript, we address the learning of visual features for three distinct problems: Image retrieval, semantic correspondence and image classification. First, we study the dependency of a Fisher vector representation on the Gaussian mixture model used as its codewords. We introduce the use of multiple Gaussian mixture models for different backgrounds, e.g. different scene categories, and analyze the performance of these representations for object classification and the impact of scene category as a latent variable. Our second approach proposes an extension to the exemplar SVM feature encoding pipeline. We first show that, by replacing the hinge loss by the square loss in the ESVM cost function, similar results in image retrieval can be obtained at a fraction of the computational cost. We call this model square-loss exemplar machine, or SLEM. Secondly, we introduce a kernelized SLEM variant which benefits from the same computational advantages but displays improved performance. We present experiments that establish the performance and efficiency of our methods using a large array of base feature representations and standard image retrieval datasets. Finally, we propose a deep neural network for the problem of establishing semantic correspondence. We employ object proposal boxes as elements for matching and construct an architecture that simultaneously learns the appearance representation and geometric consistency. We propose new geometrical consistency scores tailored to the neural network’s architecture. Our model is trained on image pairs obtained from keypoints of a benchmark dataset and evaluated on several standard datasets, outperforming both recent deep learning architectures and previous methods based on hand-crafted features. We conclude the thesis by highlighting our contributions and suggesting possible future research directions

APA, Harvard, Vancouver, ISO, and other styles

5

Budinich, Renato [Verfasser], Gerlind [Akademischer Betreuer] Plonka-Hoch, Gerlind [Gutachter] Plonka-Hoch, and Armin [Gutachter] Iske. "Adaptive Multiscale Methods for Sparse Image Representation and Dictionary Learning / Renato Budinich ; Gutachter: Gerlind Plonka-Hoch, Armin Iske ; Betreuer: Gerlind Plonka-Hoch." Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2019. http://d-nb.info/1175625396/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Jia, Yue Verfasser], Timon [Akademischer Betreuer] Rabczuk, Klaus [Gutachter] [Gürlebeck, and Alessandro [Gutachter] Reali. "Methods based on B-splines for model representation, numerical analysis and image registration / Yue Jia ; Gutachter: Klaus Gürlebeck, Alessandro Reali ; Betreuer: Timon Rabczuk." Weimar : Institut für Strukturmechanik, 2015. http://nbn-resolving.de/urn:nbn:de:gbv:wim2-20151210-24849.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Jia, Yue [Verfasser], Timon [Akademischer Betreuer] Rabczuk, Klaus [Gutachter] Gürlebeck, and Alessandro [Gutachter] Reali. "Methods based on B-splines for model representation, numerical analysis and image registration / Yue Jia ; Gutachter: Klaus Gürlebeck, Alessandro Reali ; Betreuer: Timon Rabczuk." Weimar : Institut für Strukturmechanik, 2015. http://d-nb.info/1116366770/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Sjöberg, Oscar. "Evaluating Image Compression Methods on Two DimensionalHeight Representations." Thesis, Linköpings universitet, Informationskodning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-171227.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Wei, Qi. "Bayesian fusion of multi-band images : A powerful tool for super-resolution." Phd thesis, Toulouse, INPT, 2015. http://oatao.univ-toulouse.fr/14398/1/wei.pdf.

Full text

Abstract:

Hyperspectral (HS) imaging, which consists of acquiring a same scene in several hundreds of contiguous spectral bands (a three dimensional data cube), has opened a new range of relevant applications, such as target detection [MS02], classification [C.-03] and spectral unmixing [BDPD+12]. However, while HS sensors provide abundant spectral information, their spatial resolution is generally more limited. Thus, fusing the HS image with other highly resolved images of the same scene, such as multispectral (MS) or panchromatic (PAN) images is an interesting problem. The problem of fusing a high spectral and low spatial resolution image with an auxiliary image of higher spatial but lower spectral resolution, also known as multi-resolution image fusion, has been explored for many years [AMV+11]. From an application point of view, this problem is also important as motivated by recent national programs, e.g., the Japanese next-generation space-borne hyperspectral image suite (HISUI), which fuses co-registered MS and HS images acquired over the same scene under the same conditions [YI13]. Bayesian fusion allows for an intuitive interpretation of the fusion process via the posterior distribution. Since the fusion problem is usually ill-posed, the Bayesian methodology offers a convenient way to regularize the problem by defining appropriate prior distribution for the scene of interest. The aim of this thesis is to study new multi-band image fusion algorithms to enhance the resolution of hyperspectral image. In the first chapter, a hierarchical Bayesian framework is proposed for multi-band image fusion by incorporating forward model, statistical assumptions and Gaussian prior for the target image to be restored. To derive Bayesian estimators associated with the resulting posterior distribution, two algorithms based on Monte Carlo sampling and optimization strategy have been developed. In the second chapter, a sparse regularization using dictionaries learned from the observed images is introduced as an alternative of the naive Gaussian prior proposed in Chapter 1. instead of Gaussian prior is introduced to regularize the ill-posed problem. Identifying the supports jointly with the dictionaries circumvented the difficulty inherent to sparse coding. To minimize the target function, an alternate optimization algorithm has been designed, which accelerates the fusion process magnificently comparing with the simulation-based method. In the third chapter, by exploiting intrinsic properties of the blurring and downsampling matrices, a much more efficient fusion method is proposed thanks to a closed-form solution for the Sylvester matrix equation associated with maximizing the likelihood. The proposed solution can be embedded into an alternating direction method of multipliers or a block coordinate descent method to incorporate different priors or hyper-priors for the fusion problem, allowing for Bayesian estimators. In the last chapter, a joint multi-band image fusion and unmixing scheme is proposed by combining the well admitted linear spectral mixture model and the forward model. The joint fusion and unmixing problem is solved in an alternating optimization framework, mainly consisting of solving a Sylvester equation and projecting onto a simplex resulting from the non-negativity and sum-to-one constraints. The simulation results conducted on synthetic and semi-synthetic images illustrate the advantages of the developed Bayesian estimators, both qualitatively and quantitatively.

APA, Harvard, Vancouver, ISO, and other styles

10

Slobodan, Dražić. "Shape Based Methods for Quantification and Comparison of Object Properties from Their Digital Image Representations." Phd thesis, Univerzitet u Novom Sadu, Fakultet tehničkih nauka u Novom Sadu, 2019. https://www.cris.uns.ac.rs/record.jsf?recordId=107871&source=NDLTD&language=en.

Full text

Abstract:

The thesis investigates development, improvement and evaluation of methods for quantitative characterization of objects from their digital images and similarity measurements between digital images. Methods for quantitative characterization of objects from their digital images are increasingly used in applications in which error can have crtical consequences, but the traditional methods for shape quantification are of low precision and accuracy. In the thesis is shown that the coverage of a pixel by a shape can be used to highly improve the accuracy and precision of using digital images to estimate the maximal distance between objects furthest points measured in a given direction. It is highly desirable that a distance measure between digital images can be related to a certain shape property and morphological operations are used when defining a distance for this purpose. Still, the distances defined in this manner turns out to be insufficiently sensitive to relevant data representing shape properties in images. We show that the idea of adaptive mathematical morphology can be used successfully to overcome problems related to sensitivity of distances defined via morphological operations when comparing objects from their digital image representations.
У тези су размотрени развој, побољшање и евалуација метода за квантитативну карактеризацију објеката приказаних дигиталним сликама, као и мере растојања између дигиталних слика. Методе за квантитативну карактеризацију објеката представљених дигиталним сликама се све више користе у применама у којима грешка може имати критичне последице, а традиционалне методе за квантитативну карактеризацију су мале прецизности и тачности. У тези се показује да се коришћењем информације о покривеност пиксела обликом може значајно побољшати прецизност и тачност оцене растојања између две најудаљеније тачке облика мерено у датом правцу. Веома је пожељно да мера растојања између дигиталних слика може да се веже за одређену особину облика и морфолошке операције се користе приликом дефинисања растојања у ту сврху. Ипак, растојања дефинисана на овај начин показују се недовољно осетљива на релевантне податке дигиталних слика који представљају особине облика. У тези се показује да идеја адаптивне математичке морфологије може успешно да се користи да би се превазишао поменути проблем осетљивости растојања дефинисаних користећи морфолошке операције.
U tezi su razmotreni razvoj, poboljšanje i evaluacija metoda za kvantitativnu karakterizaciju objekata prikazanih digitalnim slikama, kao i mere rastojanja između digitalnih slika. Metode za kvantitativnu karakterizaciju objekata predstavljenih digitalnim slikama se sve više koriste u primenama u kojima greška može imati kritične posledice, a tradicionalne metode za kvantitativnu karakterizaciju su male preciznosti i tačnosti. U tezi se pokazuje da se korišćenjem informacije o pokrivenost piksela oblikom može značajno poboljšati preciznost i tačnost ocene rastojanja između dve najudaljenije tačke oblika mereno u datom pravcu. Veoma je poželjno da mera rastojanja između digitalnih slika može da se veže za određenu osobinu oblika i morfološke operacije se koriste prilikom definisanja rastojanja u tu svrhu. Ipak, rastojanja definisana na ovaj način pokazuju se nedovoljno osetljiva na relevantne podatke digitalnih slika koji predstavljaju osobine oblika. U tezi se pokazuje da ideja adaptivne matematičke morfologije može uspešno da se koristi da bi se prevazišao pomenuti problem osetljivosti rastojanja definisanih koristeći morfološke operacije.

APA, Harvard, Vancouver, ISO, and other styles

11

Nain, Delphine. "Scale-based decomposable shape representations for medical image segmentation and shape analysis." Diss., Available online, Georgia Institute of Technology, 2006, 2006. http://etd.gatech.edu/theses/available/etd-11192006-184858/.

Full text

Abstract:

Thesis (Ph. D.)--Computing, Georgia Institute of Technology, 2007.
Aaron Bobick, Committee Chair ; Allen Tannenbaum, Committee Co-Chair ; Greg Turk, Committee Member ; Steven Haker, Committee Member ; W. Eric. L. Grimson, Committee Member.

APA, Harvard, Vancouver, ISO, and other styles

12

Nyh, Johan. "From Snow White to Frozen : An evaluation of popular gender representation indicators applied to Disney’s princess films." Thesis, Karlstads universitet, Institutionen för geografi, medier och kommunikation, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-36877.

Full text

Abstract:

Simple content analysis methods, such as the Bechdel test and measuring percentage of female talk time or characters, have seen a surge of attention from mainstream media and in social media the last couple of years. Underlying assumptions are generally shared with the gender role socialization model and consequently, an importance is stated, due to a high degree to which impressions from media shape in particular young children’s identification processes. For young girls, the Disney Princesses franchise (with Frozen included) stands out as the number one player commercially as well as in customer awareness. The vertical lineup of Disney princesses spans from the passive and domestic working Snow White in 1937 to independent and super-power wielding princess Elsa in 2013, which makes the line of films an optimal test subject in evaluating above-mentioned simple content analysis methods. As a control, a meta-study has been conducted on previous academic studies on the same range of films. The sampled research, within fields spanning from qualitative content analysis and semiotics to coded content analysis, all come to the same conclusions regarding the general changes over time in representations of female characters. The objective of this thesis is to answer whether or not there is a correlation between these changes and those indicated by the simple content analysis methods, i.e. whether or not the simple popular methods are in general coherence with the more intricate academic methods.

Betyg VG (skala IG-VG)

APA, Harvard, Vancouver, ISO, and other styles

13

Drira, Achraf. "Geoacoustic inversion : improvement and extension of the sources image method." Thesis, Brest, 2015. http://www.theses.fr/2015BRES0089/document.

Full text

Abstract:

Ce travail de thèse propose d’analyser les signaux issus d’une source omnidirectionnelle sphérique réfléchis par un milieu sédimentaire stratifié et enregistré par une antenne d’hydrophones, en vue de caractériser quantitativement les sédiments marins aux moyennes fréquences, i.e. comprises entre 1 et 10 kHz. La recherche développée dans ce manuscrit propose une méthodologie facilitant la recherche des paramètres géoacoustiques du milieu avec la méthode des sources images, ainsi qu’un ensemble de solutions techniques appropriées afin d’améliorer cette méthode d’inversion récemment développée. La méthode des sources images repose sur une modélisation physique de la réflexion des ondes émises par une source sur un milieu stratifié sous l’approximation de Born. Par conséquent, la réflexion de l’onde sur le milieu stratifié peut être représentée par une collection de sources images, symétriques de la source réelle par rapport aux interfaces, dont les positions spatiales sont liées à la vitesse des ondes acoustiques et aux épaisseurs des couches. L’étude se décline en deux volets : traitements des signaux et inversion des paramètres géoacoustiques. La première partie du travail est focalisée sur le développement de la méthode des sources images. La méthode originelle se basait sur la construction de cartes de migration et de semblance de signaux pour déterminer les paramètres d’entrée de l’algorithme d’inversion qui sont des temps de trajet et des angles d’arrivée. Afin d’éviter cette étape, nous détectons les temps d’arrivée avec l’opérateur d’énergie de Teager-Kaiser (TKEO) et nous trouvons les angles par une méthode de triangulation. Le modèle d’inversion a été ensuite intégré en prenant en compte la possibilité de déformation de l’antenne. Cette partie se termine par une nouvelle approche qui combine TKEO et des méthodes temps fréquence afin d’avoir une bonne détection du temps d’arrivée dans le cas de signaux fortement bruités. Sur le plan du modèle et de l’inversion géoacoustique, nous proposons tout d’abord une description précise du modèle direct en introduisant le concept de sources images virtuelles. Cette étape permet de mieux comprendre l’approche développée. Ensuite, nous proposons une extension de la méthode des sources image pour l’inversion de paramètres géoacoustiques supplémentaires : la densité, l’atténuation et la vitesse des ondes de cisaillement. Cette extension est basée sur les résultats de l’inversion originelle (estimation du nombre de strates, de leur épaisseur, et de la vitesse des ondes de compression) ainsi que sur l’utilisation de l’amplitude des signaux réfléchis. Ces améliorations et extensions de la méthode des sources images sont illustrées par leur application sur des signaux synthétiques et des signaux réels issus d’expérimentations en cuve et à la mer. Les résultats obtenus sont très satisfaisants, tant au niveau des performances de calcul que de la qualité des estimations fournies
This thesis aims at analyzing the signals emitted from a spherical omnidirectional source reflected by a stratified sedimentary environment and recorded by a hydrophone array in order to characterize quantitatively the marine sediments at medium frequencies, i.e. between 1 and 10 kHz. The research developed in this manuscript provides a methodology to facilitate the estimation of medium geoacoustic parameters with the image source method, and some appropriate technical solutions to improve this recently developed inversion method. The image source method is based on a physical modeling of the wave reflection emitted from a source by a stratified medium under the Born approximation. As result, the reflection of the wave on the layered medium can be represented by a set of image sources, symmetrical to the real source with respect to the interfaces, whose spatial positions are related to the sound speeds and the thicknesses of the layers. The study consists of two parts : signal processing and inversion of geoacoustic parameters. The first part of the work is focused on the development of the image source method. The original method was based on migration and semblance maps of the recorded signals to determine the input parameters of the inversion algorithm which are travel times and arrival angles. To avoid this step, we propose to determine the travel times with the Teager-Kaiser energy operator (TKEO) and the arrival angles are estimate with a triangulation approach. The inversion model is then integrated, taking into account the possible deformation of the antenna. This part concludes with a new approach that combines TKEO and time-frequency representations in order to have a good estimation of the travel times in the case of noisy signals. For the modeling and geoacoustic inversion part, we propose first an accurate description of the forward model by introducing the concept of virtual image sources. This idea provides a deeper understanding of the developed approach. Then, we propose an extension of the image sources method to the estimation of supplementary geoacoustic parameters : the density, the absorption coefficient, and the shear wave sound speed. This extension is based on the results of the original inversion (estimation of the number of layers, their thicknesses, and the pressure sound speeds) and on the use of the amplitudes of the reflected signals. These improvements and extents of the image source method are illustrated by their applications on both synthetic and real signals, the latter coming from tank and at-sea measurements. The obtained results are very satisfactory, from a computational point of view as well as for the quality of the provided estimations

APA, Harvard, Vancouver, ISO, and other styles

14

Rey, Otero Ives. "Anatomy of the SIFT method." Thesis, Cachan, Ecole normale supérieure, 2015. http://www.theses.fr/2015DENS0044/document.

Full text

Abstract:

Cette thèse est une analyse approfondie de la méthode SIFT, la méthode de comparaison d'images la plus populaire. En proposant un échantillonnage du scale-space Gaussien, elle est aussi la première méthode à mettre en pratique la théorie scale-space et faire usage de ses propriétés d'invariance aux changements d'échelles.SIFT associe à une image un ensemble de descripteurs invariants aux changements d'échelle, invariants à la rotation et à la translation. Les descripteurs de différentes images peuvent être comparés afin de mettre en correspondance les images. Compte tenu de ses nombreuses applications et ses innombrables variantes, étudier un algorithme publié il y a une décennie pourrait surprendre. Il apparaît néanmoins que peu a été fait pour réellement comprendre cet algorithme majeur et établir de façon rigoureuse dans quelle mesure il peut être amélioré pour des applications de haute précision. Cette étude se découpe en quatre parties. Le calcul exact du scale-space Gaussien, qui est au cœur de la méthode SIFT et de la plupart de ses compétiteurs, est l'objet de la première partie.La deuxième partie est une dissection méticuleuse de la longue chaîne de transformations qui constitue la méthode SIFT. Chaque paramètre y est documenté et son influence analysée. Cette dissection est aussi associé à une publication en ligne de l'algorithme. La description détaillée s'accompagne d'un code en C ainsi que d'une plateforme de démonstration permettant l'analyse par le lecteur de l'influence de chaque paramètre. Dans la troisième partie, nous définissons un cadre d'analyse expérimental exact dans le but de vérifier que la méthode SIFT détecte de façon fiable et stable les extrema du scale-space continue à partir de la grille discrète. En découlent des conclusions pratiques sur le bon échantillonnage du scale-space Gaussien ainsi que sur les stratégies de filtrage de points instables. Ce même cadre expérimental est utilisé dans l'analyse de l'influence de perturbations dans l'image (aliasing, bruit, flou). Cette analyse démontre que la marge d'amélioration est réduite pour la méthode SIFT ainsi que pour toutes ses variantes s'appuyant sur le scale-space pour extraire des points d'intérêt. L'analyse démontre qu'un suréchantillonnage du scale-space permet d'améliorer l'extraction d'extrema et que se restreindre aux échelles élevées améliore la robustesse aux perturbations de l'image.La dernière partie porte sur l'évaluation des performances de détecteurs de points. La métrique de performance la plus généralement utilisée est la répétabilité. Nous démontrons que cette métrique souffre pourtant d'un biais et qu'elle favorise les méthodes générant des détections redondantes. Afin d'éliminer ce biais, nous proposons une variante qui prend en considération la répartition spatiale des détections. A l'aide de cette correction nous réévaluons l'état de l'art et montrons que, une fois la redondance des détections prise en compte, la méthode SIFT est meilleure que nombre de ses variantes les plus modernes
This dissertation contributes to an in-depth analysis of the SIFT method. SIFT is the most popular and the first efficient image comparison model. SIFT is also the first method to propose a practical scale-space sampling and to put in practice the theoretical scale invariance in scale space. It associates with each image a list of scale invariant (also rotation and translation invariant) features which can be used for comparison with other images. Because after SIFT feature detectors have been used in countless image processing applications, and because of an intimidating number of variants, studying an algorithm that was published more than a decade ago may be surprising. It seems however that not much has been done to really understand this central algorithm and to find out exactly what improvements we can hope for on the matter of reliable image matching methods. Our analysis of the SIFT algorithm is organized as follows. We focus first on the exact computation of the Gaussian scale-space which is at the heart of SIFT as well as most of its competitors. We provide a meticulous dissection of the complex chain of transformations that form the SIFT method and a presentation of every design parameter from the extraction of invariant keypoints to the computation of feature vectors. Using this documented implementation permitting to vary all of its own parameters, we define a rigorous simulation framework to find out if the scale-space features are indeed correctly detected by SIFT, and which sampling parameters influence the stability of extracted keypoints. This analysis is extended to see the influence of other crucial perturbations, such as errors on the amount of blur, aliasing and noise. This analysis demonstrates that, despite the fact that numerous methods claim to outperform the SIFT method, there is in fact limited room for improvement in methods that extract keypoints from a scale-space. The comparison of many detectors proposed in SIFT competitors is the subject of the last part of this thesis. The performance analysis of local feature detectors has been mainly based on the repeatability criterion. We show that this popular criterion is biased toward methods producing redundant (overlapping) descriptors. We therefore propose an amended evaluation metric and use it to revisit a classic benchmark. For the amended repeatability criterion, SIFT is shown to outperform most of its more recent competitors. This last fact corroborates the unabating interest in SIFT and the necessity of a thorough scrutiny of this method

APA, Harvard, Vancouver, ISO, and other styles

15

Lang, Heidi. "Understanding the hidden experience of head and neck cancer patients : a qualitative exploration of beliefs and mental images." Thesis, University of Dundee, 2010. https://discovery.dundee.ac.uk/en/studentTheses/c17cd584-34b2-46bc-8290-b8cd3e6ad2c4.

Full text

Abstract:

Patients’ beliefs about their illness are known to influence their experiences of illness, its psychological impact, their health behaviours, and overall health outcomes. Research into illness beliefs has typically involved written or oral methods, yet recent studies have suggested that patients’ beliefs about their illness may be embodied in visual form, in their mental images of the disease. Beliefs embedded in mental images may not be captured via traditional modes of assessment, and thus far the possible significance of this kind of ‘visual knowledge’, has been largely overlooked. Studies using visual methods to explore patients’ mental images suggest this is a viable and useful approach which may provide additional insights into their illness beliefs. Research of this kind is in its infancy however, and there are several fundamental questions concerning the existence and nature of mental images, how best to access such images, and their relationship to illness beliefs, which are as yet unanswered. This thesis employed qualitative methods to address these issues and explore the significance of mental images within the context of head and neck cancer. It consists of three empirical phases – a methodological pilot study, a qualitative meta-synthesis, and a longitudinal study. The findings indicate that many patients do generate a mental image of their cancer, and this is significant in terms of their understanding of both the disease and its treatments. Images appear to enhance patients’ comprehension of what is going on inside their bodies, and may both reflect and influence illness beliefs. In this thesis these findings are considered with reference to the methodological issues intrinsic to researching mental images, and the implications for future research and clinical practice.

APA, Harvard, Vancouver, ISO, and other styles

16

Boháč, Martin. "Zpracování obrazu při určování topografických parametrů povrchů." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2009. http://www.nusl.cz/ntk/nusl-228823.

Full text

Abstract:

This work deal with determination topohraphic parameters of a randomly rough surface by the help of method of shearing interferometry. It is a optical method for determination surface roughness. The basic idea is based of on deformation interference strips which are made by interference of the same mutually translated monochrome luminous wavefronts. The wavefront is created after transit or reflection monochrome lights from the surface of a studied sample. The wavefronts creates picture with deformed interference strips , which carries information about character of the surface. This information can be profited by algorithms of image processing from the picture . The thesis was developed in research project MSM 0021630529 Intelligent Systems in Automation.

APA, Harvard, Vancouver, ISO, and other styles

17

Nasser, Khalafallah Mahmoud Lamees. "A dictionary-based denoising method toward a robust segmentation of noisy and densely packed nuclei in 3D biological microscopy images." Electronic Thesis or Diss., Sorbonne université, 2019. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2019SORUS283.pdf.

Full text

Abstract:

Les cellules sont les éléments constitutifs de base de tout organisme vivant. Tous les organismes vivants partagent des processus vitaux tels que croissance, développement, mouvement, nutrition, excrétion, reproduction, respiration et réaction à l’environnement. En biologie cellulaire, comprendre la structure et fonction des cellules est essentielle pour développer et tester de nouveaux médicaments. Par ailleurs, cela aide aussi à l’étude du développement embryonnaire. Enfin, cela permet aux chercheurs de mieux comprendre les effets des mutations et de diverses maladies. La vidéo-microscopie (Time Lapse Fluorescence Microscopie) est l’une des techniques d’imagerie les plus utilisées afin de quantifier diverses caractéristiques des processus cellulaires, à savoir la survie, la prolifération, la migration ou la différenciation cellulaire. En vidéo-microscopie, non seulement les informations spatiales sont disponibles, mais aussi les informations temporelles en réitérant l’acquisition de l’échantillon, et enfin les informations spectrales, ce qui génère des données dites « cinq dimensions » (X, Y, Z + temps + canal). En règle générale, les jeux de données générés consistent en plusieurs (centaines ou milliers) d’images, chacune contenant des centaines ou milliers d’objets à analyser. Pour effectuer une quantification précise et à haut débit des processus cellulaires, les étapes de segmentation et de suivi des noyaux cellulaires doivent être effectuées de manière automatisée. Cependant, la segmentation et le suivi des noyaux sont des tâches difficiles dû notamment au bruit intrinsèque dans les images, à l’inhomogénéité de l’intensité, au changement de forme des noyaux ainsi qu’à un faible contraste des noyaux. Bien que plusieurs approches de segmentation des noyaux aient été rapportées dans la littérature, le fait de traiter le bruit intrinsèque reste la partie la plus difficile de tout algorithme de segmentation. Nous proposons un nouvel algorithme de débruitage 3D, basé sur l’apprentissage d’un dictionnaire non supervisé et une représentation parcimonieuse, qui à la fois améliore la visualisation des noyaux très peu contrastés et bruités, mais aussi détecte simultanément la position de ces noyaux avec précision. De plus, notre méthode possède un nombre limité de paramètres, un seul étant critique, à savoir la taille approximative des objets à traiter. Le cadre de la méthode proposée comprend le débruitage d’images, la détection des noyaux et leur segmentation. Dans l’étape de débruitage, un dictionnaire initial est construit en sélectionnant des régions (patches) aléatoires dans l’image originale, puis une technique itérative est implémentée pour mettre à jour ce dictionnaire afin d’obtenir un dictionnaire dont les éléments mis à jour présentent un meilleur contraste. Ensuite, une carte de détection, basée sur le calcul des coefficients du dictionnaire utilisés pour débruiter l’image, est utilisée pour détecter le centre approximatif des noyaux qui serviront de marqueurs pour la segmentation. Ensuite, une approche basée sur le seuillage est proposée pour obtenir le masque de segmentation des noyaux. Finalement, une approche de segmentation par partage des eaux contrôlée par les marqueurs est utilisée pour obtenir le résultat final de segmentation des noyaux. Nous avons créé des images synthétiques 3D afin d’étudier l’effet des paramètres de notre méthode sur la détection et la segmentation des noyaux, et pour comprendre le mécanisme global de sélection et de réglage de ces paramètres significatifs sur différents jeux de données
Cells are the basic building blocks of all living organisms. All living organisms share life processes such as growth and development, movement, nutrition, excretion, reproduction, respiration and response to the environment. In cell biology research, understanding cells structure and function is essential for developing and testing new drugs. In addition, cell biology research provides a powerful tool to study embryo development. Furthermore, it helps the scientific research community to understand the effects of mutations and various diseases. Time-Lapse Fluorescence Microscopy (TLFM) is one of the most appreciated imaging techniques which can be used in live-cell imaging experiments to quantify various characteristics of cellular processes, i.e., cell survival, proliferation, migration, and differentiation. In TLFM imaging, not only spatial information is acquired, but also temporal information obtained by repeating imaging of a labeled sample at specific time points, as well as spectral information, that produces up to five-dimensional (X, Y, Z + Time + Channel) images. Typically, the generated datasets consist of several (hundreds or thousands) images, each containing hundreds to thousands of objects to be analyzed. To perform high-throughput quantification of cellular processes, nuclei segmentation and tracking should be performed in an automated manner. Nevertheless, nuclei segmentation and tracking are challenging tasks due to embedded noise, intensity inhomogeneity, shape variation as well as a weak boundary of nuclei. Although several nuclei segmentation approaches have been reported in the literature, dealing with embedded noise remains the most challenging part of any segmentation algorithm. We propose a novel 3D denoising algorithm, based on unsupervised dictionary learning and sparse representation, that can both enhance very faint and noisy nuclei, in addition, it simultaneously detects nuclei position accurately. Furthermore, our method is based on a limited number of parameters, with only one being critical, which is the approximate size of the objects of interest. The framework of the proposed method comprises image denoising, nuclei detection, and segmentation. In the denoising step, an initial dictionary is constructed by selecting random patches from the raw image then an iterative technique is implemented to update the dictionary and obtain the final one which is less noisy. Next, a detection map, based on the dictionary coefficients used to denoise the image, is used to detect marker points. Afterward, a thresholding-based approach is proposed to get the segmentation mask. Finally, a marker-controlled watershed approach is used to get the final nuclei segmentation result. We generate 3D synthetic images to study the effect of the few parameters of our method on cell nuclei detection and segmentation, and to understand the overall mechanism for selecting and tuning the significant parameters of the several datasets. These synthetic images have low contrast and low signal to noise ratio. Furthermore, they include touching spheres where these conditions simulate the same characteristics exist in the real datasets. The proposed framework shows that integrating our denoising method along with classical segmentation method works properly in the context of the most challenging cases. To evaluate the performance of the proposed method, two datasets from the cell tracking challenge are extensively tested. Across all datasets, the proposed method achieved very promising results with 96.96% recall for the C.elegans dataset. Besides, in the Drosophila dataset, our method achieved very high recall (99.3%)

APA, Harvard, Vancouver, ISO, and other styles

18

Liu, Yuan. "Représentation parcimonieuse basée sur la norme ℓ₀ Mixed integer programming for sparse coding : application to image denoising Incoherent dictionary learning via mixed-integer programming and hybrid augmented Lagrangian." Thesis, Normandie, 2019. http://www.theses.fr/2019NORMIR22.

Full text

Abstract:

Cette monographie traite du problème d’apprentissage de dictionnaire parcimonieux associé à la pseudo-norme ℓ₀. Ce problème est classiquement traité par une procédure de relaxation alternée itérative en deux phases : un codage parcimonieux (sparse coding) et une réactualisation du dictionnaire. Cependant, le problème d’optimisation associé à ce codage parcimonieux s’avère être non convexe et NP-difficile, ce qui a justifié la recherche de relaxations et d’algorithmes gloutons pour obtenir une bonne approximation de la solution globale du problème. A l’inverse, nous reformulons le problème comme un programme quadratique mixte en nombres entiers (MIQP) permettant d’obtenir l’optimum global du problème. La principale difficulté de cette approche étant le temps de calcul, nous proposons deux méthodes (la relaxation par l’ajout de contraintes complémentaires et l’initialisation par la méthode du gradient proximal) permettant de le réduire. Cet algorithme est baptisé MIQP accéléré (AcMIQP). L’application de AcMIQP à un problème de débruitage d’images démontre sa faisabilité et ses bonnes performances. Nous proposons ensuite d’améliorer cet algorithme en y intégrant des contraintes visant à promouvoir l’indépendance des atomes du dictionnaire sélectionné. Pour traiter ce problème à l’aide de AcMIQP, la phase de réactualisation du dictionnaire sous contraintes est adaptée en combinant la méthode du lagrangien augmenté (ADMM) et la méthode Extended Proximal Alternating Linearized Minimization (EPALM). L’efficacité de cette approche AcMIQP+EPALM est démontrée sur un problème de reconstruction d’image
In this monograph, we study the exact ℓ₀ based sparse representation problem. For the classical dictionary learning problem, the solution is obtained by iteratively processing two steps: sparse coding and dictionary updating. However, even the problem associated with sparse coding is non-convex and NP-hard. The method for solving this is to reformulate the problem as mixed integer quadratic programming (MIQP). Then by introducing two optimization techniques, initialization by proximal method and relaxation with augmented contraints, the algorithmis greatly speed up (which is thus called AcMIQP) and applied in image denoising, which shows the good performance. Moreover, the classical problem is extended to learn an incoherent dictionary. For dealing with this problem, AcMIQP or proximal method is used for sparse coding. As for dictionary updating, augmented Lagrangian method (ADMM) and extended proximal alternating linearized minimizing method are combined. This exact ℓ₀ based incoherent dictionary learning is applied in image recovery, which illustrates the improved performance with a lower coherence

APA, Harvard, Vancouver, ISO, and other styles

19

YANG, MING-CHUN, and 楊明錞. "2D B-string representation and access methods of image database." Thesis, 1990. http://ndltd.ncl.edu.tw/handle/81798406311591577024.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Budinich, Renato. "Adaptive Multiscale Methods for Sparse Image Representation and Dictionary Learning." Doctoral thesis, 2018. http://hdl.handle.net/11858/00-1735-0000-002E-E55B-F.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Stein, Gideon P., and Amnon Shashua. "Direct Methods for Estimation of Structure and Motion from Three Views." 1996. http://hdl.handle.net/1721.1/5937.

Full text

Abstract:

We describe a new direct method for estimating structure and motion from image intensities of multiple views. We extend the direct methods of Horn- and-Weldon to three views. Adding the third view enables us to solve for motion, and compute a dense depth map of the scene, directly from image spatio -temporal derivatives in a linear manner without first having to find point correspondences or compute optical flow. We describe the advantages and limitations of this method which are then verified through simulation and experiments with real images.

APA, Harvard, Vancouver, ISO, and other styles

22

Li, Kuan-Ying, and 李冠穎. "Representative Images Selection Methods for Video Clips." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/97957554842018606515.

Full text

Abstract:

碩士
國立臺北科技大學
資訊工程系碩士班
92
When people choose the representative picture in the video, they spend a lot of time and energy. In this thesis, we design two methods for choosing the representative pictures from the video clips automatically. The methods to choose the picture are divided into two kinds of categories in our system: the first kind of method is Well-Selection Representative Images Algorithm, and the second method is Auto-Selection Representative Images Algorithm. In Well-Selection method, users need to input the number of representative picture, then this system begin to analyze the video and divide it into scenes by Scene Changing Detection mechanism. After that, this system makes a spatial and temporal characteristics analysis of scenes, utilizes the Key Scene Allocation to find out the key scenes in accordance with user''s input number and use Key Frame Extraction Algorithm to extract representative pictures from the key scenes. In Auto-Selection method, users do not need to input the number of selected pictures; this system extract representative pictures of the video automatically using scene changing detection and Key Frame Extraction Algorithm. Users can watch pictures and figure out the whole story in the video. This system is tested via many types of video, the result is quite satisfactory. The user can use this system to extract video into representative pictures, and share with relatives and friends.

APA, Harvard, Vancouver, ISO, and other styles

23

Babu, T. Ravindra. "Large Data Clustering And Classification Schemes For Data Mining." Thesis, 2006. http://hdl.handle.net/2005/440.

Full text

Abstract:

Data Mining deals with extracting valid, novel, easily understood by humans, potentially useful and general abstractions from large data. A data is large when number of patterns, number of features per pattern or both are large. Largeness of data is characterized by its size which is beyond the capacity of main memory of a computer. Data Mining is an interdisciplinary field involving database systems, statistics, machine learning, visualization and computational aspects. The focus of data mining algorithms is scalability and efficiency. Large data clustering and classification is an important activity in Data Mining. The clustering algorithms are predominantly iterative requiring multiple scans of dataset, which is very expensive when data is stored on the disk. In the current work we propose different schemes that have both theoretical validity and practical utility in dealing with such a large data. The schemes broadly encompass data compaction, classification, prototype selection, use of domain knowledge and hybrid intelligent systems. The proposed approaches can be broadly classified as (a) compressing the data by some means in a non-lossy manner; cluster as well as classify the patterns in their compressed form directly through a novel algorithm, (b) compressing the data in a lossy fashion such that a very high degree of compression and abstraction is obtained in terms of 'distinct subsequences'; classify the data in such compressed form to improve the prediction accuracy, (c) with the help of incremental clustering, a lossy compression scheme and rough set approach, obtain simultaneous prototype and feature selection, (d) demonstrate that prototype selection and data-dependent techniques can reduce number of comparisons in multiclass classification scenario using SVMs, and (e) by making use of domain knowledge of the problem and data under consideration, we show that we obtaina very high classification accuracy with less number of iterations with AdaBoost. The schemes have pragmatic utility. The prototype selection algorithm is incremental, requiring a single dataset scan and has linear time and space requirements. We provide results obtained with a large, high dimensional handwritten(hw) digit data. The compression algorithm is based on simple concepts, where we demonstrate that classification of the compressed data improves computation time required by a factor 5 with prediction accuracy with both compressed and original data being exactly the same as 92.47%. With the proposed lossy compression scheme and pruning methods, we demonstrate that even with a reduction of distinct sequences by a factor of 6 (690 to 106), the prediction accuracy improves. Specifically, with original data containing 690 distinct subsequences, the classification accuracy is 92.47% and with appropriate choice of parameters for pruning, the number of distinct subsequences reduces to 106 with corresponding classification accuracy as 92.92%. The best classification accuracy of 93.3% is obtained with 452 distinct subsequences. With the scheme of simultaneous feature and prototype selection, we improved classification accuracy to better than that obtained with kNNC, viz., 93.58%, while significantly reducing the number of features and prototypes, achieving a compaction of 45.1%. In case of hybrid schemes based on SVM, prototypes and domain knowledge based tree(KB-Tree), we demonstrated reduction in SVM training time by 50% and testing time by about 30% as compared to complete data and improvement of classification accuracy to 94.75%. In case of AdaBoost the classification accuracy is 94.48%, which is better than those obtained with NNC and kNNC on the entire data; the training timing is reduced because of use of prototypes instead of the complete data. Another important aspect of the work is to devise a KB-Tree (with maximum depth of 4), that classifies a 10-category data in just 4 comparisons. In addition to hw data, we applied the schemes to Network Intrusion Detection Data (10% dataset of KDDCUP99) and demonstrated that the proposed schemes provided less overall cost than the reported values.

APA, Harvard, Vancouver, ISO, and other styles

24

"Sparse Methods in Image Understanding and Computer Vision." Doctoral diss., 2013. http://hdl.handle.net/2286/R.I.17719.

Full text

Abstract:

abstract: Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse modeling, data is represented as a sparse linear combination of atoms from a "dictionary" matrix. This dissertation focuses on understanding different aspects of sparse learning, thereby enhancing the use of sparse methods by incorporating tools from machine learning. With the growing need to adapt models for large scale data, it is important to design dictionaries that can model the entire data space and not just the samples considered. By exploiting the relation of dictionary learning to 1-D subspace clustering, a multilevel dictionary learning algorithm is developed, and it is shown to outperform conventional sparse models in compressed recovery, and image denoising. Theoretical aspects of learning such as algorithmic stability and generalization are considered, and ensemble learning is incorporated for effective large scale learning. In addition to building strategies for efficiently implementing 1-D subspace clustering, a discriminative clustering approach is designed to estimate the unknown mixing process in blind source separation. By exploiting the non-linear relation between the image descriptors, and allowing the use of multiple features, sparse methods can be made more effective in recognition problems. The idea of multiple kernel sparse representations is developed, and algorithms for learning dictionaries in the feature space are presented. Using object recognition experiments on standard datasets it is shown that the proposed approaches outperform other sparse coding-based recognition frameworks. Furthermore, a segmentation technique based on multiple kernel sparse representations is developed, and successfully applied for automated brain tumor identification. Using sparse codes to define the relation between data samples can lead to a more robust graph embedding for unsupervised clustering. By performing discriminative embedding using sparse coding-based graphs, an algorithm for measuring the glomerular number in kidney MRI images is developed. Finally, approaches to build dictionaries for local sparse coding of image descriptors are presented, and applied to object recognition and image retrieval.
Dissertation/Thesis
Ph.D. Electrical Engineering 2013

APA, Harvard, Vancouver, ISO, and other styles

25

Berkels, Benjamin [Verfasser]. "Joint methods in imaging based on diffuse image representations / vorgelegt von Benjamin Berkels." 2010. http://d-nb.info/1008748250/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Fuchs, Martin [Verfasser]. "Advanced methods for relightable scene representations in image space / vorgelegt von Martin Fuchs." 2008. http://d-nb.info/996233679/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

(9187466), Bharath Kumar Comandur Jagannathan Raghunathan. "Semantic Labeling of Large Geographic Areas Using Multi-Date and Multi-View Satellite Images and Noisy OpenStreetMap Labels." Thesis, 2020.

Find full text

Abstract:

This dissertation addresses the problem of how to design a convolutional neural network (CNN) for giving semantic labels to the points on the ground given the satellite image coverage over the area and, for the ground truth, given the noisy labels in OpenStreetMap (OSM). This problem is made challenging by the fact that -- (1) Most of the images are likely to have been recorded from off-nadir viewpoints for the area of interest on the ground; (2) The user-supplied labels in OSM are frequently inaccurate and, not uncommonly, entirely missing; and (3) The size of the area covered on the ground must be large enough to possess any engineering utility. As this dissertation demonstrates, solving this problem requires that we first construct a DSM (Digital Surface Model) from a stereo fusion of the available images, and subsequently use the DSM to map the individual pixels in the satellite images to points on the ground. That creates an association between the pixels in the images and the noisy labels in OSM. The CNN-based solution we present yields a 4-8% improvement in the per-class segmentation IoU (Intersection over Union) scores compared to the traditional approaches that use the views independently of one another. The system we present is end-to-end automated, which facilitates comparing the classifiers trained directly on true orthophotos vis-`a-vis first training them on the off-nadir images and subsequently translating the predicted labels to geographical coordinates. This work also presents, for arguably the first time, an in-depth discussion of large-area image alignment and DSM construction using tens of true multi-date and multi-view WorldView-3 satellite images on a distributed OpenStack cloud computing platform.

APA, Harvard, Vancouver, ISO, and other styles

28

(6630578), Yellamraju Tarun. "n-TARP: A Random Projection based Method for Supervised and Unsupervised Machine Learning in High-dimensions with Application to Educational Data Analysis." Thesis, 2019.

Find full text

Abstract:

Analyzing the structure of a dataset is a challenging problem in high-dimensions as the volume of the space increases at an exponential rate and typically, data becomes sparse in this high-dimensional space. This poses a significant challenge to machine learning methods which rely on exploiting structures underlying data to make meaningful inferences. This dissertation proposes the n-TARP method as a building block for high-dimensional data analysis, in both supervised and unsupervised scenarios.

The basic element, n-TARP, consists of a random projection framework to transform high-dimensional data to one-dimensional data in a manner that yields point separations in the projected space. The point separation can be tuned to reflect classes in supervised scenarios and clusters in unsupervised scenarios. The n-TARP method finds linear separations in high-dimensional data. This basic unit can be used repeatedly to find a variety of structures. It can be arranged in a hierarchical structure like a tree, which increases the model complexity, flexibility and discriminating power. Feature space extensions combined with n-TARP can also be used to investigate non-linear separations in high-dimensional data.

The application of n-TARP to both supervised and unsupervised problems is investigated in this dissertation. In the supervised scenario, a sequence of n-TARP based classifiers with increasing complexity is considered. The point separations are measured by classification metrics like accuracy, Gini impurity or entropy. The performance of these classifiers on image classification tasks is studied. This study provides an interesting insight into the working of classification methods. The sequence of n-TARP classifiers yields benchmark curves that put in context the accuracy and complexity of other classification methods for a given dataset. The benchmark curves are parameterized by classification error and computational cost to define a benchmarking plane. This framework splits this plane into regions of "positive-gain" and "negative-gain" which provide context for the performance and effectiveness of other classification methods. The asymptotes of benchmark curves are shown to be optimal (i.e. at Bayes Error) in some cases (Theorem 2.5.2).

In the unsupervised scenario, the n-TARP method highlights the existence of many different clustering structures in a dataset. However, not all structures present are statistically meaningful. This issue is amplified when the dataset is small, as random events may yield sample sets that exhibit separations that are not present in the distribution of the data. Thus, statistical validation is an important step in data analysis, especially in high-dimensions. However, in order to statistically validate results, often an exponentially increasing number of data samples are required as the dimensions increase. The proposed n-TARP method circumvents this challenge by evaluating statistical significance in the one-dimensional space of data projections. The n-TARP framework also results in several different statistically valid instances of point separation into clusters, as opposed to a unique "best" separation, which leads to a distribution of clusters induced by the random projection process.

The distributions of clusters resulting from n-TARP are studied. This dissertation focuses on small sample high-dimensional problems. A large number of distinct clusters are found, which are statistically validated. The distribution of clusters is studied as the dimensionality of the problem evolves through the extension of the feature space using monomial terms of increasing degree in the original features, which corresponds to investigating non-linear point separations in the projection space.

A statistical framework is introduced to detect patterns of dependence between the clusters formed with the features (predictors) and a chosen outcome (response) in the data that is not used by the clustering method. This framework is designed to detect the existence of a relationship between the predictors and response. This framework can also serve as an alternative cluster validation tool.

The concepts and methods developed in this dissertation are applied to a real world data analysis problem in Engineering Education. Specifically, engineering students' Habits of Mind are analyzed. The data at hand is qualitative, in the form of text, equations and figures. To use the n-TARP based analysis method, the source data must be transformed into quantitative data (vectors). This is done by modeling it as a random process based on the theoretical framework defined by a rubric. Since the number of students is small, this problem falls into the small sample high-dimensions scenario. The n-TARP clustering method is used to find groups within this data in a statistically valid manner. The resulting clusters are analyzed in the context of education to determine what is represented by the identified clusters. The dependence of student performance indicators like the course grade on the clusters formed with n-TARP are studied in the pattern dependence framework, and the observed effect is statistically validated. The data obtained suggests the presence of a large variety of different patterns of Habits of Mind among students, many of which are associated with significant grade differences. In particular, the course grade is found to be dependent on at least two Habits of Mind: "computation and estimation" and "values and attitudes."

APA, Harvard, Vancouver, ISO, and other styles

29

(11184732), Kumar Apurv. "E-scooter Rider Detection System in Driving Environments." Thesis, 2021.

Find full text

Abstract:

E-scooters are ubiquitous and their number keeps escalating, increasing their interactions with other vehicles on the road. E-scooter riders have an atypical behavior that varies enormously from other vulnerable road users, creating new challenges for vehicle active safety systems and automated driving functionalities. The detection of e-scooter riders by other vehicles is the first step in taking care of the risks. This research presents a novel vision-based system to differentiate between e-scooter riders and regular pedestrians and a benchmark dataset for e-scooter riders in natural environments. An efficient system pipeline built using two existing state-of-the-art convolutional neural networks (CNN), You Only Look Once (YOLOv3) and MobileNetV2, performs detection of these vulnerable e-scooter riders.

APA, Harvard, Vancouver, ISO, and other styles

30

Zhu, Jihai. "Low-complexity block dividing coding method for image compression using wavelets : a thesis presented in partial fulfillment of the requirements for the degree of Master of Engineering in Computer Systems Engineering at Massey University, Palmerston North, New Zealand." 2007. http://hdl.handle.net/10179/704.

Full text

Abstract:

Image coding plays a key role in multimedia signal processing and communications. JPEG2000 is the latest image coding standard, it uses the EBCOT (Embedded Block Coding with Optimal Truncation) algorithm. The EBCOT exhibits excellent compression performance, but with high complexity. The need to reduce this complexity but maintain similar performance to EBCOT has inspired a significant amount of research activity in the image coding community. Within the development of image compression techniques based on wavelet transforms, the EZW (Embedded Zerotree Wavelet) and the SPIHT (Set Partitioning in Hierarchical Trees) have played an important role. The EZW algorithm was the first breakthrough in wavelet based image coding. The SPIHT algorithm achieves similar performance to EBCOT, but with fewer features. The other very important algorithm is SBHP (Sub-band Block Hierarchical Partitioning), which attracted significant investigation during the JPEG2000 development process. In this thesis, the history of the development of wavelet transform is reviewed, and a discussion is presented on the implementation issues for wavelet transforms. The above mentioned four main coding methods for image compression using wavelet transforms are studied in detail. More importantly the factors that affect coding efficiency are identified. The main contribution of this research is the introduction of a new low-complexity coding algorithm for image compression based on wavelet transforms. The algorithm is based on block dividing coding (BDC) with an optimised packet assembly. Our extensive simulation results show that the proposed algorithm outperforms JPEG2000 in lossless coding, even though it still leaves a narrow gap in lossy coding situations

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Image representation methods'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles