Segui questo link per vedere altri tipi di pubblicazioni sul tema: Classification methods.

Tesi sul tema "Classification methods"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Classification methods".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.

1

Jamain, Adrien. "Meta-analysis of classification methods". Thesis, Imperial College London, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.413686.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Chzhen, Evgenii. "Plug-in methods in classification". Thesis, Paris Est, 2019. http://www.theses.fr/2019PESC2027/document.

Testo completo
Abstract (sommario):
Ce manuscrit étudie plusieurs problèmes de classification sous contraintes. Dans ce cadre de classification, notre objectif est de construire un algorithme qui a des performances aussi bonnes que la meilleure règle de classification ayant une propriété souhaitée. Fait intéressant, les méthodes de classification de type plug-in sont bien appropriées à cet effet. De plus, il est montré que, dans plusieurs configurations, ces règles de classification peuvent exploiter des données non étiquetées, c'est-à-dire qu'elles sont construites de manière semi-supervisée. Le Chapitre 1 décrit deux cas particuliers de la classification binaire - la classification où la mesure de performance est reliée au F-score, et la classification équitable. A ces deux problèmes, des procédures semi-supervisées sont proposées. En particulier, dans le cas du F-score, il s'avère que cette méthode est optimale au sens minimax sur une classe usuelle de distributions non-paramétriques. Aussi, dans le cas de la classification équitable, la méthode proposée est consistante en terme de risque de classification, tout en satisfaisant asymptotiquement la contrainte d’égalité des chances. De plus, la procédure proposée dans ce cadre d'étude surpasse en pratique les algorithmes de pointe. Le Chapitre 3 décrit le cadre de la classification multi-classes par le biais d'ensembles de confiance. Là encore, une procédure semi-supervisée est proposée et son optimalité presque minimax est établie. Il est en outre établi qu'aucun algorithme supervisé ne peut atteindre une vitesse de convergence dite rapide. Le Chapitre 4 décrit un cas de classification multi-labels dans lequel on cherche à minimiser le taux de faux-négatifs sous réserve de contraintes de type presque sûres sur les règles de classification. Dans cette partie, deux contraintes spécifiques sont prises en compte: les classifieurs parcimonieux et ceux soumis à un contrôle des erreurs négatives à tort. Pour les premiers, un algorithme supervisé est fourni et il est montré que cet algorithme peut atteindre une vitesse de convergence rapide. Enfin, pour la seconde famille, il est montré que des hypothèses supplémentaires sont nécessaires pour obtenir des garanties théoriques sur le risque de classification
This manuscript studies several problems of constrained classification. In this frameworks of classification our goal is to construct an algorithm which performs as good as the best classifier that obeys some desired property. Plug-in type classifiers are well suited to achieve this goal. Interestingly, it is shown that in several setups these classifiers can leverage unlabeled data, that is, they are constructed in a semi-supervised manner.Chapter 2 describes two particular settings of binary classification -- classification with F-score and classification of equal opportunity. For both problems semi-supervised procedures are proposed and their theoretical properties are established. In the case of the F-score, the proposed procedure is shown to be optimal in minimax sense over a standard non-parametric class of distributions. In the case of the classification of equal opportunity the proposed algorithm is shown to be consistent in terms of the misclassification risk and its asymptotic fairness is established. Moreover, for this problem, the proposed procedure outperforms state-of-the-art algorithms in the field.Chapter 3 describes the setup of confidence set multi-class classification. Again, a semi-supervised procedure is proposed and its nearly minimax optimality is established. It is additionally shown that no supervised algorithm can achieve a so-called fast rate of convergence. In contrast, the proposed semi-supervised procedure can achieve fast rates provided that the size of the unlabeled data is sufficiently large.Chapter 4 describes a setup of multi-label classification where one aims at minimizing false negative error subject to almost sure type constraints. In this part two specific constraints are considered -- sparse predictions and predictions with the control over false negative errors. For the former, a supervised algorithm is provided and it is shown that this algorithm can achieve fast rates of convergence. For the later, it is shown that extra assumptions are necessary in order to obtain theoretical guarantees in this case
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Gimati, Yousef M. T. "Bootstrapping techniques to improve classification methods". Thesis, University of Leeds, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.401072.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Kobayashi, Izumi. "Randomized ensemble methods for classification trees". Diss., Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2002. http://library.nps.navy.mil/uhtbin/hyperion-image/02sep%5FKobayashi.pdf.

Testo completo
Abstract (sommario):
Thesis (Ph. D. in Operations Research)--Naval Postgraduate School, September 2002.
Dissertation supervisor: Samuel E. Buttrey. Includes bibliographical references (p. 117-119). Also available online.
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Baker, Jonathan Peter. "Methods of Music Classification and Transcription". BYU ScholarsArchive, 2012. https://scholarsarchive.byu.edu/etd/3330.

Testo completo
Abstract (sommario):
We begin with an overview of some signal processing terms and topics relevant to music analysis including facts about human sound perception. We then discuss common objectives of music analysis and existing methods for accomplishing them. We conclude with an introduction to a new method of automatically transcribing a piece of music from a digital audio signal.
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Clibbon, Alex P. "Methods of classification of the cardiotocogram". Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:550bb5ea-bee8-4eb8-95e2-f16c54d7cd68.

Testo completo
Abstract (sommario):
This Thesis compares CTG classification techniques proposed in the literature and their potential extensions. A comparison between four classifiers previously assessed - Adaboost, Artificial Neural Networks (ANN), Random Forest (RF), Support Vector Machine (SVM) - and two proposed classifiers - Bayesian ANN (BANN), Relevance Vector Machine - was conducted using a database of 7,568 cases and two open source databases. The Random Forest (RF) achieved the highest average result and was proposed as a benchmark classifier. The proposal to use model certainty to introduce a third, unclassified, class was investigated using the BANN. An increase in the classification accuracy was demonstrated, however the proportion of cases in the unclassified class was too great to be of practical value. The information content of time series was explored using a Hidden Markov Model (HMM). The average performance of the HMM was comparable with the performance of the benchmark with a smaller distribution across validation folds, demonstrating that time-series information provides more stable estimates of class than stationary methods. Finally a method of system identification was implemented. Significant differences between feature trends and histograms in low pH (< 7.1) and healthy pH (≥ 7.1) cases were observed. These features were used as classifier inputs, and achieved performance similar to existing feature sets. When these features were aligned according the onset of stage 2 labour three unique trend patterns were discovered.
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Felldin, Markus. "Machine Learning Methods for Fault Classification". Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-183132.

Testo completo
Abstract (sommario):
This project, conducted at Ericsson AB, investigates the feasibility of implementing machine learning techniques in order to classify dump files for more effi cient trouble report routing. The project focuses on supervised machine learning methods and in particular Bayesian statistics. It shows that a program utilizing Bayesian methods can achieve well above random prediction accuracy. It is therefore concluded that machine learning methods may indeed become a viable alternative to human classification of trouble reports in the near future.
Detta examensarbete, utfört på Ericsson AB, ämnar att undersöka huruvida maskininlärningstekniker kan användas för att klassificera dumpfiler för mer effektiv problemidentifiering. Projektet fokuserar på övervakad inlärning och då speciellt Bayesiansk klassificering. Arbetet visar att ett program som utnyttjar Bayesiansk klassificering kan uppnå en noggrannhet väl över slumpen. Arbetet indikerar att maskininlärningstekniker mycket väl kan komma att bli användbara alternativ till mänsklig klassificering av dumpfiler i en nära framtid.
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Beghtol, Clare. "James Duff Brown's Subject Classification and Evaluation Methods for Classification Systems". dLIST, 2004. http://hdl.handle.net/10150/106250.

Testo completo
Abstract (sommario):
James Duff Brown (1862-1914), an important figure in librarianship in late nineteenth and early twentieth century England, made contributions in many areas of his chosen field. His Subject Classification (SC), however, has not received much recognition for its theoretical and practical contributions to bibliographic classification theory and practice in the twentieth century. This paper discusses some of the elements of SC that both did and did not inform future bibliographic classification work, considers some contrasting evaluation methods in the light of advances in bibliographic classification theory and practice and of commentaries on SC, and suggests directions for further research.
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Ravindran, Sourabh. "Physiologically Motivated Methods For Audio Pattern Classification". Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14066.

Testo completo
Abstract (sommario):
Human-like performance by machines in tasks of speech and audio processing has remained an elusive goal. In an attempt to bridge the gap in performance between humans and machines there has been an increased effort to study and model physiological processes. However, the widespread use of biologically inspired features proposed in the past has been hampered mainly by either the lack of robustness across a range of signal-to-noise ratios or the formidable computational costs. In physiological systems, sensor processing occurs in several stages. It is likely the case that signal features and biological processing techniques evolved together and are complementary or well matched. It is precisely for this reason that modeling the feature extraction processes should go hand in hand with modeling of the processes that use these features. This research presents a front-end feature extraction method for audio signals inspired by the human peripheral auditory system. New developments in the field of machine learning are leveraged to build classifiers to maximize the performance gains afforded by these features. The structure of the classification system is similar to what might be expected in physiological processing. Further, the feature extraction and classification algorithms can be efficiently implemented using the low-power cooperative analog-digital signal processing platform. The usefulness of the features is demonstrated for tasks of audio classification, speech versus non-speech discrimination, and speech recognition. The low-power nature of the classification system makes it ideal for use in applications such as hearing aids, hand-held devices, and surveillance through acoustic scene monitoring
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Kim, Heeyoung. "Statistical methods for function estimation and classification". Diss., Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/44806.

Testo completo
Abstract (sommario):
This thesis consists of three chapters. The first chapter focuses on adaptive smoothing splines for fitting functions with varying roughness. In the first part of the first chapter, we study an asymptotically optimal procedure to choose the value of a discretized version of the variable smoothing parameter in adaptive smoothing splines. With the choice given by the multivariate version of the generalized cross validation, the resulting adaptive smoothing spline estimator is shown to be consistent and asymptotically optimal under some general conditions. In the second part, we derive the asymptotically optimal local penalty function, which is subsequently used for the derivation of the locally optimal smoothing spline estimator. In the second chapter, we propose a Lipschitz regularity based statistical model, and apply it to coordinate measuring machine (CMM) data to estimate the form error of a manufactured product and to determine the optimal sampling positions of CMM measurements. Our proposed wavelet-based model takes advantage of the fact that the Lipschitz regularity holds for the CMM data. The third chapter focuses on the classification of functional data which are known to be well separable within a particular interval. We propose an interval based classifier. We first estimate a baseline of each class via convex optimization, and then identify an optimal interval that maximizes the difference among the baselines. Our interval based classifier is constructed based on the identified optimal interval. The derived classifier can be implemented via a low-order-of-complexity algorithm.
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Gretton, Arthur Lindsey. "Kernel methods for classification and signal separation". Thesis, University of Cambridge, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.615875.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
12

Fong, Wai Lam. "Numerical methods for classification and image restoration". HKBU Institutional Repository, 2013. http://repository.hkbu.edu.hk/etd_ra/1488.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
13

Varnavas, Andreas Soteriou. "Signal processing methods for EEG data classification". Thesis, Imperial College London, 2008. http://hdl.handle.net/10044/1/11943.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
14

Lohr, Marisa. "Methods for the genetic classification of languages". Thesis, University of Cambridge, 1999. https://www.repository.cam.ac.uk/handle/1810/251688.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
15

BRHANIE, BEKALU MULLU. "Multi-Label Classification Methods for Image Annotation". Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-13725.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
16

Saldanha, Richard A. "Graph-theoretic methods in discrimination and classification". Thesis, University of Oxford, 1998. https://ora.ox.ac.uk/objects/uuid:3a06dee1-00e9-4b56-be8e-e991a570ced6.

Testo completo
Abstract (sommario):
This thesis is concerned with the graphical modelling of multivariate data. The main aim of graphical modelling is to provide an easy to understand visual representation of, often complex, data relationships by fitting graphs to data. The graphs consist of nodes denoting random variables and connecting lines or edges are used to depict variable dependencies. Equivalently, the absence of particular edges in a graph describe conditional independencies between random variables. The resulting structure is called a conditional independence graph. The use of conditional independence graphs as a guide to discrete (mainly binary), normal and mixed conditional Gaussian model building is described. The problem of parameter estimation in fitting conditional Gaussian models is considered. A FORTRAN 77 program called CGM is developed and used to fit conditional Gaussian models. Submodel specification, model selection criteria and goodness-of-fit are explored. A procedure for discriminating between groups is constructed using fitted conditional Gaussian models. A Bayesian classification procedure is considered and is used to compute posterior classification probabilities. Standard bias-correcting error rates are used to test the performance of estimated classification rules. The graph-theoretic methodology described in this thesis is applied to a Scandinavian study of intrauterine foetal growth retardation also known as a small-for-gestational age (SGA) birth. Possible pre-pregnancy risk factors associated with SGA births are investigated using conditional independence graphs and an attempt is made to classify SGA births using fitted conditional Gaussian models.
Gli stili APA, Harvard, Vancouver, ISO e altri
17

Cope, James S. "Computational methods for the classification of plants". Thesis, Kingston University, 2014. http://eprints.kingston.ac.uk/28759/.

Testo completo
Abstract (sommario):
Plants are of fundamental importance to life on Earth. The shapes of leaves, petals and whole plants are of great significance to plant science, as they can help to distinguish between different species, to measure plant health, and even to model climate change. The current availability of botanists is increasingly failing to meet the growing demands for their expertise. These demands range from amateurs desiring help in identifying plants, to agricultural applications such as automated weeding systems, and to the cataloguing of biodiversity for conservational purposes. This thesis aims to help fill this gap, by exploring computational techniques for the automated analysis and classification of plants from images of their leaves. The main objective is to provide novel techniques and the required frame¬work for a robust, automated plant identification system. This involves firstly the accurate extraction of different features of the leaf and the generation of appropriate descriptors. One of the biggest challenges involved in working with plants is the high amounts of variation that may occur within a species, and high similarity that exists between some species. Algorithms are introduced which aim to allow accurate classification in spite of this. With many features of the leaf being available for use in classification, a suitable framework is required for combining them. An efficient method is proposed which selects on a leaf-by-leaf basis which of the leaf features are most likely to be of use. This decreases computational costs whilst increasing accuracy, by ignoring unsuitable features. Finally a study is carried out looking at how professional botanists view leaf images. Much can be learnt from the behaviour of experts which can be applied to the task at hand. Eye-tracking technology is used to establish the difference between how botanists and non-botanists view leaf images, and preliminary work is performed towards utilizing this information in an automated system.
Gli stili APA, Harvard, Vancouver, ISO e altri
18

Hung, Jane Yen. "Making computer vision Methods accessible for cell classification". Thesis, Massachusetts Institute of Technology, 2018. https://hdl.handle.net/1721.1/121894.

Testo completo
Abstract (sommario):
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Chemical Engineering, 2018
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 107-113).
Computers are better than ever at extracting information from visual media like images, which are especially powerful in biology. The field of computer vision tries to take advantage of this fact and use computational algorithms to analyze image data and gain higher level understanding. Recent advances in machine learning such as deep learning based architectures have greatly expanded their potential. However, biologists often lack the training or means to use new software or algorithms, leading to slower or less complete results. This thesis focuses on developing different computer vision methods and software implementations for biological applications that are both easy to use and customizable. The first application is cardiomyocytes, which contain sarcomeric qualities that can be quantified with spectral analysis. Next, CellProfiler Analyst, an updated software application for interactive machine learning classification and feature analysis is described along with its use for classifying imaging flow cytometry data. Further software related advances include the first demonstration of a deep learning based model designed to classify biological images with a user-friendly interface. Finally, blood smear images of malaria-infected blood are examined using traditional machine learning based segmentation pipelines and using novel deep learning based object detection models. To entice further development of these types of object detection models, a software package for simpler object detection training and testing called Keras R-CNN is presented. The applications investigated here show how computer vision can be a viable option for biologists who want to take advantage of their image data.
by Jane Yen Hung.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Chemical Engineering
Gli stili APA, Harvard, Vancouver, ISO e altri
19

Hofmeyr, David. "Projection methods for clustering and semi-supervised classification". Thesis, Lancaster University, 2016. http://eprints.lancs.ac.uk/87219/.

Testo completo
Abstract (sommario):
This thesis focuses on data projection methods for the purposes of clustering and semi-supervised classification, with a primary focus on clustering. A number of contributions are presented which address this problem in a principled manner; using projection pursuit formulations to identify subspaces which contain useful information for the clustering task. Projection methods are extremely useful in high dimensional applications, and situations in which the data contain irrelevant dimensions which can be counterinformative for the clustering task. The final contribution addresses high dimensionality in the context of a data stream. Data streams and high dimensionality have been identified as two of the key challenges in data clustering. The first piece of work is motivated by identifying the minimum density hyperplane separator in the finite sample setting. This objective is directly related to the problem of discovering clusters defined as connected regions of high data density, which is a widely adopted definition in non-parametric statistics and machine learning. A thorough investigation into the theoretical aspects of this method, as well as the practical task of solving the associated optimisation problem efficiently is presented. The proposed methodology is applied to both clustering and semi-supervised classification problems, and is shown to reliably find low density hyperplane separators in both contexts. The second and third contributions focus on a different approach to clustering based on graph cuts. The minimum normalised graph cut objective has gained considerable attention as relaxations of the objective have been developed, which make them solvable for reasonably well sized problems. This has been adopted by the highly popular spectral clustering methods. The second piece of work focuses on identifying the optimal subspace in which to perform spectral clustering, by minimising the second eigenvalue of the graph Laplacian for a graph defined over the data within that subspace. A rigorous treatment of this objective is presented, and an algorithm is proposed for its optimisation. An approximation method is proposed which allows this method to be applied to much larger problems than would otherwise be possible. An extension of this work deals with the spectral projection pursuit method for semi-supervised classification. iii The third body of work looks at minimising the normalised graph cut using hyperplane separators. This formulation allows for the exact normalised cut to be computed, rather than the spectral relaxation. It also allows for a computationally efficient method for optimisation. The asymptotic properties of the normalised cut based on a hyperplane separator are investigated, and shown to have similarities with the clustering objective based on low density separation. In fact, both the methods in the second and third works are shown to be connected with the first, in that all three have the same solution asymptotically, as their relative scaling parameters are reduced to zero. The final body of work addresses both problems of high dimensionality and incremental clustering in a data stream context. A principled statistical framework is adopted, in which clustering by low density separation again becomes the focal objective. A divisive hierarchical clustering model is proposed, using a collection of low density hyperplanes. The adopted framework provides well founded methodology for determining the number of clusters automatically, and also identifying changes in the data stream which are relevant to the clustering objective. It is apparent that no existing methods can make both of these claims.
Gli stili APA, Harvard, Vancouver, ISO e altri
20

Bazin, Alexander Ian. "On probabilistic methods for object description and classification". Thesis, University of Southampton, 2006. https://eprints.soton.ac.uk/263161/.

Testo completo
Abstract (sommario):
This thesis extends the utility of probabilistic methods in two diverse domains: multimodal biometrics and machine inspection. The attraction for this approach is that it is easily understood by those using such a system; however the advantages extend beyond the ease of human utility. Probabilistic measures are ideal for combination since they are guaranteed to be within a fixed range and are generally well scaled. We describe the background to probabilistic techniques and critique common implementations used by practitioners. We then set out our novel probabilistic framework for classification and verification, discussing the various optimisations and placing this framework within a data fusion context. Our work on biometrics describes the complex system we have developed for collection of multimodal biometrics, including collection strategies, system components and the modalities employed. We further examine the performance of multimodal biometrics; particularly examining performance prediction, modality correlation and the use of imbalanced classifiers. We show the benefits from score fused multimodal biometrics, even in the imbalanced case and how the decidability index may be used for optimal weighting and performance prediction. In examining machine inspection we describe in detail the development of a complex system for the automated examination of ophthalmic contact lenses. We demonstrate the performance of this system and describe the benefits that complex image processing techniques and probabilistic methods can bring to this field. We conclude by drawing these two areas together, critically evaluating the work and describing further work that we feel is necessary in the field.
Gli stili APA, Harvard, Vancouver, ISO e altri
21

Wang, Rui. "Comparisons of Classification Methods in Efficiency and Robustness". The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1345564802.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
22

He, Ping. "Classification methods and applications to mass spectral data". HKBU Institutional Repository, 2005. http://repository.hkbu.edu.hk/etd_ra/593.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
23

Isaksson, Ola. "Classification of Flying Qualities with Machine Learning Methods". Thesis, KTH, Flygdynamik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302145.

Testo completo
Abstract (sommario):
The primary objective of this thesis is to evaluate the prospect of machine learning methods being used to classify flying qualities based on simulator data (with the focus being on pitch maneuvers). If critical flying qualities could be identified earlier in the verification process, they can be further invested in and focused on with less cost for design changes of the flight control system. Information from manned simulations with given flying quality levels are used to create a replication of the performed pitch maneuver in a desktop simulator. The generated flight data is represented by different measures in the classification to separately train and test the machine learning models against the given flying quality level. The models used are Logistic Regression, Support Vector Machines with radial basis functions (RBF), linear and polynomial kernels along with Artificial Neural Networks.  The results show that the classifiers correctly identify at least 80% of cases with critical flying qualities. The classification shows that the statistical measures of the time signals and first order time derivatives of pitch, roll and yaw rates are enough for classification within the scope of this thesis. The different machine learning models show no significant difference in performance in the scope of this thesis. In conclusion, machine learning methods show good potential for classification of flying qualities, and could become an important tool for evaluating flying qualities of large amounts of simulations, in addition to manned simulations.
Huvuduppgiften med detta examensarbete är att utvärdera huruvida maskininlärning kan användas för att klassificera flygkvaliteter från simulatordata (där fokus ligger på att utvärdera tippmanövrar). Om kritiska flygkvaliteter kan identifieras tidigare i verifikationsprocessen, kan resurser fokuseras för att åtgärda problemet tidigt med mindre kostnader för ändringar av styrsystemet. Information från bemannade simuleringar där flygkvalitetsnivåer har angetts av pilot används för att återskapa tippmanövern i skrivbordssimulatorn. Den genererade flygdatan representeras av olika mått i klassificeringen för att separat träna och testa maskininlärningsmodellerna mot den givna flygkvalitetsnivån. De modeller som används i rapporten är logistisk regression, stödvektormaskiner med radiella basfunktioner (RBF), linjär och polynomisk kärna samt artificiella neurala nätverk. Resultaten visar att klassificerarna korrekt identifierar över 80% av fallen med kritiska flygkvaliteter. Klassificeringen visar att statistiska mått av tidssignalen och första ordningens tidsderivator i tipp, roll och gir är tillräckligt för klassificering inom gränserna av detta examensarbete. De olika maskininlärningsmodellerna visar inga signifikanta skillnader i prestanda med datan som används. Sammanfattningsvis kan maskininlärningsmodellerna anses ha god potential för klassificering av flygkvaliteter, och kan vara ett viktigt verktyg för att klassificera flygkvaliteter för stora mängder flygdata, som komplement till bemannade simuleringar.
Gli stili APA, Harvard, Vancouver, ISO e altri
24

Lamont, Morné Michael Connell. "Binary classification trees : a comparison with popular classification methods in statistics using different software". Thesis, Stellenbosch : Stellenbosch University, 2002. http://hdl.handle.net/10019.1/52718.

Testo completo
Abstract (sommario):
Thesis (MComm) -- Stellenbosch University, 2002.
ENGLISH ABSTRACT: Consider a data set with a categorical response variable and a set of explanatory variables. The response variable can have two or more categories and the explanatory variables can be numerical or categorical. This is a typical setup for a classification analysis, where we want to model the response based on the explanatory variables. Traditional statistical methods have been developed under certain assumptions such as: the explanatory variables are numeric only and! or the data follow a multivariate normal distribution. hl practice such assumptions are not always met. Different research fields generate data that have a mixed structure (categorical and numeric) and researchers are often interested using all these data in the analysis. hl recent years robust methods such as classification trees have become the substitute for traditional statistical methods when the above assumptions are violated. Classification trees are not only an effective classification method, but offer many other advantages. The aim of this thesis is to highlight the advantages of classification trees. hl the chapters that follow, the theory of and further developments on classification trees are discussed. This forms the foundation for the CART software which is discussed in Chapter 5, as well as other software in which classification tree modeling is possible. We will compare classification trees to parametric-, kernel- and k-nearest-neighbour discriminant analyses. A neural network is also compared to classification trees and finally we draw some conclusions on classification trees and its comparisons with other methods.
AFRIKAANSE OPSOMMING: Beskou 'n datastel met 'n kategoriese respons veranderlike en 'n stel verklarende veranderlikes. Die respons veranderlike kan twee of meer kategorieë hê en die verklarende veranderlikes kan numeries of kategories wees. Hierdie is 'n tipiese opset vir 'n klassifikasie analise, waar ons die respons wil modelleer deur gebruik te maak van die verklarende veranderlikes. Tradisionele statistiese metodes is ontwikkelonder sekere aannames soos: die verklarende veranderlikes is slegs numeries en! of dat die data 'n meerveranderlike normaal verdeling het. In die praktyk word daar nie altyd voldoen aan hierdie aannames nie. Verskillende navorsingsvelde genereer data wat 'n gemengde struktuur het (kategories en numeries) en navorsers wil soms al hierdie data gebruik in die analise. In die afgelope jare het robuuste metodes soos klassifikasie bome die alternatief geword vir tradisionele statistiese metodes as daar nie aan bogenoemde aannames voldoen word nie. Klassifikasie bome is nie net 'n effektiewe klassifikasie metode nie, maar bied baie meer voordele. Die doel van hierdie werkstuk is om die voordele van klassifikasie bome uit te wys. In die hoofstukke wat volg word die teorie en verdere ontwikkelinge van klassifikasie bome bespreek. Hierdie vorm die fondament vir die CART sagteware wat bespreek word in Hoofstuk 5, asook ander sagteware waarin klassifikasie boom modelering moontlik is. Ons sal klassifikasie bome vergelyk met parametriese-, "kernel"- en "k-nearest-neighbour" diskriminant analise. 'n Neurale netwerk word ook vergelyk met klassifikasie bome en ten slote word daar gevolgtrekkings gemaak oor klassifikasie bome en hoe dit vergelyk met ander metodes.
Gli stili APA, Harvard, Vancouver, ISO e altri
25

Randolph, Tami Rochele. "Image compression and classification using nonlinear filter banks". Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/13439.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
26

Gasanova, Tatiana [Verfasser]. "Novel methods for text preprocessing and classification / Tatiana Gasanova". Ulm : Universität Ulm. Fakultät für Ingenieurwissenschaften und Informatik, 2015. http://d-nb.info/1075568404/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
27

Wang, Wei. "Predictive modeling based on classification and pattern matching methods". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0019/MQ51498.pdf.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
28

Delmege, James W. "CLASS : a study of methods for coarse phonetic classification /". Online version of thesis, 1988. http://hdl.handle.net/1850/10449.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
29

Ding, Yunfei. "Application of Clustering and Classification Methods to Pattern Recognition". Thesis, University of Sheffield, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.511954.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
30

Nasser, Sara. "Fuzzy methods for meta-genome sequence classification and assembly". abstract and full text PDF (free order & download UNR users only), 2008. http://0-gateway.proquest.com.innopac.library.unr.edu/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3307706.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
31

Le, Truc Duc. "Machine Learning Methods for 3D Object Classification and Segmentation". Thesis, University of Missouri - Columbia, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=13877153.

Testo completo
Abstract (sommario):

Object understanding is a fundamental problem in computer vision and it has been extensively researched in recent years thanks to the availability of powerful GPUs and labelled data, especially in the context of images. However, 3D object understanding is still not on par with its 2D domain and deep learning for 3D has not been fully explored yet. In this dissertation, I work on two approaches, both of which advances the state-of-the-art results in 3D classification and segmentation.

The first approach, called MVRNN, is based multi-view paradigm. In contrast to MVCNN which does not generate consistent result across different views, by treating the multi-view images as a temporal sequence, our MVRNN correlates the features and generates coherent segmentation across different views. MVRNN demonstrated state-of-the-art performance on the Princeton Segmentation Benchmark dataset.

The second approach, called PointGrid, is a hybrid method which combines points and regular grid structure. 3D points can retain fine details but irregular, which is challenge for deep learning methods. Volumetric grid is simple and has regular structure, but does not scale well with data resolution. Our PointGrid, which is simple, allows the fine details to be consumed by normal convolutions under a coarser resolution grid. PointGrid achieved state-of-the-art performance on ModelNet40 and ShapeNet datasets in 3D classification and object part segmentation.

Gli stili APA, Harvard, Vancouver, ISO e altri
32

Andrews, Suzanne L. D. (Suzanne Lois Denise). "A classification of carbon footprint methods used by companies". Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/51642.

Testo completo
Abstract (sommario):
Thesis (M. Eng. in Logistics)--Massachusetts Institute of Technology, Engineering Systems Division, 2009.
Includes bibliographical references (leaves 50-54).
The percent increase in greenhouse gas (GHG) concentration in the atmosphere can be harmful to the environment. There is no single preferred method for measuring GHG output. How can a company classify and choose an appropriate method? This thesis offers a classification of current methods used by companies to measure their GHG output.
by Suzanne L. D. Andrews.
M.Eng.in Logistics
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Olfert, Jason Scott. "On new methods of ultra-fine particle mass classification". Thesis, University of Cambridge, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.614186.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
34

Danielsson, Benjamin. "A Study on Text Classification Methods and Text Features". Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-159992.

Testo completo
Abstract (sommario):
When it comes to the task of classification the data used for training is the most crucial part. It follows that how this data is processed and presented for the classifier plays an equally important role. This thesis attempts to investigate the performance of multiple classifiers depending on the features that are used, the type of classes to classify and the optimization of said classifiers. The classifiers of interest are support-vector machines (SMO) and multilayer perceptron (MLP), the features tested are word vector spaces and text complexity measures, along with principal component analysis on the complexity measures. The features are created based on the Stockholm-Umeå-Corpus (SUC) and DigInclude, a dataset containing standard and easy-to-read sentences. For the SUC dataset the classifiers attempted to classify texts into nine different text categories, while for the DigInclude dataset the sentences were classified into either standard or simplified classes. The classification tasks on the DigInclude dataset showed poor performance in all trials. The SUC dataset showed best performance when using SMO in combination with word vector spaces. Comparing the SMO classifier on the text complexity measures when using or not using PCA showed that the performance was largely unchanged between the two, although not using PCA had slightly better performance
Gli stili APA, Harvard, Vancouver, ISO e altri
35

Newling, James. "Novel methods of supernova classification and type probability estimation". Master's thesis, University of Cape Town, 2011. http://hdl.handle.net/11427/11174.

Testo completo
Abstract (sommario):
Future photometric surveys will provide vastly more supernovae than have presently been observed, the majority of which will not be spectroscopically typed. Key to extracting information from these future datasets will be the efficient use of light-curves. In the first part of this thesis we introduce two methods for distinguishing type Ia supernovae from their contaminating counterparts, kernel density estimation and boosting. In the second half of this thesis we shift focus from classification to the related problem of type probability estimation, and ask how best to use type probabilities.
Gli stili APA, Harvard, Vancouver, ISO e altri
36

Kelley, Edward T. II. "Comparative Analysis of Obesity Classification Methods in Aging Adults". Bowling Green State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1429283749.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
37

Towey, David John. "SPECT imaging and automatic classification methods in movement disorders". Thesis, Imperial College London, 2013. http://hdl.handle.net/10044/1/11182.

Testo completo
Abstract (sommario):
This work investigates neuroimaging as applied to movement disorders by the use of radionuclide imaging techniques. There are two focuses in this work: 1) The optimisation of the SPECT imaging process including acquisition and image reconstruction. 2) The development and optimisation of automated analysis techniques The first part has included practical measurements of camera performance using a range of phantoms. Filtered back projection and iterative methods of image reconstruction were compared and optimised. Compensation methods for attenuation and scatter are assessed. Iterative methods are shown to improve image quality over filtered back projection for a range of image quality indexes. Quantitative improvements are shown when attenuation and scatter compensation techniques are applied, but at the expense of increased noise. The clinical acquisition and processing procedures were adjusted accordingly. A large database of clinical studies was used to compare commercially available DaTSCAN quantification software programs. A novel automatic analysis technique was then developed by combining Principal Component Analysis (PCA) and machine learning techniques (including Support Vector Machines, and Naive Bayes). The accuracy of the various classification methods under different conditions is investigated and discussed. The thesis concludes that the described method can allow automatic classification of clinical images with equal or greater accuracy to that of commercially available systems.
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Gewehr, Jan Erik. "New Methods for the Prediction and Classification of Protein Domains". Diss., lmu, 2007. http://nbn-resolving.de/urn:nbn:de:bvb:19-80287.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Kandaswamy, Krishna Kumar [Verfasser]. "Sequence function classification by machine learning methods / Krishna Kumar Kandaswamy". Lübeck : Zentrale Hochschulbibliothek Lübeck, 2012. http://d-nb.info/1023624257/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Nilsson, Daniel. "Investigating the effect of microarray preprocessing methods on tumor classification". Thesis, Uppsala universitet, Statistiska institutionen, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-258951.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
41

Mohamed, Ghada. "Text classification in the BNC using corpus and statistical methods". Thesis, Lancaster University, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.658020.

Testo completo
Abstract (sommario):
The main part of this thesis sets out to develop a system of categories within a text typology. Although there exist many different approaches to the classification of text into categories, this research fills a gap in the literature, as most work on text classification is based on features external to the text such as the text's purpose, the aim of discourse, and the medium of communication. Text categories that have been set up based on some external features are not linguistically defined. In consequence, texts that belong to the same type are not necessarily similar in their linguistic forms. Even Biber's (1988) linguistically-oriented work was based on externally defined ~registers. Further, establishing text categories based on text-external features favours theoretical and qualitative approaches of text classification. These approaches can be seen as top-down approaches where external features are defined functionally in advance, and subsequently patterns of linguistic features are described in relation to each function. In such a case, the process of linking texts with a particular type is not done in a systematic way. In this thesis, I show how a text typology based on similarities in linguistic form can be developed systematically using a multivariate statistical technique; namely, cluster analysis. Following a review of various possible approaches to multivariate statistical analysis, I argue that cluster analysis is the most appropriate for systematising the study of text classification, because it has the distinctive feature of placing objects into distinct groupings based on their overall similarities across multiple variables. Cluster analysis identifies these grouping algorithmically. The objects to be clustered in my thesis are the written texts in the British National Corpus (BNC). I will make use of the written part only, since results of previous research which attempts to classify texts of this dataset were not very beneficial. Takahashi (2006), for instance, identified merely a broad distinction between formal and informal styles in the written part; whereas in the spoken part, he could come up with insightful results. Thus, it seems justifiable to look at the part of the BNC which Taka..1.ashi found intractable, using a different multivariate technique, to see if this methodology allows patterns to emerge in the dataset. Further, there are two other reasons to use the written BNC. First, some studies (e.g. Akinnaso 1982; Chafe and Danielewicz 1987) suggest that distinctions between text varieties based on frequencies of linguistic features can be identified even within one mode of communication, i.e. writing. Second, analysing written text varieties has direct implications for pedagogy (Biber and Conrad 2009). The variables measured in the written texts of the BNC are linguistic features that have functional associations. However, any linguistic feature can be interpreted functionally; hence, we cannot say that there is an easy way to decide on a list of linguistic features to investigate text varieties. In this thesis, the list of linguistic features is informed by some aspects of Systemic Functional Theory (STF) and characteristics identified in previous research on writing, as opposed to speech. SFT lends itself to the interpretation of how language is used through functional associations of linguistic features, treating meaning and form as two inseparable notions. This characteristic of SFT can be one source to inform my research to some extent, which assumes that a model of text-types can be established by investigating not only the linguistic features shared in each type, but also the functions served by these linguistic features in each type. However, there is no commitment in this study to aspects of SFT other than those I have discussed here. Similarly, the linguistic features that reflect characteristics of speech and writing identified in previous research also have a crucial role in distinguishing between different texts. For instance, writing is elaborate, and this is associated with linguistic features such as subordinate clauses, prepositional phrases, adjectives, and so on. However, these characteristics do not only reflect the distinction between speech and writing; they can also distinguish between different spoken texts or different written texts (see Akinnaso 1982). Thus, the linguistic features seen as important from these two perspectives are included in my list of linguistic features. To make the list more principled and exhaustive, I also consult a comprehensive corpus-based work on English language, along with some microscopic studies examining individual features in different registers. The linguistic features include personal pronouns, passive constructions, prepositional phrases, nominalisation, modal auxiliaries, adverbs, and adjectives. Computing a cluster analysis based on this data is a complex process with many steps. At each step, several alternative techniques are available. Choosing among the available teclmiques is a non-trivial decision, as multiple alternatives are in common use by statisticians. I demonstrate how a process of testing several combinations of clustering methods, in order to determine the most useful/stable clustering combination(s) for use in the classification of texts by their linguistic features . To test the robustness of the clustering algorithms techniques and to validate the cluster analysis, I use three validation techniques for cluster analysis, namely the cophenetic coefficient, the adjusted Rand index, and the AV p-value. The findings of the cluster analysis represent a plausible attempt to systematise the study of the diversity of texts by means of automatic classification. Initially, the cluster analysis resulted in 16 clusters/text types. However, a thorough investigation of those 16 clusters reveals that some clusters represent quite similar text types. Thus, it is possible to establish overall headings for similar types, reflecting their shared linguistic features. The resulting typology contains six major text types: persuasion, narration, informational narration, exposition, scientific exposition, and literary exposition. Cluster analysis thus proves to be a powerful tool for structuring the data, if used with caution. The way it is implemented in this study constitutes an advance in the field of text typology. Finally, a small-scale case study of the validity of the text typology is carried out. A questionnaire is used to find out whether and to what extent my taxonomy corresponds to native speakers' understanding of textual variability, that is, whether the taxonomy has some mental reality for native speakers of English. The results showed that native speakers of English, on the one hand, are good at explicitly identifying the grammatical features associated with scientific exposition and narration; but on the other hand, they are not so good at identifying the grammatical features associated with literary exposition and persuasion. The results also showed that participants seem to have difficulties in identifying grammatical features of informational narration. The results of this small-scale case study indicate that the text typology in my thesis is, to some extent, a phenomenon that native speakers are aware of, and thus we can justify placing our trust in the results - at least in their general pattern, if not in every detail.
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Oosthuizen, Surette. "Variable selection for kernel methods with application to binary classification". Thesis, Stellenbosch : University of Stellenbosch, 2008. http://hdl.handle.net/10019.1/1301.

Testo completo
Abstract (sommario):
Thesis (PhD (Statistics and Actuarial Science))—University of Stellenbosch, 2008.
The problem of variable selection in binary kernel classification is addressed in this thesis. Kernel methods are fairly recent additions to the statistical toolbox, having originated approximately two decades ago in machine learning and artificial intelligence. These methods are growing in popularity and are already frequently applied in regression and classification problems. Variable selection is an important step in many statistical applications. Thereby a better understanding of the problem being investigated is achieved, and subsequent analyses of the data frequently yield more accurate results if irrelevant variables have been eliminated. It is therefore obviously important to investigate aspects of variable selection for kernel methods. Chapter 2 of the thesis is an introduction to the main part presented in Chapters 3 to 6. In Chapter 2 some general background material on kernel methods is firstly provided, along with an introduction to variable selection. Empirical evidence is presented substantiating the claim that variable selection is a worthwhile enterprise in kernel classification problems. Several aspects which complicate variable selection in kernel methods are discussed. An important property of kernel methods is that the original data are effectively transformed before a classification algorithm is applied to it. The space in which the original data reside is called input space, while the transformed data occupy part of a feature space. In Chapter 3 we investigate whether variable selection should be performed in input space or rather in feature space. A new approach to selection, so-called feature-toinput space selection, is also proposed. This approach has the attractive property of combining information generated in feature space with easy interpretation in input space. An empirical study reveals that effective variable selection requires utilisation of at least some information from feature space. Having confirmed in Chapter 3 that variable selection should preferably be done in feature space, the focus in Chapter 4 is on two classes of selecion criteria operating in feature space: criteria which are independent of the specific kernel classification algorithm and criteria which depend on this algorithm. In this regard we concentrate on two kernel classifiers, viz. support vector machines and kernel Fisher discriminant analysis, both of which are described in some detail in Chapter 4. The chapter closes with a simulation study showing that two of the algorithm-independent criteria are very competitive with the more sophisticated algorithm-dependent ones. In Chapter 5 we incorporate a specific strategy for searching through the space of variable subsets into our investigation. Evidence in the literature strongly suggests that backward elimination is preferable to forward selection in this regard, and we therefore focus on recursive feature elimination. Zero- and first-order forms of the new selection criteria proposed earlier in the thesis are presented for use in recursive feature elimination and their properties are investigated in a numerical study. It is found that some of the simpler zeroorder criteria perform better than the more complicated first-order ones. Up to the end of Chapter 5 it is assumed that the number of variables to select is known. We do away with this restriction in Chapter 6 and propose a simple criterion which uses the data to identify this number when a support vector machine is used. The proposed criterion is investigated in a simulation study and compared to cross-validation, which can also be used for this purpose. We find that the proposed criterion performs well. The thesis concludes in Chapter 7 with a summary and several discussions for further research.
Gli stili APA, Harvard, Vancouver, ISO e altri
43

Wei, Xuelian. "Statistical methods in classification problems using gene expression / proteomic signatures". Diss., Restricted to subscribing institutions, 2008. http://proquest.umi.com/pqdweb?did=1680042151&sid=2&Fmt=2&clientId=1564&RQT=309&VName=PQD.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
44

PERES, RODRIGO TOSTA. "NEW TECHNIQUES OF PATTERN CLASSIFICATION BASED ON LOCAL-GLOBAL METHODS". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2008. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=12959@1.

Testo completo
Abstract (sommario):
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
O foco desta tese está direcionado a problemas de Classificação de Padrões. A proposta central é desenvolver e testar alguns novos algoritmos para ambientes supervisionados, utilizando um enfoque local- global. As principais contribuições são: (i) Desenvolvimento de método baseado em quantização vetorial com posterior classificação supervisionada local. O objetivo é resolver o problema de classificação estimando as probabilidades posteriores em regiões próximas à fronteira de decisão; (ii) Proposta do que denominamos Zona de Risco Generalizada, um método independente de modelo, para encontrar as observações vizinhas à fronteira de decisão; (iii) Proposta de método que denominamos Quantizador Vetorial das Fronteiras de Decisão, um método de classificação que utiliza protótipos, cujo objetivo é construir uma aproximação quantizada das regiões vizinhas à fronteira de decisão. Todos os métodos propostos foram testados em bancos de dados, alguns sintéticos e outros publicamente disponíveis.
This thesis is focused on Pattern Classification problems. The objective is to develop and test new supervised algorithms with a local-global approach. The main contributions are: (i) A method based on vector quantization with posterior supervised local classification. The classification problem is solved by the estimation of the posterior probabilities near the decision boundary; (ii) Propose of what we call Zona de Risco Generalizada, an independent model method to find observations near the decision boundary; (iii) Propose of what we call Quantizador Vetorial das Fronteiras de Decisão, a classification method based on prototypes that build a quantized approximation of the decision boundary. All methods were tested in synthetics or real datasets.
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Makinde, Olusola Samuel. "On some classification methods for high dimensional and functional data". Thesis, University of Birmingham, 2015. http://etheses.bham.ac.uk//id/eprint/5568/.

Testo completo
Abstract (sommario):
In this study, we propose classification method based on multivariate rank. We show that this classifier is Bayes rule under suitable conditions. Multivariate ranks are not invariant under affine transformation of the data and so, the effect of deviation from property of spherical symmetry is investigated. Based on this, we construct affine invariant version of this classifier. When the distributions of competing populations have different covariance matrices, minimum rank classifier performs poorly irrespective of affine invariance. To overcome this limitation, we propose a classifier based on multivariate rank region. The asymptotic properties of this method and its associated probability of misclassification are studied. Also, we propose classifiers based on the distribution of the spatial rank and establish some theoretical results for this classification method. For affine invariant version of this method, two invariants are proposed. Many multivariate techniques fail to perform well when data are curves or functions. We propose classification method based on L\(_2\) distance to spatial median and later generalise it to Lp distance to Lp median. The optimal choice of p is determined by cross validation of misclassification errors. The performances of our propose methods are examined by using simulation and real data set and the results are compared with the results from existing methods.
Gli stili APA, Harvard, Vancouver, ISO e altri
46

Zowid, Fauzi Mohammed. "Development and performance evaluation of multi-criteria inventory classification methods". Thesis, Bordeaux, 2020. http://www.theses.fr/2020BORD0331.

Testo completo
Abstract (sommario):
Cette thèse traite du problème de la classification des produits dans les systèmes de gestion de stock. Plus précisément, elle vise à proposer de nouvelles méthodes de classification pour résoudre le problème de la classification multicritères des produits en stock (MCIC). Actuellement, la méthode ABC de classification des produits en stock est largement utilisée pour rationaliser les systèmes de gestion de stock composés de milliers de produits (SKU). Les méthodes de classification des stocks ABC à un seul critère sont souvent utilisées dans la pratique et, récemment, les MCIC ont également attiré l’attention des chercheurs et des industriels. En ce qui concerne les méthodes multicritères MCIC, un grand nombre de méthodes ont été développées dans la littérature, appartenant à trois approches principales, à savoir: (1) l'approche à base de Machine Learning (ML), (2) programmation mathématique (MP), et (3) multicritères d’aide à la décision (MCDM). Dans ML, de nombreuses méthodes de type ML supervisé ont été proposées ainsi qu'un certain nombre de méthodes hybrides. Cependant, à notre connaissance, très peu d'études de recherche ont envisagé le type ML non supervisé. Concernant les approches de type MP, un certain nombre de méthodes ont été développées en utilisant la programmation linéaire et non linéaire, telles que les méthodes Ng et ZF. Cependant, la plupart de ces méthodes doivent encore être améliorées pour en limiter les inconvénients. Sur MCDM, plusieurs méthodes ont été proposées pour fournir des classifications ABC, y compris la méthode TOPSIS (technique for order preference by similarity to ideal solution), qui est bien connue pour son attractivité et son utilisation, ainsi que certaines méthodes hybrides combinées avec TOPSIS. Il convient de noter que la plupart des études publiées se sont uniquement concentrées sur la proposition de méthodes de classification pour classer les SKUs dans un système de gestion de stock avec un intérêt limité par rapport à l'objectif initial et le plus important de notre travail, qui est la performance en termes de coûts et de niveau de service de la méthode proposée. De plus, la plupart des études existantes n'ont pas considéré des systèmes de gestion de stock avec un grand nombre de données réelles (un grand nombre de références) pour évaluer empiriquement leurs performances et recommander l'utilisation d’une méthode particulière pour des mises en pratique réelles. Ainsi, cette thèse propose d'abord d'évaluer la performance (coût et service) des méthodes MCIC existantes et de proposer diverses méthodes de classification alternatives qui réduisent les coûts et conduisent à des niveaux de service plus élevés. Plus précisément, trois méthodes de type ML non supervisées sont proposées et analysées : Agglomerative hierarchical clustering, Gaussian mixture model et K-means. En outre, d'autres méthodes hybrides dans les approches de type MP et MCDM sont également développées. Ces méthodes proposées représentent une hybridation des méthodes TOPSIS et Ng avec la méthode Triangular distribution, la méthode Simple additive weighting (SAW) et la méthode Multi-Objective Optimization Method by Ratio Analysis (MOORA). Pour mener nos recherches, la thèse analyse empiriquement les performances des méthodes considérées au moyen de deux jeux de données. Le premier jeu de données est un jeu de données benchmark qui provient d’une unité d’hôpital, souvent utilisé dans la littérature traitant des méthodes MCIC, composé de 47 SKUs. Le deuxième jeu de données se compose de 9086 SKUs et provient d'un détaillant aux Pays-Bas qui vend des produits de bricolage. Les performances des méthodes proposées sont comparées à celles des méthodes de classification MCIC existantes dans la littérature. Les résultats empiriques révèlent que les méthodes proposées donnent des performances prometteuses en conduisant à une plus grande efficacité combinée service-coût, notamment pour le second jeu de données très significatif
This thesis deals with the issue of inventory classification within supply chains. More specifically, it aims to provide new alternative classification methods to address the multi-criteria inventory classification (MCIC) problem. It is well known that the ABC inventory classification technique is widely used to streamline inventory systems composed of thousands of stock-keeping-units (SKUs). Single-criterion inventory classification (SCIC) methods are often used in practice and recently MCIC techniques have also attracted researchers and practitioners. With regard to the MCIC techniques, large number of studies have been developed that belong to three main approaches, namely: (1) the machine learning (ML), (2) the mathematical programming (MP), and (3) the multi-criteria decision making (MCDM). On the ML approach, many research methods belonging to the supervised ML type have been proposed as well as a number of hybrid methods. However, to the best of our knowledge, very few research studies have considered the unsupervised ML type. On the MP approach, a number of methods have been developed using linear and non-linear programming, such as the Ng and the ZF methods. Yet, most of these developed methods still can be granted more attentions for more improvements and shortcomings reduction. On the MCDM approach, several methods have been proposed to provide ABC classifications, including the TOPSIS (technique for order preference by similarity to ideal solution) method, which is well known for its wide attractiveness and utilization, as well as some hybrid TOPSIS methods.It is worth noting that most of the published studies have only focused on providing classification methods to rank the SKUs in an inventory system without any interest in the original and most important goal of this exercise, which is achieving a combined service-cost inventory performance, i.e. the maximization of service levels and the minimization of inventory costs. Moreover, most of the existing studies have not considered large and real-life datasets to recommend the run of MCIC technique for real life implementations. Thus, this thesis proposes first to evaluate the inventory performance (cost and service) of existing MCIC methods and to provide various alternative classification methods that lead to higher service and cost performance. More specifically, three unsupervised machine learning methods are proposed and analyzed: the Agglomerative hierarchical clustering, the Gaussian mixture model and K-means. In addition, other hybrid methods within the MP and MCDM approaches are also developed. These proposed methods represent a hybridization of the TOPSIS and Ng methods with the triangular distribution, the Simple additive weighting (SAW) and the Multi-objective optimization method by ratio analysis (MOORA).To conduct our research, the thesis empirically analyzes the performance of the proposed methods by means of two datasets containing more than nine thousand SKUs. The first dataset is a benchmark dataset originating from a Hospital Respiratory Theory Unit, often used in the literature dealing with the MCIC methods, composed of 47 SKUs. The second dataset consists of 9,086 SKUs and coming from a retailer in the Netherlands. The performances of the proposed methods are compared to that of existing MCIC classification methods in the literature. The empirical results reveal that the proposed methods can carry promising performances by leading to a higher combined service-cost efficiency
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Sampaio, de Rezende Rafael. "New methods for image classification, image retrieval and semantic correspondence". Thesis, Paris Sciences et Lettres (ComUE), 2017. http://www.theses.fr/2017PSLEE068/document.

Testo completo
Abstract (sommario):
Le problème de représentation d’image est au cœur du domaine de vision. Le choix de représentation d’une image change en fonction de la tâche que nous voulons étudier. Un problème de recherche d’image dans des grandes bases de données exige une représentation globale compressée, alors qu’un problème de segmentation sémantique nécessite une carte de partitionnement de ses pixels. Les techniques d’apprentissage statistique sont l’outil principal pour la construction de ces représentations. Dans ce manuscrit, nous abordons l’apprentissage des représentations visuels dans trois problèmes différents : la recherche d’image, la correspondance sémantique et classification d’image. Premièrement, nous étudions la représentation vectorielle de Fisher et sa dépendance sur le modèle de mélange Gaussien employé. Nous introduisons l’utilisation de plusieurs modèles de mélange Gaussien pour différents types d’arrière-plans, e.g., différentes catégories de scènes, et analyser la performance de ces représentations pour objet classification et l’impact de la catégorie de scène en tant que variable latente. Notre seconde approche propose une extension de la représentation l’exemple SVM pipeline. Nous montrons d’abord que, en remplaçant la fonction de perte de la SVM par la perte carrée, on obtient des résultats similaires à une fraction de le coût de calcul. Nous appelons ce modèle la « square-loss exemplar machine », ou SLEM en anglais. Nous introduisons une variante de SLEM à noyaux qui bénéficie des même avantages computationnelles mais affiche des performances améliorées. Nous présentons des expériences qui établissent la performance et l’efficacité de nos méthodes en utilisant une grande variété de représentations de base et de jeux de données de recherche d’images. Enfin, nous proposons un réseau neuronal profond pour le problème de l’établissement sémantique correspondance. Nous utilisons des boîtes d’objets en tant qu’éléments de correspondance pour construire une architecture qui apprend simultanément l’apparence et la cohérence géométrique. Nous proposons de nouveaux scores géométriques de cohérence adaptés à l’architecture du réseau de neurones. Notre modèle est entrainé sur des paires d’images obtenues à partir des points-clés d’un jeu de données de référence et évaluées sur plusieurs ensembles de données, surpassant les architectures d’apprentissage en profondeur récentes et méthodes antérieures basées sur des caractéristiques artisanales. Nous terminons la thèse en soulignant nos contributions et en suggérant d’éventuelles directions de recherche futures
The problem of image representation is at the heart of computer vision. The choice of feature extracted of an image changes according to the task we want to study. Large image retrieval databases demand a compressed global vector representing each image, whereas a semantic segmentation problem requires a clustering map of its pixels. The techniques of machine learning are the main tool used for the construction of these representations. In this manuscript, we address the learning of visual features for three distinct problems: Image retrieval, semantic correspondence and image classification. First, we study the dependency of a Fisher vector representation on the Gaussian mixture model used as its codewords. We introduce the use of multiple Gaussian mixture models for different backgrounds, e.g. different scene categories, and analyze the performance of these representations for object classification and the impact of scene category as a latent variable. Our second approach proposes an extension to the exemplar SVM feature encoding pipeline. We first show that, by replacing the hinge loss by the square loss in the ESVM cost function, similar results in image retrieval can be obtained at a fraction of the computational cost. We call this model square-loss exemplar machine, or SLEM. Secondly, we introduce a kernelized SLEM variant which benefits from the same computational advantages but displays improved performance. We present experiments that establish the performance and efficiency of our methods using a large array of base feature representations and standard image retrieval datasets. Finally, we propose a deep neural network for the problem of establishing semantic correspondence. We employ object proposal boxes as elements for matching and construct an architecture that simultaneously learns the appearance representation and geometric consistency. We propose new geometrical consistency scores tailored to the neural network’s architecture. Our model is trained on image pairs obtained from keypoints of a benchmark dataset and evaluated on several standard datasets, outperforming both recent deep learning architectures and previous methods based on hand-crafted features. We conclude the thesis by highlighting our contributions and suggesting possible future research directions
Gli stili APA, Harvard, Vancouver, ISO e altri
48

Westin, Emil. "Authorship classification using the Vector Space Model and kernel methods". Thesis, Uppsala universitet, Statistiska institutionen, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-412897.

Testo completo
Abstract (sommario):
Authorship identification is the field of classifying a given text by its author based on the assumption that authors exhibit unique writing styles. This thesis investigates the semantic shortcomings of the vector space model by constructing a semantic kernel created from WordNet which is evaluated on the problem of authorship attribution. A multiclass SVM classifier is constructed using the one-versus-all strategy and evaluated in terms of precision, recall, accuracy and F1 scores. Results show that the use of the semantic scores from WordNet degrades the performance compared to using a linear kernel. Experiments are run to identify the best feature engineering configurations, showing that removing stopwords has a positive effect on the financial dataset Reuters while the Kaggle dataset consisting of short extracts of horror stories benefit from keeping the stopwords.
Gli stili APA, Harvard, Vancouver, ISO e altri
49

Garcia, Constantino Matias. "On the use of text classification methods for text summarisation". Thesis, University of Liverpool, 2013. http://livrepository.liverpool.ac.uk/12957/.

Testo completo
Abstract (sommario):
This thesis describes research work undertaken in the fields of text and questionnaire mining. More specifically, the research work is directed at the use of text classification techniques for the purpose of summarising the free text part of questionnaires. In this thesis text summarisation is conceived of as a form of text classification in that the classes assigned to text documents can be viewed as an indication (summarisation) of the main ideas of the original free text but in a coherent and reduced form. The reason for considering this type of summary is because summarising unstructured free text, such as that found in questionnaires, is not deemed to be effective using conventional text summarisation techniques. Four approaches are described in the context of the classification summarisation of free text from different sources, focused on the free text part of questionnaires. The first approach considers the use of standard classification techniques for text summarisation and was motivated by the desire to establish a benchmark with which the more specialised summarisation classification techniques presented later in this thesis could be compared. The second approach, called Classifier Generation Using Secondary Data (CGUSD), addresses the case when the available data is not considered sufficient for training purposes (or possibly because no data is available at all). The third approach, called Semi-Automated Rule Summarisation Extraction Tool (SARSET), presents a semi-automated classification technique to support document summarisation classification in which there is more involvement by the domain experts in the classifier generation process, the idea was that this might serve to produce more effective summaries. The fourth is a hierarchical summarisation classification approach which assumes that text summarisation can be achieved using a classification approach whereby several class labels can be associated with documents which then constitute the summarisation. For evaluation purposes three types of text were considered: (i) questionnaire free text, (ii) text from medical abstracts and (iii) text from news stories.
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Zeng, Cong. "Classification of RNA Pseudoknots and Comparison of Structure Prediction Methods". Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112127/document.

Testo completo
Abstract (sommario):
De nombreuses recherches ont constaté l'importance des molécules d'ARN, car ils jouent un rôle vital dans beaucoup de procédures moléculaires. Et il est accepté généralement que les structures des molécules d'ARN sont la clé de la découverte de leurs fonctions.Au cours de l'enquête de structures d'ARN, les chercheurs dépendent des méthodes bioinformatiques de plus en plus. Beaucoup de méthodes in silico de prédiction des structures secondaires d'ARN ont émergé dans cette grosse vague, y compris certains qui sont capables de prédire pseudonoeuds, un type particulier de structures secondaires d'ARN.Le but de ce travail est d'essayer de comparer les méthodes de l'état de l'art pour prédiction de pseudonoeud, et offrir aux collègues des idées sur le choix d’une méthode pratique pour la seule séquence donnée. En fait, beaucoup d'efforts ont été fait dans la prédiction des structures secondaires d'ARN parmi lesquelles le pseudonoeud les dernières décennies, contribuant à de nombreux programmes dans ce domaine. Certaines enjeux sont soulevées conséquemment. Comment est-elle la performance de chaque méthode, en particulier sur une classe de séquences d'ARN particulière? Quels sont leurs pour et contre? Que pout-on profiter des méthodes contemporaines si on veut développer de nouvelles? Cette thèse a la confiance dans l'enquête sur les réponses.Cette thèse porte sur très nombreuses comparaisons de la performance de prédire pseudonoeuds d'ARN par les méthodes disponibles. Une partie principale se concentre sur la prédiction de signaux de déphasage par deux méthodes principalement. La deuxième partie principale se concentre sur la prédiction de pseudonoeuds qui participent à des activités moléculaires beaucoup plus générale.Dans le détail, la deuxième partie du travail comprend 414 pseudonoeuds de Pseudobase et de la Protein Data Bank, ainsi que 15 méthodes dont 3 méthodes exactes et 12 heuristiques. Plus précisément, trois grandes catégories de mesures complexes sont introduites, qui divisent encore les 414 pseudonoeuds en une série de sous-classes respectivement.Les comparaisons se passent par comparer les prédictions de chaque méthode basée sur l'ensemble des 414 pseudonœuds, et les sous-ensembles qui sont classés par les deux mesures complexes et la longueur, le type de l'ARN et de l'organisme des pseudonœuds.Le résultat montre que les pseudo-noeuds portent une complexité relativement faible dans toutes les mesures. Et la performance des méthodes modernes varie de sous-classe à l’autre, mais diminue constamment lors que la complexité de pseudonoeuds augmente. Plus généralement, les méthodes heuristiques sont supérieurs globalement à celles exacts. Et les résultats de l'évaluation sensibles sont dépendants fortement de la qualité de structure de référence et le système d'évaluation. Enfin, cette partie du travail est fourni comme une référence en ligne pour la communauté bioinformatique
Lots of researches convey the importance of the RNA molecules, as they play vital roles in many molecular procedures. And it is commonly believed that the structures of the RNA molecules hold the key to the discovery of their functions.During the investigation of RNA structures, the researchers are dependent on the bioinformatical methods increasingly. Many in silico methods of predicting RNA secondary structures have emerged in this big wave, including some ones which are capable of predicting pseudoknots, a particular type of RNA secondary structures.The purpose of this dissertation is to try to compare the state-of-the-art methods predicting pseudoknots, and offer the colleagues some insights into how to choose a practical method for the given single sequence. In fact, lots of efforts have been done into the prediction of RNA secondary structures including pseudoknots during the last decades, contributing to many programs in this field. Some challenging questions are raised consequently. How about the performance of each method, especially on a particular class of RNA sequences? What are their advantages and disadvantages? What can we benefit from the contemporary methods if we want to develop new ones? This dissertation holds the confidence in the investigation of the answers.This dissertation carries out quite many comparisons of the performance of predicting RNA pseudoknots by the available methods. One main part focuses on the prediction of frameshifting signals by two methods principally. The second main part focuses on the prediction of pseudoknots which participate in much more general molecular activities.In detail, the second part of work includes 414 pseudoknots, from both the Pseudobase and the Protein Data Bank, and 15 methods including 3 exact methods and 12 heuristic ones. Specifically, three main categories of complexity measurements are introduced, which further divide the 414 pseudoknots into a series of subclasses respectively. The comparisons are carried out by comparing the predictions of each method based on the entire 414 pseudoknots, and the subsets which are classified by both the complexity measurements and the length, RNA type and organism of the pseudoknots.The result shows that the pseudoknots in nature hold a relatively low complexity in all measurements. And the performance of contemporary methods varies from subclass to subclass, but decreases consistently as the complexity of pseudoknots increases. More generally, the heuristic methods globally outperform the exact ones. And the susceptible assessment results are dependent strongly on the quality of the reference structures and the evaluation system. Last but not least, this part of work is provided as an on-line benchmark for the bioinformatics community
Gli stili APA, Harvard, Vancouver, ISO e altri
Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!

Vai alla bibliografia