Tesi sul tema "Feature selection"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Feature selection".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Zheng, Ling. "Feature grouping-based feature selection". Thesis, Aberystwyth University, 2017. http://hdl.handle.net/2160/41e7b226-d8e1-481f-9c48-4983f64b0a92.
Testo completoDreyer, Sigve. "Evolutionary Feature Selection". Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2013. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-24225.
Testo completoDoquet, Guillaume. "Agnostic Feature Selection". Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS486.
Testo completoWith the advent of Big Data, databases whose size far exceed the human scale are becoming increasingly common. The resulting overabundance of monitored variables (friends on a social network, movies watched, nucleotides coding the DNA, monetary transactions...) has motivated the development of Dimensionality Reduction (DR) techniques. A DR algorithm such as Principal Component Analysis (PCA) or an AutoEncoder typically combines the original variables into new features fewer in number, such that most of the information in the dataset is conveyed by the extracted feature set.A particular subcategory of DR is formed by Feature Selection (FS) methods, which directly retain the most important initial variables. How to select the best candidates is a hot topic at the crossroad of statistics and Machine Learning. Feature importance is usually inferred in a supervised context, where variables are ranked according to their usefulness for predicting a specific target feature.The present thesis focuses on the unsupervised context in FS, i.e. the challenging situation where no prediction goal is available to help assess feature relevance. Instead, unsupervised FS algorithms usually build an artificial classification goal and rank features based on their helpfulness for predicting this new target, thus falling back on the supervised context. Additionally, the efficiency of unsupervised FS approaches is typically also assessed in a supervised setting.In this work, we propose an alternate model combining unsupervised FS with data compression. Our Agnostic Feature Selection (AgnoS) algorithm does not rely on creating an artificial target and aims to retain a feature subset sufficient to recover the whole original dataset, rather than a specific variable. As a result, AgnoS does not suffer from the selection bias inherent to clustering-based techniques.The second contribution of this work( Agnostic Feature Selection, G. Doquet & M. Sebag, ECML PKDD 2019) is to establish both the brittleness of the standard supervised evaluation of unsupervised FS, and the stability of the new proposed AgnoS
Sima, Chao. "Small sample feature selection". Texas A&M University, 2003. http://hdl.handle.net/1969.1/5796.
Testo completoCoelho, Frederico Gualberto Ferreira. "Semi-supervised feature selection". Universidade Federal de Minas Gerais, 2013. http://hdl.handle.net/1843/BUOS-97NJ9S.
Testo completoComo a aquisição de dados tem se tornado relativamente mais fácil e barata, o conjunto de dados tem adquirido dimensões extremamente grandes, tanto em relação ao número de variáveis, bem como em relação ao número de instâncias. Contudo, o mesmo não ocorre com os rótulos de cada instância. O custo para se obter estes rótulos é, via de regra, muito alto, e por causa disto, dados não rotulados são a grande maioria, principalmente quando comparados com a quanti-dade de dados rotulados. A utilização destes dados requer cuidados especiais uma vez que vários problemas surgem com o aumento da dimensionalidade e com a escassez de rótulos. Reduzir a dimensão dos dados é então uma necessidade primordial. Em meio às suas características mais relevantes, usualmente encontramos variáveis redundantes e mesmo irrelevantes, que podem e devem ser eliminadas. Na procura destas variáveis, ao desprezar os dados não rotulados, implementando-se apenas estratégias supervisionadas, abrimos mão de informações estruturais que podem ser úteis. Da mesma forma, desprezar os dados rotulados implementando-se apenas métodos não supervisionados é igualmente disperdício de informação. Neste contexto, a aplicação de uma abordagem semi-supervisionada é bastante apropriada, onde pode-se tentar aproveitar o que cada tipo de dado tem de melhor a oferecer. Estamos trabalhando no problema de seleção de características semi-supervisionada através de duas abordagens distintas, mas que podem, eventualmente se complementarem mais à frente. O problema pode ser abordado num contexto de agrupamento de características, agrupando variáveis semelhantes e desprezando as irrelevantes. Por outro lado, podemos abordar o problema através de uma metodologia multiobjetiva, uma vez que temos argumentos estabelecendo claramente esta sua natureza multiobjetiva. Na primeira abordagem, uma medida de semelhança capaz de levar em consideração tanto os dados rotulados como os não rotulados, baseado na informação mútua, está sendo desenvolvida, bem como, um critério, baseado nesta medida, para agrupamento e eliminação de variáveis. Também o princípio da homogeneidade entre os rótulos e os clusters de dados é explorado e dois métodos semissupervisionados de seleção de características são desenvolvidos. Finalmente um estimador de informaçã mútua para um conjunto misto de variáveis discretas e contínuas é desenvolvido e constitue uma contribuição secundária do trabalho. Na segunda abordagem, a proposta é tentar resolver o problema de seleção de características e de aproximação de funções ao mesmo tempo. O método proposto inclue a consideração de normas diferentes para cada camada de uma rede MLP, pelo treinamento independente de cada camada e pela definição de funções objetivo que sejam capazes de maximizar algum índice de relevância das variáveis.
Garnes, Øystein Løhre. "Feature Selection for Text Categorisation". Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9017.
Testo completoText categorization is the task of discovering the category or class text documents belongs to, or in other words spotting the correct topic for text documents. While there today exists many machine learning schemes for building automatic classifiers, these are typically resource demanding and do not always achieve the best results when given the whole contents of the documents. A popular solution to these problems is called feature selection. The features (e.g. terms) in a document collection are given weights based on a simple scheme, and then ranked by these weights. Next, each document is represented using only the top ranked features, typically only a few percent of the features. The classifier is then built in considerably less time, and might even improve accuracy. In situations where the documents can belong to one of a series of categories, one can either build a multi-class classifier and use one feature set for all categories, or one can split the problem into a series of binary categorization tasks (deciding if documents belong to a category or not) and create one ranked feature subset for each category/classifier. Many feature selection metrics have been suggested over the last decades, including supervised methods that make use of a manually pre-categorized set of training documents, and unsupervised methods that need only training documents of the same type or collection that is to be categorized. While many of these look promising, there has been a lack of large-scale comparison experiments. Also, several methods have been proposed the last two years. Moreover, most evaluations are conducted on a set of binary tasks instead of a multi-class task as this often gives better results, although multi-class categorization with a joint feature set often is used in operational environments. In this report, we present results from the comparison of 16 feature selection methods (in addition to random selection) using various feature set sizes. Of these, 5 were unsupervised , and 11 were supervised. All methods are tested on both a Naive Bayes (NB) classifier and a Support Vector Machine (SVM) classifier. We conducted multi-class experiments using a collection with 20 non-overlapping categories, and each feature selection method produced feature sets common for all the categories. We also combined feature selection methods and evaluated their joint efforts. We found that the classical supervised methods had the best performance, including Chi Square, Information Gain and Mutual Information. The Chi Square variant GSS coefficient was also among the top performers. Odds Ratio showed excellent performance for NB, but not for SVM. The three unsupervised methods Collection Frequency, Collection Frequency Inverse Document Frequency and Term Frequency Document Frequency all showed performances close to the best group. The Bi-Normal Separation metric produced excellent results for the smallest feature subsets. The weirdness factor performed several times better than random selection, but was not among the top performing group. Some combination experiments achieved better results than each method alone, but the majority did not. The top performers Chi square and GSS coefficient classified more documents when used together than alone.Four of the five combinations that showed increase in performance included the BNS metric.
Pradhananga, Nripendra. "Effective Linear-Time Feature Selection". The University of Waikato, 2007. http://hdl.handle.net/10289/2315.
Testo completoCheng, Iunniang. "Hybrid Methods for Feature Selection". TopSCHOLAR®, 2013. http://digitalcommons.wku.edu/theses/1244.
Testo completoAthanasakis, D. "Feature selection in computational biology". Thesis, University College London (University of London), 2014. http://discovery.ucl.ac.uk/1432346/.
Testo completoSarkar, Saurabh. "Feature Selection with Missing Data". University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1378194989.
Testo completoPocock, Adam Craig. "Feature selection via joint likelihood". Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/feature-selection-via-joint-likelihood(3baba883-1fac-4658-bab0-164b54c3784a).html.
Testo completoBaker, Antoin Lenard. "Computer aided invariant feature selection". [Gainesville, Fla.] : University of Florida, 2008. http://purl.fcla.edu/fcla/etd/UFE0022870.
Testo completoBäck, Eneroth Moa. "A Feature Selection Approach for Evaluating and Selecting Performance Metrics". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280817.
Testo completoAtt exakt definiera och mäta prestation är en komplex process för de flesta företag, men ändå avgörande för korrekt distribution av resurser och för att uppnå en förståelse för gemensamma mål mellan affärsenheter. Trots den stora mängd data som finns tillgänglig för de flesta moderna företag idag, väljs mått av prestationer ofta baserat på expertis, tradition eller till och med magkänsla. I detta examensarbete föreslås en datadriven strategi i form av ett statistiskt ramverk för utvärdering och val av prestationsmått. Ramverkets struktur baseras på en dimensionsreducerande metod, känd som (eng.) feature selection, för tidsserier och som i sökningen efter relevanta prestationsmått använder sig av en prediktionsalgoritm för tidsserier. För att designa ett komplett ramverk utförs experiment som utforskar moderna prediktionsalgoritmer för tidsserier i kombination med två olika dimensionsreducerande metoder. Resultaten visar att för prestationsmått baserade på den verkliga data som använts i detta examensarbete, så utgörs det bästa ramverket utav den dimensionsreducerande metoden som använder sig av filtrering i kombination med en prediktionsalgoritm för univariata tidsserier.
Nogueira, Sarah. "Quantifying the stability of feature selection". Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/quantifying-the-stability-of-feature-selection(6b69098a-58ee-4182-9a30-693d714f0c9f).html.
Testo completoLi, Jiexun. "Feature Construction, Selection And Consolidation For Knowledge Discovery". Diss., The University of Arizona, 2007. http://hdl.handle.net/10150/193819.
Testo completoHare, Brian K. Dinakarpandian Deendayal. "Feature selection in DNA microarray analysis". Diss., UMK access, 2004.
Cerca il testo completo"A thesis in computer science." Typescript. Advisor: D. Dinakarpandian. Vita. Title from "catalog record" of the print edition Description based on contents viewed Feb. 24, 2006. Includes bibliographical references (leaves 81-86 ). Online version of the print edition.
Youn, Eun Seog. "Feature selection in support vector machines". [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE1000171.
Testo completoTitle from title page of source document. Document formatted into pages; contains x, 50 p.; also contains graphics. Includes vita. Includes bibliographical references.
Fusting, Christopher Winter. "Temporal Feature Selection with Symbolic Regression". ScholarWorks @ UVM, 2017. http://scholarworks.uvm.edu/graddis/806.
Testo completoBancarz, Iain. "Conditional-entropy metrics for feature selection". Thesis, University of Edinburgh, 2005. http://hdl.handle.net/1842/799.
Testo completoLongstaff, James Robert. "Feature selection for affective product development". Thesis, University of Leeds, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.511133.
Testo completoChoakjarernwanit, Naruetep. "Feature selection in statistical pattern recognition". Thesis, University of Surrey, 1992. http://epubs.surrey.ac.uk/843569/.
Testo completoSong, Jingping. "Feature selection for intrusion detection system". Thesis, Aberystwyth University, 2016. http://hdl.handle.net/2160/3143de58-208f-405e-ab18-abcecfc8f33b.
Testo completoDitzler, Gregory, J. Calvin Morrison, Yemin Lan e Gail L. Rosen. "Fizzy: feature subset selection for metagenomics". BioMed Central, 2015. http://hdl.handle.net/10150/610268.
Testo completoGarg, Vikas Ph D. (Vikas Kamur) Massachusetts Institute of Technology. "CRAFT : ClusteR-specific Assorted Feature selecTion". Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/105697.
Testo completoCataloged from PDF version of thesis.
Includes bibliographical references (pages 45-46).
In this thesis, we present a hierarchical Bayesian framework for clustering with cluster-specific feature selection. We derive a simplified model, CRAFT, by analyzing the asymptotic behavior of the log posterior formulations in a nonparametric MAP-based clustering setting in this framework. The model handles assorted data, i.e., both numeric and categorical data, and the underlying objective functions are intuitively appealing. The resulting algorithm is simple to implement and scales nicely, requires minimal parameter tuning, obviates the need to specify the number of clusters a priori, and compares favorably with other state-of-the-art methods on several datasets. We provide empirical evidence on carefully designed synthetic data sets to highlight the robustness of the algorithm to recover the underlying feature subspaces, even when the average dimensionality of the features across clusters is misspecified. Besides, the framework seamlessly allows for multiple views of clustering by interpolating between the two extremes of cluster-specific feature selection and global selection, and recovers the DP-means objective [14] under the degenerate setting of clustering without feature selection.
by Vikas Garg.
S.M.
Zhang, Zhihong. "Feature selection from higher order correlations". Thesis, University of York, 2012. http://etheses.whiterose.ac.uk/3340/.
Testo completoBonev, Boyan. "Feature selection based on information theory". Doctoral thesis, Universidad de Alicante, 2010. http://hdl.handle.net/10045/18362.
Testo completoIn this thesis we propose a feature selection method for supervised classification. The main contribution is the efficient use of information theory, which provides a solid theoretical framework for measuring the relation between the classes and the features. Mutual information is considered to be the best measure for such purpose. Traditionally it has been measured for ranking single features without taking into account the entire set of selected features. This is due to the computational complexity involved in estimating the mutual information. However, in most data sets the features are not independent and their combination provides much more information about the class, than the sum of their individual prediction power.
Methods based on density estimation can only be used for data sets with a very high number of samples and low number of features. Due to the curse of dimensionality, in a multi-dimensional feature space the amount of samples required for a reliable density estimation is very high. For this reason we analyse the use of different estimation methods which bypass the density estimation and estimate entropy directly from the set of samples. These methods allow us to efficiently evaluate sets of thousands of features.
For high-dimensional feature sets another problem is the search order of the feature space. All non-prohibitive computational cost algorithms search for a sub-optimal feature set. Greedy algorithms are the fastest and are the ones which incur less overfitting. We show that from the information theoretical perspective, a greedy backward selection algorithm conserves the amount of mutual information, even though the feature set is not the minimal one.
We also validate our method in several real-world applications. We apply feature selection to omnidirectional image classification through a novel approach. It is appearance-based and we select features from a bank of filters applied to different parts of the image. The context of the task is place recognition for mobile robotics. Another set of experiments are performed on microarrays from gene expression databases. The classification problem aims to predict the disease of a new patient. We present a comparison of the classification performance and the algorithms we present showed to outperform the existing ones. Finally, we succesfully apply feature selection to spectral graph classification. All the features we use are for unattributed graphs, which constitutes a contribution to the field. We also draw interesting conclusions about which spectral features matter most, under different experimental conditions. In the context of graph classification we also show important is the precise estimation of mutual information and we analyse its impact on the final classification results.
Nguyen, Minh Phu <1988>. "Feature Selection using Dominant-Set Clustering". Master's Degree Thesis, Università Ca' Foscari Venezia, 2016. http://hdl.handle.net/10579/8058.
Testo completoScalco, Alberto <1993>. "Feature Selection Using Neural Network Pruning". Master's Degree Thesis, Università Ca' Foscari Venezia, 2019. http://hdl.handle.net/10579/14382.
Testo completoNg, Andrew Y. 1976. "On feature selection : learning with exponentially many irreverent features as training examples". Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9658.
Testo completoIncludes bibliographical references (p. 55-57).
We consider feature selection for supervised machine learning in the "wrapper" model of feature selection. This typically involves an NP-hard optimization problem that is approximated by heuristic search for a "good" feature subset. First considering the idealization where this optimization is performed exactly, we give a rigorous bound for generalization error under feature selection. The search heuristics typically used are then immediately seen as trying to achieve the error given in our bounds, and succeeding to the extent that they succeed in solving the optimization. The bound suggests that, in the presence of many "irrelevant" features, the main somce of error in wrapper model feature selection is from "overfitting" hold-out or cross-validation data. This motivates a new algorithm that, again under the idealization of performing search exactly, has sample complexity ( and error) that grows logarithmically in the number of "irrelevant" features - which means it can tolerate having a number of "irrelevant" features exponential in the number of training examples - and search heuristics are again seen to be directly trying to reach this bound. Experimental results on a problem using simulated data show the new algorithm having much higher tolerance to irrelevant features than the standard wrapper model. Lastly, we also discuss ramifications that sample complexity logarithmic in the number of irrelevant features might have for feature design in actual applications of learning.
by Andrew Y. Ng.
S.M.
Tan, Feng. "Improving Feature Selection Techniques for Machine Learning". Digital Archive @ GSU, 2007. http://digitalarchive.gsu.edu/cs_diss/27.
Testo completoButko, Taras. "Feature selection for multimodal: acoustic event detection". Doctoral thesis, Universitat Politècnica de Catalunya, 2011. http://hdl.handle.net/10803/32176.
Testo completoLa detecció d'esdeveniments acústics (Acoustic Events -AEs-) que es produeixen naturalment en una sala de reunions pot ajudar a descriure l'activitat humana i social. La descripció automàtica de les interaccions entre els éssers humans i l'entorn pot ser útil per a proporcionar: ajuda implícita a la gent dins de la sala, informació sensible al context i al contingut sense requerir gaire atenció humana ni interrupcions, suport per a l'anàlisi d'alt nivell de l'escena acústica, etc. La detecció i la descripció d'activitat és una funcionalitat clau de les interfícies perceptives que treballen en entorns de comunicació humana com sales de reunions. D'altra banda, el recent creixement ràpid del contingut audiovisual disponible requereix l'existència d'eines per a l'anàlisi, indexació, cerca i recuperació dels documents existents. Donat un document d'àudio, el primer pas de processament acostuma a ser la seva segmentació (Audio Segmentation (AS)), és a dir, la partició de la seqüència d'entrada d'àudio en regions acústiques homogènies que s'etiqueten d'acord amb un conjunt predefinit de classes com parla, música, soroll, etc. De fet, l'AS pot ser vist com un cas particular de la detecció d’esdeveniments acústics, i així es fa en aquesta tesi. La detecció d’esdeveniments acústics (Acoustic Event Detection (AED)) és un dels objectius d'aquesta tesi. Es proposa tot una varietat de característiques que provenen no només de l'àudio, sinó també de la modalitat de vídeo, per fer front al problema de la detecció en dominis de sala de reunions i de difusió de notícies. En aquest treball s'investiguen dos enfocaments bàsics de detecció: 1) la realització conjunta de segmentació i classificació utilitzant models de Markov ocults (Hidden Markov Models (HMMs)) amb models de barreges de gaussianes (Gaussian Mixture Models (GMMs)), i 2) la detecció per classificació utilitzant màquines de vectors suport (Support Vector Machines (SVM)) discriminatives. Per al primer cas, en aquesta tesi es desenvolupa un algorisme de selecció de característiques ràpid d'un sol pas per tal de seleccionar, per a cada AE, el subconjunt de característiques multimodals que aconsegueix la millor taxa de detecció. L'AED en entorns de sales de reunió té com a objectiu processar els senyals recollits per micròfons distants i càmeres de vídeo per tal d'obtenir la seqüència temporal dels (possiblement superposats) esdeveniments acústics que s'han produït a la sala. Quan s'aplica als seminaris interactius amb un cert grau d'espontaneïtat, la detecció d'esdeveniments acústics a partir de només la modalitat d'àudio mostra una gran quantitat d'errors, que és sobretot a causa de la superposició temporal dels sons. Aquesta tesi inclou diverses contribucions pel que fa a la tasca d'AED multimodal. En primer lloc, l'ús de característiques de vídeo. Ja que en la modalitat de vídeo les fonts acústiques no se superposen (exceptuant les oclusions), les característiques proposades Resum iv milloren la detecció en els enregistraments en escenaris de caire espontani. En segon lloc, la inclusió de característiques de localització acústica, que, en combinació amb les característiques habituals d'àudio espectrotemporals, signifiquen nova millora en la taxa de reconeixement. En tercer lloc, la comparació d'estratègies de fusió a nivell de característiques i a nivell de decisions, per a la utilització combinada de les modalitats d'àudio i vídeo. En el darrer cas, les puntuacions de sortida del sistema es combinen fent ús de dos mètodes estadístics: la mitjana aritmètica ponderada i la integral difusa. D'altra banda, a causa de l'escassetat de dades multimodals anotades, i, en particular, de dades amb superposició temporal de sons, s'ha gravat i anotat manualment una nova base de dades multimodal amb una rica varietat d'AEs de sala de reunions, i s'ha posat a disposició pública per a finalitats d'investigació. Per a la segmentació d'àudio en el domini de difusió de notícies, es proposa una arquitectura jeràrquica de sistema, que agrupa apropiadament un conjunt de detectors, cada un dels quals correspon a una de les classes acústiques d'interès. S'han desenvolupat dos sistemes diferents de SA per a dues bases de dades de difusió de notícies: la primera correspon a gravacions d'àudio del programa de debat Àgora del canal de televisió català TV3, i el segon inclou diversos segments d'àudio del canal de televisió català 3/24 de difusió de notícies. La sortida del primer sistema es va utilitzar com a primera etapa dels sistemes de traducció automàtica i de subtitulat del projecte Tecnoparla, un projecte finançat pel govern de la Generalitat en el que es desenvoluparen diverses tecnologies de la parla per extreure tota la informació possible del senyal d'àudio. El segon sistema d'AS, que és un sistema de detecció jeràrquica basat en HMM-GMM amb selecció de característiques, ha obtingut resultats competitius en l'avaluació de segmentació d'àudio Albayzín2010. Per acabar, val la pena esmentar alguns resultats col·laterals d’aquesta tesi. L’autor ha sigut responsable de l'organització de l'avaluació de sistemes de segmentació d'àudio dins de la campanya Albayzín-2010 abans esmentada. S'han especificat les classes d’esdeveniments, les bases de dades, la mètrica i els protocols d'avaluació utilitzats, i s'ha realitzat una anàlisi posterior dels sistemes i els resultats presentats pels vuit grups de recerca participants, provinents d'universitats espanyoles i portugueses. A més a més, s'ha implementat en la sala multimodal de la UPC un sistema de detecció d'esdeveniments acústics per a dues fonts simultànies, basat en HMM-GMM, i funcionant en temps real, per finalitats de test i demostració.
Lin, Pengpeng. "A Framework for Consistency Based Feature Selection". TopSCHOLAR®, 2009. http://digitalcommons.wku.edu/theses/62.
Testo completoTeixeira, de Souza Jerffeson. "Feature selection with a general hybrid algorithm". Thesis, University of Ottawa (Canada), 2004. http://hdl.handle.net/10393/29177.
Testo completoVanhoy, Garrett, e Noel Teku. "FEATURE SELECTION FOR CYCLOSTATIONARY-BASED SIGNAL CLASSIFICATION". International Foundation for Telemetering, 2017. http://hdl.handle.net/10150/626974.
Testo completoMashhadi-Farahani, Bahman. "Feature extraction and selection for speech recognition". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq38255.pdf.
Testo completoKumar, Rajeev. "Feature selection, representation and classification in vision". Thesis, University of Sheffield, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.245688.
Testo completoHuang, Yu'e. "An optimization of feature selection for classification". Thesis, University of Ulster, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.428284.
Testo completoMOTTA, EDUARDO NEVES. "SUPERVISED LEARNING INCREMENTAL FEATURE INDUCTION AND SELECTION". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2014. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=28688@1.
Testo completoCOORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
PROGRAMA DE EXCELENCIA ACADEMICA
A indução de atributos não lineares a partir de atributos básicos é um modo de obter modelos preditivos mais precisos para problemas de classificação. Entretanto, a indução pode causar o rápido crescimento do número de atributos, resultando usualmente em overfitting e em modelos com baixo poder de generalização. Para evitar esta consequência indesejada, técnicas de regularização são aplicadas, para criar um compromisso entre um reduzido conjunto de atributos representativo do domínio e a capacidade de generalização Neste trabalho, descrevemos uma abordagem de aprendizado de máquina supervisionado com indução e seleção incrementais de atributos. Esta abordagem integra árvores de decisão, support vector machines e seleção de atributos utilizando perceptrons esparsos em um framework de aprendizado que chamamos IFIS – Incremental Feature Induction and Selection. Usando o IFIS, somos capazes de criar modelos regularizados não lineares de alto desempenho utilizando um algoritmo com modelo linear. Avaliamos o nosso sistema em duas tarefas de processamento de linguagem natural em dois idiomas. Na primeira tarefa, anotação morfossintática, usamos dois corpora, o corpus WSJ em língua inglesa e o Mac-Morpho em Português. Em ambos, alcançamos resultados competitivos com o estado da arte reportado na literatura, alcançando as acurácias de 97,14 por cento e 97,13 por cento, respectivamente. Na segunda tarefa, análise de dependência, utilizamos o corpus da CoNLL 2006 Shared Task em português, ultrapassando os resultados reportados durante aquela competição e alcançando resultados competitivos com o estado da arte para esta tarefa, com a métrica UAS igual a 92,01 por cento. Com a regularização usando um perceptron esparso, geramos modelos SVM que são até 10 vezes menores, preservando sua acurácia. A redução dos modelos é obtida através da regularização dos domínios dos atributos, que atinge percentuais de até 99 por cento. Com a regularização dos modelos, alcançamos uma redução de até 82 por cento no tamanho físico dos modelos. O tempo de predição do modelo compacto é reduzido em até 84 por cento. A redução dos domínios e modelos permite também melhorar a engenharia de atributos, através da análise dos domínios compactos e da introdução incremental de novos atributos.
Non linear feature induction from basic features is a method of generating predictive models with higher precision for classification problems. However, feature induction may rapidly lead to a huge number of features, causing overfitting and models with low predictive power. To prevent this side effect, regularization techniques are employed to obtain a trade-off between a reduced feature set representative of the domain and generalization power. In this work, we describe a supervised machine learning approach that incrementally inducts and selects feature conjunctions derived from base features. This approach integrates decision trees, support vector machines and feature selection using sparse perceptrons in a machine learning framework named IFIS – Incremental Feature Induction and Selection. Using IFIS, we generate regularized non-linear models with high performance using a linear algorithm. We evaluate our system in two natural language processing tasks in two different languages. For the first task, POS tagging, we use two corpora, WSJ corpus for English, and Mac-Morpho for Portuguese. Our results are competitive with the state-of-the-art performance in both, achieving accuracies of 97.14 per cent and 97.13 per cent, respectively. In the second task, Dependency Parsing, we use the CoNLL 2006 Shared Task Portuguese corpus, achieving better results than those reported during that competition and competitive with the state-of-the-art for this task, with UAS score of 92.01 per cent. Applying model regularization using a sparse perceptron, we obtain SVM models 10 times smaller, while maintaining their accuracies. We achieve model reduction by regularization of feature domains, which can reach 99 per cent. Using the regularized model we achieve model physical size shrinking of up to 82 per cent. The prediction time is cut by up to 84 per cent. Domains and models downsizing also allows enhancing feature engineering, through compact domain analysis and incremental inclusion of new features.
Thapa, Mandira. "Optimal Feature Selection for Spatial Histogram Classifiers". Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1513710294627304.
Testo completoZhao, Helen. "Interactive Causal Feature Selection with Prior Knowledge". Case Western Reserve University School of Graduate Studies / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=case1553785900876815.
Testo completoLoscalzo, Steven. "Group based techniques for stable feature selection". Diss., Online access via UMI:, 2009.
Cerca il testo completoSöderberg, Max Joel, e Axel Meurling. "Feature selection in short-term load forecasting". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259692.
Testo completoI denna rapport undersöks korrelation och betydelsen av olika attribut för att förutspå energiförbrukning 24 timmar framåt. Attributen härstammar från tre kategorier: väder, tid och tidigare energiförbrukning. Korrelationerna tas fram genom att utföra Pearson Correlation och Mutual Information. Detta resulterade i att de högst korrelerade attributen var de som representerar tidigare energiförbrukning, följt av temperatur och månad. Två identiska attributmängder erhölls genom att ranka attributen över korrelation. Tre attributmängder skapades manuellt. Den första mängden innehåll sju attribut som representerade tidigare energiförbrukning, en för varje dag, sju dagar innan datumet för prognosen av energiförbrukning. Den andra mängden bestod av väderoch tidsattribut. Den tredje mängden bestod av alla attribut från den första och andra mängden. Dessa mängder jämfördes sedan med hjälp av olika maskininlärningsmodeller. Resultaten visade att mängden med alla attribut och den med tidigare energiförbrukning gav bäst resultat för samtliga modeller.
Pighin, Daniele. "Greedy Feature Selection in Tree Kernel Spaces". Doctoral thesis, Università degli studi di Trento, 2010. https://hdl.handle.net/11572/368779.
Testo completoPighin, Daniele. "Greedy Feature Selection in Tree Kernel Spaces". Doctoral thesis, University of Trento, 2010. http://eprints-phd.biblio.unitn.it/359/1/thesis.pdf.
Testo completoRezaei, Boroujeni Forough. "Feature Selection for Hybrid Data Sets and Feature Extraction for Non-Hybrid Data Sets". Thesis, Griffith University, 2021. http://hdl.handle.net/10072/404170.
Testo completoThesis (Masters)
Master of Philosophy (MPhil)
School of Info & Comm Tech
Science, Environment, Engineering and Technology
Full Text
Gupta, Chelsi. "Feature Selection and Analysis for Standard Machine Learning Classification of Audio Beehive Samples". DigitalCommons@USU, 2019. https://digitalcommons.usu.edu/etd/7564.
Testo completoMay, Michael. "Data analytics and methods for improved feature selection and matching". Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/data-analytics-and-methods-for-improved-feature-selection-and-matching(965ded10-e3a0-4ed5-8145-2af7a8b5e35d).html.
Testo completoLorentzon, Matilda. "Feature Extraction for Image Selection Using Machine Learning". Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-142095.
Testo completoJensen, Richard. "Combining rough and fuzzy sets for feature selection". Thesis, University of Edinburgh, 2004. http://hdl.handle.net/1842/24740.
Testo completoNilsson, Roland. "Statistical Feature Selection : With Applications in Life Science". Doctoral thesis, Linköping : Department of Physcis, Chemistry and Biology, Linköping University, 2007. http://www.bibl.liu.se/liupubl/disp/disp2007/tek1090s.pdf.
Testo completo