Dissertations / Theses on the topic 'Boosting and bagging'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 23 dissertations / theses for your research on the topic 'Boosting and bagging.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Nascimento, Diego Silveira Costa. "Configuração heterogênea de ensembles de classificadores : investigação em bagging, boosting e multiboosting." Universidade de Fortaleza, 2009. http://dspace.unifor.br/handle/tede/83562.
Full textThis work presents a study on the characterization and evaluation of six new heterogeneous committees machines algorithms, which are aimed at solving problems of pattern classification. These algorithms are extensions of models which are already found in the literature and have been successfully applied in different fields of research. Following two approaches, evolutionary and constructive, different machine learning algorithms (inductors) can be used for induction of components of the ensemble to be trained by standard Bagging, Boosting or MultiBoosting on the resampled data, aiming at the increasing of the diversity of the resulting composite model. As a means of automatic configuration of different types of components, we adopt a customized genetic algorithm for the first approach and greedy search for the second approach. For purposes of validation of the proposal, an empirical study has been conducted involving 10 different types of inductors and 18 classification problems taken from the UCI repository. The acuity values obtained by the evolutionary and constructive heterogeneous ensembles are analyzed based on those produced by models of homogeneous ensembles composed of the 10 types of inductors we have utilized, and the majority of the results evidence a gain in performance from both approaches. Keywords: Machine learning, Committee machines, Bagging, Wagging, Boosting, MultiBoosting, Genetic algorithm.
Este trabalho apresenta um estudo quanto à caracterização e avaliação de seis novos algoritmos de comitês de máquinas heterogêneos, sendo estes destinados à resolução de problemas de classificação de padrões. Esses algoritmos são extensões de modelos já encontrados na literatura e que vêm sendo aplicados com sucesso em diferentes domínios de pesquisa. Seguindo duas abordagens, uma evolutiva e outra construtiva, diferentes algoritmos de aprendizado de máquina (indutores) podem ser utilizados para fins de indução dos componentes do ensemble a serem treinados por Bagging, Boosting ou MultiBoosting padrão sobre os dados reamostrados, almejando-se o incremento da diversidade do modelo composto resultante. Como meio de configuração automática dos diferentes tipos de componentes, adota-se um algoritmo genético customizado para a primeira abordagem e uma busca de natureza gulosa para a segunda abordagem. Para fins de validação da proposta, foi conduzido um estudo empírico envolvendo 10 diferentes tipos de indutores e 18 problemas de classificação extraídos do repositório UCI. Os valores de acuidade obtidos via ensembles heterogêneos evolutivos e construtivos são analisados com base naqueles produzidos por modelos de ensembles homogêneos compostos pelos 10 tipos de indutores utilizados, sendo que em grande parte dos casos os resultados evidenciam ganhos de desempenho de ambas as abordagens. Palavras-chave: Aprendizado de máquina, Comitês de máquinas, Bagging, Wagging, Boosting, MultiBoosting, Algoritmo genético.
Rubesam, Alexandre. "Estimação não parametrica aplicada a problemas de classificação via Bagging e Boosting." [s.n.], 2004. http://repositorio.unicamp.br/jspui/handle/REPOSIP/306510.
Full textDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica
Made available in DSpace on 2018-08-03T20:17:33Z (GMT). No. of bitstreams: 1 Rubesam_Alexandre_M.pdf: 3561307 bytes, checksum: 136856548e218dc25a0ba4ee178b63a7 (MD5) Previous issue date: 2004
Resumo: Alguns dos métodos mais modernos e bem sucedidos de classificação são bagging, boosting e SVM (Support Vector M achines ). B agging funciona combinando classificadores ajustados em amostras bootstrap dos dados; boosting funciona aplicando-se seqüencialmente um algoritmo de classificação a versões reponderadas do conjunto de dados de treinamento, dando maior peso às observações classificadas erroneamente no passo anterior, e SVM é um método que transforma os dados originais de maneira não linear para um espaço de dimensão maior, e procura um hiperplano separador neste espaço transformado. N este trabalho estudamos os métodos descritos acima, e propusemos dois métodos de classificação, um baseado em regressão não paramétrica por Hsplines (também proposto aqui) e boosting, e outro que é uma modificação de um algoritmo de boosting baseado no algoritmo MARS. Os métodos foram aplicados em dados simulados e em dados reais
Abstract: Some of the most modern and well succeeded classification methods are bagging, boosting and SVM (Support Vector Machines). Bagging combines classifiers fitted to bootstrap samples of the training data; boosting sequentially applies a classification algorithm to reweighted versions of the training data, increasing in each step the weights of the observations that were misclassified in the previous step, and SVM is a method that transforms the data in a nonlinear way to a space of greater dimension than that of the original data, and searches for a separating hyperplane in this transformed space. In this work we have studied the methods described above. We propose two classification methods: one of them is based on a nonparametric regression method via H-splines (also proposed here) and boosting, and the other is a modification of a boosting algorithm, based on the MARS algorithm. The methods were applied to both simulated and real data
Mestrado
Mestre em Estatística
Boshoff, Lusilda. "Boosting, bagging and bragging applied to nonparametric regression : an empirical approach / Lusilda Boshoff." Thesis, North-West University, 2009. http://hdl.handle.net/10394/4337.
Full textLlerena, Nils Ever Murrugarra. "Ensembles na classificação relacional." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-18102011-095113/.
Full textIn many fields, besides information about the objects or entities that compose them, there is also information about the relationships between objects. Some of these fields are, for example, co-authorship networks and Web pages. Therefore, it is natural to search for classification techniques that take into account this information. Among these techniques are the so-called graphbased classification, which seek to classify examples taking into account the relationships between them. This paper presents the development of methods to improve the performance of graph-based classifiers by using strategies of ensembles. An ensemble classifier considers a set of classifiers whose individual predictions are combined in some way. This combined classifier usually performs better than its individual classifiers. Three techniques have been developed: the first applied for originally propositional data transformed to relational format based on graphs and the second and the third applied for data originally in graph format. The first technique, inspired by the boosting algorithm originated the Adaptive Graph-Based K-Nearest Neighbor (A-KNN). The second technique, inspired by the bagging algorithm led to three approaches of Graph-Based Bagging (BG). Finally the third technique, inspired by the Cross- Validated Committees algorithm led to the Graph-Based Cross-Validated Committees (CVCG). The experiments were performed on 38 data sets, 22 datasets in propositional format and 16 in relational format. Evaluation was performed using the scheme of 10-fold stratified cross-validation and to determine statistical differences between the classifiers it was used the method proposed by Demsar (2006). Regarding the results, these three techniques improved or at least maintain the performance of the base classifiers. In conclusion, ensembles applied to graph-based classifiers have good results in the performance of them
Barrow, Devon K. "Active model combination : an evaluation and extension of bagging and boosting for time series forecasting." Thesis, Lancaster University, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.659174.
Full textDang, Yue. "A Comparative Study of Bagging and Boosting of Supervised and Unsupervised Classifiers For Outliers Detection." Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1502475855457354.
Full textBourel, Mathias. "Agrégation de modèles en apprentissage statistique pour l'estimation de la densité et la classification multiclasse." Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4076/document.
Full textEnsemble methods in statistical learning combine several base learners built from the same data set in order to obtain a more stable predictor with better performance. Such methods have been extensively studied in the supervised context for regression and classification. In this work we consider the extension of these approaches to density estimation. We suggest several new algorithms in the same spirit as bagging and boosting. We show the efficiency of combined density estimators by extensive simulations. We give also the theoretical results for one of our algorithms (Random Averaged Shifted Histogram) by mean of asymptotical convergence under milmd conditions. A second part is devoted to the extensions of the Boosting algorithms for the multiclass case. We propose a new algorithm (Adaboost.BG) accounting for the margin of the base classifiers and show its efficiency by simulations and comparing it to the most used methods in this context on several datasets from the machine learning benchmark. Partial theoretical results are given for our algorithm, such as the exponential decrease of the learning set misclassification error to zero
Siqueira, Vânia Rosatti de. "Um modelo de credit scoring para microcrédito: uma inovação no mercado brasileiro." Universidade Presbiteriana Mackenzie, 2011. http://tede.mackenzie.br/jspui/handle/tede/546.
Full textThe Grameen Bank experiences with microcredit operations have been imitated in various countries, mainly the ones related to the two great innovations in this market: the credit agent s role and the solidary group mechanism. The massification of the operations and the reduction in their costs become vital for economies of scale to be achieved, as well as a greater appetite for the MFIs to expand their activity in the microcredit market. In this context, the next great innovation in the microcredit market will be the introduction of credit scoring models in such operations. This will speed up the process, reduce the risks and consequently the costs. Historical information about microcredit operations was taken into account for the creation of a credit model. It was then possible to identify key variables that help to distinguish between the good and the bad borrowers. The results show that as machine learning techniques bagging and boosting are added to the traditional methods of credit analysis discriminant analysis and logistic regression , an improvement in the performance of the credit scoring models for microcredit can be achieved.
As experiências do Grameen Bank com operações de microcrédito têm sido reproduzidas em vários países, principalmente as relacionadas com as duas grandes inovações neste mercado: o papel do agente de crédito e o mecanismo de grupo solidário. A massificação das operações e a redução de custos tornam-se imprescindíveis para que haja economia de escala e maior apetite para as IMFs ampliarem sua atuação neste mercado. Neste cenário, a implantação de modelos de credit scoring será a próxima inovação do microcrédito e proporcionará agilidade, redução de riscos e, conseqüentemente, redução dos custos. Com base em informações históricas de operações de microcrédito foi elaborado um modelo de crédito. Foram identificadas variáveis chave que permitem distinguir os bons e maus pagadores. Os resultados mostram que, acoplando-se técnicas de linguagem de máquina bagging e boosting aos métodos tradicionais de análise de crédito análise discriminante e regressão logística , obtém-se melhora na performance dos modelos de credit scoring para microcrédito.
Lopes, Neilson Soares. "Modelos de classificação de risco de crédito para financiamentos imobiliários: regressão logística, análise discriminante, árvores de decisão, bagging e boosting." Universidade Presbiteriana Mackenzie, 2011. http://tede.mackenzie.br/jspui/handle/tede/527.
Full textFundo Mackenzie de Pesquisa
This study applied the techniques of traditional parametric discriminant analysis and logistic regression analysis of credit real estate financing transactions where borrowers may or may not have a payroll loan transaction. It was the hit rate compared these methods with the non-parametric techniques based on classification trees, and the methods of meta-learning bagging and boosting that combine classifiers for improved accuracy in the algorithms.In a context of high housing deficit, especially in Brazil, the financing of real estate can still be very encouraged. The impacts of sustainable growth in the mortgage not only bring economic benefits and social. The house is, for most individuals, the largest source of expenditure and the most valuable asset that will have during her lifetime.At the end of the study concluded that the computational techniques of decision trees are more effective for the prediction of payers (94.2% correct), followed by bagging (80.7%) and boosting (or arcing , 75.2%). For the prediction of bad debtors in mortgages, the techniques of logistic regression and discriminant analysis showed the worst results (74.6% and 70.7%, respectively). For the good payers, the decision tree also showed the best predictive power (75.8%), followed by discriminant analysis (75.3%) and boosting (72.9%). For the good paying mortgages, bagging and logistic regression showed the worst results (72.1% and 71.7%, respectively). Logistic regression shows that for a borrower with payroll loans, the chance to be a bad credit is 2.19 higher than if the borrower does not have such type of loan.The presence of credit between the payroll operations of mortgage borrowers also has relevance in the discriminant analysis.
Neste estudo foram aplicadas as técnicas paramétricas tradicionais de análise discriminante e regressão logística para análise de crédito de operações de financiamento imobiliário. Foi comparada a taxa de acertos destes métodos com as técnicas não-paramétricas baseadas em árvores de classificação, além dos métodos de meta-aprendizagem BAGGING e BOOSTING, que combinam classificadores para obter uma melhor precisão nos algoritmos.Em um contexto de alto déficit de moradias, em especial no caso brasileiro, o financiamento de imóveis ainda pode ser bastante fomentado. Os impactos de um crescimento sustentável no crédito imobiliário trazem benefícios não só econômicos como sociais. A moradia é, para grande parte dos indivíduos, a maior fonte de despesas e o ativo mais valioso que terão durante sua vida. Ao final do estudo, concluiu-se que as técnicas computacionais de árvores de decisão se mostram mais efetivas para a predição de maus pagadores (94,2% de acerto), seguida do BAGGING (80,7%) e do BOOSTING (ou ARCING, 75,2%). Para a predição de maus pagadores em financiamentos imobiliários, as técnicas de regressão logística e análise discriminante apresentaram os piores resultados (74,6% e 70,7%, respectivamente). Para os bons pagadores, a árvore de decisão também apresentou o melhor poder preditivo (75,8%), seguida da análise discriminante (75,3%) e do BOOSTING (72,9%). Para os bons pagadores de financiamentos imobiliários, BAGGING e regressão logística apresentaram os piores resultados (72,1% e 71,7%, respectivamente).A regressão logística mostra que, para um tomador com crédito consignado, a chance se ser um mau pagador é 2,19 maior do que se este tomador não tivesse tal modalidade de empréstimo. A presença de crédito consignado entre as operações dos tomadores de financiamento imobiliário também apresenta relevância na análise discriminante.
Shire, Norah J. "Boosting, Bagging, and Classification Analysis to Improve Noninvasive Liver Fibrosis Prediction in HCV/HIV Coinfected Subjects: An Analysis of the AIDS Clinical Trials Group (ACTG) 5178." Cincinnati, Ohio : University of Cincinnati, 2007. http://rave.ohiolink.edu/etdc/view.cgi?acc_num=ucin1172860066.
Full textAdvisor: Charles Ralph Buncher. Title from electronic thesis title page (viewed April 23, 2009). Keywords: Coinfection; Boosting and bagging; Classification analysis; HIV; Viral hepatitis. Includes abstract. Includes bibliographical references.
Kirkby, Richard Brendon. "Improving Hoeffding Trees." The University of Waikato, 2008. http://hdl.handle.net/10289/2568.
Full textSeck, Djamal. "Arbres de décisions symboliques, outils de validations et d'aide à l'interprétation." Thesis, Paris 9, 2012. http://www.theses.fr/2012PA090067.
Full textIn this thesis, we propose the STREE methodology for the construction of decision trees with symbolic data. This data type allows us to characterize individuals of higher levels which may be classes or categories of individuals or concepts within the meaning of the Galois lattice. The values of the variables, called symbolic variables, may be sets, intervals or histograms. The criterion of recursive partitioning is a combination of a criterion related to the explanatory variables and a criterion related to the dependant variable. The first criterion is the variation of the variance of the explanatory variables. When it is applied alone, STREE acts as a top-down clustering methodology. The second criterion enables us to build a decision tree. This criteron is expressed as the variation of the Gini index if the dependant variable is nominal, and as the variation of the variance if thedependant variable is continuous or is a symbolic variable. Conventional data are a special case of symbolic data on which STREE can also get good results. It has performed well on multiple sets of UCI data compared to conventional methodologies of Data Mining such as CART, C4.5, Naive Bayes, KNN, MLP and SVM. The STREE methodology also allows for the construction of ensembles of symbolic decision trees either by bagging or by boosting. The use of such ensembles is designed to overcome shortcomings related to the decisions trees themselves and to obtain a finaldecision that is in principle more reliable than that obtained from a single tree
Dias, Alexandra Aparecida Delpósito. "Previsão do incumprimento no crédito a empresas com classificadores múltiplos." Master's thesis, Instituto Superior de Economia e Gestão, 2012. http://hdl.handle.net/10400.5/11023.
Full textNeste estudo foram implementados modelos de previsão do incumprimento no crédito a empresas baseados em classificadores múltiplos. O desempenho destes modelos foi comparado com o de classificadores individuais. A capacidade preditiva dos modelos foi avaliada através de curvas ROC e da análise de taxas de erro de classificação. Os resultados sugerem que modelos baseados em classificadores múltiplos têm maior precisão na classificação de incumprimento do que classificadores individuais.
This study develops models for predicting credit defaults in the corporate segment using multiple classifiers. The performance of these models was compared with those of individual classifiers. The predictive ability of the competing models was evaluated using ROC curves and error rates of classification. The results suggest that models based on multiple classifiers have a better performance in the classification of credit defaults than individual classifiers.
Jiang, Fuhua. "SVM-Based Negative Data Mining to Binary Classification." Digital Archive @ GSU, 2006. http://digitalarchive.gsu.edu/cs_diss/8.
Full textHovorka, Martin. "Meta-learning." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217654.
Full textZoghi, Zeinab. "Ensemble Classifier Design and Performance Evaluation for Intrusion Detection Using UNSW-NB15 Dataset." University of Toledo / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1596756673292254.
Full textUlriksson, Marcus, and Shahin Armaki. "Analys av prestations- och prediktionsvariabler inom fotboll." Thesis, Uppsala universitet, Statistiska institutionen, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-324983.
Full textNascimento, Diego Silveira Costa. "Novas abordagens para configura??es autom?ticas dos par?metros de controle em comit?s de classificadores." Universidade Federal do Rio Grande do Norte, 2014. http://repositorio.ufrn.br/handle/123456789/19754.
Full textApproved for entry into archive by Arlan Eloi Leite Silva (eloihistoriador@yahoo.com.br) on 2016-02-03T23:54:37Z (GMT) No. of bitstreams: 1 DiegoSilveiraCostaNascimento_TESE.pdf: 3953454 bytes, checksum: 3237fa5d0296298ccc738a2ba7eab05e (MD5)
Made available in DSpace on 2016-02-03T23:54:37Z (GMT). No. of bitstreams: 1 DiegoSilveiraCostaNascimento_TESE.pdf: 3953454 bytes, checksum: 3237fa5d0296298ccc738a2ba7eab05e (MD5) Previous issue date: 2014-12-05
Significativos avan?os v?m surgindo em pesquisas relacionadas ao tema de Comit?s de Classificadores. Os modelos que mais recebem aten??o na literatura s?o aqueles de natureza est?tica, ou tamb?m conhecidos por ensembles. Dos algoritmos que fazem parte dessa classe, destacam-se os m?todos que utilizam reamostragem dos dados de treinamento: Bagging, Boosting e Multiboosting. A escolha do tipo de arquitetura e dos componentes a serem recrutados n?o ? uma tarefa trivial, e tem motivado, ainda mais, o surgimento de novas propostas na tentativa de se construir tais modelos de forma autom?tica e, muitas delas, s?o baseadas em m?todos de otimiza??o. Muitas dessas contribui??es n?o t?m apresentado resultados satisfat?rios quando aplicadas a problemas mais complexos ou de natureza distinta. Em contrapartida, a tese aqui apresentada prop?e tr?s novas abordagens h?bridas para constru??o autom?tica em ensembles de classificadores: Incremento de Diversidade, Fun??o de Avalia??o Adaptativa e Meta-aprendizado para a elabora??o de sistemas de configura??o autom?tica dos par?metros de controle para os modelos de ensemble. Na primeira abordagem, ? proposta uma solu??o que combina diferentes t?cnicas de diversidade em um ?nico arcabou?o conceitual, na tentativa de se alcan?ar n?veis mais elevados de diversidade em ensemble, e com isso, melhor o desempenho de tais sistemas. J? na segunda abordagem, ? utilizado um algoritmo gen?tico para o design autom?tico de ensembles. A contribui??o consiste em combinar as t?cnicas de filtro e wrapper de forma adaptativa para evoluir uma melhor distribui??o do espa?o de atributos a serem apresentados aos componentes de um ensemble. E por fim, a ?ltima abordagem, que prop?e uma nova t?cnica de recomenda??o de arquitetura e componentes base em ensemble, via t?cnicas de meta-aprendizado tradicional e multirr?tulo. De forma geral os resultados s?o animadores, e corroboram com a tese de que ferramentas h?bridas s?o uma poderosa solu??o na constru??o de ensembles eficazes em problemas de classifica??o de padr?es
Significant advances have emerged in research related to the topic of Classifier Committees. The models that receive the most attention in the literature are those of the static nature, also known as ensembles. The algorithms that are part of this class, we highlight the methods that using techniques of resampling of the training data: Bagging, Boosting and Multiboosting. The choice of the architecture and base components to be recruited is not a trivial task and has motivated new proposals in an attempt to build such models automatically, and many of them are based on optimization methods. Many of these contributions have not shown satisfactory results when applied to more complex problems with different nature. In contrast, the thesis presented here, proposes three new hybrid approaches for automatic construction for ensembles: Increment of Diversity, Adaptive-fitness Function and Meta-learning for the development of systems for automatic configuration of parameters for models of ensemble. In the first one approach, we propose a solution that combines different diversity techniques in a single conceptual framework, in attempt to achieve higher levels of diversity in ensembles, and with it, the better the performance of such systems. In the second one approach, using a genetic algorithm for automatic design of ensembles. The contribution is to combine the techniques of filter and wrapper adaptively to evolve a better distribution of the feature space to be presented for the components of ensemble. Finally, the last one approach, which proposes new techniques for recommendation of architecture and based components on ensemble, by techniques of traditional meta-learning and multi-label meta-learning. In general, the results are encouraging and corroborate with the thesis that hybrid tools are a powerful solution in building effective ensembles for pattern classification problems.
Thorén, Daniel. "Radar based tank level measurement using machine learning : Agricultural machines." Thesis, Linköpings universitet, Programvara och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176259.
Full text"Estimação não parametrica aplicada a problemas de classificação via Bagging e Boosting." Tese, Biblioteca Digital da Unicamp, 2004. http://libdigi.unicamp.br/document/?code=vtls000316781.
Full textNožička, Michal. "Ensemble learning metody pro vývoj skóringových modelů." Master's thesis, 2018. http://www.nusl.cz/ntk/nusl-382813.
Full textKrugell, Marike. "Bias reduction studies in nonparametric regression with applications : an empirical approach / Marike Krugell." Thesis, 2014. http://hdl.handle.net/10394/15345.
Full textMSc (Statistics), North-West University, Potchefstroom Campus, 2015
Rodríguez, Hernán Cortés. "Ensemble classifiers in remote sensing: a comparative analysis." Master's thesis, 2014. http://hdl.handle.net/10362/11671.
Full textLand Cover and Land Use (LCLU) maps are very important tools for understanding the relationships between human activities and the natural environment. Defining accurately all the features over the Earth's surface is essential to assure their management properly. The basic data which are being used to derive those maps are remote sensing imagery (RSI), and concretely, satellite images. Hence, new techniques and methods able to deal with those data and at the same time, do it accurately, have been demanded. In this work, our goal was to have a brief review over some of the currently approaches in the scientific community to face this challenge, to get higher accuracy in LCLU maps. Although, we will be focus on the study of the classifiers ensembles and the different strategies that those ensembles present in the literature. We have proposed different ensembles strategies based in our data and previous work, in order to increase the accuracy of previous LCLU maps made by using the same data and single classifiers. Finally, only one of the ensembles proposed have got significantly higher accuracy, in the classification of LCLU map, than the better single classifier performance with the same data. Also, it was proved that diversity did not play an important role in the success of this ensemble.