Academic literature on the topic 'ENSEMBLE LEARNING MODELS'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'ENSEMBLE LEARNING MODELS.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Dissertations / Theses on the topic "ENSEMBLE LEARNING MODELS"

1

He, Wenbin. "Exploration and Analysis of Ensemble Datasets with Statistical and Deep Learning Models." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1574695259847734.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kim, Jinhan. "J-model : an open and social ensemble learning architecture for classification." Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/7672.

Full text
Abstract:
Ensemble learning is a promising direction of research in machine learning, in which an ensemble classifier gives better predictive and more robust performance for classification problems by combining other learners. Meanwhile agent-based systems provide frameworks to share knowledge from multiple agents in an open context. This thesis combines multi-agent knowledge sharing with ensemble methods to produce a new style of learning system for open environments. We now are surrounded by many smart objects such as wireless sensors, ambient communication devices, mobile medical devices and even information supplied via other humans. When we coordinate smart objects properly, we can produce a form of collective intelligence from their collaboration. Traditional ensemble methods and agent-based systems have complementary advantages and disadvantages in this context. Traditional ensemble methods show better classification performance, while agent-based systems might not guarantee their performance for classification. Traditional ensemble methods work as closed and centralised systems (so they cannot handle classifiers in an open context), while agent-based systems are natural vehicles for classifiers in an open context. We designed an open and social ensemble learning architecture, named J-model, to merge the conflicting benefits of the two research domains. The J-model architecture is based on a service choreography approach for coordinating classifiers. Coordination protocols are defined by interaction models that describe how classifiers will interact with one another in a peer-to-peer manner. The peer ranking algorithm recommends more appropriate classifiers to participate in an interaction model to boost the success rate of results of their interactions. Coordinated participant classifiers who are recommended by the peer ranking algorithm become an ensemble classifier within J-model. We evaluated J-model’s classification performance with 13 UCI machine learning benchmark data sets and a virtual screening problem as a realistic classification problem. J-model showed better performance of accuracy, for 9 benchmark sets out of 13 data sets, than 8 other representative traditional ensemble methods. J-model gave better results of specificity for 7 benchmark sets. In the virtual screening problem, J-model gave better results for 12 out of 16 bioassays than already published results. We defined different interaction models for each specific classification task and the peer ranking algorithm was used across all the interaction models. Our research contributions to knowledge are as follows. First, we showed that service choreography can be an effective ensemble coordination method for classifiers in an open context. Second, we used interaction models that implement task specific coordinations of classifiers to solve a variety of representative classification problems. Third, we designed the peer ranking algorithm which is generally and independently applicable to the task of recommending appropriate member classifiers from a classifier pool based on an open pool of interaction models and classifiers.
APA, Harvard, Vancouver, ISO, and other styles
3

Gharroudi, Ouadie. "Ensemble multi-label learning in supervised and semi-supervised settings." Thesis, Lyon, 2017. http://www.theses.fr/2017LYSE1333/document.

Full text
Abstract:
L'apprentissage multi-label est un problème d'apprentissage supervisé où chaque instance peut être associée à plusieurs labels cibles simultanément. Il est omniprésent dans l'apprentissage automatique et apparaît naturellement dans de nombreuses applications du monde réel telles que la classification de documents, l'étiquetage automatique de musique et l'annotation d'images. Nous discutons d'abord pourquoi les algorithmes multi-label de l'etat-de-l'art utilisant un comité de modèle souffrent de certains inconvénients pratiques. Nous proposons ensuite une nouvelle stratégie pour construire et agréger les modèles ensemblistes multi-label basés sur k-labels. Nous analysons ensuite en profondeur l'effet de l'étape d'agrégation au sein des approches ensemblistes multi-label et étudions comment cette agrégation influece les performances de prédictive du modèle enfocntion de la nature de fonction cout à optimiser. Nous abordons ensuite le problème spécifique de la selection de variables dans le contexte multi-label en se basant sur le paradigme ensembliste. Trois méthodes de sélection de caractéristiques multi-label basées sur le paradigme des forêts aléatoires sont proposées. Ces méthodes diffèrent dans la façon dont elles considèrent la dépendance entre les labels dans le processus de sélection des varibales. Enfin, nous étendons les problèmes de classification et de sélection de variables au cadre d'apprentissage semi-supervisé. Nous proposons une nouvelle approche de sélection de variables multi-label semi-supervisée basée sur le paradigme de l'ensemble. Le modèle proposé associe des principes issues de la co-training en conjonction avec une métrique interne d'évaluation d'importnance des varaibles basée sur les out-of-bag. Testés de manière satisfaisante sur plusieurs données de référence, les approches développées dans cette thèse sont prometteuses pour une variété d'ap-plications dans l'apprentissage multi-label supervisé et semi-supervisé. Testés de manière satisfaisante sur plusieurs jeux de données de référence, les approches développées dans cette thèse affichent des résultats prometteurs pour une variété domaine d'applications de l'apprentissage multi-label supervisé et semi-supervisé<br>Multi-label learning is a specific supervised learning problem where each instance can be associated with multiple target labels simultaneously. Multi-label learning is ubiquitous in machine learning and arises naturally in many real-world applications such as document classification, automatic music tagging and image annotation. In this thesis, we formulate the multi-label learning as an ensemble learning problem in order to provide satisfactory solutions for both the multi-label classification and the feature selection tasks, while being consistent with respect to any type of objective loss function. We first discuss why the state-of-the art single multi-label algorithms using an effective committee of multi-label models suffer from certain practical drawbacks. We then propose a novel strategy to build and aggregate k-labelsets based committee in the context of ensemble multi-label classification. We then analyze the effect of the aggregation step within ensemble multi-label approaches in depth and investigate how this aggregation impacts the prediction performances with respect to the objective multi-label loss metric. We then address the specific problem of identifying relevant subsets of features - among potentially irrelevant and redundant features - in the multi-label context based on the ensemble paradigm. Three wrapper multi-label feature selection methods based on the Random Forest paradigm are proposed. These methods differ in the way they consider label dependence within the feature selection process. Finally, we extend the multi-label classification and feature selection problems to the semi-supervised setting and consider the situation where only few labelled instances are available. We propose a new semi-supervised multi-label feature selection approach based on the ensemble paradigm. The proposed model combines ideas from co-training and multi-label k-labelsets committee construction in tandem with an inner out-of-bag label feature importance evaluation. Satisfactorily tested on several benchmark data, the approaches developed in this thesis show promise for a variety of applications in supervised and semi-supervised multi-label learning
APA, Harvard, Vancouver, ISO, and other styles
4

Henriksson, Aron. "Ensembles of Semantic Spaces : On Combining Models of Distributional Semantics with Applications in Healthcare." Doctoral thesis, Stockholms universitet, Institutionen för data- och systemvetenskap, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-122465.

Full text
Abstract:
Distributional semantics allows models of linguistic meaning to be derived from observations of language use in large amounts of text. By modeling the meaning of words in semantic (vector) space on the basis of co-occurrence information, distributional semantics permits a quantitative interpretation of (relative) word meaning in an unsupervised setting, i.e., human annotations are not required. The ability to obtain inexpensive word representations in this manner helps to alleviate the bottleneck of fully supervised approaches to natural language processing, especially since models of distributional semantics are data-driven and hence agnostic to both language and domain. All that is required to obtain distributed word representations is a sizeable corpus; however, the composition of the semantic space is not only affected by the underlying data but also by certain model hyperparameters. While these can be optimized for a specific downstream task, there are currently limitations to the extent the many aspects of semantics can be captured in a single model. This dissertation investigates the possibility of capturing multiple aspects of lexical semantics by adopting the ensemble methodology within a distributional semantic framework to create ensembles of semantic spaces. To that end, various strategies for creating the constituent semantic spaces, as well as for combining them, are explored in a number of studies. The notion of semantic space ensembles is generalizable across languages and domains; however, the use of unsupervised methods is particularly valuable in low-resource settings, in particular when annotated corpora are scarce, as in the domain of Swedish healthcare. The semantic space ensembles are here empirically evaluated for tasks that have promising applications in healthcare. It is shown that semantic space ensembles – created by exploiting various corpora and data types, as well as by adjusting model hyperparameters such as the size of the context window and the strategy for handling word order within the context window – are able to outperform the use of any single constituent model on a range of tasks. The semantic space ensembles are used both directly for k-nearest neighbors retrieval and for semi-supervised machine learning. Applying semantic space ensembles to important medical problems facilitates the secondary use of healthcare data, which, despite its abundance and transformative potential, is grossly underutilized.<br><p>At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 4 and 5: Unpublished conference papers.</p><br>High-Performance Data Mining for Drug Effect Detection
APA, Harvard, Vancouver, ISO, and other styles
5

Chakraborty, Debaditya. "Detection of Faults in HVAC Systems using Tree-based Ensemble Models and Dynamic Thresholds." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1543582336141076.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Li, Qiongzhu. "Study of Single and Ensemble Machine Learning Models on Credit Data to Detect Underlying Non-performing Loans." Thesis, Uppsala universitet, Statistiska institutionen, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-297080.

Full text
Abstract:
In this paper, we try to compare the performance of two feature dimension reduction methods, the LASSO and PCA. Both simulation study and empirical study show that the LASSO is superior to PCA when selecting significant variables. We apply Logistics Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), Decision Tree (DT) and their corresponding ensemble machines constructed by bagging and adaptive boosting (adaboost) in our study. Three experiments are conducted to explore the impact of class-unbalanced data set on all models. Empirical study indicates that when the percentage of performing loans exceeds 83.3%, the training models shall be carefully applied. When we have class-balanced data set, ensemble machines indeed have a better performance over single machines. The weaker the single machine, the more obvious the improvement we can observe.
APA, Harvard, Vancouver, ISO, and other styles
7

Franch, Gabriele. "Deep Learning for Spatiotemporal Nowcasting." Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/295096.

Full text
Abstract:
Nowcasting – short-term forecasting using current observations – is a key challenge that human activities have to face on a daily basis. We heavily rely on short-term meteorological predictions in domains such as aviation, agriculture, mobility, and energy production. One of the most important and challenging task for meteorology is the nowcasting of extreme events, whose anticipation is highly needed to mitigate risk in terms of social or economic costs and human safety. The goal of this thesis is to contribute with new machine learning methods to improve the spatio-temporal precision of nowcasting of extreme precipitation events. This work relies on recent advances in deep learning for nowcasting, adding methods targeted at improving nowcasting using ensembles and trained on novel original data resources. Indeed, the new curated multi-year radar scan dataset (TAASRAD19) is introduced that contains more than 350.000 labelled precipitation records over 10 years, to provide a baseline benchmark, and foster reproducibility of machine learning modeling. A TrajGRU model is applied to TAASRAD19, and implemented in an operational prototype. The thesis also introduces a novel method for fast analog search based on manifold learning: the tool leverages the entire dataset history in less than 5 seconds and demonstrates the feasibility of predictive ensembles. In the final part of the thesis, the new deep learning architecture ConvSG based on stacked generalization is presented, introducing novel concepts for deep learning in precipitation nowcasting: ConvSG is specifically designed to improve predictions of extreme precipitation regimes over published methods, and shows a 117% skill improvement on extreme rain regimes over a single member. Moreover, ConvSG shows superior or equal skills compared to Lagrangian Extrapolation models for all rain rates, achieving a 49% average improvement in predictive skill over extrapolation on the higher precipitation regimes.
APA, Harvard, Vancouver, ISO, and other styles
8

Franch, Gabriele. "Deep Learning for Spatiotemporal Nowcasting." Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/295096.

Full text
Abstract:
Nowcasting – short-term forecasting using current observations – is a key challenge that human activities have to face on a daily basis. We heavily rely on short-term meteorological predictions in domains such as aviation, agriculture, mobility, and energy production. One of the most important and challenging task for meteorology is the nowcasting of extreme events, whose anticipation is highly needed to mitigate risk in terms of social or economic costs and human safety. The goal of this thesis is to contribute with new machine learning methods to improve the spatio-temporal precision of nowcasting of extreme precipitation events. This work relies on recent advances in deep learning for nowcasting, adding methods targeted at improving nowcasting using ensembles and trained on novel original data resources. Indeed, the new curated multi-year radar scan dataset (TAASRAD19) is introduced that contains more than 350.000 labelled precipitation records over 10 years, to provide a baseline benchmark, and foster reproducibility of machine learning modeling. A TrajGRU model is applied to TAASRAD19, and implemented in an operational prototype. The thesis also introduces a novel method for fast analog search based on manifold learning: the tool leverages the entire dataset history in less than 5 seconds and demonstrates the feasibility of predictive ensembles. In the final part of the thesis, the new deep learning architecture ConvSG based on stacked generalization is presented, introducing novel concepts for deep learning in precipitation nowcasting: ConvSG is specifically designed to improve predictions of extreme precipitation regimes over published methods, and shows a 117% skill improvement on extreme rain regimes over a single member. Moreover, ConvSG shows superior or equal skills compared to Lagrangian Extrapolation models for all rain rates, achieving a 49% average improvement in predictive skill over extrapolation on the higher precipitation regimes.
APA, Harvard, Vancouver, ISO, and other styles
9

Ekström, Linus, and Andreas Augustsson. "A comperative study of text classification models on invoices : The feasibility of different machine learning algorithms and their accuracy." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-15647.

Full text
Abstract:
Text classification for companies is becoming more important in a world where an increasing amount of digital data are made available. The aim is to research whether five different machine learning algorithms can be used to automate the process of classification of invoice data and see which one gets the highest accuracy. Algorithms are in a later stage combined for an attempt to achieve higher results. N-grams are used, and results are compared in form of total accuracy of classification for each algorithm. A library in Python, called scikit-learn, implementing the chosen algorithms, was used. Data is collected and generated to represent data present on a real invoice where data has been extracted. Results from this thesis show that it is possible to use machine learning for this type of problem. The highest scoring algorithm (LinearSVC from scikit-learn) classifies 86% of all samples correctly. This is a margin of 16% above the acceptable level of 70%.
APA, Harvard, Vancouver, ISO, and other styles
10

Lundberg, Jacob. "Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems." Thesis, Linköpings universitet, Statistik och maskininlärning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013.

Full text
Abstract:
Combining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark model is created with LightGBM, a popular framework for gradient boosted decision trees. This model is first transformed and regularized with RuleFit, a LASSO regression framework. Then it is further optimized with quantization and weight sharing, techniques used when compressing neural networks. The entire process is combined into a novel framework, called ESRule. The data used comes from the domain of frequency measurements in cellular networks. There is a clear use-case where embedded systems can use the produced resource optimized models. Compared with LightGBM, ESRule uses 72ˆ less internal memory on average, simultaneously increasing predictive performance. The models use 4 kilobytes on average. The serialized variant of ESRule uses 104ˆ less hard disk space than LightGBM. ESRule is also clearly faster at predicting a single sample.
APA, Harvard, Vancouver, ISO, and other styles
More sources
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!