Bibliographies thématiques / Cost-sensitive classification

Sommaire

Articles de revues
Thèses
Chapitres de livres
Actes de conférences
Rapports d'organisations

Littérature scientifique sur le sujet « Cost-sensitive classification »

Auteur : Grafiati

Publié le 4 juin 2021

Mis à jour le 30 janvier 2023

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Cost-sensitive classification ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Articles de revues sur le sujet "Cost-sensitive classification"

Wang, Jialei, Peilin Zhao et Steven C. H. Hoi. « Cost-Sensitive Online Classification ». IEEE Transactions on Knowledge and Data Engineering 26, n^o 10 (octobre 2014) : 2425–38. http://dx.doi.org/10.1109/tkde.2013.157.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Zhang, Shichao. « Cost-sensitive KNN classification ». Neurocomputing 391 (mai 2020) : 234–42. http://dx.doi.org/10.1016/j.neucom.2018.11.101.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Zhao, Peilin, Yifan Zhang, Min Wu, Steven C. H. Hoi, Mingkui Tan et Junzhou Huang. « Adaptive Cost-Sensitive Online Classification ». IEEE Transactions on Knowledge and Data Engineering 31, n^o 2 (1 février 2019) : 214–28. http://dx.doi.org/10.1109/tkde.2018.2826011.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Cebe, Mumin, et Cigdem Gunduz-Demir. « Qualitative test-cost sensitive classification ». Pattern Recognition Letters 31, n^o 13 (octobre 2010) : 2043–51. http://dx.doi.org/10.1016/j.patrec.2010.05.028.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Zhang, Shichao. « Cost-sensitive classification with respect to waiting cost ». Knowledge-Based Systems 23, n^o 5 (juillet 2010) : 369–78. http://dx.doi.org/10.1016/j.knosys.2010.01.008.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Pendharkar, Parag C. « Linear models for cost-sensitive classification ». Expert Systems 32, n^o 5 (5 juin 2015) : 622–36. http://dx.doi.org/10.1111/exsy.12114.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Ji, Shihao, et Lawrence Carin. « Cost-sensitive feature acquisition and classification ». Pattern Recognition 40, n^o 5 (mai 2007) : 1474–85. http://dx.doi.org/10.1016/j.patcog.2006.11.008.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Yang, Yi, Yuxuan Guo et Xiangyu Chang. « Angle-based cost-sensitive multicategory classification ». Computational Statistics & ; Data Analysis 156 (avril 2021) : 107107. http://dx.doi.org/10.1016/j.csda.2020.107107.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Tapkan, Pınar, Lale Özbakır, Sinem Kulluk et Adil Baykasoğlu. « A cost-sensitive classification algorithm : BEE-Miner ». Knowledge-Based Systems 95 (mars 2016) : 99–113. http://dx.doi.org/10.1016/j.knosys.2015.12.010.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Wang, Tao, Zhenxing Qin, Shichao Zhang et Chengqi Zhang. « Cost-sensitive classification with inadequate labeled data ». Information Systems 37, n^o 5 (juillet 2012) : 508–16. http://dx.doi.org/10.1016/j.is.2011.10.009.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Plus de sources

Thèses sur le sujet "Cost-sensitive classification"

Dachraoui, Asma. « Cost-Sensitive Early classification of Time Series ». Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLA002/document.

Texte intégral

Résumé :

Dans de nombreux domaines dans lesquels les mesures ou les données sont disponibles séquentiellement, il est important de savoir décider le plus tôt possible, même si c’est à partir d’informations encore incomplètes. C’est le cas par exemple en milieu hospitalier où l’apprentissage de règles de décision peut se faire à partir de cas complètement documentés, mais où, devant un nouveau patient, il peut être crucial de prendre une dé- cision très rapidement. Dans ce type de contextes, un compromis doit être optimisé entre la possibilité d’arriver à une meilleure décision en attendant des mesures supplé- mentaires, et le coût croissant associé à chaque nouvelle mesure. Nous considérons dans cette thèse un nouveau cadre général de classification précoce de séries temporelles où le coût d’attente avant de prendre une décision est explicitement pris en compte lors de l’optimisation du compromis entre la qualité et la précocité de prédictions. Nous proposons donc un critère formel qui exprime ce compromis, ainsi que deux approches différentes pour le résoudre. Ces approches sont intéressantes et apportent deux propriétés désirables pour décider en ligne : (i) elles estiment en ligne l’instant optimal dans le futur où une minimisation du critère peut être prévue. Elles vont donc au-delà des approches classiques qui décident d’une façon myope, à chaque instant, d’émettre une prédiction ou d’attendre plus d’information, (ii) ces approches sont adaptatives car elles prennent en compte les propriétés de la série temporelle en entrée pour estimer l’instant optimal pour la classifier. Des expériences extensives sur des données contrôlées et sur des données réelles montrent l’intérêt de ces approches pour fournir des prédictions précoces, fiables, adaptatives et non myopes, ce qui est indispensable dans de nombreuses applications
Early classification of time series is becoming increasingly a valuable task for assisting in decision making process in many application domains. In this setting, information can be gained by waiting for more evidences to arrive, thus helping to make better decisions that incur lower misclassification costs, but, meanwhile, the cost associated with delaying the decision generally increases, rendering the decision less attractive. Making early predictions provided that are accurate requires then to solve an optimization problem combining two types of competing costs. This thesis introduces a new general framework for time series early classification problem. Unlike classical approaches that implicitly assume that misclassification errors are cost equally and the cost of delaying the decision is constant over time, we cast the the problem as a costsensitive online decision making problem when delaying the decision is costly. We then propose a new formal criterion, along with two approaches that estimate the optimal decision time for a new incoming yet incomplete time series. In particular, they capture the evolutions of typical complete time series in the training set thanks to a segmentation technique that forms meaningful groups, and leverage these complete information to estimate the costs for all future time steps where data points still missing. These approaches are interesting in two ways: (i) they estimate, online, the earliest time in the future where a minimization of the criterion can be expected. They thus go beyond the classical approaches that myopically decide at each time step whether to make a decision or to postpone the call one more time step, and (ii) they are adaptive, in that the properties of the incoming time series are taken into account to decide when is the optimal time to output a prediction. Results of extensive experiments on synthetic and real data sets show that both approaches successfully meet the behaviors expected from early classification systems

Styles APA, Harvard, Vancouver, ISO, etc.

MARQUES, DANIEL DOS SANTOS. « A DECISION TREE LEARNER FOR COST-SENSITIVE BINARY CLASSIFICATION ». PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2016. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=28239@1.

Texte intégral

Résumé :

PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
Problemas de classificação foram amplamente estudados na literatura de aprendizado de máquina, gerando aplicações em diversas áreas. No entanto, em diversos cenários, custos por erro de classificação podem variar bastante, o que motiva o estudo de técnicas de classificação sensível ao custo. Nesse trabalho, discutimos o uso de árvores de decisão para o problema mais geral de Aprendizado Sensível ao Custo do Exemplo (ASCE), onde os custos dos erros de classificação variam com o exemplo. Uma das grandes vantagens das árvores de decisão é que são fáceis de interpretar, o que é uma propriedade altamente desejável em diversas aplicações. Propomos um novo método de seleção de atributos para construir árvores de decisão para o problema ASCE e discutimos como este pode ser implementado de forma eficiente. Por fim, comparamos o nosso método com dois outros algoritmos de árvore de decisão propostos recentemente na literatura, em 3 bases de dados públicas.
Classification problems have been widely studied in the machine learning literature, generating applications in several areas. However, in a number of scenarios, misclassification costs can vary substantially, which motivates the study of Cost-Sensitive Learning techniques. In the present work, we discuss the use of decision trees on the more general Example-Dependent Cost-Sensitive Problem (EDCSP), where misclassification costs vary with each example. One of the main advantages of decision trees is that they are easy to interpret, which is a highly desirable property in a number of applications. We propose a new attribute selection method for constructing decision trees for the EDCSP and discuss how it can be efficiently implemented. Finally, we compare our new method with two other decision tree algorithms recently proposed in the literature, in 3 publicly available datasets.

Styles APA, Harvard, Vancouver, ISO, etc.

Bakshi, Arjun. « Methodology For Generating High-Confidence Cost-Sensitive Rules For Classification ». University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1377868085.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Kamath, Vidya P. « Enhancing Gene Expression Signatures in Cancer Prediction Models : Understanding and Managing Classification Complexity ». Scholar Commons, 2010. http://scholarcommons.usf.edu/etd/3653.

Texte intégral

Résumé :

Cancer can develop through a series of genetic events in combination with external influential factors that alter the progression of the disease. Gene expression studies are designed to provide an enhanced understanding of the progression of cancer and to develop clinically relevant biomarkers of disease, prognosis and response to treatment. One of the main aims of microarray gene expression analyses is to develop signatures that are highly predictive of specific biological states, such as the molecular stage of cancer. This dissertation analyzes the classification complexity inherent in gene expression studies, proposing both techniques for measuring complexity and algorithms for reducing this complexity. Classifier algorithms that generate predictive signatures of cancer models must generalize to independent datasets for successful translation to clinical practice. The predictive performance of classifier models is shown to be dependent on the inherent complexity of the gene expression data. Three specific quantitative measures of classification complexity are proposed and one measure ( f) is shown to correlate highly (R 2=0.82) with classifier accuracy in experimental data. Three quantization methods are proposed to enhance contrast in gene expression data and reduce classification complexity. The accuracy for cancer prognosis prediction is shown to improve using quantization in two datasets studied: from 67% to 90% in lung cancer and from 56% to 68% in colorectal cancer. A corresponding reduction in classification complexity is also observed. A random subspace based multivariable feature selection approach using costsensitive analysis is proposed to model the underlying heterogeneous cancer biology and address complexity due to multiple molecular pathways and unbalanced distribution of samples into classes. The technique is shown to be more accurate than the univariate ttest method. The classifier accuracy improves from 56% to 68% for colorectal cancer prognosis prediction. A published gene expression signature to predict radiosensitivity of tumor cells is augmented with clinical indicators to enhance modeling of the data and represent the underlying biology more closely. Statistical tests and experiments indicate that the improvement in the model fit is a result of modeling the underlying biology rather than statistical over-fitting of the data, thereby accommodating classification complexity through the use of additional variables.

Styles APA, Harvard, Vancouver, ISO, etc.

Julock, Gregory Alan. « The Effectiveness of a Random Forests Model in Detecting Network-Based Buffer Overflow Attacks ». NSUWorks, 2013. http://nsuworks.nova.edu/gscis_etd/190.

Texte intégral

Résumé :

Buffer Overflows are a common type of network intrusion attack that continue to plague the networked community. Unfortunately, this type of attack is not well detected with current data mining algorithms. This research investigated the use of Random Forests, an ensemble technique that creates multiple decision trees, and then votes for the best tree. The research Investigated Random Forests' effectiveness in detecting buffer overflows compared to other data mining methods such as CART and Naïve Bayes. Random Forests was used for variable reduction, cost sensitive classification was applied, and each method's detection performance compared and reported along with the receive operator characteristics. The experiment was able to show that Random Forests outperformed CART and Naïve Bayes in classification performance. Using a technique to obtain Buffer Overflow most important variables, Random Forests was also able to improve upon its Buffer Overflow classification performance.

Styles APA, Harvard, Vancouver, ISO, etc.

Makki, Sara. « An Efficient Classification Model for Analyzing Skewed Data to Detect Frauds in the Financial Sector ». Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1339/document.

Texte intégral

Résumé :

Différents types de risques existent dans le domaine financier, tels que le financement du terrorisme, le blanchiment d’argent, la fraude de cartes de crédit, la fraude d’assurance, les risques de crédit, etc. Tout type de fraude peut entraîner des conséquences catastrophiques pour des entités telles que les banques ou les compagnies d’assurances. Ces risques financiers sont généralement détectés à l'aide des algorithmes de classification. Dans les problèmes de classification, la distribution asymétrique des classes, également connue sous le nom de déséquilibre de classe (class imbalance), est un défi très commun pour la détection des fraudes. Des approches spéciales d'exploration de données sont utilisées avec les algorithmes de classification traditionnels pour résoudre ce problème. Le problème de classes déséquilibrées se produit lorsque l'une des classes dans les données a beaucoup plus d'observations que l’autre classe. Ce problème est plus vulnérable lorsque l'on considère dans le contexte des données massives (Big Data). Les données qui sont utilisées pour construire les modèles contiennent une très petite partie de groupe minoritaire qu’on considère positifs par rapport à la classe majoritaire connue sous le nom de négatifs. Dans la plupart des cas, il est plus délicat et crucial de classer correctement le groupe minoritaire plutôt que l'autre groupe, comme la détection de la fraude, le diagnostic d’une maladie, etc. Dans ces exemples, la fraude et la maladie sont les groupes minoritaires et il est plus délicat de détecter un cas de fraude en raison de ses conséquences dangereuses qu'une situation normale. Ces proportions de classes dans les données rendent très difficile à l'algorithme d'apprentissage automatique d'apprendre les caractéristiques et les modèles du groupe minoritaire. Ces algorithmes seront biaisés vers le groupe majoritaire en raison de leurs nombreux exemples dans l'ensemble de données et apprendront à les classer beaucoup plus rapidement que l'autre groupe. Dans ce travail, nous avons développé deux approches : Une première approche ou classifieur unique basée sur les k plus proches voisins et utilise le cosinus comme mesure de similarité (Cost Sensitive Cosine Similarity K-Nearest Neighbors : CoSKNN) et une deuxième approche ou approche hybride qui combine plusieurs classifieurs uniques et fondu sur l'algorithme k-modes (K-modes Imbalanced Classification Hybrid Approach : K-MICHA). Dans l'algorithme CoSKNN, notre objectif était de résoudre le problème du déséquilibre en utilisant la mesure de cosinus et en introduisant un score sensible au coût pour la classification basée sur l'algorithme de KNN. Nous avons mené une expérience de validation comparative au cours de laquelle nous avons prouvé l'efficacité de CoSKNN en termes de taux de classification correcte et de détection des fraudes. D’autre part, K-MICHA a pour objectif de regrouper des points de données similaires en termes des résultats de classifieurs. Ensuite, calculez les probabilités de fraude dans les groupes obtenus afin de les utiliser pour détecter les fraudes de nouvelles observations. Cette approche peut être utilisée pour détecter tout type de fraude financière, lorsque des données étiquetées sont disponibles. La méthode K-MICHA est appliquée dans 3 cas : données concernant la fraude par carte de crédit, paiement mobile et assurance automobile. Dans les trois études de cas, nous comparons K-MICHA au stacking en utilisant le vote, le vote pondéré, la régression logistique et l’algorithme CART. Nous avons également comparé avec Adaboost et la forêt aléatoire. Nous prouvons l'efficacité de K-MICHA sur la base de ces expériences. Nous avons également appliqué K-MICHA dans un cadre Big Data en utilisant H2O et R. Nous avons pu traiter et analyser des ensembles de données plus volumineux en très peu de temps
There are different types of risks in financial domain such as, terrorist financing, money laundering, credit card fraudulence and insurance fraudulence that may result in catastrophic consequences for entities such as banks or insurance companies. These financial risks are usually detected using classification algorithms. In classification problems, the skewed distribution of classes also known as class imbalance, is a very common challenge in financial fraud detection, where special data mining approaches are used along with the traditional classification algorithms to tackle this issue. Imbalance class problem occurs when one of the classes have more instances than another class. This problem is more vulnerable when we consider big data context. The datasets that are used to build and train the models contain an extremely small portion of minority group also known as positives in comparison to the majority class known as negatives. In most of the cases, it’s more delicate and crucial to correctly classify the minority group rather than the other group, like fraud detection, disease diagnosis, etc. In these examples, the fraud and the disease are the minority groups and it’s more delicate to detect a fraud record because of its dangerous consequences, than a normal one. These class data proportions make it very difficult to the machine learning classifier to learn the characteristics and patterns of the minority group. These classifiers will be biased towards the majority group because of their many examples in the dataset and will learn to classify them much faster than the other group. After conducting a thorough study to investigate the challenges faced in the class imbalance cases, we found that we still can’t reach an acceptable sensitivity (i.e. good classification of minority group) without a significant decrease of accuracy. This leads to another challenge which is the choice of performance measures used to evaluate models. In these cases, this choice is not straightforward, the accuracy or sensitivity alone are misleading. We use other measures like precision-recall curve or F1 - score to evaluate this trade-off between accuracy and sensitivity. Our objective is to build an imbalanced classification model that considers the extreme class imbalance and the false alarms, in a big data framework. We developed two approaches: A Cost-Sensitive Cosine Similarity K-Nearest Neighbor (CoSKNN) as a single classifier, and a K-modes Imbalance Classification Hybrid Approach (K-MICHA) as an ensemble learning methodology. In CoSKNN, our aim was to tackle the imbalance problem by using cosine similarity as a distance metric and by introducing a cost sensitive score for the classification using the KNN algorithm. We conducted a comparative validation experiment where we prove the effectiveness of CoSKNN in terms of accuracy and fraud detection. On the other hand, the aim of K-MICHA is to cluster similar data points in terms of the classifiers outputs. Then, calculating the fraud probabilities in the obtained clusters in order to use them for detecting frauds of new transactions. This approach can be used to the detection of any type of financial fraud, where labelled data are available. At the end, we applied K-MICHA to a credit card, mobile payment and auto insurance fraud data sets. In all three case studies, we compare K-MICHA with stacking using voting, weighted voting, logistic regression and CART. We also compared with Adaboost and random forest. We prove the efficiency of K-MICHA based on these experiments

Styles APA, Harvard, Vancouver, ISO, etc.

Charnay, Clément. « Enhancing supervised learning with complex aggregate features and context sensitivity ». Thesis, Strasbourg, 2016. http://www.theses.fr/2016STRAD025/document.

Texte intégral

Résumé :

Dans cette thèse, nous étudions l'adaptation de modèles en apprentissage supervisé. Nous adaptons des algorithmes d'apprentissage existants à une représentation relationnelle. Puis, nous adaptons des modèles de prédiction aux changements de contexte.En représentation relationnelle, les données sont modélisées par plusieurs entités liées par des relations. Nous tirons parti de ces relations avec des agrégats complexes. Nous proposons des heuristiques d'optimisation stochastique pour inclure des agrégats complexes dans des arbres de décisions relationnels et des forêts, et les évaluons sur des jeux de données réelles.Nous adaptons des modèles de prédiction à deux types de changements de contexte. Nous proposons une optimisation de seuils sur des modèles à scores pour s'adapter à un changement de coûts. Puis, nous utilisons des transformations affines pour adapter les attributs numériques à un changement de distribution. Enfin, nous étendons ces transformations aux agrégats complexes
In this thesis, we study model adaptation in supervised learning. Firstly, we adapt existing learning algorithms to the relational representation of data. Secondly, we adapt learned prediction models to context change.In the relational setting, data is modeled by multiples entities linked with relationships. We handle these relationships using complex aggregate features. We propose stochastic optimization heuristics to include complex aggregates in relational decision trees and Random Forests, and assess their predictive performance on real-world datasets.We adapt prediction models to two kinds of context change. Firstly, we propose an algorithm to tune thresholds on pairwise scoring models to adapt to a change of misclassification costs. Secondly, we reframe numerical attributes with affine transformations to adapt to a change of attribute distribution between a learning and a deployment context. Finally, we extend these transformations to complex aggregates

Styles APA, Harvard, Vancouver, ISO, etc.

Lo, Hung-Yi, et 駱宏毅. « Cost-Sensitive Multi-Label Classification with Applications ». Thesis, 2013. http://ndltd.ncl.edu.tw/handle/61015886145358618517.

Texte intégral

Résumé :

博士
國立臺灣大學
資訊工程學研究所
101
We study a generalization of the traditional multi-label classification, which we refer to as cost-sensitive multi-label classification (CSML). In this problem, the misclassification cost can be different for each instance-label pair. For solving the problem, we propose two novel and general strategies based on the problem transformation technique. The proposed strategies transform the CSML problem to several cost-sensitive single-label classification problems. In addition, we propose a basis expansion model for CSML, which we call the Generalized k-Labelsets Ensemble (GLE). In the basis expansion model, a basis function is a label powerset classifier trained on a random k-labelset. The expansion coefficients are learned by minimizing the cost-weighted global error between the prediction and the ground truth. GLE can also be used for traditional multi-label classification. Experimental results on both multi-label classification and cost-sensitive multi-label classification demonstrate that our method has better performance than other methods. Cost-sensitive classification is based on the assumption that the cost is given according to the application. “Where does cost come from?” is an important practical issue. We study two real-world prediction tasks and link their data distribution to the cost information. The two tasks are medical image classification and social tag prediction. In medical image classification, we observe a patient-imbalanced phenomenon that has seriously hurt the generalization ability of the image classifier. We design several patient-balanced learning algorithms based on cost-sensitive binary classification. The success of our patient-balanced learning methods has been proved by winning KDD Cup 2008. For social tag prediction, we propose to treat the tag counts as the mis-classification costs and model the social tagging problem as a cost-sensitive multi-label classification problem. The experimental results in audio tag annotation and retrieval demonstrate that the CSML approaches outperform our winning method in Music Information Retrieval Evaluation eXchange (MIREX) 2009 in terms of both cost-sensitive and cost-less evaluation metrics. The results on social bookmark prediction also demonstrate that our proposed method has better performance than other methods.

Styles APA, Harvard, Vancouver, ISO, etc.

Sun, Yanmin. « Cost-Sensitive Boosting for Classification of Imbalanced Data ». Thesis, 2007. http://hdl.handle.net/10012/3000.

Texte intégral

Résumé :

The classification of data with imbalanced class distributions has posed a significant drawback in the performance attainable by most well-developed classification systems, which assume relatively balanced class distributions. This problem is especially crucial in many application domains, such as medical diagnosis, fraud detection, network intrusion, etc., which are of great importance in machine learning and data mining. This thesis explores meta-techniques which are applicable to most classifier learning algorithms, with the aim to advance the classification of imbalanced data. Boosting is a powerful meta-technique to learn an ensemble of weak models with a promise of improving the classification accuracy. AdaBoost has been taken as the most successful boosting algorithm. This thesis starts with applying AdaBoost to an associative classifier for both learning time reduction and accuracy improvement. However, the promise of accuracy improvement is trivial in the context of the class imbalance problem, where accuracy is less meaningful. The insight gained from a comprehensive analysis on the boosting strategy of AdaBoost leads to the investigation of cost-sensitive boosting algorithms, which are developed by introducing cost items into the learning framework of AdaBoost. The cost items are used to denote the uneven identification importance among classes, such that the boosting strategies can intentionally bias the learning towards classes associated with higher identification importance and eventually improve the identification performance on them. Given an application domain, cost values with respect to different types of samples are usually unavailable for applying the proposed cost-sensitive boosting algorithms. To set up the effective cost values, empirical methods are used for bi-class applications and heuristic searching of the Genetic Algorithm is employed for multi-class applications. This thesis also covers the implementation of the proposed cost-sensitive boosting algorithms. It ends with a discussion on the experimental results of classification of real-world imbalanced data. Compared with existing algorithms, the new algorithms this thesis presents are superior in achieving better measurements regarding the learning objectives.

Styles APA, Harvard, Vancouver, ISO, etc.

Tu, Han-Hsing, et 涂漢興. « Regression approaches for multi-class cost-sensitive classification ». Thesis, 2009. http://ndltd.ncl.edu.tw/handle/79841686006299558588.

Texte intégral

Résumé :

碩士
國立臺灣大學
資訊工程學研究所
97
Cost-sensitive classification is an important research problem in recent years. It allows machine learning algorithms to use the additional cost information to make more strategic decisions. Studies on binary cost-sensitive classification have led to promising results in theories, algorithms, and applications. The multi-class counterpart is also needed in many real-world applications, but is more difficult to analyze. This thesis focuses on multi-class cost-sensitive classification. Existing methods for multi-class cost-sensitive classification usually transform the cost information into example importance (weight). This thesis offers a different viewpoint of the problem, and proposes a novel method. We directly estimate the cost value corresponding to each prediction using regression, and outputs the label that comes with the smallest estimated cost. We improve the method by analyzing the errors made during the decision. Then, we propose a different regression loss function that tightly connects with the errors. The new loss function leads to a solid theoretical guarantee of error transformation. We design a concrete algorithm for the loss function with the support vector machines. The algorithm can be viewed as a theoretically justified extension the popular one-versus-all support vector machine. Experiments using real-world data sets with arbitrary cost values demonstrate the usefulness of our proposed methods, and validate that the cost information should be appropriately used instead of dropped.

Styles APA, Harvard, Vancouver, ISO, etc.

Plus de sources

Chapitres de livres sur le sujet "Cost-sensitive classification"

Shultz, Thomas R., Scott E. Fahlman, Susan Craw, Periklis Andritsos, Panayiotis Tsaparas, Ricardo Silva, Chris Drummond et al. « Cost-Sensitive Classification ». Dans Encyclopedia of Machine Learning, 231. Boston, MA : Springer US, 2011. http://dx.doi.org/10.1007/978-0-387-30164-8_180.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Roychoudhury, Shoumik, Mohamed Ghalwash et Zoran Obradovic. « Cost Sensitive Time-Series Classification ». Dans Machine Learning and Knowledge Discovery in Databases, 495–511. Cham : Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-71246-8_30.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Qin, Zhenxing, Chengqi Zhang, Tao Wang et Shichao Zhang. « Cost Sensitive Classification in Data Mining ». Dans Advanced Data Mining and Applications, 1–11. Berlin, Heidelberg : Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-17316-5_1.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Mitrokotsa*, Aikaterini, Christos Dimitrakakis et Christos Douligeris. « Intrusion Detection Using Cost-Sensitive Classification ». Dans Proceedings of the 3rd European Conference on Computer Network Defense, 35–47. Boston, MA : Springer US, 2009. http://dx.doi.org/10.1007/978-0-387-85555-4_3.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Qin, Zhenxing, Alan Tao Wang, Chengqi Zhang et Shichao Zhang. « Cost-Sensitive Classification with k-Nearest Neighbors ». Dans Knowledge Science, Engineering and Management, 112–31. Berlin, Heidelberg : Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-39787-5_10.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Iša, Jiří, Zuzana Reitermanová et Ondřej Sýkora. « Cost-Sensitive Classification with Unconstrained Influence Diagrams ». Dans SOFSEM 2012 : Theory and Practice of Computer Science, 625–36. Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-27660-6_51.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Wang, Yu, et Nan Wang. « Study on an Extreme Classification of Cost - Sensitive Classification Algorithm ». Dans Advances in Intelligent Systems and Computing, 1772–82. Singapore : Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-2568-1_250.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Davis, Jason V., Jungwoo Ha, Christopher J. Rossbach, Hany E. Ramadan et Emmett Witchel. « Cost-Sensitive Decision Tree Learning for Forensic Classification ». Dans Lecture Notes in Computer Science, 622–29. Berlin, Heidelberg : Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11871842_60.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Kotsiantis, Sotiris B., et Panagiotis E. Pintelas. « A Cost Sensitive Technique for Ordinal Classification Problems ». Dans Methods and Applications of Artificial Intelligence, 220–29. Berlin, Heidelberg : Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-24674-9_24.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Margineantu, Dragos D. « Class Probability Estimation and Cost-Sensitive Classification Decisions ». Dans Lecture Notes in Computer Science, 270–81. Berlin, Heidelberg : Springer Berlin Heidelberg, 2002. http://dx.doi.org/10.1007/3-540-36755-1_23.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Actes de conférences sur le sujet "Cost-sensitive classification"

Wang, Jialei, Peilin Zhao et Steven C. H. Hoi. « Cost-Sensitive Online Classification ». Dans 2012 IEEE 12th International Conference on Data Mining (ICDM). IEEE, 2012. http://dx.doi.org/10.1109/icdm.2012.116.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Schaefer, Gerald, Bartosz Krawczyk, Niraj P. Doshi et Tomoharu Nakashima. « Cost-sensitive texture classification ». Dans 2014 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2014. http://dx.doi.org/10.1109/cec.2014.6900500.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Tur, Gokhan. « Cost-sensitive call classification ». Dans Interspeech 2004. ISCA : ISCA, 2004. http://dx.doi.org/10.21437/interspeech.2004-41.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Liu, Zhenbing, Chunyang Gao, Huihua Yang et Qijia He. « Cost-sensitive sparse representation based classification ». Dans 2016 4th International Conference on Cloud Computing and Intelligence Systems (CCIS). IEEE, 2016. http://dx.doi.org/10.1109/ccis.2016.7790248.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Ali, Alnur, et Kevyn Collins-Thompson. « Robust Cost-Sensitive Confidence-Weighted Classification ». Dans 2013 IEEE 13th International Conference on Data Mining Workshops (ICDMW). IEEE, 2013. http://dx.doi.org/10.1109/icdmw.2013.108.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Nan, Feng, Joseph Wang, Kirill Trapeznikov et Venkatesh Saligrama. « Fast margin-based cost-sensitive classification ». Dans ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014. http://dx.doi.org/10.1109/icassp.2014.6854141.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Schaefer, G., T. Nakashima, Y. Yokota et H. Ishibuchi. « Cost-Sensitive Fuzzy Classification for Medical Diagnosis ». Dans 2007 4th Symposium on Computational Intelligence in Bioinformatics and Computational Biology. IEEE, 2007. http://dx.doi.org/10.1109/cibcb.2007.4221238.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Azab, Ahmad, Robert Layton, Mamoun Alazab et Paul Watters. « Skype Traffic Classification Using Cost Sensitive Algorithms ». Dans 2013 Fourth Cybercrime and Trustworthy Computing Workshop (CTC). IEEE, 2013. http://dx.doi.org/10.1109/ctc.2013.11.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

O'Brien, Deirdre B., Maya R. Gupta et Robert M. Gray. « Cost-sensitive multi-class classification from probability estimates ». Dans the 25th international conference. New York, New York, USA : ACM Press, 2008. http://dx.doi.org/10.1145/1390156.1390246.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Bakshi, Arjun, et Raj Bhatnagar. « Learning Cost-Sensitive Rules for Non-forced Classification ». Dans 2012 IEEE 12th International Conference on Data Mining Workshops. IEEE, 2012. http://dx.doi.org/10.1109/icdmw.2012.62.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Rapports d'organisations sur le sujet "Cost-sensitive classification"

Bonfil, David J., Daniel S. Long et Yafit Cohen. Remote Sensing of Crop Physiological Parameters for Improved Nitrogen Management in Semi-Arid Wheat Production Systems. United States Department of Agriculture, janvier 2008. http://dx.doi.org/10.32747/2008.7696531.bard.

Texte intégral

Résumé :

To reduce financial risk and N losses to the environment, fertilization methods are needed that improve NUE and increase the quality of wheat. In the literature, ample attention is given to grid-based and zone-based soil testing to determine the soil N available early in the growing season. Plus, information is available on in-season N topdressing applications as a means of improving GPC. However, the vast majority of research has focused on wheat that is grown under N limiting conditions in sub-humid regions and irrigated fields. Less attention has been given to wheat in dryland that is water limited. The objectives of this study were to: (1) determine accuracy in determining GPC of HRSW in Israel and SWWW in Oregon using on-combine optical sensors under field conditions; (2) develop a quantitative relationship between image spectral reflectance and effective crop physiological parameters; (3) develop an operational precision N management procedure that combines variable-rate N recommendations at planting as derived from maps of grain yield, GPC, and test weight; and at mid-season as derived from quantitative relationships, remote sensing, and the DSS; and (4) address the economic and technology-transfer aspects of producers’ needs. Results from the research suggest that optical sensing and the DSS can be used for estimating the N status of dryland wheat and deciding whether additional N is needed to improve GPC. Significant findings include: 1. In-line NIR reflectance spectroscopy can be used to rapidly and accurately (SEP <5.0 mg g⁻¹) measure GPC of a grain stream conveyed by an auger. 2. On-combine NIR spectroscopy can be used to accurately estimate (R² < 0.88) grain test weight across fields. 3. Precision N management based on N removal increases GPC, grain yield, and profitability in rainfed wheat. 4. Hyperspectral SI and partial least squares (PLS) models have excellent potential for estimation of biomass, and water and N contents of wheat. 5. A novel heading index can be used to monitor spike emergence of wheat with classification accuracy between 53 and 83%. 6. Index MCARI/MTVI2 promises to improve remote sensing of wheat N status where water- not soil N fertility, is the main driver of plant growth. Important features include: (a) computable from commercial aerospace imagery that include the red edge waveband, (b) sensitive to Chl and resistant to variation in crop biomass, and (c) accommodates variation in soil reflectance. Findings #1 and #2 above enable growers to further implement an efficient, low cost PNM approach using commercially available on-combine optical sensors. Finding #3 suggests that profit opportunities may exist from PNM based on information from on-combine sensing and aerospace remote sensing. Finding #4, with its emphasis on data retrieval and accuracy, enhances the potential usefulness of a DSS as a tool for field crop management. Finding #5 enables land managers to use a DSS to ascertain at mid-season whether a wheat crop should be harvested for grain or forage. Finding #6a expands potential commercial opportunities of MS imagery and thus has special importance to a majority of aerospace imaging firms specializing in the acquisition and utilization of these data. Finding #6b on index MCARI/MVTI2 has great potential to expand use of ground-based sensing and in-season N management to millions of hectares of land in semiarid environments where water- not N, is the main determinant of grain yield. Finding #6c demonstrates that MCARI/MTVI2 may alleviate the requirement of multiple N-rich reference strips to account for soil differences within farm fields. This simplicity will be less demanding of grower resources, promising substantially greater acceptance of sensing technologies for in-season N management.

Styles APA, Harvard, Vancouver, ISO, etc.

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!