Academic literature on the topic 'Classification binaire supervisée'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Classification binaire supervisée.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Classification binaire supervisée"

1

Chehata, Nesrine, Karim Ghariani, Arnaud Le Bris, and Philippe Lagacherie. "Apport des images pléiades pour la délimitation des parcelles agricoles à grande échelle." Revue Française de Photogrammétrie et de Télédétection, no. 209 (January 29, 2015): 165–71. http://dx.doi.org/10.52638/rfpt.2015.220.

Full text
Abstract:
Les pratiques et les arrangements spatiaux des parcelles agricoles ont un fort impact sur les flux d'eau dans les paysages cultivés . Afin de surveiller les paysages à grande échelle, il ya un fort besoin de délimitation automatique ou semi-automatique des parcelles agricoles. Cet article montre la contribution des images satellitaires à très haute résolution spatiales, telles que Pléiades, pour délimiter le parcellaire agricole de manière automatique .On propose une approche originale utilisant une classification binaire supervisée des limites. Une approche d'apprentissage actif est proposée afin d'adapter le modèle de classifieur au contexte local permettant ainsi la délimitation parcellaire à grande échelle.Le classifieur des Forêts Aléatoires est utilisé pour la classification et la sélection des attributs . Le concept de marge non supervisée est utilisé comme mesure d'incertitude dans l'algorithme d'apprentissage actif. En outre, un étiquetage automatique des pixels incertains est proposé en utilisant une approche hybride qui combinant une approche région et le concept de marge.Des résultats satisfaisants sont obtenus sur une image Pléiades. Différentes stratégies d'apprentissage sont comparées et discutées . Pour un cas d'étude opérationnel, un modèle global ou bien un modèle simple enrichi peuvent être utilisés en fonction des données de terrain disponibles.
APA, Harvard, Vancouver, ISO, and other styles
2

Stausberg, J., and D. Nasseh. "Evaluation of a Binary Semi-supervised Classification Technique for Probabilistic Record Linkage." Methods of Information in Medicine 55, no. 02 (2016): 136–43. http://dx.doi.org/10.3414/me14-01-0087.

Full text
Abstract:
SummaryBackground: The process of merging data of different data sources is referred to as record linkage. A medical environment with increased preconditions on privacy protection demands the transformation of clear-text attributes like first name or date of birth into one-way encrypted pseudonyms. When performing an automated or privacy preserving record linkage there might be the need of a binary classification deciding whether two records should be classified as the same entity. The classification is the final of the four main phases of the record linkage process: Preprocessing, indexing, matching and classification. The choice of binary classification techniques in dependence of project specifications in particular data quality has not extensively been studied yet.Objectives: The aim of this work is the introduction and evaluation of an automatable semi-supervised binary classification system applied within the field of record linkage capable of competing or even surpassing advanced automated techniques of the domain of unsupervised classification.Methods: This work describes the rationale leading to the model and the final implementation of an automatable semi-supervised binary classification system and the comparison of its classification performance to an advanced active learning approach out of the domain of unsupervised learning. The performance of both systems has been measured on a broad variety of artificial test sets (n = 400), based on real patient data, with distinct and unique characteristics.Results: While the classification performance for both methods measured as F-measure was relatively close on test sets with maximum defined data quality, 0.996 for semi-supervised classification, 0.993 for unsupervised classification, it incrementally diverged for test sets of worse data quality dropping to 0.964 for semi-supervised classification and 0.803 for unsupervised classification.Conclusions: Aside from supplying a viable model for semi-supervised classification for automated probabilistic record linkage, the tests conducted on a large amount of test sets suggest that semi-supervised techniques might generally be capable of outperforming unsupervised techniques especially on data with lower levels of data quality.
APA, Harvard, Vancouver, ISO, and other styles
3

Arnason, R. M., P. Barmby, and N. Vulic. "Identifying new X-ray binary candidates in M31 using random forest classification." Monthly Notices of the Royal Astronomical Society 492, no. 4 (February 3, 2020): 5075–88. http://dx.doi.org/10.1093/mnras/staa207.

Full text
Abstract:
ABSTRACT Identifying X-ray binary (XRB) candidates in nearby galaxies requires distinguishing them from possible contaminants including foreground stars and background active galactic nuclei. This work investigates the use of supervised machine learning algorithms to identify high-probability XRB candidates. Using a catalogue of 943 Chandra X-ray sources in the Andromeda galaxy, we trained and tested several classification algorithms using the X-ray properties of 163 sources with previously known types. Amongst the algorithms tested, we find that random forest classifiers give the best performance and work better in a binary classification (XRB/non-XRB) context compared to the use of multiple classes. Evaluating our method by comparing with classifications from visible-light and hard X-ray observations as part of the Panchromatic Hubble Andromeda Treasury, we find compatibility at the 90 per cent level, although we caution that the number of source in common is rather small. The estimated probability that an object is an XRB agrees well between the random forest binary and multiclass approaches and we find that the classifications with the highest confidence are in the XRB class. The most discriminating X-ray bands for classification are the 1.7–2.8, 0.5–1.0, 2.0–4.0, and 2.0–7.0 keV photon flux ratios. Of the 780 unclassified sources in the Andromeda catalogue, we identify 16 new high-probability XRB candidates and tabulate their properties for follow-up.
APA, Harvard, Vancouver, ISO, and other styles
4

Hung, Cheng-An, and Sheng-Fuu Lin. "Supervised Adaptive Hamming Net for Classification of Multiple-Valued Patterns." International Journal of Neural Systems 08, no. 02 (April 1997): 181–200. http://dx.doi.org/10.1142/s0129065797000203.

Full text
Abstract:
A Supervised Adaptive Hamming Net (SAHN) is introduced for incremental learning of recognition categories in response to arbitrary sequence of multiple-valued or binary-valued input patterns. The binary-valued SAHN derived from the Adaptive Hamming Net (AHN) is functionally equivalent to a simplified ARTMAP, which is specifically designed to establish many-to-one mappings. The generalization to learning multiple-valued input patterns is achieved by incorporating multiple-valued logic into the AHN. In this paper, we examine some useful properties of learning in a P-valued SAHN. In particular, an upper bound is derived on the number of epochs required by the P-valued SAHN to learn a list of input-output pairs that is repeatedly presented to the architecture. Furthermore, we connect the P-valued SAHN with the binary-valued SAHN via the thermometer code.
APA, Harvard, Vancouver, ISO, and other styles
5

Couellan, Nicolas. "A note on supervised classification and Nash-equilibrium problems." RAIRO - Operations Research 51, no. 2 (February 27, 2017): 329–41. http://dx.doi.org/10.1051/ro/2016024.

Full text
Abstract:
In this note, we investigate connections between supervised classification and (Generalized) Nash equilibrium problems (NEP & GNEP). For the specific case of support vector machines (SVM), we exploit the geometric properties of class separation in the dual space to formulate a non-cooperative game. NEP and Generalized NEP formulations are proposed for both binary and multi-class SVM problems.
APA, Harvard, Vancouver, ISO, and other styles
6

Binol, Hamidullah, Huseyin Cukur, and Abdullah Bal. "A supervised discriminant subspaces-based ensemble learning for binary classification." International Journal of Advanced Computer Research 6, no. 27 (October 3, 2016): 209–14. http://dx.doi.org/10.19101/ijacr.2016.627008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kalakech, Mariam, Alice Porebski, Nicolas Vandenbroucke, and Denis Hamad. "Unsupervised Local Binary Pattern Histogram Selection Scores for Color Texture Classification." Journal of Imaging 4, no. 10 (September 28, 2018): 112. http://dx.doi.org/10.3390/jimaging4100112.

Full text
Abstract:
These last few years, several supervised scores have been proposed in the literature to select histograms. Applied to color texture classification problems, these scores have improved the accuracy by selecting the most discriminant histograms among a set of available ones computed from a color image. In this paper, two new scores are proposed to select histograms: The adapted Variance score and the adapted Laplacian score. These new scores are computed without considering the class label of the images, contrary to what is done until now. Experiments, achieved on OuTex, USPTex, and BarkTex sets, show that these unsupervised scores give as good results as the supervised ones for LBP histogram selection.
APA, Harvard, Vancouver, ISO, and other styles
8

LU, Jia. "Semi-supervised binary classification algorithm based on global and local regularization." Journal of Computer Applications 32, no. 3 (April 1, 2013): 643–45. http://dx.doi.org/10.3724/sp.j.1087.2012.00643.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Süveges, M., F. Barblan, I. Lecoeur-Taïbi, A. Prša, B. Holl, L. Eyer, A. Kochoska, N. Mowlavi, and L. Rimoldini. "Gaiaeclipsing binary and multiple systems. Supervised classification and self-organizing maps." Astronomy & Astrophysics 603 (July 2017): A117. http://dx.doi.org/10.1051/0004-6361/201629710.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Huang, Liang, Rui Xuan Li, Kun Mei Wen, and Xi Wu Gu. "A Self Training Semi-Supervised Truncated Kernel Projection Machine for Link Prediction." Advanced Materials Research 580 (October 2012): 369–73. http://dx.doi.org/10.4028/www.scientific.net/amr.580.369.

Full text
Abstract:
With the large amount of complex network data becoming available in the web, link prediction has become a popular research field of data mining. We focus on the link prediction task which can be formulated as a binary classification problem in social network. To treat this problem, a sparse semi-supervised classification algorithm called Self Training Semi-supervised Truncated Kernel Projection Machine (STKPM), based on empirical feature selection, is proposed for link prediction. Experimental results show that the proposed algorithm outperformed several outstanding learning algorithms with smaller test errors and more stability.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Classification binaire supervisée"

1

Monnier, Jean-Baptiste. "Quelques contributions en classification, régression et étude d'un problème inverse en finance." Phd thesis, Université Paris-Diderot - Paris VII, 2011. http://tel.archives-ouvertes.fr/tel-00650930.

Full text
Abstract:
On s'intéresse aux problèmes de régression, classification et à un problème inverse en finance. Nous abordons dans un premier temps le problème de régression en design aléatoire à valeurs dans un espace euclidien et dont la loi admet une densité inconnue. Nous montrons qu'il est possible d'élaborer une stratégie d'estimation optimale par projections localisées sur une analyse multi-résolution. Cette méthode originale offre un avantage calculatoire sur les méthodes d'estimation à noyau traditionnellement utilisées dans un tel contexte. On montre par la même occasion que le classifieur plug-in construit sur cette nouvelle procédure est optimal. De plus, il hérite des avantages calculatoires mentionnés plus haut, ce qui s'avère être un atout crucial dans de nombreuses applications. On se tourne ensuite vers le problème de régression en design aléatoire uniformément distribué sur l'hyper-sphère et on montre comment le tight frame de needlets permet de généraliser les méthodes traditionnelles de régression en ondelettes à ce nouveau contexte. On s'intéresse finalement au problème d'estimation de la densité risque-neutre à partir des prix d'options cotés sur les marchés. On exhibe une décomposition en valeurs singulières explicite d'opérateurs de prix restreints et on montre qu'elle permet d'élaborer une méthode d'estimation de la densité risque-neutre qui repose sur la résolution d'un simple programme quadratique.
APA, Harvard, Vancouver, ISO, and other styles
2

Wu, Nicholas(Nicholas T. ). "Inductive logic programming with gradient descent for supervised binary classification." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/129926.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020
Cataloged from student-submitted PDF of thesis.
Includes bibliographical references (pages 75-76).
As machine learning techniques have become more advanced, interpretability has become a major concern for models making important decisions. In contrast to Local Interpretable Model-Agnostic Explanations (LIME), this thesis seeks to develop an interpretable model using logical rules, rather than explaining existing blackbox models. We extend recent inductive logic programming methods developed by Evans and Grefenstette [3] to develop an gradient descent-based inductive logic programming technique for supervised binary classification. We start by developing our methodology for binary input data, and then extend the approach to numerical data using a threshold-gate based binarization technique. We test our implementations on datasets with varying pattern structures and noise levels, and select our best performing implementation. We then present an example where our method generates an accurate and interpretable rule set, whereas the LIME technique fails to generate a reasonable model. Further, we test our original methodology on the FICO Home Equity Line of Credit dataset. We run a hyperparameter search over differing number of rules and rule sizes. Our best performing model achieves a 71.7% accuracy, which is comparable to multilayer perceptron and randomized forest models. We conclude by suggesting directions for future applications and potential improvements.
by Nicholas Wu.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
3

Michel, Pierre. "Sélection d'items en classification non supervisée et questionnaires informatisés adaptatifs : applications à des données de qualité de vie liée à la santé." Thesis, Aix-Marseille, 2016. http://www.theses.fr/2016AIXM4097/document.

Full text
Abstract:
Un questionnaire adaptatif fournit une mesure valide de la qualité de vie des patients et réduit le nombre d'items à remplir. Cette approche est dépendante des modèles utilisés, basés sur des hypothèses parfois non vérifiables. Nous proposons une approche alternative basée sur les arbres de décision. Cette approche n'est basée sur aucune hypothèse et requiert moins de temps de calcul pour l'administration des items. Nous présentons différentes simulations qui démontrent la pertinence de notre approche. Nous présentons une méthode de classification non supervisée appelée CUBT. CUBT comprend trois étapes pour obtenir une partition optimale d'un jeu de données. La première étape construit un arbre en divisant récursivement le jeu de données. La deuxième étape regroupe les paires de noeuds terminaux de l'arbre. La troisième étape agrège des nœuds terminaux qui ne sont pas issus de la même division. Différentes simulations sont présentés pour comparer CUBT avec d'autres approches. Nous définissons également des heuristiques concernant le choix des paramètres de CUBT. CUBT identifie les variables qui sont actives dans la construction de l'arbre. Cependant, bien que certaines variables peuvent être sans importance, elles peuvent être compétitives pour les variables actives. Il est essentiel de classer les variables en fonction d'un score d'importance pour déterminer leur pertinence dans un modèle donné. Nous présentons une méthode pour mesurer l'importance des variables basée sur CUBT et les divisions binaires compétitives pour définir un score d'importance des variables. Nous analysons l'efficacité et la stabilité de ce nouvel indice, en le comparant à d'autres méthodes
An adaptive test provides a valid measure of quality of life of patients and reduces the number of items to be filled. This approach is dependent on the models used, sometimes based on unverifiable assumptions. We propose an alternative approach based on decision trees. This approach is not based on any assumptions and requires less calculation time for item administration. We present different simulations that demonstrate the relevance of our approach.We present an unsupervised classification method called CUBT. CUBT includes three steps to obtain an optimal partition of a data set. The first step grows a tree by recursively dividing the data set. The second step groups together the pairs of terminal nodes of the tree. The third step aggregates terminal nodes that do not come from the same split. Different simulations are presented to compare CUBT with other approaches. We also define heuristics for the choice of CUBT parameters.CUBT identifies the variables that are active in the construction of the tree. However, although some variables may be irrelevant, they may be competitive for the active variables. It is essential to rank the variables according to an importance score to determine their relevance in a given model. We present a method to measure the importance of variables based on CUBT and competitive binary splis to define a score of variable importance. We analyze the efficiency and stability of this new index, comparing it with other methods
APA, Harvard, Vancouver, ISO, and other styles
4

Gustafsson, Andreas. "Winner Prediction of Blood Bowl 2 Matches with Binary Classification." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20368.

Full text
Abstract:
Being able to predict the outcome of a game is useful in many aspects. Such as,to aid designers in the process of understanding how the game is played by theplayers, as well as how to be able to balance the elements within the game aretwo of those aspects. If one could predict the outcome of games with certaintythe design process could possibly be evolved into more of an experiment basedapproach where one can observe cause and effect to some degree. It has previouslybeen shown that it is possible to predict outcomes of games to varying degrees ofsuccess. However, there is a lack of research which compares and evaluates severaldifferent models on the same domain with common aims. To narrow this identifiedgap an experiment is conducted to compare and analyze seven different classifierswithin the same domain. The classifiers are then ranked on accuracy against eachother with help of appropriate statistical methods. The classifiers compete onthe task of predicting which team will win or lose in a match of the game BloodBowl 2. For nuance three different datasets are made for the models to be trainedon. While the results vary between the models of the various datasets the general consensus has an identifiable pattern of rejections. The results also indicatea strong accuracy for Support Vector Machine and Logistic Regression across allthe datasets.
APA, Harvard, Vancouver, ISO, and other styles
5

Quost, Benjamin. "Combinaison de classifieurs binaires dans le cadre des fonctions de croyance." Compiègne, 2006. http://www.theses.fr/2006COMP1647.

Full text
Abstract:
La classification supervisée a pour enjeu de construire un système, ou classifieur, capable de prédire automatiquement la classe d'un phénomène observé. Son architecture peut être modulaire : le problème abordé est décomposé en sous-problèmes plus simples, traités par des classifieurs, et la combinaison des résultats donne la solution globale. Nous nous intéressons au cas de sous-problèmes binaires, en particulier les décompositions où chaque classe est opposée à chaque autre, chaque classe est opposée à toutes les autres, et le cas général où deux groupes de classes disjoints sont opposés l'un à l'autre. La combinaison des classifieurs est formalisée dans le cadre de la théorie des fonctions de croyance. Nous interprétons les sorties des classifieurs binaires comme des fonctions de croyance définies sur des domaines restreints, dépendant du schéma de décomposition employée. Les classifieurs sont alors combinés en déterminant la fonction de croyance la plus consistante possible avec leurs sorties
Supervised classification aims at building a system, or classifier, able to predict the class of a phenomenon being observed. Its architecture may be modular : the problem to be tackled is decomposed into simpler sub-problems, solved by classifiers, and the combination of the results gives the global solution. We address the case of binary sub-problems in particular the decompositions where each class is opposed to each other, each class is opposed to an the others, and the general case where two disjoint groups of classes are opposed to each other. The combination of the classifiers is formalized within the theory of evidence framework. We interpret the outputs of the binary classifiers as belief functions defined on restricted domains, according to the decomposition scheme used. The classifiers are then combined by determining the belief function which is the most. . . Consistant with their outputs
APA, Harvard, Vancouver, ISO, and other styles
6

Arnroth, Lukas, and Dennis Jonni Fiddler. "Supervised Learning Techniques : A comparison of the Random Forest and the Support Vector Machine." Thesis, Uppsala universitet, Statistiska institutionen, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-274768.

Full text
Abstract:
This thesis examines the performance of the support vector machine and the random forest models in the context of binary classification. The two techniques are compared and the outstanding one is used to construct a final parsimonious model. The data set consists of 33 observations and 89 biomarkers as features with no known dependent variable. The dependent variable is generated through k-means clustering, with a predefined final solution of two clusters. The training of the algorithms is performed using five-fold cross-validation repeated twenty times. The outcome of the training process reveals that the best performing versions of the models are a linear support vector machine and a random forest with six randomly selected features at each split. The final results of the comparison on the test set of these optimally tuned algorithms show that the random forest outperforms the linear kernel support vector machine. The former classifies all observations in the test set correctly whilst the latter classifies all but one correctly. Hence, a parsimonious random forest model using the top five features is constructed, which, to conclude, performs equally well on the test set compared to the original random forest model using all features.
APA, Harvard, Vancouver, ISO, and other styles
7

Tandan, Isabelle, and Erika Goteman. "Bank Customer Churn Prediction : A comparison between classification and evaluation methods." Thesis, Uppsala universitet, Statistiska institutionen, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-411918.

Full text
Abstract:
This study aims to assess which supervised statistical learning method; random forest, logistic regression or K-nearest neighbor, that is the best at predicting banks customer churn. Additionally, the study evaluates which cross-validation set approach; k-Fold cross-validation or leave-one-out cross-validation that yields the most reliable results. Predicting customer churn has increased in popularity since new technology, regulation and changed demand has led to an increase in competition for banks. Thus, with greater reason, banks acknowledge the importance of maintaining their customer base.   The findings of this study are that unrestricted random forest model estimated using k-Fold is to prefer out of performance measurements, computational efficiency and a theoretical point of view. Albeit, k-Fold cross-validation and leave-one-out cross-validation yield similar results, k-Fold cross-validation is to prefer due to computational advantages.   For future research, methods that generate models with both good interpretability and high predictability would be beneficial. In order to combine the knowledge of which customers end their engagement as well as understanding why. Moreover, interesting future research would be to analyze at which dataset size leave-one-out cross-validation and k-Fold cross-validation yield the same results.
APA, Harvard, Vancouver, ISO, and other styles
8

Gardner, Angelica. "Stronger Together? An Ensemble of CNNs for Deepfakes Detection." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-97643.

Full text
Abstract:
Deepfakes technology is a face swap technique that enables anyone to replace faces in a video, with highly realistic results. Despite its usefulness, if used maliciously, this technique can have a significant impact on society, for instance, through the spreading of fake news or cyberbullying. This makes the ability of deepfakes detection a problem of utmost importance. In this paper, I tackle the problem of deepfakes detection by identifying deepfakes forgeries in video sequences. Inspired by the state-of-the-art, I study the ensembling of different machine learning solutions built on convolutional neural networks (CNNs) and use these models as objects for comparison between ensemble and single model performances. Existing work in the research field of deepfakes detection suggests that escalated challenges posed by modern deepfake videos make it increasingly difficult for detection methods. I evaluate that claim by testing the detection performance of four single CNN models as well as six stacked ensembles on three modern deepfakes datasets. I compare various ensemble approaches to combine single models and in what way their predictions should be incorporated into the ensemble output. The results I found was that the best approach for deepfakes detection is to create an ensemble, though, the ensemble approach plays a crucial role in the detection performance. The final proposed solution is an ensemble of all available single models which use the concept of soft (weighted) voting to combine its base-learners’ predictions. Results show that this proposed solution significantly improved deepfakes detection performance and substantially outperformed all single models.
APA, Harvard, Vancouver, ISO, and other styles
9

Saneem, Ahmed C. G. "Bayes Optimal Feature Selection for Supervised Learning." Thesis, 2014. http://hdl.handle.net/2005/3138.

Full text
Abstract:
The problem of feature selection is critical in several areas of machine learning and data analysis such as, for example, cancer classification using gene expression data, text categorization, etc. In this work, we consider feature selection for supervised learning problems, where one wishes to select a small set of features that facilitate learning a good prediction model in the reduced feature space. Our interest is primarily in filter methods that select features independently of the learning algorithm to be used and are generally faster to implement compared to other types of feature selection algorithms. Many common filter methods for feature selection make use of information-theoretic criteria such as those based on mutual information to guide their search process. However, even in simple binary classification problems, mutual information based methods do not always select the best set of features in terms of the Bayes error. In this thesis, we develop a general approach for selecting a set of features that directly aims to minimize the Bayes error in the reduced feature space with respect to the loss or performance measure of interest. We show that the mutual information based criterion is a special case of our setting when the loss function of interest is the logarithmic loss for class probability estimation. We give a greedy forward algorithm for approximately optimizing this criterion and demonstrate its application to several supervised learning problems including binary classification (with 0-1 error, cost-sensitive error, and F-measure), binary class probability estimation (with logarithmic loss), bipartite ranking (with pairwise disagreement loss), and multiclass classification (with multiclass 0-1 error). Our experiments suggest that the proposed approach is competitive with several state-of-the art methods.
APA, Harvard, Vancouver, ISO, and other styles
10

Manita, Vitor Manuel Cruz. "The importance of Quality Assurance as a Data Scientist: Commom pitfalls, examples and solutions found while validationand developing supervised binary classification models." Master's thesis, 2021. http://hdl.handle.net/10362/113991.

Full text
Abstract:
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
In today’s information era, where Data galvanizes change, companies are aiming towards competitive advantage by mining this important resource to achieve actionable insights, knowledge, and wisdom. However, to minimize bias and obtain robust long-term solutions, the methodologies that are devised from Data Science and Machine Learning approaches benefit from being carefully validated by a Quality Assurance Data Scientist, who understands not only both business rules and analytics tasks, but also understands and recommends Quality Assurance guidelines and validations. Through my experience as a Data Scientist at EDP Distribuição, I identify and systematically report on seven key Quality Assurance guidelines that helped achieve more reliable products and provided three practical examples where validation was key in discerning improvements.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Classification binaire supervisée"

1

Sanodiya, Rakesh Kumar, Sriparna Saha, Jimson Mathew, and Arpita Raj. "Supervised and Semi-supervised Multi-task Binary Classification." In Neural Information Processing, 380–91. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-04212-7_33.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sellami, Hedia Mhiri, and Ali Jaoua. "Non-supervised Rectangular Classification of Binary Data." In Multiple Approaches to Intelligent Systems, 642–48. Berlin, Heidelberg: Springer Berlin Heidelberg, 1999. http://dx.doi.org/10.1007/978-3-540-48765-4_68.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Švec, Jan. "Semi-supervised Learning Algorithm for Binary Relevance Multi-label Classification." In Lecture Notes in Computer Science, 1–13. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-20370-6_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Barman, Anwesha Ujjwal, Kritika Shah, Kanchan Lata Kashyap, Avanish Sandilya, and Nishq Poorav Desai. "Binary Classification of Celestial Bodies Using Supervised Machine Learning Algorithms." In Algorithms for Intelligent Systems, 495–505. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-33-4087-9_42.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kowsari, Kamran, Nima Bari, Roman Vichr, and Farhad A. Goodarzi. "FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification." In Advances in Intelligent Systems and Computing, 655–70. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-03405-4_46.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Moraes, Ronei M., Liliane S. Machado, Henri Prade, and Gilles Richard. "Supervised Classification Using Homogeneous Logical Proportions for Binary and Nominal Features." In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 165–73. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-41822-8_21.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Peng, Alex Yuxuan, Yun Sing Koh, Patricia Riddle, and Bernhard Pfahringer. "Using Supervised Pretraining to Improve Generalization of Neural Networks on Binary Classification Problems." In Machine Learning and Knowledge Discovery in Databases, 410–25. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-10925-7_25.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Abuassba, Adnan Omer, Dezheng O. Zhang, and Xiong Luo. "Ensemble Learning via Extreme Learning Machines for Imbalanced Data." In Advances in Computational Intelligence and Robotics, 59–88. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-3038-2.ch004.

Full text
Abstract:
Ensembles are known to reduce the risk of selecting the wrong model by aggregating all candidate models. Ensembles are known to be more accurate than single models. Accuracy has been identified as an important factor in explaining the success of ensembles. Several techniques have been proposed to improve ensemble accuracy. But, until now, no perfect one has been proposed. The focus of this research is on how to create accurate ensemble learning machine (ELM) in the context of classification to deal with supervised data, noisy data, imbalanced data, and semi-supervised data. To deal with mentioned issues, the authors propose a heterogeneous ELM ensemble. The proposed heterogeneous ensemble of ELMs (AELME) for classification has different ELM algorithms, including regularized ELM (RELM) and kernel ELM (KELM). The authors propose new diverse AdaBoost ensemble-based ELM (AELME) for binary and multiclass data classification to deal with the imbalanced data issue.
APA, Harvard, Vancouver, ISO, and other styles
9

Anjali Jivani, Dr, Dr Hetal Bhavsar, Sneh Shah, and Riya Shah. "Exploration of Supervised Machine Learning Algorithms on Binary Classification." In ICT for Competitive Strategies, 601–16. CRC Press, 2020. http://dx.doi.org/10.1201/9781003052098-63.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Yamanishi, Yoshihiro, and Hisashi Kashima. "Prediction of Compound-protein Interactions with Machine Learning Methods." In Machine Learning, 616–30. IGI Global, 2012. http://dx.doi.org/10.4018/978-1-60960-818-7.ch315.

Full text
Abstract:
In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compound-protein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two different viewpoints: binary classification and dimension reduction. In the results, they demonstrate the usefulness of the methods on the prediction of drug-target interactions and ligand-protein interactions from chemical structure data and genomic sequence data.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Classification binaire supervisée"

1

Li, Pengyong, Jun Wang, Ziliang Li, Yixuan Qiao, Xianggen Liu, Fei Ma, Peng Gao, Sen Song, and Guotong Xie. "Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/371.

Full text
Abstract:
Self-supervised learning has gradually emerged as a powerful technique for graph representation learning. However, transferable, generalizable, and robust representation learning on graph data still remains a challenge for pre-training graph neural networks. In this paper, we propose a simple and effective self-supervised pre-training strategy, named Pairwise Half-graph Discrimination (PHD), that explicitly pre-trains a graph neural network at graph-level. PHD is designed as a simple binary classification task to discriminate whether two half-graphs come from the same source. Experiments demonstrate that the PHD is an effective pre-training strategy that offers comparable or superior performance on 13 graph classification tasks compared with state-of-the-art strategies, and achieves notable improvements when combined with node-level strategies. Moreover, the visualization of learned representation revealed that PHD strategy indeed empowers the model to learn graph-level knowledge like the molecular scaffold. These results have established PHD as a powerful and effective self-supervised learning strategy in graph-level representation learning.
APA, Harvard, Vancouver, ISO, and other styles
2

Sun, Jianjun, and Qinghua Huang. "Binary Classification with Supervised-like Biclustering and Adaboost." In 2020 7th International Conference on Information Science and Control Engineering (ICISCE). IEEE, 2020. http://dx.doi.org/10.1109/icisce50968.2020.00083.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Shinoda, Kazuhiko, Hirotaka Kaji, and Masashi Sugiyama. "Binary Classification from Positive Data with Skewed Confidence." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/460.

Full text
Abstract:
Positive-confidence (Pconf) classification [Ishida et al., 2018] is a promising weakly-supervised learning method which trains a binary classifier only from positive data equipped with confidence. However, in practice, the confidence may be skewed by bias arising in an annotation process. The Pconf classifier cannot be properly learned with skewed confidence, and consequently, the classification performance might be deteriorated. In this paper, we introduce the parameterized model of the skewed confidence, and propose the method for selecting the hyperparameter which cancels out the negative impact of the skewed confidence under the assumption that we have the misclassification rate of positive samples as a prior knowledge. We demonstrate the effectiveness of the proposed method through a synthetic experiment with simple linear models and benchmark problems with neural network models. We also apply our method to drivers’ drowsiness prediction to show that it works well with a real-world problem where confidence is obtained based on manual annotation.
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Xi, Iadh Ounis, and Craig Macdonald. "Negative Confidence-Aware Weakly Supervised Binary Classification for Effective Review Helpfulness Classification." In CIKM '20: The 29th ACM International Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3340531.3411978.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Rusli, Andre, Julio Christian Young, and Ni Made Satvika Iswari. "Identifying Fake News in Indonesian via Supervised Binary Text Classification." In 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT). IEEE, 2020. http://dx.doi.org/10.1109/iaict50021.2020.9172020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Rizk, Yara, Nicholas Mitri, and Mariette Awad. "A local mixture based SVM for an efficient supervised binary classification." In 2013 International Joint Conference on Neural Networks (IJCNN 2013 - Dallas). IEEE, 2013. http://dx.doi.org/10.1109/ijcnn.2013.6707032.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Hou, Ming, Brahim Chaib-draa, Chao Li, and Qibin Zhao. "Generative Adversarial Positive-Unlabelled Learning." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/312.

Full text
Abstract:
In this work, we consider the task of classifying binary positive-unlabeled (PU) data. The existing discriminative learning based PU models attempt to seek an optimal reweighting strategy for U data, so that a decent decision boundary can be found. However, given limited P data, the conventional PU models tend to suffer from overfitting when adapted to very flexible deep neural networks. In contrast, we are the first to innovate a totally new paradigm to attack the binary PU task, from perspective of generative learning by leveraging the powerful generative adversarial networks (GAN). Our generative positive-unlabeled (GenPU) framework incorporates an array of discriminators and generators that are endowed with different roles in simultaneously producing positive and negative realistic samples. We provide theoretical analysis to justify that, at equilibrium, GenPU is capable of recovering both positive and negative data distributions. Moreover, we show GenPU is generalizable and closely related to the semi-supervised classification. Given rather limited P data, experiments on both synthetic and real-world dataset demonstrate the effectiveness of our proposed framework. With infinite realistic and diverse sample streams generated from GenPU, a very flexible classifier can then be trained using deep neural networks.
APA, Harvard, Vancouver, ISO, and other styles
8

Nwala, Alexander C., and Michael L. Nelson. "A Supervised Learning Algorithm for Binary Domain Classification of Web Queries using SERPs." In JCDL '16: The 16th ACM/IEEE-CS Joint Conference on Digital Libraries. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2910896.2925449.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Maximov, Yury, Massih-Reza Amini, and Zaid Harchaoui. "Rademacher Complexity Bounds for a Penalized Multi-class Semi-supervised Algorithm (Extended Abstract)." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/800.

Full text
Abstract:
We propose Rademacher complexity bounds for multi-class classifiers trained with a two-step semi-supervised model. In the first step, the algorithm partitions the partially labeled data and then identifies dense clusters containing k predominant classes using the labeled training examples such that the proportion of their non-predominant classes is below a fixed threshold stands for clustering consistency. In the second step, a classifier is trained by minimizing a margin empirical loss over the labeled training set and a penalization term measuring the disability of the learner to predict the k predominant classes of the identified clusters. The resulting data-dependent generalization error bound involves the margin distribution of the classifier, the stability of the clustering technique used in the first step and Rademacher complexity terms corresponding to partially labeled training data. Our theoretical result exhibit convergence rates extending those proposed in the literature for the binary case, and experimental results on different multi-class classification problems show empirical evidence that supports the theory.
APA, Harvard, Vancouver, ISO, and other styles
10

Xu, Yixing, Chang Xu, Chao Xu, and Dacheng Tao. "Multi-Positive and Unlabeled Learning." In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/444.

Full text
Abstract:
The positive and unlabeled (PU) learning problem focuses on learning a classifier from positive and unlabeled data. Some methods have been developed to solve the PU learning problem. However, they are often limited in practical applications, since only binary classes are involved and cannot easily be adapted to multi-class data. Here we propose a one-step method that directly enables multi-class model to be trained using the given input multi-class data and that predicts the label based on the model decision. Specifically, we construct different convex loss functions for labeled and unlabeled data to learn a discriminant function F. The theoretical analysis on the generalization error bound shows that it is no worse than k√k times of the fully supervised multi-class classification methods when the size of the data in k classes is of the same order. Finally, our experimental results demonstrate the significance and effectiveness of the proposed algorithm in synthetic and real-world datasets.
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Classification binaire supervisée"

1

Farhi, Edward, and Hartmut Neven. Classification with Quantum Neural Networks on Near Term Processors. Web of Open Science, December 2020. http://dx.doi.org/10.37686/qrl.v1i2.80.

Full text
Abstract:
We introduce a quantum neural network, QNN, that can represent labeled data, classical or quantum, and be trained by supervised learning. The quantum circuit consists of a sequence of parameter dependent unitary transformations which acts on an input quantum state. For binary classification a single Pauli operator is measured on a designated readout qubit. The measured output is the quantum neural network’s predictor of the binary label of the input state. We show through classical simulation that parameters can be found that allow the QNN to learn to correctly distinguish the two data sets. We then discuss presenting the data as quantum superpositions of computational basis states corresponding to different label values. Here we show through simulation that learning is possible. We consider using our QNN to learn the label of a general quantum state. By example we show that this can be done. Our work is exploratory and relies on the classical simulation of small quantum systems. The QNN proposed here was designed with near-term quantum processors in mind. Therefore it will be possible to run this QNN on a near term gate model quantum computer where its power can be explored beyond what can be explored with simulation.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography