Dissertations / Theses on the topic 'Nearest Neighbor Classification'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Nearest Neighbor Classification.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Karo, Ciril. "Two new nearest neighbor classification rules." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1998. http://handle.dtic.mil/100.2/ADA354997.
Full text"September 1998." Thesis advisor(s): Samuel E. Buttrey. Includes bibliographical references (p. 69-71). Also available online.
Moraski, Ashley M. "Classification via distance profile nearest neighbors." Digital WPI, 2006. https://digitalcommons.wpi.edu/etd-theses/703.
Full textGupta, Nidhi. "Mutual k Nearest Neighbor based Classifier." University of Cincinnati / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1289937369.
Full textBurkholder, Joshua Jeremy. "Nearest neighbor classification using a density sensitive distance measurement [electronic resource]." Thesis, Monterey, California : Naval Postgraduate School, 2009. http://edocs.nps.edu/npspubs/scholarly/theses/2009/Sep/09Sep%5FBurkholder.pdf.
Full textThesis Advisor(s): Squire, Kevin. "September 2009." Description based on title screen as viewed on November 03, 2009. Author(s) subject terms: Classification, Supervised Learning, k-Nearest Neighbor Classification, Euclidean Distance, Mahalanobis Distance, Density Sensitive Distance, Parzen Windows, Manifold Parzen Windows, Kernel Density Estimation Includes bibliographical references (p. 99-100). Also available in print.
Ozsakabasi, Feray. "Classification Of Forest Areas By K Nearest Neighbor Method: Case Study, Antalya." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609548/index.pdf.
Full textPORFIRIO, DAVID JONATHAN. "SINGLE-SEQUENCE PROTEIN SECONDARY STRUCTURE PREDICTION BY NEAREST-NEIGHBOR CLASSIFICATION OF PROTEIN WORDS." Thesis, The University of Arizona, 2016. http://hdl.handle.net/10150/613449.
Full textAli, Khan Syed Irteza. "Classification using residual vector quantization." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50300.
Full textLiu, Dongqing. "GENETIC ALGORITHMS FOR SAMPLE CLASSIFICATION OF MICROARRAY DATA." University of Akron / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=akron1125253420.
Full textRudin, Pierre. "Football result prediction using simple classification algorithms, a comparison between k-Nearest Neighbor and Linear Regression." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-187659.
Full textÄnda sedan vi människor började tävla mot varandra, har folk försökt förutspå vinnaren i tävlingarna. Fotboll är inget undantag till detta och är extra intressant för den här studien då den tillgängliga mängden data från fotbollsmatcher ständigt ökar. Tidigare har egna kunskaper och små mängder data använts för att förutspå resultaten. Den här rapporten kommer dra nytta av den växande mängden data för att ta reda på om det är möjligt att med hjälp av k-Nearest Neighbor algoritmen och Linjär regression förutspå resultat i fotbollsmatcher. Algoritmerna kommer jämföras utifrån hur exakt de förutspår vinnaren i matcher, hur många mål de båda lagen gör samt hur precist algoritmerna förutspår målskilnaden i matcherna. Resultaten presenteras både i grafer och i tabeller. En diskusion förs för att analysera resultaten och kommer fram till att båda algoritmerna kan vara användbara om modelen är välkonstruerad, och att Linjär regression är bättre lämpad än k-NN.
Blinn, Christine Elizabeth. "Increasing the Precision of Forest Area Estimates through Improved Sampling for Nearest Neighbor Satellite Image Classification." Diss., Virginia Tech, 2005. http://hdl.handle.net/10919/28694.
Full textPh. D.
Naram, Hari Prasad. "Classification of Dense Masses in Mammograms." OpenSIUC, 2018. https://opensiuc.lib.siu.edu/dissertations/1528.
Full textJiao, Lianmeng. "Classification of uncertain data in the framework of belief functions : nearest-neighbor-based and rule-based approaches." Thesis, Compiègne, 2015. http://www.theses.fr/2015COMP2222/document.
Full textIn many classification problems, data are inherently uncertain. The available training data might be imprecise, incomplete, even unreliable. Besides, partial expert knowledge characterizing the classification problem may also be available. These different types of uncertainty bring great challenges to classifier design. The theory of belief functions provides a well-founded and elegant framework to represent and combine a large variety of uncertain information. In this thesis, we use this theory to address the uncertain data classification problems based on two popular approaches, i.e., the k-nearest neighbor rule (kNN) andrule-based classification systems. For the kNN rule, one concern is that the imprecise training data in class over lapping regions may greatly affect its performance. An evidential editing version of the kNNrule was developed based on the theory of belief functions in order to well model the imprecise information for those samples in over lapping regions. Another consideration is that, sometimes, only an incomplete training data set is available, in which case the ideal behaviors of the kNN rule degrade dramatically. Motivated by this problem, we designedan evidential fusion scheme for combining a group of pairwise kNN classifiers developed based on locally learned pairwise distance metrics.For rule-based classification systems, in order to improving their performance in complex applications, we extended the traditional fuzzy rule-based classification system in the framework of belief functions and develop a belief rule-based classification system to address uncertain information in complex classification problems. Further, considering that in some applications, apart from training data collected by sensors, partial expert knowledge can also be available, a hybrid belief rule-based classification system was developed to make use of these two types of information jointly for classification
Sammon, Ryan. "Data Collection, Analysis, and Classification for the Development of a Sailing Performance Evaluation System." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/25481.
Full textDastile, Xolani Collen. "Improved tree species discrimination at leaf level with hyperspectral data combining binary classifiers." Thesis, Rhodes University, 2011. http://hdl.handle.net/10962/d1002807.
Full textZhang, Xianjie, and Sebastian Bogic. "Datautvinning av klickdata : Kombination av klustring och klassifikation." Thesis, KTH, Hälsoinformatik och logistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230630.
Full textOwners of websites and applications usually profits through users that clicks on their links. These can be advertisements or items for sale amongst others. There are many studies about data analysis where they tell you if a link will be clicked, but only a few that focus on what needs to be adjusted to get the link clicked. The problem that Flygresor.se have is that they are missing a tool for their customers, travel agencies, that analyses their tickets and after that adjusts the attributes of those trips. The requested solution was an application which gave suggestions about how to change the tickets in a way that would make it more clicked and in that way, make more sales. A prototype was constructed which make use of two different data mining methods, clustering with the algorithm DBSCAN and classification with the algorithm knearest neighbor. These algorithms were used together with an evaluation process, called DNNA, which analyzes the result from the algorithms and gave suggestions about changes that could be done to the attributes of the links. The combination of the algorithms and DNNA was tested and evaluated as the solution to the problem. The program was able to predict what attributes of the tickets needed to be adjusted to get the tickets more clicks. ‘The recommendations of adjustments were reasonable but this result could not be compared to similar tools since they had not been published.
Amlathe, Prakhar. "Standard Machine Learning Techniques in Audio Beehive Monitoring: Classification of Audio Samples with Logistic Regression, K-Nearest Neighbor, Random Forest and Support Vector Machine." DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7050.
Full textMestre, Ricardo Jorge Palheira. "Improvements on the KNN classifier." Master's thesis, Faculdade de Ciências e Tecnologia, 2013. http://hdl.handle.net/10362/10923.
Full textThe object classification is an important area within the artificial intelligence and its application extends to various areas, whether or not in the branch of science. Among the other classifiers, the K-nearest neighbor (KNN) is among the most simple and accurate especially in environments where the data distribution is unknown or apparently not parameterizable. This algorithm assigns the classifying element the major class in the K nearest neighbors. According to the original algorithm, this classification implies the calculation of the distances between the classifying instance and each one of the training objects. If on the one hand, having an extensive training set is an element of importance in order to obtain a high accuracy, on the other hand, it makes the classification of each object slower due to its lazy-learning algorithm nature. Indeed, this algorithm does not provide any means of storing information about the previous calculated classifications,making the calculation of the classification of two equal instances mandatory. In a way, it may be said that this classifier does not learn. This dissertation focuses on the lazy-learning fragility and intends to propose a solution that transforms the KNNinto an eager-learning classifier. In other words, it is intended that the algorithm learns effectively with the training set, thus avoiding redundant calculations. In the context of the proposed change in the algorithm, it is important to highlight the attributes that most characterize the objects according to their discriminating power. In this framework, there will be a study regarding the implementation of these transformations on data of different types: continuous and/or categorical.
Bhadoria, Divya. "Learning from spatially disjoint data." [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000344.
Full textgundam, madhuri, and Madhuri Gundam. "Automatic Classification of Fish in Underwater Video; Pattern Matching - Affine Invariance and Beyond." ScholarWorks@UNO, 2015. http://scholarworks.uno.edu/td/1976.
Full textPiro, Paolo. "Learning prototype-based classification rules in a boosting framework: application to real-world and medical image categorization." Phd thesis, Université de Nice Sophia-Antipolis, 2010. http://tel.archives-ouvertes.fr/tel-00590403.
Full textBjörk, Gabriella. "Evaluation of system design strategies and supervised classification methods for fruit recognition in harvesting robots." Thesis, KTH, Skolan för industriell teknik och management (ITM), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217859.
Full textDet här masterexamensarbetet har utförts av en student från Kungliga Tekniska Högskolan i samarbete med Cybercom Group. Målet var att utvärdera och jämföra designstrategier för igenkänning av frukt i en skörderobot och prestandan av klassificerande maskininlärningsalgoritmer när de appliceras på det specifika problemet. Arbetet omfattar grunderna av dessa system; till vilket parametrar, begränsningar, krav och designbeslut har undersökts. Ramverket användes sedan som grund för implementationen av sensorsystemet, processerings- och klassifikationsalgoritmerna. En tomatplanta i pplast med frukter av varierande mognasgrad användes som bas för träning och validering av systemet, och en Kinect för Windows v2 utrustad med sensorer för högupplöst färg, djup, och infraröd data anvöndes för att erhålla bilder. Datan processerades i MATLAB med hjälp av mjukvaruutvecklingskit för Kinect tillhandahållandet av Windows, i syfte att extrahera egenskaper ifrån objekt på bilderna. Multipla vyer erhölls genom att låta tomatplantan rotera på en plattform, driven av en stegmotor Arduino Uno. De binära klassifikationsalgoritmer som testades var Support Vector MAchine, Decision Tree och k-Nearest Neighbor. Modellerna tränades och valideras med hjälp av en five fold cross validation i MATLABs Classification Learner applikation. Prestationsindikatorer som precision, återkallelse och F1- poäng beräknades för de olika modellerna. Resultatet visade bland annat att statiska modeller som k-NN och SVM presterade bättre för det givna problemet, och att den sistnömnda är mest lovande för framtida applikationer.
Aygar, Alper. "Doppler Radar Data Processing And Classification." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609890/index.pdf.
Full textGünther, Michael. "FREDDY." Association for Computing Machinery, 2018. https://tud.qucosa.de/id/qucosa%3A38451.
Full textTagami, Yukihiro. "Practical Web-scale Recommender Systems." Kyoto University, 2018. http://hdl.handle.net/2433/235110.
Full textZapletal, Petr. "Klasifikační metody analýzy vrstvy nervových vláken na sítnici." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2010. http://www.nusl.cz/ntk/nusl-218575.
Full textAxillus, Viktor. "Comparing Julia and Python : An investigation of the performance on image processing with deep neural networks and classification." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-19160.
Full textFu, Ruijun. "Empirical RF Propagation Modeling of Human Body Motions for Activity Classification." Digital WPI, 2012. https://digitalcommons.wpi.edu/etd-theses/1130.
Full textMody, Ravi. "Optimizing the distance function for nearest neighbors classification." Diss., [La Jolla] : University of California, San Diego, 2009. http://wwwlib.umi.com/cr/ucsd/fullcit?p1470299.
Full textTitle from first page of PDF file (viewed December 2, 2009). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 48-49).
Tandan, Isabelle, and Erika Goteman. "Bank Customer Churn Prediction : A comparison between classification and evaluation methods." Thesis, Uppsala universitet, Statistiska institutionen, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-411918.
Full textZhong, Xiao. "A study of several statistical methods for classification with application to microbial source tracking." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0430104-155106/.
Full textKeywords: classification; k-nearest-neighbor (k-n-n); neural networks; linear discriminant analysis (LDA); support vector machines; microbial source tracking (MST); quadratic discriminant analysis (QDA); logistic regression. Includes bibliographical references (p. 59-61).
Villa, Medina Joe Luis. "Reliability of classification and prediction in k-nearest neighbours." Doctoral thesis, Universitat Rovira i Virgili, 2013. http://hdl.handle.net/10803/127108.
Full textEn aquesta tesi doctoral s'ha desenvolupat el càlcul de la fiabilitat de classificació i de la fiabilitat de predicció utilitzant el mètode dels k-veïns més propers (k-nearest neighbours, kNN) i estratègies de remostreig basades en bootstrap. S'han desenvolupat, a més, dos nous mètodes de classificació: Probabilistic Bootstrap k-Nearest Neighbours (PBkNN) i Bagged k-Nearest Neighbours (Bagged kNN), i un nou mètode de predicció, el Direct OrthogonalizationkNN (DOkNN). En tots els casos, els resultats obtinguts amb els nous mètodes han estat comparables o millors que els obtinguts utilitzant mètodes clàssics de classificació i calibratge multivariant.
Hatko, Stan. "k-Nearest Neighbour Classification of Datasets with a Family of Distances." Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/33361.
Full textLuk, Andrew. "Some new results in nearest neighbour classification and lung sound analysis." Thesis, University of Glasgow, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.280756.
Full textHe, Jingwen. "Large Margin Nearest Neighbors Classification With Privileged Information for Biometric Applications." Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/19663.
Full textBerrett, Thomas Benjamin. "Modern k-nearest neighbour methods in entropy estimation, independence testing and classification." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/267832.
Full textVeras, Ricardo da Costa. "Utilização de métodos de machine learning para identificação de instrumentos musicais de sopro pelo timbre." reponame:Repositório Institucional da UFABC, 2018.
Find full textDissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Engenharia da Informação, Santo André, 2018.
De forma geral a Classificação de Padrões voltada a Processamento de Sinais vem sendo estudada e utilizada para a interpretação de informações diversas, que se manifestam em forma de imagens, áudios, dados geofísicos, impulsos elétricos, entre outros. Neste trabalho são estudadas técnicas de Machine Learning aplicadas ao problema de identificação de instrumentos musicais, buscando obter um sistema automático de reconhecimento de timbres. Essas técnicas foram utilizadas especificamente com cinco instrumentos da categoria de Sopro de Madeira (o Clarinete, o Fagote, a Flauta, o Oboé e o Sax). As técnicas utilizadas foram o kNN (com k = 3) e o SVM (numa configuração não linear), assim como foram estudadas algumas características (features) dos áudios, tais como o MFCC (do inglês Mel-Frequency Cepstral Coefficients), o ZCR (do inglês Zero Crossing Rate), a entropia, entre outros, sendo fonte de dados para os processos de treinamento e de teste. Procurou-se estudar instrumentos nos quais se observa uma aproximação nos timbres, e com isso verificar como é o comportamento de um sistema classificador nessas condições específicas. Observou-se também o comportamento dessas técnicas com áudios desconhecidos do treinamento, assim como com trechos em que há uma mistura de elementos (gerando interferências para cada modelo classificador) que poderiam desviar os resultados, ou com misturas de elementos que fazem parte das classes observadas, e que se somam num mesmo áudio. Os resultados indicam que as características selecionadas possuem informações relevantes a respeito do timbre de cada um dos instrumentos avaliados (como observou-se em relação aos solos), embora a acurácia obtida para alguns dos instrumentos tenha sido abaixo do esperado (como observou-se em relação aos duetos).
In general, Pattern Classification for Signal Processing has been studied and used for the interpretation of several information, which are manifested in many ways, like: images, audios, geophysical data, electrical impulses, among others. In this project we study techniques of Machine Learning applied to the problem of identification of musical instruments, aiming to obtain an automatic system of timbres recognition. These techniques were used specifically with five instruments of Woodwind category (Clarinet, Bassoon, Flute, Oboe and Sax). The techniques used were the kNN (with k = 3) and the SVM (in a non-linear configuration), as well as some audio features, such as MFCC (Mel-Frequency Cepstral Coefficients), ZCR (Zero Crossing Rate), entropy, among others, used as data source for the training and testing processes. We tried to study instruments in which an approximation in the timbres is observed, and to verify in this case how is the behavior of a classifier system in these specific conditions. It was also observed the behavior of these techniques with audios unknown to the training, as well as with sections in which there is a mixture of elements (generating interferences for each classifier model) that could deviate the results, or with mixtures of elements that are part of the observed classes, and added in a same audio. The results indicate that the selected characteristics have relevant information regarding the timbre of each one of evaluated instruments (as observed on the solos results), although the accuracy obtained for some of the instruments was lower than expected (as observed on the duets results).
Fisher, Julia Marie. "Classification Analytics in Functional Neuroimaging: Calibrating Signal Detection Parameters." Thesis, The University of Arizona, 2015. http://hdl.handle.net/10150/594646.
Full textStiernborg, Sebastian, and Sara Ervik. "Evaluation of Machine Learning Classification Methods : Support Vector Machines, Nearest Neighbour and Decision Tree." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209119.
Full textMed växande data och tillgänglighet ökar intresset och användning- en för maskininlärning, tillsammans med behovet för klassificering. Klassificering är en viktig metod inom maskininlärning för att förenk- la data och göra förutstägelser. Denna rapport utvärderar tre klassificeringsmetoder för övervakad in- lärning: Stödvektormaskiner (SVM) med olika kärnor, Närmaste Gran- ne (k-NN) och Beslutsträd (DT). Metoderna utvärderades baserat på nogrannhet, precision, återkallelse och tid. Experimenten utfördes på artificiell data skapad för att representera en variation av fördelningar med en begränsning av endast 2 egenskaper och 3 klasser. Resultaten visar att mätningarna för noggrannhet och tid varierar avsevärt för olika variationer av dataset. SVM med RBF-kärna gav generellt högre värden för noggrannhet i jämförelse med de and- ra klassificeringsmetoderna. k-NN visade något lägre noggrannhet än SVM med RBF-kärna i allmänhet, men presterade bättre på det mest utmanande datasetet. DT är den minst tidskrävande algoritmen och var signifikant snabbare än de andra klassificeringsmetoderna. Den enda metoden som kunde konkurrera med DT i tid var SVM med k- NN som var snabbare än DT för det dataset som hade liten spridning och sammanfallande klasser. Även om en tydlig trend kan ses i resultaten behöver området studeras ytterligare för att dra en omfattande slutsats på grund av begränsning av dataset i denna studie.
Borén, Mirjam. "Classification of discrete stress levels in users using eye tracker and K- Nearest Neighbour algorithm." Thesis, Umeå universitet, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-176258.
Full textSakouvogui, Kekoura. "Comparative Classification of Prostate Cancer Data using the Support Vector Machine, Random Forest, Dualks and k-Nearest Neighbours." Thesis, North Dakota State University, 2015. https://hdl.handle.net/10365/27698.
Full textProkopová, Ivona. "Detekce fibrilace síní v EKG." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-413170.
Full textJia, Wei. "Image analysis and representation for textile design classification." Thesis, University of Dundee, 2011. https://discovery.dundee.ac.uk/en/studentTheses/c667f279-d7a6-4670-b23e-c9dbe2784266.
Full textvan, Woerden Irene. "A statistical investigation of the risk factors for tuberculosis." Thesis, University of Canterbury. Mathematics and Statistics, 2013. http://hdl.handle.net/10092/8662.
Full textÅkerblom, Thea, and Tobias Thor. "Fraud or Not?" Thesis, Uppsala universitet, Statistiska institutionen, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-388695.
Full textGan, Changquan. "Une approche de classification non supervisée basée sur la notion des K plus proches voisins." Compiègne, 1994. http://www.theses.fr/1994COMP765S.
Full textAlsouda, Yasser. "An IoT Solution for Urban Noise Identification in Smart Cities : Noise Measurement and Classification." Thesis, Linnéuniversitetet, Institutionen för fysik och elektroteknik (IFE), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-80858.
Full textMakki, Sara. "An Efficient Classification Model for Analyzing Skewed Data to Detect Frauds in the Financial Sector." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1339/document.
Full textThere are different types of risks in financial domain such as, terrorist financing, money laundering, credit card fraudulence and insurance fraudulence that may result in catastrophic consequences for entities such as banks or insurance companies. These financial risks are usually detected using classification algorithms. In classification problems, the skewed distribution of classes also known as class imbalance, is a very common challenge in financial fraud detection, where special data mining approaches are used along with the traditional classification algorithms to tackle this issue. Imbalance class problem occurs when one of the classes have more instances than another class. This problem is more vulnerable when we consider big data context. The datasets that are used to build and train the models contain an extremely small portion of minority group also known as positives in comparison to the majority class known as negatives. In most of the cases, it’s more delicate and crucial to correctly classify the minority group rather than the other group, like fraud detection, disease diagnosis, etc. In these examples, the fraud and the disease are the minority groups and it’s more delicate to detect a fraud record because of its dangerous consequences, than a normal one. These class data proportions make it very difficult to the machine learning classifier to learn the characteristics and patterns of the minority group. These classifiers will be biased towards the majority group because of their many examples in the dataset and will learn to classify them much faster than the other group. After conducting a thorough study to investigate the challenges faced in the class imbalance cases, we found that we still can’t reach an acceptable sensitivity (i.e. good classification of minority group) without a significant decrease of accuracy. This leads to another challenge which is the choice of performance measures used to evaluate models. In these cases, this choice is not straightforward, the accuracy or sensitivity alone are misleading. We use other measures like precision-recall curve or F1 - score to evaluate this trade-off between accuracy and sensitivity. Our objective is to build an imbalanced classification model that considers the extreme class imbalance and the false alarms, in a big data framework. We developed two approaches: A Cost-Sensitive Cosine Similarity K-Nearest Neighbor (CoSKNN) as a single classifier, and a K-modes Imbalance Classification Hybrid Approach (K-MICHA) as an ensemble learning methodology. In CoSKNN, our aim was to tackle the imbalance problem by using cosine similarity as a distance metric and by introducing a cost sensitive score for the classification using the KNN algorithm. We conducted a comparative validation experiment where we prove the effectiveness of CoSKNN in terms of accuracy and fraud detection. On the other hand, the aim of K-MICHA is to cluster similar data points in terms of the classifiers outputs. Then, calculating the fraud probabilities in the obtained clusters in order to use them for detecting frauds of new transactions. This approach can be used to the detection of any type of financial fraud, where labelled data are available. At the end, we applied K-MICHA to a credit card, mobile payment and auto insurance fraud data sets. In all three case studies, we compare K-MICHA with stacking using voting, weighted voting, logistic regression and CART. We also compared with Adaboost and random forest. We prove the efficiency of K-MICHA based on these experiments
Alzubaidi, Laith. "Deep learning for medical imaging applications." Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/227812/1/Laith_Alzubaidi_Thesis.pdf.
Full textBílý, Ondřej. "Moderní řečové příznaky používané při diagnóze chorob." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-218971.
Full textDyremark, Johanna, and Caroline Mayer. "Bedömning av elevuppsatser genom maskininlärning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-262041.
Full textToday, a large amount of a teacher’s workload is comprised of essay scoring and there is a large variability between teachers’ gradings. This report aims to examine what accuracy can be acceived with an automated essay scoring system for Swedish. Three following machine learning models for classification are trained and tested with 5-fold cross-validation on essays from Swedish national tests: Linear Discriminant Analysis, K-Nearest Neighbour and Random Forest. Essays are classified based on 31 language structure related attributes such as token-based length measures, similarity to texts with different formal levels and use of grammar. The results show a maximal quadratic weighted kappa value of 0.4829 and a grading identical to expert’s assessment in 57.53% of all tests. These results were achieved by a model based on Linear Discriminant Analysis and showed higher inter-rater reliability with expert grading than a local teacher. Despite an ongoing digitilization within the Swedish educational system, there are a number of obstacles preventing a complete automization of essay scoring such as users’ attitude, ethical issues and the current techniques difficulties in understanding semantics. Nevertheless, a partial integration of automatic essay scoring has potential to effectively identify essays suitable for double grading which can increase the consistency of large-scale tests to a low cost.