Relevant bibliographies by topics / Data classification and machine learning

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'Data classification and machine learning'

Author: Grafiati

Published: 4 June 2021

Last updated: 3 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Data classification and machine learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Data classification and machine learning"

Sarker, Ananya, Md ShahidUz Zaman, and Md Azmain Yakin Srizon. "Twitter Data Classification by Applying and Comparing Multiple Machine Learning Techniques." International Journal of Innovative Research in Computer Science & Technology 7, no. 6 (November 2019): 147–52. http://dx.doi.org/10.21276/ijircst.2019.7.6.2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

DROTÁR, Peter, and Zdeněk SMÉKAL. "COMPARATIVE STUDY OF MACHINE LEARNING TECHNIQUES FOR SUPERVISED CLASSIFICATION OF BIOMEDICAL DATA." Acta Electrotechnica et Informatica 14, no. 3 (September 1, 2014): 5–10. http://dx.doi.org/10.15546/aeei-2014-0021.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Sabeti, Behnam, Hossein Abedi Firouzjaee, Reza Fahmi, Saeid Safavi, Wenwu Wang, and Mark D. Plumbley. "Credit Risk Rating Using State Machines and Machine Learning." International Journal of Trade, Economics and Finance 11, no. 6 (December 2020): 163–68. http://dx.doi.org/10.18178/ijtef.2020.11.6.683.

Full text

Abstract:

Credit risk is the possibility of a loss resulting from a borrower’s failure to repay a loan or meet contractual obligations. With the growing number of customers and expansion of businesses, it’s not possible or at least feasible for banks to assess each customer individually in order to minimize this risk. Machine learning can leverage available user data to model a behavior and automatically estimate a credit score for each customer. In this research, we propose a novel approach based on state machines to model this problem into a classical supervised machine learning task. The proposed state machine is used to convert historical user data to a credit score which generates a data-set for training supervised models. We have explored several classification models in our experiments and illustrated the effectiveness of our modeling approach.

APA, Harvard, Vancouver, ISO, and other styles

Ziarko, Wojciech, and Ning Shan. "Machine Learning Through Data Classification and Reduction." Fundamenta Informaticae 30, no. 3,4 (1997): 373–82. http://dx.doi.org/10.3233/fi-1997-303411.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Rose, Lina, and X. Anitha Mary. "Sensor data classification using machine learning algorithm." Journal of Statistics and Management Systems 23, no. 2 (February 17, 2020): 363–71. http://dx.doi.org/10.1080/09720510.2020.1736319.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bond, Koby, and Alaa Sheta. "Medical Data Classification using Machine Learning Techniques." International Journal of Computer Applications 183, no. 6 (June 21, 2021): 1–8. http://dx.doi.org/10.5120/ijca2021921339.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Li, Xiaodong, Weijie Mao, and Wei Jiang. "Extreme learning machine based transfer learning for data classification." Neurocomputing 174 (January 2016): 203–10. http://dx.doi.org/10.1016/j.neucom.2015.01.096.

Full text

APA, Harvard, Vancouver, ISO, and other styles

MAHAJAN, SHWETA. "News Classification Using Machine Learning." International Journal on Recent and Innovation Trends in Computing and Communication 9, no. 5 (May 31, 2021): 23–27. http://dx.doi.org/10.17762/ijritcc.v9i5.5464.

Full text

Abstract:

There are plenty of social media webpages and platforms producing the textual data. These different kind of a data needs to be analysed and processed to extract meaningful information from raw data. Classification of text plays a vital role in extraction of useful information along with summarization, text retrieval. In our work we have considered the problem of news classification using machine learning approach. Currently we have a news related dataset which having various types of data like entertainment, education, sports, politics, etc. On this data we have applying classification algorithm with some word vectorizing techniques in order to get best result. The results which we got that have been compared on different parameters like Precision, Recall, F1 Score, accuracy for performance improvement.

APA, Harvard, Vancouver, ISO, and other styles

Hall, Brendon. "Facies classification using machine learning." Leading Edge 35, no. 10 (October 2016): 906–9. http://dx.doi.org/10.1190/tle35100906.1.

Full text

Abstract:

There has been much excitement recently about big data and the dire need for data scientists who possess the ability to extract meaning from it. Geoscientists, meanwhile, have been doing science with voluminous data for years, without needing to brag about how big it is. But now that large, complex data sets are widely available, there has been a proliferation of tools and techniques for analyzing them. Many free and open-source packages now exist that provide powerful additions to the geoscientist's toolbox, much of which used to be only available in proprietary (and expensive) software platforms.

APA, Harvard, Vancouver, ISO, and other styles

Punia, Sanjeev Kumar, Manoj Kumar, Thompson Stephan, Ganesh Gopal Deverajan, and Rizwan Patan. "Performance Analysis of Machine Learning Algorithms for Big Data Classification." International Journal of E-Health and Medical Communications 12, no. 4 (July 2021): 60–75. http://dx.doi.org/10.4018/ijehmc.20210701.oa4.

Full text

Abstract:

In broad, three machine learning classification algorithms are used to discover correlations, hidden patterns, and other useful information from different data sets known as big data. Today, Twitter, Facebook, Instagram, and many other social media networks are used to collect the unstructured data. The conversion of unstructured data into structured data or meaningful information is a very tedious task. The different machine learning classification algorithms are used to convert unstructured data into structured data. In this paper, the authors first collect the unstructured research data from a frequently used social media network (i.e., Twitter) by using a Twitter application program interface (API) stream. Secondly, they implement different machine classification algorithms (supervised, unsupervised, and reinforcement) like decision trees (DT), neural networks (NN), support vector machines (SVM), naive Bayes (NB), linear regression (LR), and k-nearest neighbor (K-NN) from the collected research data set. The comparison of different machine learning classification algorithms is concluded.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Data classification and machine learning"

Stenekap, Daniel. "Classification of Gear-shift data using machine learning." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-53445.

Full text

Abstract:

Today, automatic transmissions are the industrial standard in heavy-duty vehicles. However, tolerances and component wear can cause factory calibrated gearshifts to have deviations that have a negative impact on clutch durability and driver comfort. An adaptive shift process could solve this problem by recognizing when pre-calibrated values are out-dated. The purpose of this thesis is to examine the classification of shift types using machine learning for the future goal of an adaptive gearshift process. Recent papers concerning machine learning on time-series are reviewed. Adata set is collected and validated using hand-engineered features and unsupervised learning. Four deep neural networks (DNN) models are trained on raw and normalized shift data. Three of the models show good generalization and perform with accuracies above 90%. An adaption of the fully convolutional network (FCN) used in [1] shows promise due to relative size and ability to learn the raw data sets. An adaptation of the multi-variate long short time memory fully convolutional network (MLSTMFCN) used in [2] is superior on normalized data sets. This thesis shows that DNN structures can be used to distinguish between time-series of shift data. However, much effort remains since a database for shift types is necessary for this work to continue.

APA, Harvard, Vancouver, ISO, and other styles

Fujino, Akinori. "Machine Learning with Heterogeneous Data for Classification Problems." 京都大学 (Kyoto University), 2009. http://hdl.handle.net/2433/123832.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Teatini, Alex. "Movement trajectory classification using supervised machine learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-265009.

Full text

Abstract:

Anything that moves can be tracked, and hence its trajectory analysed. The trajectory of a moving object can carry a lot of useful information depending on what is sought. In this work, the aim is to exploit machine learning to be able to classify finite trajectories based on their shape. In a clinical environment, a set of trajectory classes have been defined based on relevance to particular pathologies. Furthermore, several trajectories have been collected using a depth sensor from a number of subjects. The problem to address is to evaluate whether it is possible to classify these trajectories into predefined classes. A trajectory consists of a sequentially ordered list of coordinates, which would imply temporal processing. However, following the success of machine learning to classify images, the idea of a visual approach surfaced. On this basis, the plots of the trajectories are transformed into images, making the problem become similar to a written character recognition problem. The implemented methods for this classification tasks are the well-known Support Vector Machine (SVM) and the Convolutional Neural Network (CNN), the most appreciated deep approach to image recognition. We find that the best possible way to of achieving substantial performances on this classification task is to use a mixture of the two aforementioned methods, namely a two-step classification made of a binary SVM, responsible for a first distinction, followed by a CNN for the final decision. We illustrate that this tree-based approach is capable of granting the best classification accuracy score under the imposed restrictions. In conclusion, a look into possible future developments based on the exploration of novel deep learning methods will be given. This project has been developed during an internship at the company ‘Qinematic’.
Allt som rör sig kan detekteras och därmed kan dess bana analyseras. Banan för ett rörligt objekt kan bära en hel del användbar information beroende på vad som eftersöks. I detta arbete är syftet att utnyttja maskininlärning för att kunna klassificera ändliga banor baserat på deras form. I en klinisk miljö har en uppsättning banklasser definierats baserat på dess relevans för vissa sjukdomar. Vidare har flera banor samlats in med hjälp av en djupledssensor från ett antal personer. Projektets syfte är att utvärdera om det är möjligt att klassificera dessa banor i de fördefinierade klasserna. En bana består av en sekventiellt ordnad lista av koordinater, vilket skulle antyda temporal behandling. Men utifrån framgången av maskininlärning för att klassificera bilder fick vi idén om en bildbaserad analys. På grundval av detta har banor omvandlas till bilder, vilket gör att problemet nu liknar igenkänningsproblemet av handskrivna siffror. De genomförda metoderna för klassificeringsuppgiften är den välkända Support Vector Machine (SVM), implementerad i några olika konfigurationer samt Convolutional Neural Network (CNN), den mest uppskattade metoden för bildigenkänning inom Deep Learning. Vi finner att bästa möjliga sätt för att uppnå betydande prestationer på klassificeringsuppgiften är att använda en blandning av de två tidigare nämnda metoderna, nämligen en tvåstegsklassificering gjord av en binär SVM, ansvarig för en första distinktion, följt av en CNN för det slutliga beslutet. Vi visar att detta trädbaserade tillvägagångssätt kan ge den bästa klassnoggrannheten under ålagda restriktioner. Avslutningsvis ges en hypotes för framtida förbättringar av nya djupa inlärningsmetoder

APA, Harvard, Vancouver, ISO, and other styles

Milne, Linda Computer Science &amp Engineering Faculty of Engineering UNSW. "Machine learning for automatic classification of remotely sensed data." Publisher:University of New South Wales. Computer Science & Engineering, 2008. http://handle.unsw.edu.au/1959.4/41322.

Full text

Abstract:

As more and more remotely sensed data becomes available it is becoming increasingly harder to analyse it with the more traditional labour intensive, manual methods. The commonly used techniques, that involve expert evaluation, are widely acknowledged as providing inconsistent results, at best. We need more general techniques that can adapt to a given situation and that incorporate the strengths of the traditional methods, human operators and new technologies. The difficulty in interpreting remotely sensed data is that often only a small amount of data is available for classification. It can be noisy, incomplete or contain irrelevant information. Given that the training data may be limited we demonstrate a variety of techniques for highlighting information in the available data and how to select the most relevant information for a given classification task. We show that more consistent results between the training data and an entire image can be obtained, and how misclassification errors can be reduced. Specifically, a new technique for attribute selection in neural networks is demonstrated. Machine learning techniques, in particular, provide us with a means of automating classification using training data from a variety of data sources, including remotely sensed data and expert knowledge. A classification framework is presented in this thesis that can be used with any classifier and any available data. While this was developed in the context of vegetation mapping from remotely sensed data using machine learning classifiers, it is a general technique that can be applied to any domain. The emphasis of the applicability for this framework being domains that have inadequate training data available.

APA, Harvard, Vancouver, ISO, and other styles

Li, Ling Abu-Mostafa Yaser S. "Data complexity in machine learning and novel classification algorithms /." Diss., Pasadena, Calif. : Caltech, 2006. http://resolver.caltech.edu/CaltechETD:etd-04122006-114210.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Montiel, López Jacob. "Fast and slow machine learning." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT014/document.

Full text

Abstract:

L'ère du Big Data a révolutionné la manière dont les données sont créées et traitées. Dans ce contexte, de nombreux défis se posent, compte tenu de la quantité énorme de données disponibles qui doivent être efficacement gérées et traitées afin d’extraire des connaissances. Cette thèse explore la symbiose de l'apprentissage en mode batch et en flux, traditionnellement considérés dans la littérature comme antagonistes, sur le problème de la classification à partir de flux de données en évolution. L'apprentissage en mode batch est une approche bien établie basée sur une séquence finie: d'abord les données sont collectées, puis les modèles prédictifs sont créés, finalement le modèle est appliqué. Par contre, l’apprentissage par flux considère les données comme infinies, rendant le problème d’apprentissage comme une tâche continue (sans fin). De plus, les flux de données peuvent évoluer dans le temps, ce qui signifie que la relation entre les caractéristiques et la réponse correspondante peut changer. Nous proposons un cadre systématique pour prévoir le surendettement, un problème du monde réel ayant des implications importantes dans la société moderne. Les deux versions du mécanisme d'alerte précoce (batch et flux) surpassent les performances de base de la solution mise en œuvre par le Groupe BPCE, la deuxième institution bancaire en France. De plus, nous introduisons une méthode d'imputation évolutive basée sur un modèle pour les données manquantes dans la classification. Cette méthode présente le problème d'imputation sous la forme d'un ensemble de tâches de classification / régression résolues progressivement.Nous présentons un cadre unifié qui sert de plate-forme d'apprentissage commune où les méthodes de traitement par batch et par flux peuvent interagir de manière positive. Nous montrons que les méthodes batch peuvent être efficacement formées sur le réglage du flux dans des conditions spécifiques. Nous proposons également une adaptation de l'Extreme Gradient Boosting algorithme aux flux de données en évolution. La méthode adaptative proposée génère et met à jour l'ensemble de manière incrémentielle à l'aide de mini-lots de données. Enfin, nous présentons scikit-multiflow, un framework open source en Python qui comble le vide en Python pour une plate-forme de développement/recherche pour l'apprentissage à partir de flux de données en évolution
The Big Data era has revolutionized the way in which data is created and processed. In this context, multiple challenges arise given the massive amount of data that needs to be efficiently handled and processed in order to extract knowledge. This thesis explores the symbiosis of batch and stream learning, which are traditionally considered in the literature as antagonists. We focus on the problem of classification from evolving data streams.Batch learning is a well-established approach in machine learning based on a finite sequence: first data is collected, then predictive models are created, then the model is applied. On the other hand, stream learning considers data as infinite, rendering the learning problem as a continuous (never-ending) task. Furthermore, data streams can evolve over time, meaning that the relationship between features and the corresponding response (class in classification) can change.We propose a systematic framework to predict over-indebtedness, a real-world problem with significant implications in modern society. The two versions of the early warning mechanism (batch and stream) outperform the baseline performance of the solution implemented by the Groupe BPCE, the second largest banking institution in France. Additionally, we introduce a scalable model-based imputation method for missing data in classification. This method casts the imputation problem as a set of classification/regression tasks which are solved incrementally.We present a unified framework that serves as a common learning platform where batch and stream methods can positively interact. We show that batch methods can be efficiently trained on the stream setting under specific conditions. The proposed hybrid solution works under the positive interactions between batch and stream methods. We also propose an adaptation of the Extreme Gradient Boosting (XGBoost) algorithm for evolving data streams. The proposed adaptive method generates and updates the ensemble incrementally using mini-batches of data. Finally, we introduce scikit-multiflow, an open source framework in Python that fills the gap in Python for a development/research platform for learning from evolving data streams

APA, Harvard, Vancouver, ISO, and other styles

He, Jin. "Robust Mote-Scale Classification of Noisy Data via Machine Learning." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1440413201.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Rosquist, Christine. "Text Classification of Human Resources-related Data with Machine Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302375.

Full text

Abstract:

Text classification has been an important application and research subject since the origin of digital documents. Today, as more and more data are stored in the form of electronic documents, the text classification approach is even more vital. There exist various studies that apply machine learning methods such as Naive Bayes and Convolutional Neural Networks (CNN) to text classification and sentiment analysis. However, most of these studies do not focus on cross- domain classification i.e., machine learning models that have been trained on a dataset from one context are tested on another dataset from another context. This is useful when there is not enough training data for the specific domain where text data is to be classified. This thesis investigates how the machine learning methods Naive Bayes and CNN perform when they are trained in one context and then tested in another slightly different context. The study uses data from employee reviews in order to train the models, and the models are then tested on both the employee-review data but also on human resources-related data. Thus, the aim with the thesis is to gain insights on how to develop a system with the capability to perform an accurate cross-domain classification, and to provide more insights to the text classification research area in general. A comparative analysis of the models Naive Bayes and CNN was done, and the results showed that both of the models performed quite similarly when classifying sentences by only using the employee-review data to train and test the models. However, CNN performed slightly better when it comes to multiclass classification for the employee data, which indicates that CNN might be a better model in that context. From a cross-domain perspective, Naive Bayes turned out to be the better model since it performed better in all of the metrics evaluated. However, both of the models can be used as guidance tools in order to classify human-resources related data quickly, even if Naive Bayes is the model that performs the best in the cross-domain context. The results can possibly be improved with more research and need to be verified with more data. Suggestions on how to improve the results are among others to enhance the hyperparameter optimization, use another approach to handle the data imbalance, and adjust the preprocessing methods used. It is also worth noting that the statistical significance could not be confirmed in all of the different test cases, meaning that no absolute conclusions can be drawn, but the results from this thesis work still provide an indication of how well the models perform.
Textklassificering har varit en viktig tillämpning och ett viktigt forskningsämne sedan uppkomsten av digitala dokument. Idag, i och med att allt mer data sparas i form av elektroniska dokument, är textklassificeringen ännu mer relevant. Det existerar flera studier som applicerar maskininlärningsmodeller så som Naive Bayes och Convolutional Neural Networks (CNN) på textklassificering och sentimentanalys. Dock ligger inte fokuset i dessa studier på en krossdomän-klassificering, vilket innebär att maskinlärningsmodellerna tränas på ett dataset från en viss kontext och sedan testas på ett dataset från en annan kontext. Detta är användbart när det inte finns tillräckligt med träningsdata från den specifika domänen där textdata ska klassificeras. Den här studien undersöker hur maskininlärningsmodellerna Naive Bayes och CNN presterar när de är tränade i en viss kontext och sedan testade i en annan, något annorlunda, kontext. Studien använder data från recensioner gjorda av anställda för att träna modellerna, som sedan testas på den datan men också på personalavdelningsrelaterad data. Således är syftet med denna studie att bidra med insikt i hur ett system kan utvecklas med kapabilitet att utföra en korrekt krossdomän-klassificering, samt bidra med generell insikt till forskningsämnet textklassificering. En jämförande analys av modellerna Naive Bayes och CNN utfördes, och resultaten visade att modellerna presterar lika när det kom till att klassificera text genom att enbart använda datan med recensioner gjorda av anställda för att träna och testa modellerna. Dock visade det sig att CNN presterade bättre när det kom till multiklass-klassificering av datan med recensioner gjorda av anställda, vilket indikerar att CNN kan vara en bättre modell i den kontexten. Från ett krossdomän-perspektiv visade det sig att Naive Bayes var den bättre modellen, i och med att den modellen presterade bäst i alla mätningar. Båda modellerna kan användas som guidningsverktyg för att klassificera personalavdelningsrelaterad data, trots att Naive Bayes var modellen som presterade bäst i ett krossdomän-perspektiv. Resultatet kan förbättrats en del med mer forskning, och behöver verifieras med mer data. Förslag på hur resultaten kan förbättras är att förbättra hyperparameteroptimeringen, använda en annan metod för att hantera den obalanserade datan samt att justera förbehandlingen av datan. Det är också värt att notera att den statistiska signifikansen inte kunde bekräftas i alla testfall, vilket innebär att inga egentliga slutsatser kan dras, även om det fortfarande bidrar med en indikering om hur bra de olika modellerna presterar i de olika fallen.

APA, Harvard, Vancouver, ISO, and other styles

Pehrson, Jakob, and Sara Lindstrand. "Support Unit Classification through Supervised Machine Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281537.

Full text

Abstract:

The purpose of this article is to evaluate the impact a supervised machine learning classification model can have on the process of internal customer support within a large digitized company. Chatbots are becoming a frequently used utility among digital services, though the true general impact is not always clear. The research is separated into the following two questions: (1) Which supervised machine learning algorithm of naïve Bayes, logistic regression, and neural networks can best predict the correct support a user needs and with what accuracy? And (2) What is the effect on the productivity and customer satisfaction of using machine learning to sort customer needs? The data was collected from the internal server database of a large digital company and was then trained on and tested with the three classification algorithms. Furthermore, a survey was collected with questions focused on understanding how the current system affects the involved employees. A first finding indicates that neural networks is the best suited model for the classification task. Though, when the scope and complexity was limited, naïve Bayes and logistic regression performed sufficiently. A second finding of the study is that the classification model potentially improves productivity given that the baseline is met. However, a difficulty exists in drawing conclusions on the exact effects on customer satisfaction since there are many aspects to take into account. Nevertheless, there is a good potential to achieve a positive net effect.
Syftet med artikeln är att utvärdera den påverkan som en klassificeringsmodell kan ha på den interna processen av kundtjänst inom ett stort digitaliserat företag. Chatbotar används allt mer frekvent bland digitala tjänster, även om den generella effekten inte alltid är tydlig. Studien är uppdelad i följande två frågeställningar: (1) Vilken klassificeringsalgoritm bland naive Bayes, logistisk regression, och neurala nätverk kan bäst förutspå den korrekta hjälpen en användare är i behov av och med vilken noggrannhet? Och (2) Vad är effekten på produktivitet och kundnöjdhet för användandet av maskininlärning för sortering av kundbehov? Data samlades från ett stort, digitalt företags interna databas och används sedan i träning och testning med de tre klassificeringsalgoritmerna. Vidare, en enkät skickades ut med fokus på att förstå hur det nuvarande systemet påverkar de berörda arbetarna. Ett första fynd indikerar att neurala nätverk är den mest lämpade modellen för klassificeringen. Däremot, när omfånget och komplexiteten var begränsat presenterade även naive Bayes och logistisk regression tillräckligt. Ett andra fynd av studien är att klassificeringen potentiellt förbättrar produktiviteten givet att baslinjen är mött. Däremot existerar en svårighet i att dra slutsatser om den exakta effekten på kundnöjdhet eftersom det finns många olika aspekter att ta hänsyn till. Likväl finns en god potential i att uppnå en positiv nettoeffekt.

APA, Harvard, Vancouver, ISO, and other styles

Amil, Marletti Pablo. "Machine learning methods for the characterization and classification of complex data." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/668842.

Full text

Abstract:

This thesis work presents novel methods for the analysis and classification of medical images and, more generally, complex data. First, an unsupervised machine learning method is proposed to order anterior chamber OCT (Optical Coherence Tomography) images according to a patient's risk of developing angle-closure glaucoma. In a second study, two outlier finding techniques are proposed to improve the results of above mentioned machine learning algorithm, we also show that they are applicable to a wide variety of data, including fraud detection in credit card transactions. In a third study, the topology of the vascular network of the retina, considering it a complex tree-like network is analyzed and we show that structural differences reveal the presence of glaucoma and diabetic retinopathy. In a fourth study we use a model of a laser with optical injection that presents extreme events in its intensity time-series to evaluate machine learning methods to forecast such extreme events.
El presente trabajo de tesis desarrolla nuevos métodos para el análisis y clasificación de imágenes médicas y datos complejos en general. Primero, proponemos un método de aprendizaje automático sin supervisión que ordena imágenes OCT (tomografía de coherencia óptica) de la cámara anterior del ojo en función del grado de riesgo del paciente de padecer glaucoma de ángulo cerrado. Luego, desarrollamos dos métodos de detección automática de anomalías que utilizamos para mejorar los resultados del algoritmo anterior, pero que su aplicabilidad va mucho más allá, siendo útil, incluso, para la detección automática de fraudes en transacciones de tarjetas de crédito. Mostramos también, cómo al analizar la topología de la red vascular de la retina considerándola una red compleja, podemos detectar la presencia de glaucoma y de retinopatía diabética a través de diferencias estructurales. Estudiamos también un modelo de un láser con inyección óptica que presenta eventos extremos en la serie temporal de intensidad para evaluar diferentes métodos de aprendizaje automático para predecir dichos eventos extremos.
Aquesta tesi desenvolupa nous mètodes per a l’anàlisi i la classificació d’imatges mèdiques i dades complexes. Hem proposat, primer, un mètode d’aprenentatge automàtic sense supervisió que ordena imatges OCT (tomografia de coherència òptica) de la cambra anterior de l’ull en funció del grau de risc del pacient de patir glaucoma d’angle tancat. Després, hem desenvolupat dos mètodes de detecció automàtica d’anomalies que hem utilitzat per millorar els resultats de l’algoritme anterior, però que la seva aplicabilitat va molt més enllà, sent útil, fins i tot, per a la detecció automàtica de fraus en transaccions de targetes de crèdit. Mostrem també, com en analitzar la topologia de la xarxa vascular de la retina considerant-la una xarxa complexa, podem detectar la presència de glaucoma i de retinopatia diabètica a través de diferències estructurals. Finalment, hem estudiat un làser amb injecció òptica, el qual presenta esdeveniments extrems en la sèrie temporal d’intensitat. Hem avaluat diferents mètodes per tal de predir-los.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Books on the topic "Data classification and machine learning"

Suthaharan, Shan. Machine Learning Models and Algorithms for Big Data Classification. Boston, MA: Springer US, 2016. http://dx.doi.org/10.1007/978-1-4899-7641-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Pham, Thuy T. Applying Machine Learning for Automated Classification of Biomedical Data in Subject-Independent Settings. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-319-98675-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Jahrestagung, Gesellschaft für Klassifikation. Data analysis, machine learning and applications: Proceedings of the 31st Annual Conference of the Gesellschaft fü̈r Klassifikation e.V., Albert-Ludwigs-Universität Freiburg, March 7-9, 2007. Edited by Preisach Christine. Berlin: Springer, 2008.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Shuurmans, Dale Eric. Effective classification learning. Toronto: University of Toronto, 1996.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Dean, Jared. Big Data, Data Mining, and Machine Learning. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2014. http://dx.doi.org/10.1002/9781118691786.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Friedman, Craig. Utility-based learning from data. Boca Raton: Chapman & Hall/CRC, 2010.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Nicosia, Giuseppe, Panos Pardalos, Giovanni Giuffrida, Renato Umeton, and Vincenzo Sciacca, eds. Machine Learning, Optimization, and Data Science. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-13709-0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Preisach, Christine, Hans Burkhardt, Lars Schmidt-Thieme, and Reinhold Decker, eds. Data Analysis, Machine Learning and Applications. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-78246-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Nicosia, Giuseppe, Panos Pardalos, Giovanni Giuffrida, and Renato Umeton, eds. Machine Learning, Optimization, and Big Data. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-72926-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Pardalos, Panos M., Piero Conca, Giovanni Giuffrida, and Giuseppe Nicosia, eds. Machine Learning, Optimization, and Big Data. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-51469-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Data classification and machine learning"

Paluszek, Michael, and Stephanie Thomas. "Data Classification." In MATLAB Machine Learning, 113–41. Berkeley, CA: Apress, 2016. http://dx.doi.org/10.1007/978-1-4842-2250-8_8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Paluszek, Michael, and Stephanie Thomas. "Data Classification with Decision Trees." In MATLAB Machine Learning Recipes, 147–69. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-3916-2_7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Drummond, Chris. "Classification." In Encyclopedia of Machine Learning and Data Mining, 205–8. Boston, MA: Springer US, 2017. http://dx.doi.org/10.1007/978-1-4899-7687-1_111.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Vucetic, Slobodan, and Zoran Obradovic. "Classification on Data with Biased Class Distribution." In Machine Learning: ECML 2001, 527–38. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44795-4_45.

Full text

APA, Harvard, Vancouver, ISO, and other styles

da Costa, Joaquim Pinto, and Jaime S. Cardoso. "Classification of Ordinal Data Using Neural Networks." In Machine Learning: ECML 2005, 690–97. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11564096_70.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Stefanowski, Jerzy, and Dariusz Brzezinski. "Stream Classification." In Encyclopedia of Machine Learning and Data Mining, 1191–99. Boston, MA: Springer US, 2017. http://dx.doi.org/10.1007/978-1-4899-7687-1_908.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Stefanowski, Jerzy, and Dariusz Brzezinski. "Stream Classification." In Encyclopedia of Machine Learning and Data Mining, 1–9. Boston, MA: Springer US, 2016. http://dx.doi.org/10.1007/978-1-4899-7502-7_908-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Fürnkranz, Johannes. "Classification Rule." In Encyclopedia of Machine Learning and Data Mining, 1. Boston, MA: Springer US, 2016. http://dx.doi.org/10.1007/978-1-4899-7502-7_914-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Namata, Galileo, Prithviraj Sen, Mustafa Bilgic, and Lise Getoor. "Collective Classification." In Encyclopedia of Machine Learning and Data Mining, 1–7. Boston, MA: Springer US, 2014. http://dx.doi.org/10.1007/978-1-4899-7502-7_44-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Mladenić, Dunja, Janez Brank, and Marko Grobelnik. "Document Classification." In Encyclopedia of Machine Learning and Data Mining, 1–5. Boston, MA: Springer US, 2016. http://dx.doi.org/10.1007/978-1-4899-7502-7_75-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Data classification and machine learning"

Li, Wenrui, Nishita Narvekar, Nakshatra Nakshatra, Nitisha Raut, Birsen Sirkeci, and Jerry Gao. "Seismic Data Classification Using Machine Learning." In 2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService). IEEE, 2018. http://dx.doi.org/10.1109/bigdataservice.2018.00017.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bost, Raphael, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser. "Machine Learning Classification over Encrypted Data." In Network and Distributed System Security Symposium. Reston, VA: Internet Society, 2015. http://dx.doi.org/10.14722/ndss.2015.23241.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Lin, Yi-meng, Xuan Wang, Wing Y. Ng, Qun Chang, Daniel Yeung, and Xiao-long Wang. "Sphere Classification for Ambiguous Data." In 2006 International Conference on Machine Learning and Cybernetics. IEEE, 2006. http://dx.doi.org/10.1109/icmlc.2006.258851.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Litvinov, S. I., P. S. Bekeshko, and O. O. Adamovich. "Machine learning for classification of seismic data." In Data Science in Oil and Gas 2021. European Association of Geoscientists & Engineers, 2021. http://dx.doi.org/10.3997/2214-4609.202156018.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Aydogan, Murat, and Ali Karci. "Turkish Text Classification with Machine Learning and Transfer Learning." In 2019 International Artificial Intelligence and Data Processing Symposium (IDAP). IEEE, 2019. http://dx.doi.org/10.1109/idap.2019.8875919.

Full text

APA, Harvard, Vancouver, ISO, and other styles

El-Mandouh, Amira M., Laila A. Abd-Elmegid, Hamdi A. Mahmoud, and Mohamed H. Haggag. "Machine Learning Approach for Big Data Classification." In 2017 27th International Conference on Computer Theory and Applications (ICCTA). IEEE, 2017. http://dx.doi.org/10.1109/iccta43079.2017.9497138.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Sharma, Aman, and Rinkle Rani. "Classification of Cancerous Profiles Using Machine Learning." In 2017 International Conference on Machine Learning and Data Science (MLDS). IEEE, 2017. http://dx.doi.org/10.1109/mlds.2017.6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Niu, Shuteng, Jian Wang, Yongxin Liu, and Houbing Song. "Transfer Learning based Data-Efficient Machine Learning Enabled Classification." In 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). IEEE, 2020. http://dx.doi.org/10.1109/dasc-picom-cbdcom-cyberscitech49142.2020.00108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bugge, A. J., J. E. Lie, and S. Clark. "Automatic Facies Classification And Horizon Tracking In 3D Seismic Data." In First EAGE/PESGB Workshop Machine Learning. Netherlands: EAGE Publications BV, 2018. http://dx.doi.org/10.3997/2214-4609.201803010.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Garg, Amit, Nachiket Trivedi, Junlan Lu, Magdalini Eirinaki, Bin Yu, and Femi Olumofin. "An evaluation of machine learning methods for domain name classification." In 2020 IEEE International Conference on Big Data (Big Data). IEEE, 2020. http://dx.doi.org/10.1109/bigdata50022.2020.9377787.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Data classification and machine learning"

Davis, Benjamin. Applying Machine Learning to the Classification of DC-DC Converters: Real-world data collection processing & Validation. Office of Scientific and Technical Information (OSTI), September 2020. http://dx.doi.org/10.2172/1670255.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hodgdon, Taylor, Anthony Fuentes, Jason Olivier, Brian Quinn, and Sally Shoop. Automated terrain classification for vehicle mobility in off-road conditions. Engineer Research and Development Center (U.S.), April 2021. http://dx.doi.org/10.21079/11681/40219.

Full text

Abstract:

The U.S. Army is increasingly interested in autonomous vehicle operations, including off-road autonomous ground maneuver. Unlike on-road, off-road terrain can vary drastically, especially with the effects of seasonality. As such, vehicles operating in off-road environments need to be in-formed about the changing terrain prior to departure or en route for successful maneuver to the mission end point. The purpose of this report is to assess machine learning algorithms used on various remotely sensed datasets to see which combinations are useful for identifying different terrain. The study collected data from several types of winter conditions by using both active and passive, satellite and vehicle-based sensor platforms and both supervised and unsupervised machine learning algorithms. To classify specific terrain types, supervised algorithms must be used in tandem with large training datasets, which are time consuming to create. However, unsupervised segmentation algorithms can be used to help label the training data. More work is required gathering training data to include a wider variety of terrain types. While classification is a good first step, more detailed information about the terrain properties will be needed for off-road autonomy.

APA, Harvard, Vancouver, ISO, and other styles

Shabalina, A., A. Carpenter, M. Rahman, C. Tennant, and L. Vidyaratne. Machine Learning Based Cavity Fault Classification and Prediction. Office of Scientific and Technical Information (OSTI), December 2020. http://dx.doi.org/10.2172/1735851.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Pilania, Ghanshyam, James E. Gubernatis, Turab Lookman, and Rampi Ramprasad. Materials Classification & Accelerated Property Predictions using Machine Learning. Office of Scientific and Technical Information (OSTI), June 2015. http://dx.doi.org/10.2172/1184607.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Waldrop, Lauren, Carl Hart, Nancy Parker, Chris Pettit, and Scotland McIntosh. Utility of machine learning algorithms for natural background photo classification. Cold Regions Research and Engineering Laboratory (U.S.), June 2018. http://dx.doi.org/10.21079/11681/27344.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Porter, Reid B., James P. Theiler, and Donald R. Hush. Interactive Machine Learning in Data Exploitation. Office of Scientific and Technical Information (OSTI), January 2013. http://dx.doi.org/10.2172/1060903.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Rao, Vishwas, Sandeep Madireddy, Carlo Graziani, Pengfei Xue, and Romit Maulik. Probabilistic Machine Learning and Data Assimilation. Office of Scientific and Technical Information (OSTI), April 2021. http://dx.doi.org/10.2172/1769766.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hedyehzadeh, Mohammadreza, Shadi Yoosefian, Dezfuli Nezhad, and Naser Safdarian. Evaluation of Conventional Machine Learning Methods for Brain Tumour Type Classification. "Prof. Marin Drinov" Publishing House of Bulgarian Academy of Sciences, June 2020. http://dx.doi.org/10.7546/crabs.2020.06.14.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Byrd, Lexie, Curtis Smith, Ross Kunz, Nancy Lybeck, Ronald Boring, Humberto Garcia, Victor Walker, et al. Big Data, Machine Learning, Artificial Intelligence [PowerPoint]. Office of Scientific and Technical Information (OSTI), May 2020. http://dx.doi.org/10.2172/1617329.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Downard, Alicia, Stephen Semmens, and Bryant Robbins. Automated characterization of ridge-swale patterns along the Mississippi River. Engineer Research and Development Center (U.S.), April 2021. http://dx.doi.org/10.21079/11681/40439.

Full text

Abstract:

The orientation of constructed levee embankments relative to alluvial swales is a useful measure for identifying regions susceptible to backward erosion piping (BEP). This research was conducted to create an automated, efficient process to classify patterns and orientations of swales within the Lower Mississippi Valley (LMV) to support levee risk assessments. Two machine learning algorithms are used to train the classification models: a convolutional neural network and a U-net. The resulting workflow can identify linear topographic features but is unable to reliably differentiate swales from other features, such as the levee structure and riverbanks. Further tuning of training data or manual identification of regions of interest could yield significantly better results. The workflow also provides an orientation to each linear feature to support subsequent analyses of position relative to levee alignments. While the individual models fall short of immediate applicability, the procedure provides a feasible, automated scheme to assist in swale classification and characterization within mature alluvial valley systems similar to LMV.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Contents

Academic literature on the topic 'Data classification and machine learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Journal articles on the topic "Data classification and machine learning"

Dissertations / Theses on the topic "Data classification and machine learning"

Books on the topic "Data classification and machine learning"

Book chapters on the topic "Data classification and machine learning"

Conference papers on the topic "Data classification and machine learning"

Reports on the topic "Data classification and machine learning"