Log in

Relevant bibliographies by topics / K-Support vector nearest neighbor / Dissertations / Theses

To see the other types of publications on this topic, follow the link: K-Support vector nearest neighbor.

Dissertations / Theses on the topic 'K-Support vector nearest neighbor'

Author: Grafiati

Published: 4 June 2025

Last updated: 23 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'K-Support vector nearest neighbor.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Amlathe, Prakhar. "Standard Machine Learning Techniques in Audio Beehive Monitoring: Classification of Audio Samples with Logistic Regression, K-Nearest Neighbor, Random Forest and Support Vector Machine." DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7050.

Full text

Abstract:

Honeybees are one of the most important pollinating species in agriculture. Every three out of four crops have honeybee as their sole pollinator. Since 2006 there has been a drastic decrease in the bee population which is attributed to Colony Collapse Disorder(CCD). The bee colonies fail/ die without giving any traditional health symptoms which otherwise could help in alerting the Beekeepers in advance about their situation. Electronic Beehive Monitoring System has various sensors embedded in it to extract video, audio and temperature data that could provide critical information on colony behavior and health without invasive beehive inspections. Previously, significant patterns and information have been extracted by processing the video/image data, but no work has been done using audio data. This research inaugurates and takes the first step towards the use of audio data in the Electronic Beehive Monitoring System (BeePi) by enabling a path towards the automatic classification of audio samples in different classes and categories within it. The experimental results give an initial support to the claim that monitoring of bee buzzing signals from the hive is feasible, it can be a good indicator to estimate hive health and can help to differentiate normal behavior against any deviation for honeybees.

APA, Harvard, Vancouver, ISO, and other styles

2

Sakouvogui, Kekoura. "Comparative Classification of Prostate Cancer Data using the Support Vector Machine, Random Forest, Dualks and k-Nearest Neighbours." Thesis, North Dakota State University, 2015. https://hdl.handle.net/10365/27698.

Full text

Abstract:

This paper compares four classifications tools, Support Vector Machine (SVM), Random Forest (RF), DualKS and the k-Nearest Neighbors (kNN) that are based on different statistical learning theories. The dataset used is a microarray gene expression of 596 male patients with prostate cancer. After treatment, the patients were classified into one group of phenotype with three levels: PSA (Prostate-Specific Antigen), Systematic and NED (No Evidence of Disease). The purpose of this research is to determine the performance rate of each classifier by selecting the optimal kernels and parameters that give the best prediction rate of the phenotype. The paper begins with the discussion of previous implementations of the tools and their mathematical theories. The results showed that three classifiers achieved a comparable performance that was above the average while DualKS did not. We also observed that SVM outperformed the kNN, RF and DualKS classifiers.

APA, Harvard, Vancouver, ISO, and other styles

3

Naram, Hari Prasad. "Classification of Dense Masses in Mammograms." OpenSIUC, 2018. https://opensiuc.lib.siu.edu/dissertations/1528.

Full text

Abstract:

This dissertation material provided in this work details the techniques that are developed to aid in the Classification of tumors, non-tumors, and dense masses in a Mammogram, certain characteristics such as texture in a mammographic image are used to identify the regions of interest as a part of classification. Pattern recognizing techniques such as nearest mean classifier and Support vector machine classifier are also used to classify the features. The initial stages include the processing of mammographic image to extract the relevant features that would be necessary for classification and during the final stage the features are classified using the pattern recognizing techniques mentioned above. The goal of this research work is to provide the Medical Experts and Researchers an effective method which would aid them in identifying the tumors, non-tumors, and dense masses in a mammogram. At first the breast region extraction is carried using the entire mammogram. The extraction is carried out by creating the masks and using those masks to extract the region of interest pertaining to the tumor. A chain code is employed to extract the various regions, the extracted regions could potentially be classified as tumors, non-tumors, and dense regions. Adaptive histogram equalization technique is employed to enhance the contrast of an image. After applying the adaptive histogram equalization for several times which will provide a saturated image which would contain only bright spots of the mammographic image which appear like dense regions of the mammogram. These dense masses could be potential tumors which would need treatment. Relevant Characteristics such as texture in the mammographic image are used for feature extraction by using the nearest mean and support vector machine classifier. A total of thirteen Haralick features are used to classify the three classes. Support vector machine classifier is used to classify two class problems and radial basis function (RBF) kernel is used to find the best possible (c and gamma) values. Results obtained in this research suggest the best classification accuracy was achieved by using the support vector machines for both Tumor vs Non-Tumor and Tumor vs Dense masses. The maximum accuracies achieved for the tumor and non-tumor is above 90 % and for the dense masses is 70.8% using 11 features for support vector machines. Support vector machines performed better than the nearest mean majority classifier in the classification of the classes. Various case studies were performed using two distinct datasets in which each dataset consisting of 24 patients’ data in two individual views. Each patient data will consist of both the cranio caudal view and medio lateral oblique views. From these views the region of interest which could possibly be a tumor, non-tumor, or a dense regions(mass).

APA, Harvard, Vancouver, ISO, and other styles

4

Janson, Lisa, and Minna Mathisson. "Data mining inom tillverkningsindustrin : En fallstudie om möjligheten att förutspå kvalitetsutfall i produktionslinjer." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301246.

Full text

Abstract:

I detta arbete har en fallstudie utförts på Volvo Group i Köping. I takt med ¨övergången till industri 4.0, ökar möjligheterna att använda maskininlärning som ett verktyg i analysen av industriell data och vidareutvecklingen av industriproduktionen. Detta arbete syftar till att undersöka möjligheten att förutspå kvalitetsutfall vid sammanpressning av nav och huvudaxel. Metoden innefattar implementering av tre maskininlärningsmodeller samt evaluering av dess prestation i förhållande till varandra. Vid applicering av modellerna på monteringsdata från fabriken erhölls ett bristfälligt resultat, vilket indikerar att det utifrån de inkluderade variablerna inte är möjligt att förutspå kvalitetsutfallet. Orsakerna som låg till grund för resultatet granskades, och det resulterade i att det förmodligen berodde på att modellerna var oförmögna att finna samband i datan eller att det inte fanns något samband i datasetet. För att avgöra vilken av dessa två faktorer som var avgörande skapades ett fabricerat dataset där tre nya variabler introducerades. De fabricerade värdena på dessa variabler skapades på sådant sätt att det fanns syntetisk kausalitet mellan två av variablerna och kvalitetsutfallet. Vid applicering av modellerna på den fabricerade datan, lyckades samtliga modeller identifiera det syntetiska sambandet. Utifrån det drogs slutsatsen att det bristfälliga resultatet inte berodde på modellernas prestation utan att det inte fanns något samband i datasetet bestående av verklig monteringsdata. Det här bidrog till bedömningen att om spårbarheten på komponenterna hade ökat i framtiden, i kombination med att fler maskiner i produktionslinjen genererade data till ett sammankopplat system, skulle denna studie kunna utföras igen, men med fler variabler och ett större dataset. Support vector machine var den modell som presterade bäst, givet de prestationsmått som användes i denna studie. Det faktum att modellerna som inkluderats i den här studien lyckades identifiera sambandet i datan, när det fanns vetskap om att sambandet existerade, motiverar användandet av dessa modeller i framtida studier. Avslutningsvis kan det konstateras att med förbättrad spårbarhet och en allt mer uppkopplad fabrik, finns det möjlighet att använda maskininlärningsmodeller som komponenter i större system för att kunna uppnå effektiviseringar.<br>As the adaptation towards Industry 4.0 proceeds, the possibility of using machine learning as a tool for further development of industrial production, becomes increasingly profound. In this paper, a case study has been conducted at Volvo Group in Köping, in order to investigate the wherewithals of predicting quality outcomes in the compression of hub and mainshaft. In the conduction of this study, three different machine learning models were implemented and compared amongst each other. A dataset containing data from Volvo’s production site in Köping was utilized when training and evaluating the models. However, the low evaluation scores acquired from this, indicate that the quality outcome of the compression could not be predicted given solely the variables included in that dataset. Therefore, a dataset containing three additional variables consisting of fabricated values and a known causality between two of the variables and the quality outcome, was also utilized. The purpose of this was to investigate whether the poor evaluation metrics resulted from a non-existent pattern between the included variables and the quality outcome, or from the models not being able to find the pattern. The performance of the models, when trained and evaluated on the fabricated dataset, indicate that the models were in fact able to find the pattern that was known to exist. Support vector machine was the model that performed best, given the evaluation metrics that were chosen in this study. Consequently, if the traceability of the components were to be enhanced in the future and an additional number of machines in the production line would transmit production data to a connected system, it would be possible to conduct the study again with additional variables and a larger data set. The fact that the models included in this study succeeded in finding patterns in the dataset when such patterns were known to exist, motivates the use of the same models. Furthermore, it can be concluded that with enhanced traceability of the components and a larger amount of machines transmitting production data to a connected system, there is a possibility that machine learning models could be utilized as components in larger business monitoring systems, in order to achieve efficiencies.

APA, Harvard, Vancouver, ISO, and other styles

5

Pai, Chih-Yun. "Automatic Pain Assessment from Infants’ Crying Sounds." Scholar Commons, 2016. http://scholarcommons.usf.edu/etd/6560.

Full text

Abstract:

Crying is infants utilize to express their emotional state. It provides the parents and the nurses a criterion to understand infants’ physiology state. Many researchers have analyzed infants’ crying sounds to diagnose specific diseases or define the reasons for crying. This thesis presents an automatic crying level assessment system to classify infants’ crying sounds that have been recorded under realistic conditions in the Neonatal Intensive Care Unit (NICU) as whimpering or vigorous crying. To analyze the crying signal, Welch’s method and Linear Predictive Coding (LPC) are used to extract spectral features; the average and the standard deviation of the frequency signal and the maximum power spectral density are the other spectral features which are used in classification. For classification, three state-of-the-art classifiers, namely K-nearest Neighbors, Random Forests, and Least Squares Support Vector Machine are tested in this work, and the experimental result achieves the highest accuracy in classifying whimper and vigorous crying using the clean dataset is 90%, which is sampled with 10 seconds before scoring and 5 seconds after scoring and uses K-nearest neighbors as the classifier.

APA, Harvard, Vancouver, ISO, and other styles

6

Björk, Gabriella. "Evaluation of system design strategies and supervised classification methods for fruit recognition in harvesting robots." Thesis, KTH, Skolan för industriell teknik och management (ITM), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217859.

Full text

Abstract:

This master thesis project is carried out by one student at the Royal Institute of Technology in collaboration with Cybercom Group. The aim was to evaluate and compare system design strategies for fruit recognition in harvesting robots and the performance of supervised machine learning classification methods when applied to this specific task. The thesis covers the basics of these systems; to which parameters, constraints, requirements, and design decisions have been investigated. The framework is used as a foundation for the implementation of both sensing system, and processing and classification algorithms. A plastic tomato plant with fruit of varying maturity was used as a basis for training and testing, and a Kinect v2 for Windows including sensors for high resolution color-, depth, and IR data was used for image acquisition. The obtained data were processed and features of objects of interest extracted using MATLAB and a SDK for Kinect provided by Microsoft. Multiple views of the plant were acquired by having the plant rotate on a platform controlled by a stepper motor and an Ardunio Uno. The algorithms tested were binary classifiers, including Support Vector Machine, Decision Tree, and k-Nearest Neighbor. The models were trained and validated using a five fold cross validation in MATLABs Classification Learner application. Peformance metrics such as precision, recall, and the F1-score, used for accuracy comparison, were calculated. The statistical models k-NN and SVM achieved the best scores. The method considered most promising for fruit recognition purposes was the SVM.<br>Det här masterexamensarbetet har utförts av en student från Kungliga Tekniska Högskolan i samarbete med Cybercom Group. Målet var att utvärdera och jämföra designstrategier för igenkänning av frukt i en skörderobot och prestandan av klassificerande maskininlärningsalgoritmer när de appliceras på det specifika problemet. Arbetet omfattar grunderna av dessa system; till vilket parametrar, begränsningar, krav och designbeslut har undersökts. Ramverket användes sedan som grund för implementationen av sensorsystemet, processerings- och klassifikationsalgoritmerna. En tomatplanta i pplast med frukter av varierande mognasgrad användes som bas för träning och validering av systemet, och en Kinect för Windows v2 utrustad med sensorer för högupplöst färg, djup, och infraröd data anvöndes för att erhålla bilder. Datan processerades i MATLAB med hjälp av mjukvaruutvecklingskit för Kinect tillhandahållandet av Windows, i syfte att extrahera egenskaper ifrån objekt på bilderna. Multipla vyer erhölls genom att låta tomatplantan rotera på en plattform, driven av en stegmotor Arduino Uno. De binära klassifikationsalgoritmer som testades var Support Vector MAchine, Decision Tree och k-Nearest Neighbor. Modellerna tränades och valideras med hjälp av en five fold cross validation i MATLABs Classification Learner applikation. Prestationsindikatorer som precision, återkallelse och F1- poäng beräknades för de olika modellerna. Resultatet visade bland annat att statiska modeller som k-NN och SVM presterade bättre för det givna problemet, och att den sistnömnda är mest lovande för framtida applikationer.

APA, Harvard, Vancouver, ISO, and other styles

7

VANCE, DANNY W. "AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING." University of Cincinnati / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1162335608.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Carls, Fredrik. "Evaluation of machine learning methods for anomaly detection in combined heat and power plant." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-255006.

Full text

Abstract:

In the hope to increase the detection rate of faults in combined heat and power plant boilers thus lowering unplanned maintenance three machine learning models are constructed and evaluated. The algorithms; k-Nearest Neighbor, One-Class Support Vector Machine, and Auto-encoder have a proven track record in research for anomaly detection, but are relatively unexplored for industrial applications such as this one due to the difficulty in collecting non-artificial labeled data in the field.The baseline versions of the k-Nearest Neighbor and Auto-encoder performed very similarly. Nevertheless, the Auto-encoder was slightly better and reached an area under the precision-recall curve (AUPRC) of 0.966 and 0.615 on the trainingand test period, respectively. However, no sufficiently good results were reached with the One-Class Support Vector Machine. The Auto-encoder was made more sophisticated to see how much performance could be increased. It was found that the AUPRC could be increased to 0.987 and 0.801 on the trainingand test period, respectively. Additionally, the model was able to detect and generate one alarm for each incident period that occurred under the test period.The conclusion is that ML can successfully be utilized to detect faults at an earlier stage and potentially circumvent otherwise costly unplanned maintenance. Nevertheless, there is still a lot of room for improvements in the model and the collection of the data.<br>I hopp om att öka identifieringsgraden av störningar i kraftvärmepannor och därigenom minska oplanerat underhåll konstrueras och evalueras tre maskininlärningsmodeller.Algoritmerna; k-Nearest Neighbor, One-Class Support Vector Machine, och Autoencoder har bevisad framgång inom forskning av anomalidetektion, men är relativt outforskade för industriella applikationer som denna på grund av svårigheten att samla in icke-artificiell uppmärkt data inom området.Grundversionerna av k-Nearest Neighbor och Auto-encoder presterade nästan likvärdigt. Dock var Auto-encoder-modellen lite bättre och nådde ett AUPRC-värde av 0.966 respektive 0.615 på träningsoch testperioden. Inget tillräckligt bra resultat nåddes med One-Class Support Vector Machine. Auto-encoder-modellen gjordes mer sofistikerad för att se hur mycket prestandan kunde ökas. Det visade sig att AUPRC-värdet kunde ökas till 0.987 respektive 0.801 under träningsoch testperioden. Dessutom lyckades modellen identifiera och generera ett larm vardera för alla incidenter under testperioden. Slutsatsen är att ML framgångsrikt kan användas för att identifiera störningar iett tidigare skede och därigenom potentiellt kringgå i annat fall dyra oplanerade underhåll. Emellertid finns det fortfarande mycket utrymme för förbättringar av modellen samt inom insamlingen av data.

APA, Harvard, Vancouver, ISO, and other styles

9

Veras, Ricardo da Costa. "Utilização de métodos de machine learning para identificação de instrumentos musicais de sopro pelo timbre." reponame:Repositório Institucional da UFABC, 2018.

Find full text

Abstract:

Orientador: Prof. Dr. Ricardo Suyama<br>Dissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Engenharia da Informação, Santo André, 2018.<br>De forma geral a Classificação de Padrões voltada a Processamento de Sinais vem sendo estudada e utilizada para a interpretação de informações diversas, que se manifestam em forma de imagens, áudios, dados geofísicos, impulsos elétricos, entre outros. Neste trabalho são estudadas técnicas de Machine Learning aplicadas ao problema de identificação de instrumentos musicais, buscando obter um sistema automático de reconhecimento de timbres. Essas técnicas foram utilizadas especificamente com cinco instrumentos da categoria de Sopro de Madeira (o Clarinete, o Fagote, a Flauta, o Oboé e o Sax). As técnicas utilizadas foram o kNN (com k = 3) e o SVM (numa configuração não linear), assim como foram estudadas algumas características (features) dos áudios, tais como o MFCC (do inglês Mel-Frequency Cepstral Coefficients), o ZCR (do inglês Zero Crossing Rate), a entropia, entre outros, sendo fonte de dados para os processos de treinamento e de teste. Procurou-se estudar instrumentos nos quais se observa uma aproximação nos timbres, e com isso verificar como é o comportamento de um sistema classificador nessas condições específicas. Observou-se também o comportamento dessas técnicas com áudios desconhecidos do treinamento, assim como com trechos em que há uma mistura de elementos (gerando interferências para cada modelo classificador) que poderiam desviar os resultados, ou com misturas de elementos que fazem parte das classes observadas, e que se somam num mesmo áudio. Os resultados indicam que as características selecionadas possuem informações relevantes a respeito do timbre de cada um dos instrumentos avaliados (como observou-se em relação aos solos), embora a acurácia obtida para alguns dos instrumentos tenha sido abaixo do esperado (como observou-se em relação aos duetos).<br>In general, Pattern Classification for Signal Processing has been studied and used for the interpretation of several information, which are manifested in many ways, like: images, audios, geophysical data, electrical impulses, among others. In this project we study techniques of Machine Learning applied to the problem of identification of musical instruments, aiming to obtain an automatic system of timbres recognition. These techniques were used specifically with five instruments of Woodwind category (Clarinet, Bassoon, Flute, Oboe and Sax). The techniques used were the kNN (with k = 3) and the SVM (in a non-linear configuration), as well as some audio features, such as MFCC (Mel-Frequency Cepstral Coefficients), ZCR (Zero Crossing Rate), entropy, among others, used as data source for the training and testing processes. We tried to study instruments in which an approximation in the timbres is observed, and to verify in this case how is the behavior of a classifier system in these specific conditions. It was also observed the behavior of these techniques with audios unknown to the training, as well as with sections in which there is a mixture of elements (generating interferences for each classifier model) that could deviate the results, or with mixtures of elements that are part of the observed classes, and added in a same audio. The results indicate that the selected characteristics have relevant information regarding the timbre of each one of evaluated instruments (as observed on the solos results), although the accuracy obtained for some of the instruments was lower than expected (as observed on the duets results).

APA, Harvard, Vancouver, ISO, and other styles

10

Wahab, Nor-Ul. "Evaluation of Supervised Machine LearningAlgorithms for Detecting Anomalies in Vehicle’s Off-Board Sensor Data." Thesis, Högskolan Dalarna, Mikrodataanalys, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:du-28962.

Full text

Abstract:

A diesel particulate filter (DPF) is designed to physically remove diesel particulate matter or soot from the exhaust gas of a diesel engine. Frequently replacing DPF is a waste of resource and waiting for full utilization is risky and very costly, so, what is the optimal time/milage to change DPF? Answering this question is very difficult without knowing when the DPF is changed in a vehicle. We are finding the answer with supervised machine learning algorithms for detecting anomalies in vehicles off-board sensor data (operational data of vehicles). Filter change is considered an anomaly because it is rare as compared to normal data. Non-sequential machine learning algorithms for anomaly detection like oneclass support vector machine (OC-SVM), k-nearest neighbor (K-NN), and random forest (RF) are applied for the first time on DPF dataset. The dataset is unbalanced, and accuracy is found misleading as a performance measure for the algorithms. Precision, recall, and F1-score are found good measure for the performance of the machine learning algorithms when the data is unbalanced. RF gave highest F1-score of 0.55 than K-NN (0.52) and OCSVM (0.51). It means that RF perform better than K-NN and OC-SVM but after further investigation it is concluded that the results are not satisfactory. However, a sequential approach should have been tried which could yield better result.

APA, Harvard, Vancouver, ISO, and other styles

11

Linton, Thomas. "Forecasting hourly electricity consumption for sets of households using machine learning algorithms." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186592.

Full text

Abstract:

To address inefficiency, waste, and the negative consequences of electricity generation, companies and government entities are looking to behavioural change among residential consumers. To drive behavioural change, consumers need better feedback about their electricity consumption. A monthly or quarterly bill provides the consumer with almost no useful information about the relationship between their behaviours and their electricity consumption. Smart meters are now widely dispersed in developed countries and they are capable of providing electricity consumption readings at an hourly resolution, but this data is mostly used as a basis for billing and not as a tool to assist the consumer in reducing their consumption. One component required to deliver innovative feedback mechanisms is the capability to forecast hourly electricity consumption at the household scale. The work presented by this thesis is an evaluation of the effectiveness of a selection of kernel based machine learning methods at forecasting the hourly aggregate electricity consumption for different sized sets of households. The work of this thesis demonstrates that k-Nearest Neighbour Regression and Gaussian process Regression are the most accurate methods within the constraints of the problem considered. In addition to accuracy, the advantages and disadvantages of each machine learning method are evaluated, and a simple comparison of each algorithms computational performance is made.<br>För att ta itu med ineffektivitet, avfall, och de negativa konsekvenserna av elproduktion så vill företag och myndigheter se beteendeförändringar bland hushållskonsumenter. För att skapa beteendeförändringar så behöver konsumenterna bättre återkoppling när det gäller deras elförbrukning. Den nuvarande återkopplingen i en månads- eller kvartalsfaktura ger konsumenten nästan ingen användbar information om hur deras beteenden relaterar till deras konsumtion. Smarta mätare finns nu överallt i de utvecklade länderna och de kan ge en mängd information om bostäders konsumtion, men denna data används främst som underlag för fakturering och inte som ett verktyg för att hjälpa konsumenterna att minska sin konsumtion. En komponent som krävs för att leverera innovativa återkopplingsmekanismer är förmågan att förutse elförbrukningen på hushållsskala. Arbetet som presenteras i denna avhandling är en utvärdering av noggrannheten hos ett urval av kärnbaserad maskininlärningsmetoder för att förutse den sammanlagda förbrukningen för olika stora uppsättningar av hushåll. Arbetet i denna avhandling visar att "k-Nearest Neighbour Regression" och "Gaussian Process Regression" är de mest exakta metoder inom problemets begränsningar. Förutom noggrannhet, så görs en utvärdering av fördelar, nackdelar och prestanda hos varje maskininlärningsmetod.

APA, Harvard, Vancouver, ISO, and other styles

12

Alzubaidi, Laith. "Deep learning for medical imaging applications." Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/227812/1/Laith_Alzubaidi_Thesis.pdf.

Full text

Abstract:

This thesis investigated novel deep learning techniques for advanced medical imaging applications. It addressed three major research issues of employing deep learning for medical imaging applications including network architecture, lack of training data, and generalisation. It proposed three new frameworks for CNN network architecture and three novel transfer learning methods. The proposed solutions have been tested on four different medical imaging applications demonstrating their effectiveness and generalisation. These solutions have already been employed by the scientific community showing excellent performance in medical imaging applications and other domains.

APA, Harvard, Vancouver, ISO, and other styles

13

LOPES, Marcus Vinicius de Sousa. "Aplicação de classificadores para determinação de conformidade de biodiesel." Universidade Federal do Maranhão, 2017. http://tedebc.ufma.br:8080/jspui/handle/tede/1896.

Full text

Abstract:

Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-09-04T17:47:07Z No. of bitstreams: 1 MarcusLopes.pdf: 2085041 bytes, checksum: 14f6f9bbe0d5b050a23103874af8c783 (MD5)<br>Made available in DSpace on 2017-09-04T17:47:07Z (GMT). No. of bitstreams: 1 MarcusLopes.pdf: 2085041 bytes, checksum: 14f6f9bbe0d5b050a23103874af8c783 (MD5) Previous issue date: 2017-07-26<br>The growing demand for energy and the limitations of oil reserves have led to the search for renewable and sustainable energy sources to replace, even partially, fossil fuels. Biodiesel has become in last decades the main alternative to petroleum diesel. Its quality is evaluated by given parameters and specifications which vary according to country or region like, for example, in Europe (EN 14214), US (ASTM D6751) and Brazil (RANP 45/2014), among others. Some of these parameters are intrinsically related to the composition of fatty acid methyl esters (FAMEs) of biodiesel, such as viscosity, density, oxidative stability and iodine value, which allows to relate the behavior of these properties with the size of the carbon chain and the presence of unsaturation in the molecules. In the present work four methods for direct classification (support vector machine, K-nearest neighbors, decision tree classifier and artificial neural networks) were optimized and compared to classify biodiesel samples according to their compliance to viscosity, density, oxidative stability and iodine value, having as input the composition of fatty acid methyl esters, since those parameters are intrinsically related to composition of biodiesel. The classifi- cations were carried out under the specifications of standards EN 14214, ASTM D6751 and RANP 45/2014. A comparison between these methods of direct classification and empirical equations (indirect classification) distinguished positively the direct classification methods in the problem addressed, especially when the biodiesel samples have properties values very close to the limits of the considered specifications.<br>A demanda crescente por fontes de energia renováveis e como alternativa aos combustíveis fósseis tornam o biodiesel como uma das principais alternativas para substituição dos derivados do petróleo. O controle da qualidade do biodiesel durante processo de produção e distribuição é extremamente importante para garantir um combustível com qualidade confiável e com desempenho satisfatório para o usuário final. O biodiesel é caracterizado pela medição de determinadas propriedades de acordo com normas internacionais. A utilização de métodos de aprendizagem de máquina para a caracterização do biodiesel permite economia de tempo e dinheiro. Neste trabalho é mostrado que para a determinação da conformidade de um biodiesel os classificadores SVM, KNN e Árvore de decisões apresentam melhores resultados que os métodos de predição de trabalhos anteriores. Para as propriedades de viscosidade densidade, índice de iodo e estabilidade oxidativa (RANP 45/2014, EN14214:2014 e ASTM D6751-15) os classificadores KNN e Árvore de decisões apresentaram-se como melhores opções. Estes resultados mostram que os classificadores podem ser aplicados de forma prática visando economia de tempo, recursos financeiros e humanos.

APA, Harvard, Vancouver, ISO, and other styles

14

Ambrožová, Monika. "Detekce fibrilace síní v krátkodobých EKG záznamech." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-400984.

Full text

Abstract:

Atrial fibrillation is diagnosed in 1-2% of the population, in next decades, it expects a significant increase in the number of patients with this arrhythmia in connection with the aging of the population and the higher incidence of some diseases that are considered as risk factors of atrial fibrillation. The aim of this work is to describe the problem of atrial fibrillation and the methods that allow its detection in the ECG record. In the first part of work there is a theory dealing with cardiac physiology and atrial fibrillation. There is also basic descreption of the detection of atrial fibrillation. In the practical part of work, there is described software for detection of atrial fibrillation, which is provided by BTL company. Furthermore, an atrial fibrillation detector is designed. Several parameters were selected to detect the variation of RR intervals. These are the parameters of the standard deviation, coefficient of skewness and kurtosis, coefficient of variation, root mean square of the successive differences, normalized absolute deviation, normalized absolute difference, median absolute deviation and entropy. Three different classification models were used: support vector machine (SVM), k-nearest neighbor (KNN) and discriminant analysis classification. The SVM classification model achieves the best results. Results of success indicators (sensitivity: 67.1%; specificity: 97.0%; F-measure: 66.8%; accuracy: 92.9%).

APA, Harvard, Vancouver, ISO, and other styles

15

Andersson, Fanny, and Anna Furugård. "Detektion och klassificering av äppelmognad i hyperspektrala bilder." Thesis, Linköpings universitet, Medie- och Informationsteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-178848.

Full text

Abstract:

Detta arbete presenterar en icke-destruktiv metod för att detektera och klassificera mognadsgraden hos äpplen med användning av hyperspektrala bilder. Fastställning av mognadsgraden hos äpplen är intressant för bland annat äppelodlare och musterier vid lagring och beredning. Äpplens mognadsgrad är även intressant inom växtförädling. För att fastställa mognadsgraden idag krävs att det skärs i frukten, en så kallad destruktiv metod. Hyperspektrala bilder kan idag användas inom områden som jordbruk, miljöövervakning och militär spaning.<br><p>Examensarbetet är utfört vid Institutionen för teknik och naturvetenskap (ITN) vid Tekniska fakulteten, Linköpings universitet</p>

APA, Harvard, Vancouver, ISO, and other styles

16

Zhong, Xiao. "A study of several statistical methods for classification with application to microbial source tracking." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0430104-155106/.

Full text

Abstract:

Thesis (M.S.)--Worcester Polytechnic Institute.<br>Keywords: classification; k-nearest-neighbor (k-n-n); neural networks; linear discriminant analysis (LDA); support vector machines; microbial source tracking (MST); quadratic discriminant analysis (QDA); logistic regression. Includes bibliographical references (p. 59-61).

APA, Harvard, Vancouver, ISO, and other styles

17

Fu, Ruijun. "Empirical RF Propagation Modeling of Human Body Motions for Activity Classification." Digital WPI, 2012. https://digitalcommons.wpi.edu/etd-theses/1130.

Full text

Abstract:

"Many current and future medical devices are wearable, using the human body as a conduit for wireless communication, which implies that human body serves as a crucial part of the transmission medium in body area networks (BANs). Implantable medical devices such as Pacemaker and Cardiac Defibrillators are designed to provide patients with timely monitoring and treatment. Endoscopy capsules, pH Monitors and blood pressure sensors are used as clinical diagnostic tools to detect physiological abnormalities and replace traditional wired medical devices. Body-mounted sensors need to be investigated for use in providing a ubiquitous monitoring environment. In order to better design these medical devices, it is important to understand the propagation characteristics of channels for in-body and on- body wireless communication in BANs. The IEEE 802.15.6 Task Group 6 is officially working on the standardization of Body Area Network, including the channel modeling and communication protocol design. This thesis is focused on the propagation characteristics of human body movements. Specifically, standing, walking and jogging motions are measured, evaluated and analyzed using an empirical approach. Using a network analyzer, probabilistic models are derived for the communication links in the medical implant communication service band (MICS), the industrial scientific medical band (ISM) and the ultra- wideband (UWB) band. Statistical distributions of the received signal strength and second order statistics are presented to evaluate the link quality and outage performance for on-body to on- body communications at different antenna separations. The Normal distribution, Gamma distribution, Rayleigh distribution, Weibull distribution, Nakagami-m distribution, and Lognormal distribution are considered as potential models to describe the observed variation of received signal strength. Doppler spread in the frequency domain and coherence time in the time domain from temporal variations is analyzed to characterize the stability of the channels induced by human body movements. The shape of the Doppler spread spectrum is also investigated to describe the relationship of the power and frequency in the frequency domain. All these channel characteristics could be used in the design of communication protocols in BANs, as well as providing features to classify different human body activities. Realistic data extracted from built-in sensors in smart devices were used to assist in modeling and classification of human body movements along with the RF sensors. Variance, energy and frequency domain entropy of the data collected from accelerometer and orientation sensors are pre- processed as features to be used in machine learning algorithms. Activity classifiers with Backpropagation Network, Probabilistic Neural Network, k-Nearest Neighbor algorithm and Support Vector Machine are discussed and evaluated as means to discriminate human body motions. The detection accuracy can be improved with both RF and inertial sensors."

APA, Harvard, Vancouver, ISO, and other styles

18

Lindblom, Ellen, and Isabelle Almquist. "Data-Driven Predictions of Heating Energy Savings in Residential Buildings." Thesis, Uppsala universitet, Byggteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-387395.

Full text

Abstract:

Along with the increasing use of intermittent electricity sources, such as wind and sun, comes a growing demand for user flexibility. This has paved the way for a new market of services that provide electricity customers with energy saving solutions. These include a variety of techniques ranging from sophisticated control of the customers’ home equipment to information on how to adjust their consumption behavior in order to save energy. This master thesis work contributes further to this field by investigating an additional incentive; predictions of future energy savings related to indoor temperature. Five different machine learning models have been tuned and used to predict monthly heating energy consumption for a given set of homes. The model tuning process and performance evaluation were performed using 10-fold cross validation. The best performing model was then used to predict how much heating energy each individual household could save by decreasing their indoor temperature by 1°C during the heating season. The highest prediction accuracy (of about 78%) is achieved with support vector regression (SVR), closely followed by neural networks (NN). The simpler regression models that have been implemented are, however, not far behind. According to the SVR model, the average household is expected to lower their heating energy consumption by approximately 3% if the indoor temperature is decreased by 1°C.

APA, Harvard, Vancouver, ISO, and other styles

19

Prokopová, Ivona. "Detekce fibrilace síní v EKG." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-413170.

Full text

Abstract:

Atrial fibrillation is one of the most common cardiac rhythm disorders characterized by ever-increasing prevalence and incidence in the Czech Republic and abroad. The incidence of atrial fibrillation is reported at 2-4 % of the population, but due to the often asymptomatic course, the real prevalence is even higher. The aim of this work is to design an algorithm for automatic detection of atrial fibrillation in the ECG record. In the practical part of this work, an algorithm for the detection of atrial fibrillation is proposed. For the detection itself, the k-nearest neighbor method, the support vector method and the multilayer neural network were used to classify ECG signals using features indicating the variability of RR intervals and the presence of the P wave in the ECG recordings. The best detection was achieved by a model using a multilayer neural network classification with two hidden layers. Results of success indicators: Sensitivity 91.23 %, Specificity 99.20 %, PPV 91.23 %, F-measure 91.23 % and Accuracy 98.53 %.

APA, Harvard, Vancouver, ISO, and other styles

20

Stiernborg, Sebastian, and Sara Ervik. "Evaluation of Machine Learning Classification Methods : Support Vector Machines, Nearest Neighbour and Decision Tree." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209119.

Full text

Abstract:

With more and more data available, the interest and use for machine learning is growing and so does the need for classification. Classifica- tion is an important method within machine learning for data simpli- fication and prediction. This report evaluates three classification methods for supervised learn- ing: Support Vector Machines (SVM) with several kernels, Nearest Neighbor (k-NN) and Decision Tree (DT). The methods were evalu- ated based on the factors accuracy, precision, recall and time. The experiments were conducted on artificial data created to represent a variation of distributions with a limitation of only 2 features and 3 classes. Different distributions of data were chosen to challenge each classification method. The results show that the measurements for ac- curacy and time vary considerably for the different distributed dataset. SVM with RBF kernel performed better for accuracy in comparison to the other classification methods. k-NN scored slightly lower accuracy values than SVM with RBF kernel in general, but performed better on the challenging dataset. DT is the less time consuming algorithm and was significally faster than the other classification methods. The only method that could compete with DT on time was k-NN that was faster than DT for the dataset with small spread and coinciding classes. Although a clear trend can be seen in the results the area needs to be studied further to draw a comprehensive conclusion due to limitation of the artificially generated datasets in this study.<br>Med växande data och tillgänglighet ökar intresset och användning- en för maskininlärning, tillsammans med behovet för klassificering. Klassificering är en viktig metod inom maskininlärning för att förenk- la data och göra förutstägelser. Denna rapport utvärderar tre klassificeringsmetoder för övervakad in- lärning: Stödvektormaskiner (SVM) med olika kärnor, Närmaste Gran- ne (k-NN) och Beslutsträd (DT). Metoderna utvärderades baserat på nogrannhet, precision, återkallelse och tid. Experimenten utfördes på artificiell data skapad för att representera en variation av fördelningar med en begränsning av endast 2 egenskaper och 3 klasser. Resultaten visar att mätningarna för noggrannhet och tid varierar avsevärt för olika variationer av dataset. SVM med RBF-kärna gav generellt högre värden för noggrannhet i jämförelse med de and- ra klassificeringsmetoderna. k-NN visade något lägre noggrannhet än SVM med RBF-kärna i allmänhet, men presterade bättre på det mest utmanande datasetet. DT är den minst tidskrävande algoritmen och var signifikant snabbare än de andra klassificeringsmetoderna. Den enda metoden som kunde konkurrera med DT i tid var SVM med k- NN som var snabbare än DT för det dataset som hade liten spridning och sammanfallande klasser. Även om en tydlig trend kan ses i resultaten behöver området studeras ytterligare för att dra en omfattande slutsats på grund av begränsning av dataset i denna studie.

APA, Harvard, Vancouver, ISO, and other styles

21

Jelínková, Jana. "Rozpoznání hudebního slohu z orchestrální nahrávky za pomoci technik Music Information Retrieval." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-413256.

Full text

Abstract:

As all genres of popular music, classical music consists of many different subgenres. The aim of this work is to recognize those subgenres from orchestral recordings. It is focused on the time period from the very end of 16th century to the beginning of 20th century, which means that Baroque era, Classical era and Romantic era are researched. The Music Information Retrieval (MIR) method was used to classify chosen subgenres. In the first phase of MIR method, parameters were extracted from musical recordings and were evaluated. Only the best parameters were used as input data for machine learning classifiers, to be specific: kNN (K-Nearest Neighbor), LDA (Linear Discriminant Analysis), GMM (Gaussian Mixture Models) and SVM (Support Vector Machines). In the final chapter, all the best results are summarized. According to the results, there is significant difference between the Baroque era and the other researched eras. This significant difference led to better identification of the Baroque era recordings. On the contrary, Classical era ended up to be relatively similar to Romantic era and therefore all classifiers had less success in identification of recordings from this era. The results are in line with music theory and characteristics of chosen musical eras.

APA, Harvard, Vancouver, ISO, and other styles

22

Creemers, Warren. "On the Recognition of Emotion from Physiological Data." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2013. https://ro.ecu.edu.au/theses/680.

Full text

Abstract:

This work encompasses several objectives, but is primarily concerned with an experiment where 33 participants were shown 32 slides in order to create ‗weakly induced emotions‘. Recordings of the participants‘ physiological state were taken as well as a self report of their emotional state. We then used an assortment of classifiers to predict emotional state from the recorded physiological signals, a process known as Physiological Pattern Recognition (PPR). We investigated techniques for recording, processing and extracting features from six different physiological signals: Electrocardiogram (ECG), Blood Volume Pulse (BVP), Galvanic Skin Response (GSR), Electromyography (EMG), for the corrugator muscle, skin temperature for the finger and respiratory rate. Improvements to the state of PPR emotion detection were made by allowing for 9 different weakly induced emotional states to be detected at nearly 65% accuracy. This is an improvement in the number of states readily detectable. The work presents many investigations into numerical feature extraction from physiological signals and has a chapter dedicated to collating and trialing facial electromyography techniques. There is also a hardware device we created to collect participant self reported emotional states which showed several improvements to experimental procedure.

APA, Harvard, Vancouver, ISO, and other styles

23

Alsouda, Yasser. "An IoT Solution for Urban Noise Identification in Smart Cities : Noise Measurement and Classification." Thesis, Linnéuniversitetet, Institutionen för fysik och elektroteknik (IFE), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-80858.

Full text

Abstract:

Noise is defined as any undesired sound. Urban noise and its effect on citizens area significant environmental problem, and the increasing level of noise has become a critical problem in some cities. Fortunately, noise pollution can be mitigated by better planning of urban areas or controlled by administrative regulations. However, the execution of such actions requires well-established systems for noise monitoring. In this thesis, we present a solution for noise measurement and classification using a low-power and inexpensive IoT unit. To measure the noise level, we implement an algorithm for calculating the sound pressure level in dB. We achieve a measurement error of less than 1 dB. Our machine learning-based method for noise classification uses Mel-frequency cepstral coefficients for audio feature extraction and four supervised classification algorithms (that is, support vector machine, k-nearest neighbors, bootstrap aggregating, and random forest). We evaluate our approach experimentally with a dataset of about 3000 sound samples grouped in eight sound classes (such as car horn, jackhammer, or street music). We explore the parameter space of the four algorithms to estimate the optimal parameter values for the classification of sound samples in the dataset under study. We achieve noise classification accuracy in the range of 88% – 94%.

APA, Harvard, Vancouver, ISO, and other styles

24

Silva, Rodrigo Dalvit Carvalho da. "Um estudo sobre a extraÃÃo de caracterÃsticas e a classificaÃÃo de imagens invariantes Ã rotaÃÃo extraÃdas de um sensor industrial 3D." Universidade Federal do CearÃ, 2014. http://www.teses.ufc.br/tde_busca/arquivo.php?codArquivo=12154.

Full text

Abstract:

CoordenaÃÃo de AperfeÃoamento de Pessoal de NÃvel Superior<br>Neste trabalho, Ã discutido o problema de reconhecimento de objetos utilizando imagens extraÃdas de um sensor industrial 3D. NÃs nos concentramos em 9 extratores de caracterÃsticas, dos quais 7 sÃo baseados nos momentos invariantes (Hu, Zernike, Legendre, Fourier-Mellin, Tchebichef, Bessel-Fourier e Gaussian-Hermite), um outro Ã baseado na Transformada de Hough e o Ãltimo na anÃlise de componentes independentes, e, 4 classificadores, Naive Bayes, k-Vizinhos mais PrÃximos, MÃquina de Vetor de Suporte e Rede Neural Artificial-Perceptron Multi-Camadas. Para a escolha do melhor extrator de caracterÃsticas, foram comparados os seus desempenhos de classificaÃÃo em termos de taxa de acerto e de tempo de extraÃÃo, atravÃs do classificador k-Vizinhos mais PrÃximos utilizando distÃncia euclidiana. O extrator de caracterÃsticas baseado nos momentos de Zernike obteve as melhores taxas de acerto, 98.00%, e tempo relativamente baixo de extraÃÃo de caracterÃsticas, 0.3910 segundos. Os dados gerados a partir deste, foram apresentados a diferentes heurÃsticas de classificaÃÃo. Dentre os classificadores testados, o classificador k-Vizinhos mais PrÃximos, obteve a melhor taxa mÃdia de acerto, 98.00% e, tempo mÃdio de classificaÃÃo relativamente baixo, 0.0040 segundos, tornando-se o classificador mais adequado para a aplicaÃÃo deste estudo.<br>In this work, the problem of recognition of objects using images extracted from a 3D industrial sensor is discussed. We focus in 9 feature extractors (where seven are based on invariant moments -Hu, Zernike, Legendre, Fourier-Mellin, Tchebichef, BesselâFourier and Gaussian-Hermite-, another is based on the Hough transform and the last one on independent component analysis), and 4 classifiers (Naive Bayes, k-Nearest Neighbor, Support Vector machines and Artificial Neural Network-Multi-Layer Perceptron). To choose the best feature extractor, their performance was compared in terms of classification accuracy rate and extraction time by the k-nearest neighbors classifier using euclidean distance. The feature extractor based on Zernike moments, got the best hit rates, 98.00 %, and relatively low time feature extraction, 0.3910 seconds. The data generated from this, were presented to different heuristic classification. Among the tested classifiers, the k-nearest neighbors classifier achieved the highest average hit rate, 98.00%, and average time of relatively low rank, 0.0040 seconds, thus making it the most suitable classifier for the implementation of this study.

APA, Harvard, Vancouver, ISO, and other styles

25

Maršánová, Lucie. "Analýza experimentálních EKG záznamů." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2015. http://www.nusl.cz/ntk/nusl-221365.

Full text

Abstract:

This diploma thesis deals with the analysis of experimental electrograms (EG) recorded from isolated rabbit hearts. The theoretical part is focused on the basic principles of electrocardiography, pathological events in ECGs, automatic classification of ECG and experimental cardiological research. The practical part deals with manual classification of individual pathological events – these results will be presented in the database of EG records, which is under developing at the Department of Biomedical Engineering at BUT nowadays. Manual scoring of data was discussed with experts. After that, the presence of pathological events within particular experimental periods was described and influence of ischemia on heart electrical activity was reviewed. In the last part, morphological parameters calculated from EG beats were statistically analised with Kruskal-Wallis and Tukey-Kramer tests and also principal component analysis (PCA) and used as classification features to classify automatically four types of the beats. Classification was realized with four approaches such as discriminant function analysis, k-Nearest Neighbours, support vector machines, and naive Bayes classifier.

APA, Harvard, Vancouver, ISO, and other styles

26

Lantz, Robin. "Time series monitoring and prediction of data deviations in a manufacturing industry." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-100181.

Full text

Abstract:

An automated manufacturing industry makes use of many interacting moving parts and sensors. Data from these sensors generate complex multidimensional data in the production environment. This data is difficult to interpret and also difficult to find patterns in. This project provides tools to get a deeper understanding of Swedsafe’s production data, a company involved in an automated manufacturing business. The project is based on and will show the potential of the multidimensional production data. The project mainly consists of predicting deviations from predefined threshold values in Swedsafe’s production data. Machine learning is a good method of finding relationships in complex datasets. Supervised machine learning classification is used to predict deviation from threshold values in the data. An investigation is conducted to identify the classifier that performs best on Swedsafe's production data. The technique sliding window is used for managing time series data, which is used in this project. Apart from predicting deviations, this project also includes an implementation of live graphs to easily get an overview of the production data. A steady production with stable process values is important. So being able to monitor and predict events in the production environment can provide the same benefit for other manufacturing companies and is therefore suitable not only for Swedsafe. The best performing machine learning classifier tested in this project was the Random Forest classifier. The Multilayer Perceptron did not perform well on Swedsafe’s data, but further investigation in recurrent neural networks using LSTM neurons would be recommended. During the projekt a web based application displaying the sensor data in live graphs is also developed.

APA, Harvard, Vancouver, ISO, and other styles

27

Novosad, Andrej. "Využití metod dolování dat pro analýzu sociálních sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236424.

Full text

Abstract:

Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined.

APA, Harvard, Vancouver, ISO, and other styles

28

Pešek, Milan. "Detekce logopedických vad v řeči." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-218106.

Full text

Abstract:

The thesis deals with a design and an implementation of software for a detection of logopaedia defects of speech. Due to the need of early logopaedia defects detecting, this software is aimed at a child’s age speaker. The introductory part describes the theory of speech realization, simulation of speech realization for numerical processing, phonetics, logopaedia and basic logopaedia defects of speech. There are also described used methods for feature extraction, for segmentation of words to speech sounds and for features classification into either correct or incorrect pronunciation class. In the next part of the thesis there are results of testing of selected methods presented. For logopaedia speech defects recognition algorithms are used in order to extract the features MFCC and PLP. The segmentation of words to speech sounds is performed on the base of Differential Function method. The extracted features of a sound are classified into either a correct or an incorrect pronunciation class with one of tested methods of pattern recognition. To classify the features, the k-NN, SVN, ANN, and GMM methods are tested.

APA, Harvard, Vancouver, ISO, and other styles

29

Klimeš, Filip. "Zpracování obrazových sekvencí sítnice z fundus kamery." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2015. http://www.nusl.cz/ntk/nusl-220975.

Full text

Abstract:

Cílem mé diplomové práce bylo navrhnout metodu analýzy retinálních sekvencí, která bude hodnotit kvalitu jednotlivých snímků. V teoretické části se také zabývám vlastnostmi retinálních sekvencí a způsobem registrace snímků z fundus kamery. V praktické části je implementována metoda hodnocení kvality snímků, která je otestována na reálných retinálních sekvencích a vyhodnocena její úspěšnost. Práce hodnotí i vliv této metody na registraci retinálních snímků.

APA, Harvard, Vancouver, ISO, and other styles

30

Bílý, Ondřej. "Moderní řečové příznaky používané při diagnóze chorob." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-218971.

Full text

Abstract:

This work deals with the diagnosis of Parkinson's disease by analyzing the speech signal. At the beginning of this work there is described speech signal production. The following is a description of the speech signal analysis, its preparation and subsequent feature extraction. Next there is described Parkinson's disease and change of the speech signal by this disability. The following describes the symptoms, which are used for the diagnosis of Parkinson's disease (FCR, VSA, VOT, etc.). Another part of the work deals with the selection and reduction symptoms using the learning algorithms (SVM, ANN, k-NN) and their subsequent evaluation. In the last part of the thesis is described a program to count symptoms. Further is described selection and the end evaluated all the result.

APA, Harvard, Vancouver, ISO, and other styles

31

Konečný, Antonín. "Využití umělé inteligence v technické diagnostice." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2021. http://www.nusl.cz/ntk/nusl-443221.

Full text

Abstract:

The diploma thesis is focused on the use of artificial intelligence methods for evaluating the fault condition of machinery. The evaluated data are from a vibrodiagnostic model for simulation of static and dynamic unbalances. The machine learning methods are applied, specifically supervised learning. The thesis describes the Spyder software environment, its alternatives, and the Python programming language, in which the scripts are written. It contains an overview with a description of the libraries (Scikit-learn, SciPy, Pandas ...) and methods — K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Decision Trees (DT) and Random Forests Classifiers (RF). The results of the classification are visualized in the confusion matrix for each method. The appendix includes written scripts for feature engineering, hyperparameter tuning, evaluation of learning success and classification with visualization of the result.

APA, Harvard, Vancouver, ISO, and other styles

32

Dočekal, Martin. "Porovnání klasifikačních metod." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403211.

Full text

Abstract:

This thesis deals with a comparison of classification methods. At first, these classification methods based on machine learning are described, then a classifier comparison system is designed and implemented. This thesis also describes some classification tasks and datasets on which the designed system will be tested. The evaluation of classification tasks is done according to standard metrics. In this thesis is presented design and implementation of a classifier that is based on the principle of evolutionary algorithms.

APA, Harvard, Vancouver, ISO, and other styles

33

Dušil, Lubomír. "Automatické rozpoznávání logopedických vad v řečovém projevu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-218161.

Full text

Abstract:

The thesis is aimed at an analysis and automatic detection of logopaedic defects in speech utterance. Its objective is to facilitate and accelerate the work of logopaedists and to increase percentage of detected logopaedic defects in children of the youngest possible age followed by the most successful treatment. It presents methods of speech work, classification of the defects within individual stages of child development and appropriate words for identification of the speech defects and their subsequent remedy. After that there are analyses of methods of calculating coefficients which reflect human speech best. Also classifiers which are used to discern and determine whether it is a speech defect or not. Classifiers exploit coefficients for their work. Coefficients and classifiers are being tested and their best combination is being looked for in order to achieve the highest possible success rate of the automatic detection of the speech defects. All the programming and testing jobs has been conducted in the Matlab programme.

APA, Harvard, Vancouver, ISO, and other styles

34

Jia, Wei. "Image analysis and representation for textile design classification." Thesis, University of Dundee, 2011. https://discovery.dundee.ac.uk/en/studentTheses/c667f279-d7a6-4670-b23e-c9dbe2784266.

Full text

Abstract:

A good image representation is vital for image comparision and classification; it may affect the classification accuracy and efficiency. The purpose of this thesis was to explore novel and appropriate image representations. Another aim was to investigate these representations for image classification. Finally, novel features were examined for improving image classification accuracy. Images of interest to this thesis were textile design images. The motivation of analysing textile design images is to help designers browse images, fuel their creativity, and improve their design efficiency. In recent years, bag-of-words model has been shown to be a good base for image representation, and there have been many attempts to go beyond this representation. Bag-of-words models have been used frequently in the classification of image data, due to good performance and simplicity. “Words” in images can have different definitions and are obtained through steps of feature detection, feature description, and codeword calculation. The model represents an image as an orderless collection of local features. However, discarding the spatial relationships of local features limits the power of this model. This thesis exploited novel image representations, bag of shapes and region label graphs models, which were based on bag-of-words model. In both models, an image was represented by a collection of segmented regions, and each region was described by shape descriptors. In the latter model, graphs were constructed to capture the spatial information between groups of segmented regions and graph features were calculated based on some graph theory. Novel elements include use of MRFs to extract printed designs and woven patterns from textile images, utilisation of the extractions to form bag of shapes models, and construction of region label graphs to capture the spatial information. The extraction of textile designs was formulated as a pixel labelling problem. Algorithms for MRF optimisation and re-estimation were described and evaluated. A method for quantitative evaluation was presented and used to compare the performance of MRFs optimised using alpha-expansion and iterated conditional modes (ICM), both with and without parameter re-estimation. The results were used in the formation of the bag of shapes and region label graphs models. Bag of shapes model was a collection of MRFs' segmented regions, and the shape of each region was described with generic Fourier descriptors. Each image was represented as a bag of shapes. A simple yet competitive classification scheme based on nearest neighbour class-based matching was used. Classification performance was compared to that obtained when using bags of SIFT features. To capture the spatial information, region label graphs were constructed to obtain graph features. Regions with the same label were treated as a group and each group was associated uniquely with a vertex in an undirected, weighted graph. Each region group was represented as a bag of shape descriptors. Edges in the graph denoted either the extent to which the groups' regions were spatially adjacent or the dissimilarity of their respective bags of shapes. Series of unweighted graphs were obtained by removing edges in order of weight. Finally, an image was represented using its shape descriptors along with features derived from the chromatic numbers or domination numbers of the unweighted graphs and their complements. Linear SVM classifiers were used for classification. Experiments were implemented on data from Liberty Art Fabrics, which consisted of more than 10,000 complicated images mainly of printed textile designs and woven patterns. Experimental data was classified into seven classes manually by assigning each image a text descriptor based on content or design type. The seven classes were floral, paisley, stripe, leaf, geometric, spot, and check. The result showed that reasonable and interesting regions were obtained from MRF segmentation in which alpha-expansion with parameter re-estimation performs better than alpha-expansion without parameter re-estimation or ICM. This result was not only promising for textile CAD (Computer-Aided Design) to redesign the textile image, but also for image representation. It was also found that bag of shapes model based on MRF segmentation can obtain comparable classification accuracy with bag of SIFT features in the framework of nearest neighbour class-based matching. Finally, the result indicated that incorporation of graph features extracted by constructing region label graphs can improve the classification accuracy compared to both bag of shapes model and bag of SIFT models.

APA, Harvard, Vancouver, ISO, and other styles

35

De, Gregorio Ludovica. "Development of new data fusion techniques for improving snow parameters estimation." Doctoral thesis, Università degli studi di Trento, 2019. http://hdl.handle.net/11572/245392.

Full text

Abstract:

Water stored in snow is a critical contribution to the world’s available freshwater supply and is fundamental to the sustenance of natural ecosystems, agriculture and human societies. The importance of snow for the natural environment and for many socio-economic sectors in several mid‐ to high‐latitude mountain regions around the world, leads scientists to continuously develop new approaches to monitor and study snow and its properties. The need to develop new monitoring methods arises from the limitations of in situ measurements, which are pointwise, only possible in accessible and safe locations and do not allow for a continuous monitoring of the evolution of the snowpack and its characteristics. These limitations have been overcome by the increasingly used methods of remote monitoring with space-borne sensors that allow monitoring the wide spatial and temporal variability of the snowpack. Snow models, based on modeling the physical processes that occur in the snowpack, are an alternative to remote sensing for studying snow characteristics. However, from literature it is evident that both remote sensing and snow models suffer from limitations as well as have significant strengths that it would be worth jointly exploiting to achieve improved snow products. Accordingly, the main objective of this thesis is the development of novel methods for the estimation of snow parameters by exploiting the different properties of remote sensing and snow model data. In particular, the following specific novel contributions are presented in this thesis: i. A novel data fusion technique for improving the snow cover mapping. The proposed method is based on the exploitation of the snow cover maps derived from the AMUNDSEN snow model and the MODIS product together with their quality layer in a decision level fusion approach by mean of a machine learning technique, namely the Support Vector Machine (SVM). ii. A new approach has been developed for improving the snow water equivalent (SWE) product obtained from AMUNDSEN model simulations. The proposed method exploits some auxiliary information from optical remote sensing and from topographic characteristics of the study area in a new approach that differs from the classical data assimilation approaches and is based on the estimation of AMUNDSEN error with respect to the ground data through a k-NN algorithm. The new product has been validated with ground measurement data and by a comparison with MODIS snow cover maps. In a second step, the contribution of information derived from X-band SAR imagery acquired by COSMO-SkyMed constellation has been evaluated, by exploiting simulations from a theoretical model to enlarge the dataset.

APA, Harvard, Vancouver, ISO, and other styles

36

De, Gregorio Ludovica. "Development of new data fusion techniques for improving snow parameters estimation." Doctoral thesis, Università degli studi di Trento, 2019. http://hdl.handle.net/11572/245392.

Full text

Abstract:

Water stored in snow is a critical contribution to the world’s available freshwater supply and is fundamental to the sustenance of natural ecosystems, agriculture and human societies. The importance of snow for the natural environment and for many socio-economic sectors in several mid‐ to high‐latitude mountain regions around the world, leads scientists to continuously develop new approaches to monitor and study snow and its properties. The need to develop new monitoring methods arises from the limitations of in situ measurements, which are pointwise, only possible in accessible and safe locations and do not allow for a continuous monitoring of the evolution of the snowpack and its characteristics. These limitations have been overcome by the increasingly used methods of remote monitoring with space-borne sensors that allow monitoring the wide spatial and temporal variability of the snowpack. Snow models, based on modeling the physical processes that occur in the snowpack, are an alternative to remote sensing for studying snow characteristics. However, from literature it is evident that both remote sensing and snow models suffer from limitations as well as have significant strengths that it would be worth jointly exploiting to achieve improved snow products. Accordingly, the main objective of this thesis is the development of novel methods for the estimation of snow parameters by exploiting the different properties of remote sensing and snow model data. In particular, the following specific novel contributions are presented in this thesis: i. A novel data fusion technique for improving the snow cover mapping. The proposed method is based on the exploitation of the snow cover maps derived from the AMUNDSEN snow model and the MODIS product together with their quality layer in a decision level fusion approach by mean of a machine learning technique, namely the Support Vector Machine (SVM). ii. A new approach has been developed for improving the snow water equivalent (SWE) product obtained from AMUNDSEN model simulations. The proposed method exploits some auxiliary information from optical remote sensing and from topographic characteristics of the study area in a new approach that differs from the classical data assimilation approaches and is based on the estimation of AMUNDSEN error with respect to the ground data through a k-NN algorithm. The new product has been validated with ground measurement data and by a comparison with MODIS snow cover maps. In a second step, the contribution of information derived from X-band SAR imagery acquired by COSMO-SkyMed constellation has been evaluated, by exploiting simulations from a theoretical model to enlarge the dataset.

APA, Harvard, Vancouver, ISO, and other styles

37

Ur-Rahman, Nadeem. "Textual data mining applications for industrial knowledge management solutions." Thesis, Loughborough University, 2010. https://dspace.lboro.ac.uk/2134/6373.

Full text

Abstract:

In recent years knowledge has become an important resource to enhance the business and many activities are required to manage these knowledge resources well and help companies to remain competitive within industrial environments. The data available in most industrial setups is complex in nature and multiple different data formats may be generated to track the progress of different projects either related to developing new products or providing better services to the customers. Knowledge Discovery from different databases requires considerable efforts and energies and data mining techniques serve the purpose through handling structured data formats. If however the data is semi-structured or unstructured the combined efforts of data and text mining technologies may be needed to bring fruitful results. This thesis focuses on issues related to discovery of knowledge from semi-structured or unstructured data formats through the applications of textual data mining techniques to automate the classification of textual information into two different categories or classes which can then be used to help manage the knowledge available in multiple data formats. Applications of different data mining techniques to discover valuable information and knowledge from manufacturing or construction industries have been explored as part of a literature review. The application of text mining techniques to handle semi-structured or unstructured data has been discussed in detail. A novel integration of different data and text mining tools has been proposed in the form of a framework in which knowledge discovery and its refinement processes are performed through the application of Clustering and Apriori Association Rule of Mining algorithms. Finally the hypothesis of acquiring better classification accuracies has been detailed through the application of the methodology on case study data available in the form of Post Project Reviews (PPRs) reports. The process of discovering useful knowledge, its interpretation and utilisation has been automated to classify the textual data into two classes.

APA, Harvard, Vancouver, ISO, and other styles

38

Trahan, Patrick. "Classification of Carpiodes Using Fourier Descriptors: A Content Based Image Retrieval Approach." ScholarWorks@UNO, 2009. http://scholarworks.uno.edu/td/1085.

Full text

Abstract:

Taxonomic classification has always been important to the study of any biological system. Many biological species will go unclassified and become lost forever at the current rate of classification. The current state of computer technology makes image storage and retrieval possible on a global level. As a result, computer-aided taxonomy is now possible. Content based image retrieval techniques utilize visual features of the image for classification. By utilizing image content and computer technology, the gap between taxonomic classification and species destruction is shrinking. This content based study utilizes the Fourier Descriptors of fifteen known landmark features on three Carpiodes species: C.carpio, C.velifer, and C.cyprinus. Classification analysis involves both unsupervised and supervised machine learning algorithms. Fourier Descriptors of the fifteen known landmarks provide for strong classification power on image data. Feature reduction analysis indicates feature reduction is possible. This proves useful for increasing generalization power of classification.

APA, Harvard, Vancouver, ISO, and other styles

39

Nyberg, Selma. "Video Recommendation Based on Object Detection." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-351122.

Full text

Abstract:

In this thesis, various machine learning domains have been combined in order to build a video recommender system that is based on object detection. The work combines two extensively studied research fields, recommender systems and computer vision, that also are rapidly growing and popular techniques on commercial markets. To investigate the performance of the approach, three different content-based recommender systems have been implemented at Spotify, which are based on the following video features: object detections, titles and descriptions, and user preferences. These systems have then been evaluated and compared against each other together with their hybridized result. Two algorithms have been implemented, the prediction and the top-N algorithm, where the former is the more reliable source for evaluating the system's performance. The evaluation of the system shows that the overall performance scores for predicting values of the users' liked and disliked videos are in the range from about 40 % to 70 % for the prediction algorithm and from about 15 % to 70 % for the top-N algorithm. The approach based on object detection performs worse in comparison to the other approaches. Hence, there seems to be is a low correlation between the user preferences and the video contents in terms of object detection data. Therefore, this data is not very suitable for describing the content of videos and using it in the recommender system. However, the results of this study cannot be generalized to apply for other systems before the approach has been evaluated in other environments and for various data sets. Moreover, there are plenty of room for refinements and improvements to the system, as well as there are many interesting research areas for future work.

APA, Harvard, Vancouver, ISO, and other styles

40

Yiu, Tzu-Hsuen, and 游子璇. "Application of Support Vector Machine, K-Nearest Neighbor and Logistic Regression on Medical Diagnosis." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/25297161291698273286.

Full text

Abstract:

碩士<br>國立勤益科技大學<br>工業工程與管理系<br>99<br>Using the model in a large medical database to predict and provide medical diagnosis of reference. According to Preventive Medicine perspective, it can provide patients with appropriate advice, health education content and even for preventive treatment. Hope to reduce the incidence of the diseases. In the study, the databases sourced the UCI Machine Learning Repository. The databases were inclusive Parkinson databases and Dermatology database. This search uses feature selection to extract important features and reports a comparative study of three methods on Parkinson and Dermatology Database. We established three different classifiers, Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Logistic Regression. The accuracy of three classifiers: Parkinson database were 98.60%, 97.49%, 97.49%; Dermatology database were 98.60%, 97.49%, 97.49%, respectively. We compared the results with the related research. Our proposed model is very effective. The result helps to decrease the medical examination time and cost.

APA, Harvard, Vancouver, ISO, and other styles

41

Su, Wanhua. "Efficient Kernel Methods for Statistical Detection." Thesis, 2008. http://hdl.handle.net/10012/3598.

Full text

Abstract:

This research is motivated by a drug discovery problem -- the AIDS anti-viral database from the National Cancer Institute. The objective of the study is to develop effective statistical methods to model the relationship between the chemical structure of a compound and its activity against the HIV-1 virus. And as a result, the structure-activity model can be used to predict the activity of new compounds and thus helps identify those active chemical compounds that can be used as drug candidates. Since active compounds are generally rare in a compound library, we recognize the drug discovery problem as an application of the so-called statistical detection problem. In a typical statistical detection problem, we have data {Xi,Yi}, where Xi is the predictor vector of the ith observation and Yi={0,1} is its class label. The objective of a statistical detection problem is to identify class-1 observations, which are extremely rare. Besides drug discovery problem, other applications of statistical detection include direct marketing and fraud detection. We propose a computationally efficient detection method called LAGO, which stands for "locally adjusted GO estimator". The original idea is inspired by an ancient game known today as "GO". The construction of LAGO consists of two steps. In the first step, we estimate the density of class 1 with an adaptive bandwidth kernel density estimator. The kernel functions are located at and only at the class-1 observations. The bandwidth of the kernel function centered at a certain class-1 observation is calculated as the average distance between this class-1 observation and its K-nearest class-0 neighbors. In the second step, we adjust the density estimated in the first step locally according to the density of class 0. It can be shown that the amount of adjustment in the second step is approximately inversely proportional to the bandwidth calculated in the first step. Application to the NCI data demonstrates that LAGO is superior to methods such as K nearest neighbors and support vector machines. One drawback of the existing LAGO is that it only provides a point estimate of a test point's possibility of being class 1, ignoring the uncertainty of the model. In the second part of this thesis, we present a Bayesian framework for LAGO, referred to as BLAGO. This Bayesian approach enables quantification of uncertainty. Non-informative priors are adopted. The posterior distribution is calculated over a grid of (K, alpha) pairs by integrating out beta0 and beta1 using the Laplace approximation, where K and alpha are two parameters to construct the LAGO score. The parameters beta0, beta1 are the coefficients of the logistic transformation that converts the LAGO score to the probability scale. BLAGO provides proper probabilistic predictions that have support on (0,1) and captures uncertainty of the predictions as well. By avoiding Markov chain Monte Carlo algorithms and using the Laplace approximation, BLAGO is computationally very efficient. Without the need of cross-validation, BLAGO is even more computationally efficient than LAGO.

APA, Harvard, Vancouver, ISO, and other styles

42

Palinggi, Denny Asarias. "Predicting soccer outcome with machine learning based on weather condition." Master's thesis, 2019. http://hdl.handle.net/10362/64182.

Full text

Abstract:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies<br>Massive amounts of research have been doing on predicting soccer matches using machine learning algorithms. Unfortunately, there are no prior researches used weather condition as features. In this thesis, three different classification algorithms were investigated for predicting the outcomes of soccer matches by using temperature difference, rain precipitation, and several other historical match statistics as features. The dataset consists of statistic information of soccer matches in La Liga and Segunda division from season 2013-2014 to 2016-2017 and weather information in every host cities. The results show that the SVM model has better accuracy score for predicting the full-time result compare to KNN and RF with 45.32% for temperature difference below 5° and 49.51% for temperature difference above 5°. For over/under 2.5 goals, SVM also has better accuracy with 53.07% for rain precipitation below 5 mm and 56% for rain precipitation above 5 mm.

APA, Harvard, Vancouver, ISO, and other styles

43

Cruz, Pablo Henrique Alves. "Mapping urban tree species in a tropical environment using airborne multispectral and LiDAR data." Master's thesis, 2021. http://hdl.handle.net/10362/113904.

Full text

Abstract:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies<br>Accurate and up-to-date urban tree inventory is an essential resource for the development of strategies towards sustainable urban planning, as well as for effective management and preservation of biodiversity. Trees contribute to thermal comfort within urban centers by lessening heat island effect and have a direct impact in the reduction of air pollution. However, mapping individual trees species normally involves time-consuming field work over large areas or image interpretation performed by specialists. The integration of airborne LiDAR data with high-spatial resolution and multispectral aerial image is an alternative and effective approach to differentiate tree species at the individual crown level. This thesis aims to investigate the potential of such remotely sensed data to discriminate 5 common urban tree species using traditional Machine Learning classifiers (Random Forest, Support Vector Machine, and k-Nearest Neighbors) in the tropical environment of Salvador, Brazil. Vegetation indices and texture information were extracted from multispectral imagery, and LiDAR-derived variables for tree crowns, were tested separately and combined to perform tree species classification applying three different classifiers. Random Forest outperformed the other two classifiers, reaching overall accuracy of 82.5% when using combined multispectral and LiDAR data. The results indicate that (1) given the similarity in spectral signature, multispectral data alone is not sufficient to distinguish tropical tree species (only k-NN classifier could detect all species); (2) height values and intensity of crown returns points were the most relevant LiDAR features, combination of both datasets improved accuracy up to 20%; (3) generation of canopy height model derived from LiDAR point cloud is an effective method to delineate individual tree crowns in a semi-automatic approach.

APA, Harvard, Vancouver, ISO, and other styles

44

Hernández, Maicol Fernando Camargo. "Land suitability analysis to assess the potential of public open spaces for urban agriculture activities." Master's thesis, 2020. http://hdl.handle.net/10362/94399.

Full text

Abstract:

Dissertation submitted in partial fulfilment of the requirements for the degree of Master of Science in Geospatial Technologies<br>In a world increasingly dominated by cities and an accelerated urban sprawl, urban agriculture emerges as an alternative for the continuous stock and food supply that urban population demands. This thesis aimed to identify and evaluate potential available areas in public locations for implementing urban agriculture practices within the urban perimeter of the city of Bogota in Colombia. The methodology was conducted using variables reflecting the physical, environmental and socioeconomic components of the area. Two approaches were implemented to evaluate a land suitability analysis for urban agriculture to alleviate urban poverty by increasing food security and nutrition in the study area. The first approach was based on expert knowledge combining GIS with multicriteria decision making analysis (MCDM) using analytical hierarchical process (AHP) method, estimating that 21% of the study area presents highly suitability conditions for implementing urban agriculture activities. The second approach was developed using supervised machine learning algorithms for classification models based on historical data of the current sites, where urban agriculture activities were being implemented in the city, showing that 18% of the study area is in high suitability conditions for the implementation of urban agriculture activities. Both approaches indicated that the areas of excellent suitability are located in the South and Southwestern parts of the study area, emphasizing its congruence with the areas with the lowest socioeconomic levels in the city. It was found that approximately 2% of the study area has available spaces in public locations with a significant potential for urban agriculture practices. Three projected scenarios were simulated where 10%, 30% and in the most utopic case 50% of these spaces would be used for urban agriculture activities and the vegetable productivity in tons of five of the most popular crops grown was estimated.

APA, Harvard, Vancouver, ISO, and other styles

45

(6642491), Jingzhao Dai. "SPARSE DISCRETE WAVELET DECOMPOSITION AND FILTER BANK TECHNIQUES FOR SPEECH RECOGNITION." Thesis, 2019.

Find full text

Abstract:

<p>Speech recognition is widely applied to translation from speech to related text, voice driven commands, human machine interface and so on [1]-[8]. It has been increasingly proliferated to Human’s lives in the modern age. To improve the accuracy of speech recognition, various algorithms such as artificial neural network, hidden Markov model and so on have been developed [1], [2].</p> <p>In this thesis work, the tasks of speech recognition with various classifiers are investigated. The classifiers employed include the support vector machine (SVM), k-nearest neighbors (KNN), random forest (RF) and convolutional neural network (CNN). Two novel features extraction methods of sparse discrete wavelet decomposition (SDWD) and bandpass filtering (BPF) based on the Mel filter banks [9] are developed and proposed. In order to meet diversity of classification algorithms, one-dimensional (1D) and two-dimensional (2D) features are required to be obtained. The 1D features are the array of power coefficients in frequency bands, which are dedicated for training SVM, KNN and RF classifiers while the 2D features are formed both in frequency domain and temporal variations. In fact, the 2D feature consists of the power values in decomposed bands versus consecutive speech frames. Most importantly, the 2D feature with geometric transformation are adopted to train CNN.</p> <p>Speech recognition including males and females are from the recorded data set as well as the standard data set. Firstly, the recordings with little noise and clear pronunciation are applied with the proposed feature extraction methods. After many trials and experiments using this dataset, a high recognition accuracy is achieved. Then, these feature extraction methods are further applied to the standard recordings having random characteristics with ambient noise and unclear pronunciation. Many experiment results validate the effectiveness of the proposed feature extraction techniques.</p>

APA, Harvard, Vancouver, ISO, and other styles

46

Elmasry, Mohamed Hani Abdelhamid Mohamed Tawfik. "Machine learning approach for credit score analysis : a case study of predicting mortgage loan defaults." Master's thesis, 2019. http://hdl.handle.net/10362/62427.

Full text

Abstract:

Dissertation submitted in partial fulfilment of the requirements for the degree of Statistics and Information Management specialized in Risk Analysis and Management<br>To effectively manage credit score analysis, financial institutions instigated techniques and models that are mainly designed for the purpose of improving the process assessing creditworthiness during the credit evaluation process. The foremost objective is to discriminate their clients – borrowers – to fall either in the non-defaulter group, that is more likely to pay their financial obligations, or the defaulter one which has a higher probability of failing to pay their debts. In this paper, we devote to use machine learning models in the prediction of mortgage defaults. This study employs various single classification machine learning methodologies including Logistic Regression, Classification and Regression Trees, Random Forest, K-Nearest Neighbors, and Support Vector Machine. To further improve the predictive power, a meta-algorithm ensemble approach – stacking – will be introduced to combine the outputs – probabilities – of the afore mentioned methods. The sample for this study is solely based on the publicly provided dataset by Freddie Mac. By modelling this approach, we achieve an improvement in the model predictability performance. We then compare the performance of each model, and the meta-learner, by plotting the ROC Curve and computing the AUC rate. This study is an extension of various preceding studies that used different techniques to further enhance the model predictivity. Finally, our results are compared with work from different authors.<br>Para gerir com eficácia a análise de risco de crédito, as instituições financeiras desenvolveram técnicas e modelos que foram projetados principalmente para melhorar o processo de avaliação da qualidade de crédito durante o processo de avaliação de crédito. O objetivo final é classifica os seus clientes - tomadores de empréstimos - entre aqueles que tem maior probabilidade de pagar suas obrigações financeiras, e os potenciais incumpridores que têm maior probabilidade de entrar em default. Neste artigo, nos dedicamos a usar modelos de aprendizado de máquina na previsão de defaults de hipoteca. Este estudo emprega várias metodologias de aprendizado de máquina de classificação única, incluindo Regressão Logística, Classification and Regression Trees, Random Forest, K-Nearest Neighbors, and Support Vector Machine. Para melhorar ainda mais o poder preditivo, a abordagem do conjunto de meta-algoritmos - stacking - será introduzida para combinar as saídas - probabilidades - dos métodos acima mencionados. A amostra deste estudo é baseada exclusivamente no conjunto de dados fornecido publicamente pela Freddie Mac. Ao modelar essa abordagem, alcançamos uma melhoria no desempenho do modelo de previsibilidade. Em seguida, comparamos o desempenho de cada modelo e o meta-aprendiz, plotando a Curva ROC e calculando a taxa de AUC. Este estudo é uma extensão de vários estudos anteriores que usaram diferentes técnicas para melhorar ainda mais o modelo preditivo. Finalmente, nossos resultados são comparados com trabalhos de diferentes autores.

APA, Harvard, Vancouver, ISO, and other styles

47

Erickson, Joshua N. "Evaluation of computational methods for data prediction." Thesis, 2014. http://hdl.handle.net/1828/5662.

Full text

Abstract:

Given the overall increase in the availability of computational resources, and the importance of forecasting the future, it should come as no surprise that prediction is considered to be one of the most compelling and challenging problems for both academia and industry in the world of data analytics. But how is prediction done, what factors make it easier or harder to do, how accurate can we expect the results to be, and can we harness the available computational resources in meaningful ways? With efforts ranging from those designed to save lives in the moments before a near field tsunami to others attempting to predict the performance of Major League Baseball players, future generations need to have realistic expectations about prediction methods and analytics. This thesis takes a broad look at the problem, including motivation, methodology, accuracy, and infrastructure. In particular, a careful study involving experiments in regression, the prediction of continuous, numerical values, and classification, the assignment of a class to each sample, is provided. The results and conclusions of these experiments cover only the included data sets and the applied algorithms as implemented by the Python library. The evaluation includes accuracy and running time of different algorithms across several data sets to establish tradeoffs between the approaches, and determine the impact of variations in the size of the data sets involved. As scalability is a key characteristic required to meet the needs of future prediction problems, a discussion of some of the challenges associated with parallelization is included.<br>Graduate<br>0984<br>erickson@uvic.ca

APA, Harvard, Vancouver, ISO, and other styles

48

Sharma, Govind. "Sentiment-Driven Topic Analysis Of Song Lyrics." Thesis, 2012. https://etd.iisc.ac.in/handle/2005/2472.

Full text

Abstract:

Sentiment Analysis is an area of Computer Science that deals with the impact a document makes on a user. The very field is further sub-divided into Opinion Mining and Emotion Analysis, the latter of which is the basis for the present work. Work on songs is aimed at building affective interactive applications such as music recommendation engines. Using song lyrics, we are interested in both supervised and unsupervised analyses, each of which has its own pros and cons. For an unsupervised analysis (clustering), we use a standard probabilistic topic model called Latent Dirichlet Allocation (LDA). It mines topics from songs, which are nothing but probability distributions over the vocabulary of words. Some of the topics seem sentiment-based, motivating us to continue with this approach. We evaluate our clusters using a gold dataset collected from an apt website and get positive results. This approach would be useful in the absence of a supervisor dataset. In another part of our work, we argue the inescapable existence of supervision in terms of having to manually analyse the topics returned. Further, we have also used explicit supervision in terms of a training dataset for a classifier to learn sentiment specific classes. This analysis helps reduce dimensionality and improve classification accuracy. We get excellent dimensionality reduction using Support Vector Machines (SVM) for feature selection. For re-classification, we use the Naive Bayes Classifier (NBC) and SVM, both of which perform well. We also use Non-negative Matrix Factorization (NMF) for classification, but observe that the results coincide with those of NBC, with no exceptions. This drives us towards establishing a theoretical equivalence between the two.

APA, Harvard, Vancouver, ISO, and other styles

49

Sharma, Govind. "Sentiment-Driven Topic Analysis Of Song Lyrics." Thesis, 2012. http://etd.iisc.ernet.in/handle/2005/2472.

Full text

Abstract:

Sentiment Analysis is an area of Computer Science that deals with the impact a document makes on a user. The very field is further sub-divided into Opinion Mining and Emotion Analysis, the latter of which is the basis for the present work. Work on songs is aimed at building affective interactive applications such as music recommendation engines. Using song lyrics, we are interested in both supervised and unsupervised analyses, each of which has its own pros and cons. For an unsupervised analysis (clustering), we use a standard probabilistic topic model called Latent Dirichlet Allocation (LDA). It mines topics from songs, which are nothing but probability distributions over the vocabulary of words. Some of the topics seem sentiment-based, motivating us to continue with this approach. We evaluate our clusters using a gold dataset collected from an apt website and get positive results. This approach would be useful in the absence of a supervisor dataset. In another part of our work, we argue the inescapable existence of supervision in terms of having to manually analyse the topics returned. Further, we have also used explicit supervision in terms of a training dataset for a classifier to learn sentiment specific classes. This analysis helps reduce dimensionality and improve classification accuracy. We get excellent dimensionality reduction using Support Vector Machines (SVM) for feature selection. For re-classification, we use the Naive Bayes Classifier (NBC) and SVM, both of which perform well. We also use Non-negative Matrix Factorization (NMF) for classification, but observe that the results coincide with those of NBC, with no exceptions. This drives us towards establishing a theoretical equivalence between the two.

APA, Harvard, Vancouver, ISO, and other styles

50

Σαψάνης, Χρήστος. "Αναγνώριση βασικών κινήσεων του χεριού με χρήση ηλεκτρομυογραφήματος". Thesis, 2013. http://hdl.handle.net/10889/6420.

Full text

Abstract:

Ο στόχος αυτής της εργασίας ήταν η αναγνώριση έξι βασικών κινήσεων του χεριού με χρήση δύο συστημάτων. Όντας θέμα διεπιστημονικού επιπέδου έγινε μελέτη της ανατομίας των μυών του πήχη, των βιοσημάτων, της μεθόδου της ηλεκτρομυογραφίας (ΗΜΓ) και μεθόδων αναγνώρισης προτύπων. Παράλληλα, το σήμα περιείχε αρκετό θόρυβο και έπρεπε να αναλυθεί, με χρήση του EMD, να εξαχθούν χαρακτηριστικά αλλά και να μειωθεί η διαστασιμότητά τους, με χρήση των RELIEF και PCA, για βελτίωση του ποσοστού επιτυχίας ταξινόμησης. Στο πρώτο μέρος γίνεται χρήση συστήματος ΗΜΓ της Delsys αρχικά σε ένα άτομο και στη συνέχεια σε έξι άτομα με το κατά μέσο όρο επιτυχημένης ταξινόμησης, για τις έξι αυτές κινήσεις, να αγγίζει ποσοστά άνω του 80%. Το δεύτερο μέρος περιλαμβάνει την κατασκευή αυτόνομου συστήματος ΗΜΓ με χρήση του Arduino μικροελεγκτή, αισθητήρων ΗΜΓ και ηλεκτροδίων, τα οποία είναι τοποθετημένα σε ένα ελαστικό γάντι. Τα αποτελέσματα ταξινόμησης σε αυτή την περίπτωση αγγίζουν το 75%.<br>The aim of this work was to identify six basic movements of the hand using two systems. Being an interdisciplinary topic, there has been conducted studying in the anatomy of forearm muscles, biosignals, the method of electromyography (EMG) and methods of pattern recognition. Moreover, the signal contained enough noise and had to be analyzed, using EMD, to extract features and to reduce its dimensionality, using RELIEF and PCA, to improve the success rate of classification. The first part uses an EMG system of Delsys initially for an individual and then for six people with the average successful classification, for these six movements at rates of over 80%. The second part involves the construction of an autonomous system EMG using an Arduino microcontroller, EMG sensors and electrodes, which are arranged in an elastic glove. Classification results in this case reached 75% of success.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!