Dissertations / Theses on the topic 'Naive bayes classifier'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Naive bayes classifier.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Wester, Philip. "Anomaly-based intrusion detection using Tree Augmented Naive Bayes Classifier." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295754.
Full textMed informationsteknikens utveckling och det ökade beroendet av dessa system, blir det alltmer viktigt att hålla systemen säkra. Intrångsdetektionssystem (IDS) är en av många fundamentala teknologier som kan öka säkerheten i ett system. En av de större utmaningarna inom IDS, är att upptäcka typer av intrång som tidigare inte stötts på, så kallade okända intrång. Dessa intrång upptäcks oftast med hjälp av metoder som kollektivt kallas för avvikelsedetektionsmetoder. I denna uppsats utvärderar jag algoritmen Tree Augmented Naive Bayes Classifiers (TAN) prestation som en intrångsdetektionsklassificerare. Jag programmerade ett TAN-program, i Python, och testade detta program på två dataset som innehöll datatrafik. Denna uppsats ämnar att skapa en bättre förståelse för hur TAN fungerar, samt utvärdera om det är en lämplig algoritm för detektion av intrång. Resultaten visar att TAN kan prestera på en acceptabel nivå, med rimligt hög noggrannhet. Resultaten markerar även betydelsen av "smoothing operator", som inkluderas i standardversionen av TAN.
Eldud, Omer Ahmed Abdelkarim. "Prediction of protein secondary structure using binary classificationtrees, naive Bayes classifiers and the Logistic Regression Classifier." Thesis, Rhodes University, 2016. http://hdl.handle.net/10962/d1019985.
Full textKoc, Levent. "Application of a Hidden Bayes Naive Multiclass Classifier in Network Intrusion Detection." The George Washington University, 2013.
Find full textVallin, Simon. "Likelihood-based classification of single trees in hemi-boreal forests." Thesis, Umeå universitet, Institutionen för matematik och matematisk statistik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-99691.
Full textAtt kunna artbestämma enskilda träd är viktigt inom skogsbruket. I denna uppsats undersöker vi om det är möjligt att skilja mellan gran, tall och lövträd med data från en flygburen laserskanner genom att skatta en unik täthetsfunktion för varje trädslag. Täthetsfunktionerna skattas på tre olika sätt: genom att anpassa en beta-fördelning, skatta täthetsfunktionen med histogram samt skatta täthetsfunktionen med en kernel täthetsskattning. Alla dessa metoder klassificerar varje enskild laserretur (och inte segment av laserreturer). Resultaten från vår klassificering jämförs sedan med en referensmetod som bygger på särdrag från laserskanner data. Vi mäter hur väl metoderna presterar genom att jämföra den totala precisionen, vilket är andelen korrektklassificerade träd. Den högsta totala precisionen för de framtagna metoderna i denna uppsats erhölls med metoden som bygger på täthetsskattning med histogram. Precisionen för denna metod var 83,4 procent rättklassicerade träd. Detta kan jämföras med en rättklassificering på 84,1 procent vilket är det bästa resultatet för referensmetoderna. Att vi erhåller en så pass hög grad av rättklassificerade träd tyder på att de metoder som vi använder oss av är användbara för trädslagsklassificering.
Warsitha, Tedy, and Robin Kammerlander. "Analyzing the ability of Naive-Bayes and Label Spreading to predict labels with varying quantities of training data : Classifier Evaluation." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-188132.
Full textEn studie utfördes på klassifieringsmetoderna Naive-Bayes och Label Spreading applicerade i ett spam filter. Meto- dernas förmåga att predicera observerades och resultaten jämfördes i ett McNemar test, vilket ledde till upptäckten av styrkorna och svagheterna av de valda metoderna i en miljö med varierande träningsdata. Fastän resultaten var ofullständiga på grund av bristfälliga resurser, så diskute- ras den bakomliggande teorin utifrån flera vinklar. Denna diskussion har målet att ge en bättre förståelse kring de bakomliggande förutsättningarna som kan leda till poten- tiellt annorlunda resultat för de valda metoderna. Vidare öppnar detta möjligheter för förbättringar och framtida stu- dier. Slutsatsen som dras av denna studie är att signifikanta skillnader existerar i förmågan att kunna predicera klasser mellan de två valda klassifierarna. Den slutgiltiga rekom- mendationen blir att välja en klassifierare utifrån utbudet av träningsdata och tillgängligheten av datorkraft.
SILVA, Antonio Carlos de Castro da. "Reconhecimento automático de defeitos de fabricação em painéis TFT-LCD através de inspeção de imagem." Universidade Federal de Pernambuco, 2016. https://repositorio.ufpe.br/handle/123456789/17823.
Full textMade available in DSpace on 2016-09-12T14:09:09Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) MSc_Antonio Carlos de Castro da Silva_digital_12_04_16.pdf: 2938596 bytes, checksum: 9d5e96b489990fe36c4e1ad5a23148dd (MD5) Previous issue date: 2016-01-15
A detecção prematura de defeitos nos componentes de linhas de montagem de fabricação é determinante para a obtenção de produtos finais de boa qualidade. Partindo desse pressuposto, o presente trabalho apresenta uma plataforma desenvolvida para detecção automática dos defeitos de fabricação em painéis TFT-LCD (Thin Film Transistor-Liquid Cristal Displays) através da realização de inspeção de imagem. A plataforma desenvolvida é baseada em câmeras, sendo o painel inspecionado posicionado em uma câmara fechada para não sofrer interferência da luminosidade do ambiente. As etapas da inspeção consistem em aquisição das imagens pelas câmeras, definição da região de interesse (detecção do quadro), extração das características, análise das imagens, classificação dos defeitos e tomada de decisão de aprovação ou rejeição do painel. A extração das características das imagens é realizada tomando tanto o padrão RGB como imagens em escala de cinza. Para cada componente RGB a intensidade de pixels é analisada e a variância é calculada, se um painel apresentar variação de 5% em relação aos valores de referência, o painel é rejeitado. A classificação é realizada por meio do algorítimo de Naive Bayes. Os resultados obtidos mostram um índice de 94,23% de acurácia na detecção dos defeitos. Está sendo estudada a incorporação da plataforma aqui descrita à linha de produção em massa da Samsung em Manaus.
The early detection of defects in the parts used in manufacturing assembly lines is crucial for assuring the good quality of the final product. Thus, this paper presents a platform developed for automatically detecting manufacturing defects in TFT-LCD (Thin Film Transistor-Liquid Cristal Displays) panels by image inspection. The developed platform is based on câmeras. The panel under inspection is positioned in a closed chamber to avoid interference from light sources from the environment. The inspection steps encompass image acquisition by the cameras, setting the region of interest (frame detection), feature extraction, image analysis, classification of defects, and decision making. The extraction of the features of the acquired images is performed using both the standard RGB and grayscale images. For each component the intensity of RGB pixels is analyzed and the variance is calculated. A panel is rejected if the value variation of the measure obtained is 5% of the reference values. The classification is performed using the Naive Bayes algorithm. The results obtained show an accuracy rate of 94.23% in defect detection. Samsung (Manaus) is considering the possibility of incorporating the platform described here to its mass production line.
Pyon, Yoon Soo. "Variant Detection Using Next Generation Sequencing Data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=case1347053645.
Full textAnderson, Michael P. "Bayesian classification of DNA barcodes." Diss., Manhattan, Kan. : Kansas State University, 2009. http://hdl.handle.net/2097/2247.
Full textLee, Jun won. "Relationships Among Learning Algorithms and Tasks." BYU ScholarsArchive, 2011. https://scholarsarchive.byu.edu/etd/2478.
Full textGiunchi, Massimiliano. "Tecnologie per la gestione di big data: analisi della piattaforma Hadoop." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2017.
Find full textKraus, Michal. "Zjednoznačňování slovních významů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-235964.
Full textДенисова, П. А., and P. A. Denisova. "COVID-19: Анализ эмоциональной окраски сообщений в социальных сетях (на материале сети «Twitter») : магистерская диссертация." Master's thesis, б. и, 2021. http://hdl.handle.net/10995/97958.
Full textThe work is devoted to the sentiment analysis study of messages in Twitter social network. The research material consisted of 818,224 messages and 17 keywords, whereas 89,025 tweets contained the words "COVID-19" and "Coronavirus". In the first part, theoretical and methodological issues are considered: the concept of sentiment analysis is introduced, various approaches to text classification are analyzed. Particular attention in the problems of text classification is given to Naive Bayes classifier, which shows high accuracy of work. The features of sentiment analysis in social networks during epidemics and disease outbreaks are studied. The procedure and algorithm for analyzing the sentiment of the text are described. Much attention is paid to the analysis of sentiment of texts in Python using TextBlob library, and also one of the SaaS tools is chosen - software as a service, which allows real-time sentiment analysis of texts, where there is no need for extensive experience in machine learning and natural language processing against Python programming language. The second part of the study begins with sampling, i.e. definition of keywords by which the search and export of the necessary tweets is carried out. For this purpose, the Coronavirus Corpus is used, designed to reflect the social, cultural and economic consequences of the coronavirus (COVID-19) in 2020 and beyond. The dynamics of the topic words usage during 2020 is analyzed and an analogy is drawn between the frequency of their usage and the events in place. Next, the selected keywords are used to search for tweets and, based on the data obtained, the sentiment analysis of messages is carried out using the Python library - TextBlob, created for processing textual data, and the Brand24 online service. Comparing these tools, the results are similar. The study helps to understand quickly and in real-time public sentiments about the COVID-19 outbreak, thereby contributing to the understanding of developing events. Also, this work can be used as a model for determining the emotional state of Internet users in various situations.
Petřík, Patrik. "Predikce vývoje akciového trhu prostřednictvím technické a psychologické analýzy." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2010. http://www.nusl.cz/ntk/nusl-222507.
Full textKhan, Syeduzzaman. "A PROBABILISTIC MACHINE LEARNING FRAMEWORK FOR CLOUD RESOURCE SELECTION ON THE CLOUD." Scholarly Commons, 2020. https://scholarlycommons.pacific.edu/uop_etds/3720.
Full textDrábek, Matěj. "Využití vybraných metod strojového učení pro modelování kreditního rizika." Master's thesis, Vysoká škola ekonomická v Praze, 2017. http://www.nusl.cz/ntk/nusl-360509.
Full textHelmersson, Benjamin. "Definition Extraction From Swedish Technical Documentation : Bridging the gap between industry and academy approaches." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-131057.
Full textHrach, Vlastimil. "Využití prostředků umělé inteligence na kapitálových trzích." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2011. http://www.nusl.cz/ntk/nusl-222912.
Full textMackových, Marek. "Analýza experimentálních EKG." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2016. http://www.nusl.cz/ntk/nusl-241981.
Full textGuňka, Jiří. "Adaptivní klient pro sociální síť Twitter." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-237052.
Full textMaršánová, Lucie. "Analýza experimentálních EKG záznamů." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2015. http://www.nusl.cz/ntk/nusl-221365.
Full textEkdahl, Magnus. "Approximations of Bayes Classifiers for Statistical Learning of Clusters." Licentiate thesis, Linköping : Linköpings universitet, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5856.
Full textNálevka, Petr. "Improving Efficiency of Prevention in Telemedicine." Doctoral thesis, Vysoká škola ekonomická v Praze, 2010. http://www.nusl.cz/ntk/nusl-113299.
Full textSjöqvist, Hugo. "Classifying Forest Cover type with cartographic variables via the Support Vector Machine, Naive Bayes and Random Forest classifiers." Thesis, Örebro universitet, Handelshögskolan vid Örebro Universitet, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-58384.
Full textSheppard, Sarah E. "Application of a Naïve Bayes Classifier to Assign Polyadenylation Sites from 3' End Deep Sequencing Data: A Dissertation." eScholarship@UMMS, 2013. http://escholarship.umassmed.edu/gsbs_diss/653.
Full textTrevino, Alberto. "Improving Filtering of Email Phishing Attacks by Using Three-Way Text Classifiers." BYU ScholarsArchive, 2012. https://scholarsarchive.byu.edu/etd/3103.
Full textSandberg, Sebastian. "Identifying Hateful Text on Social Media with Machine Learning Classifiers and Normalization Methods - Using Support Vector Machines and Naive Bayes Algorithm." Thesis, Umeå universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-155353.
Full textPolák, Michael Adam. "Identifikace zařízení na základě jejich chování v síti." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-417243.
Full textGaspareto, Marinaldo José. "SELEÇÃO DE ATRIBUTOS EM IMAGENS COLETADAS SOB CONDIÇÕES DE ILUMINAÇÃO NÃO CONTROLADA E SUA INFLUÊNCIA NO DESEMPENHO DE CLASSIFICADORES NAIVE BAYES PARA IDENTIFICAÇÃO DE OBJETOS EM ESTUFAS AGRÍCOLAS." UNIVERSIDADE ESTADUAL DE PONTA GROSSA, 2013. http://tede2.uepg.br/jspui/handle/prefix/172.
Full textA problem regarding the implementation of navigation systems for autonomous moving robots is to detect the objects of interest and obstacles which are in the environment. This study considers the detection of walls / low walls of agricultural greenhouses in digital images obtained without illumination control. The proposed approach employs techniques of digital image processing and digital classification to detect the object of interest. The classifier has been developed digital type Naive Bayes. Two important issues when employing classification methods in computer vision is the accuracy of the classifier and the complexity of computing time. The selection of attributes descriptors that comprise a classifier has great impact on these two factors, generally the fewer attributes are required, the lower the computational cost. Regarding it, this study compared the performance of two methods of feature selection based on principal component analysis, named B2 and B4 in two cases. In the first scenario the feature selection was conducted on all the data extracted from all images. The second selection was performed for images grouped by similarity. After selection, the selected attributes for each approach was used to construct the type Naive Bayes classifier with 12, 17, 22 and 27 input variables. The results indicate that the grouping of images is useful when: (a) the distance from the center of the group to the center of the original database exceeds a threshold and (b) a correlation among the descriptors variables and the target variable is greater than in the group as a whole complete data. Keywords: Greenhouses, Autonomous navigation, Selection attributes, Naive Bayes classifiers.
Um problema relativo à implementação de sistemas de navegação para robôs autônomos móveis é a detecção dos objetos de interesse e dos obstáculos que estão no ambiente. Este trabalho considera a detecção das paredes/muretas de estufas agrícolas em imagens digitais adquiridas sem controle de iluminação. A abordagem proposta emprega técnicas de processamento digital de imagens e classificação digital para detectar o objeto de interesse. O classificador digital desenvolvido foi do tipo Naive Bayes. Duas questões importantes quando do emprego de métodos de classificação em visão computacional são a acurácia do classificador e a complexidade de tempo de computação. A seleção dos atributos descritores que compõem um classificador tem grande impacto sobre estes dois fatores, de um modo geral, quanto menos atributos forem necessários, menor o custo computacional. Considerando isso, este trabalho comparou o desempenho de dois métodos de seleção de atributos baseados na análise de componentes principais, chamados B2 e B4 em duas situações. Na primeira situação, a seleção de atributos foi realizada sobre o conjunto dos dados extraídos de todas as imagens. Na segunda, a seleção foi realizada para imagens agrupadas por similaridade. Após a seleção, os atributos selecionados em cada uma das abordagens foram usados para construir classificadores do tipo Naive Bayes com 12, 17, 22 e 27 variáveis de entrada. Os resultados indicam que o agrupamento de imagens é útil quando: (a) a distância do centro do grupo ao centro da base original ultrapassa um limiar e (b) a correlação entre as variáveis descritoras e a variável meta é maior no grupo do que no conjunto completo de dados.
Marin, Rodenas Alfonso. "Comparison of Automatic Classifiers’ Performances using Word-based Feature Extraction Techniques in an E-government setting." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-32363.
Full textMargold, Tomáš. "Klasifikace příspěvků ve webových diskusích." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-235908.
Full textMichel, David. "All Negative on the Western Front: Analyzing the Sentiment of the Russian News Coverage of Sweden with Generic and Domain-Specific Multinomial Naive Bayes and Support Vector Machines Classifiers." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447398.
Full textDočekal, Martin. "Porovnání klasifikačních metod." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403211.
Full textHátle, Lukáš. "Využití Bayesovských sítí pro predikci korporátních bankrotů." Master's thesis, Vysoká škola ekonomická v Praze, 2014. http://www.nusl.cz/ntk/nusl-192331.
Full textOchuko, Rita E. "E-banking operational risk assessment. A soft computing approach in the context of the Nigerian banking industry." Thesis, University of Bradford, 2012. http://hdl.handle.net/10454/5733.
Full textTully, Philip. "Spike-Based Bayesian-Hebbian Learning in Cortical and Subcortical Microcircuits." Doctoral thesis, KTH, Beräkningsvetenskap och beräkningsteknik (CST), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-205568.
Full textQC 20170421
Ochuko, Rita Erhovwo. "E-banking operational risk assessment : a soft computing approach in the context of the Nigerian banking industry." Thesis, University of Bradford, 2012. http://hdl.handle.net/10454/5733.
Full textБлінков, Євген Миколайович. "Інформаційна технологія визначення тональності текстів." Bachelor's thesis, КПІ ім. Ігоря Сікорського, 2020. https://ela.kpi.ua/handle/123456789/39700.
Full textStructure and scope of work. The explanatory note of the diploma project consists of five sections, contains 26 figures, 7 tables, 1 appendix, 17 sources. The diploma project is devoted to automation of processes of the sentimental analysis of texts, applying various algorithms, and comparison of efficiency of these algorithms. This project describes the methods of sentimental analysis of texts and principles of their application in the development of information technology for text sentimental analysis. The information support section provided sets of input and output data, as well as described their format, structure and purpose in the software product. The section of mathematical support is primarily devoted to the description of meaningful and mathematical formulations of the problem, as well as key methods for solving the problem. In addition, the rationale for the choice of these methods for their implementation in the software product. The software section lists the development tools that have been used in this software product, as well as how the web application works in the form of various diagrams. The technological section provides user guidance for using this program, as well as describes the test results.
Heidfors, Filip, and Elias Moltedo. "Maskininlärning: avvikelseklassificering på sekventiell sensordata. En jämförelse och utvärdering av algoritmer för att klassificera avvikelser i en miljövänlig IoT produkt med sekventiell sensordata." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20742.
Full textA company has developed a environment-friendly IoT device with sequential sensor data and want to use machine learning to classify anomalies in their data. Throughout the years, several well working algorithms for classifications have been developed. However, there is no optimal algorithm for every problem. The purpose of this work was therefore to investigate, compare and evaluate different classifiers within supervised machine learning to find out which classifier that gives the best accuracy to classify anomalies in the kind of IoT device that the company has developed. With a literature review we first wanted to find out which classifiers that are commonly used and have worked well in related work for similar purposes and applications. We concluded to further compare and evaluate Random Forest, Naïve Bayes and Support Vector Machines. We created a dataset of 513 examples that we used for training and evaluation for each classifier. The result showed that Random Forest had superior accuracy with 95.7% compared to Naïve Bayes (81.5%) and Support Vector Machines (78.6%). The conclusion for this work is that Random Forest, with 95.7%, gives a high enough accuracy for the company to have good use of the machine learning model. The result also indicates that Random Forest, for this thesis specific classification problem, is the best classifier within supervised machine learning but that there is a potential possibility to get even higher accuracy with other techniques such as unsupervised machine learning or semi-supervised machine learning.
Moraes, Rodrigo de. "Uma investigação empírica e comparativa da aplicação de RNAs ao problema de mineração de opiniões e análise de sentimentos." Universidade do Vale do Rio dos Sinos, 2013. http://www.repositorio.jesuita.org.br/handle/UNISINOS/3411.
Full textMade available in DSpace on 2015-05-04T17:25:43Z (GMT). No. of bitstreams: 1 Rodrigo Morais.pdf: 5083865 bytes, checksum: 69563cc7178422ac20ff08fe38ee97de (MD5) Previous issue date: 2013
Nenhuma
A área de Mineração de Opiniões e Análise de Sentimentos surgiu da necessidade de processamento automatizado de informações textuais referentes a opiniões postadas na web. Como principal motivação está o constante crescimento do volume desse tipo de informação, proporcionado pelas tecnologia trazidas pela Web 2.0, que torna inviável o acompanhamento e análise dessas opiniões úteis tanto para usuários com pretensão de compra de novos produtos quanto para empresas para a identificação de demanda de mercado. Atualmente, a maioria dos estudos em Mineração de Opiniões e Análise de Sentimentos que fazem o uso de mineração de dados se voltam para o desenvolvimentos de técnicas que procuram uma melhor representação do conhecimento e acabam utilizando técnicas de classificação comumente aplicadas, não explorando outras que apresentam bons resultados em outros problemas. Sendo assim, este trabalho tem como objetivo uma investigação empírica e comparativa da aplicação do modelo clássico de Redes Neurais Artificiais (RNAs), o multilayer perceptron , no problema de Mineração de Opiniões e Análise de Sentimentos. Para isso, bases de dados de opiniões são definidas e técnicas de representação de conhecimento textual são aplicadas sobre essas objetivando uma igual representação dos textos para os classificadores através de unigramas. A partir dessa reresentação, os classificadores Support Vector Machines (SVM), Naïve Bayes (NB) e RNAs são aplicados considerandos três diferentes contextos de base de dados: (i) bases de dados balanceadas, (ii) bases com diferentes níveis de desbalanceamento e (iii) bases em que a técnica para o tratamento do desbalanceamento undersampling randômico é aplicada. A investigação do contexto desbalanceado e de outros originados dele se mostra relevante uma vez que bases de opiniões disponíveis na web normalmente apresentam mais opiniões positivas do que negativas. Para a avaliação dos classificadores são utilizadas métricas tanto para a mensuração de desempenho de classificação quanto para a de tempo de execução. Os resultados obtidos sobre o contexto balanceado indicam que as RNAs conseguem superar significativamente os resultados dos demais classificadores e, apesar de apresentarem um grande custo computacional para treinamento, proporcionam tempos de classificação significantemente inferiores aos do classificador que apresentou os resultados de classificação mais próximos aos dos resultados das RNAs. Já para o contexto desbalanceado, as RNAs se mostram sensíveis ao aumento de ruído na representação dos dados e ao aumento do desbalanceamento, se destacando nestes experimentos, o classificador NB. Com a aplicação de undersampling as RNAs conseguem ser equivalentes aos demais classificadores apresentando resultados competitivos. Porém, podem não ser o classificador mais adequado de se adotar nesse contexto quando considerados os tempos de treinamento e classificação, e também a diferença pouco expressiva de acerto de classificação.
The area of Opinion Mining and Sentiment Analysis emerges from the need for automated processing of textual information about reviews posted in the web. The main motivation of this area is the constant volume growth of such information, provided by the technologies brought by Web 2.0, that makes impossible the monitoring and analysis of these reviews that are useful for users, who desire to purchase new products, and for companies to identify market demand as well. Currently, the most studies of Opinion Mining and Sentiment Analysis that make use of data mining aims to the development of techniques that seek a better knowledge representation and using classification techniques commonly applied and they not explore others classifiers that work well in other problems. Thus, this work aims a comparative empirical research of the ap-plication of the classical model of Artificial Neural Networks (ANN), the multilayer perceptron, in the Opinion Mining and Sentiment Analysis problem. For this, reviews datasets are defined and techniques for textual knowledge representation applied to these aiming an equal texts rep-resentation for the classifiers. From this representation, the classifiers Support Vector Machines (SVM), Naïve Bayes (NB) and ANN are applied considering three data context: (i) balanced datasets, (ii) datasets with different unbalanced ratio and (iii) datasets with the application of random undersampling technique for the unbalanced handling. The unbalanced context inves-tigation and of others originated from it becomes relevant once datasets available in the web ordinarily contain more positive opinions than negative. For the classifiers evaluation, metrics both for the classification perform and for run time are used. The results obtained in the bal-anced context indicate that ANN outperformed significantly the others classifiers and, although it has a large computation cost for the training fase, the ANN classifier provides classification time (real-time) significantly less than the classifier that obtained the results closer than ANN. For the unbalanced context, the ANN are sensitive to the growth of noise representation and the unbalanced growth while the NB classifier stood out. With the undersampling application, the ANN classifier is equivalent to the others classifiers attaining competitive results. However, it can not be the most appropriate classifier to this context when the training and classification time and its little advantage of classification accuracy are considered.
Solis, Montero Andres. "Efficient Feature Extraction for Shape Analysis, Object Detection and Tracking." Thesis, Université d'Ottawa / University of Ottawa, 2016. http://hdl.handle.net/10393/34830.
Full textHung, Chi-Chang, and 洪啟彰. "Improving Naive Bayes Classifier with Association Rules." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/72415233167975672024.
Full text國立中正大學
資訊工程研究所
92
Naive Bayes classifier in machine learning is a kind of probabilistic classifiers based on Bayesian theory. It uses statistic method to classify a new instance by assigning a class with the maximum conditional probability. Naive Bayes assumes that conditional probabilities of terms are the independent assumption. On another domain, association rule mining is done on learning rules by exhaustive search. It aims to find all rules that satisfy user-specified minimum support and minimum confidence. A classifier is trained only by the set of rules may not be used for accurate classification. But association rule mining can find some strong evidences from training set. Our approach is to involve those strong evidences into Naive Bayes classifier. The accuracy of combination is better than the single Naive Bayes classifier.
Tsai, Zong-Ching, and 蔡宗欽. "Improving Naive Bayes Classifier with Multiple Attributes Association Rules." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/54197840022046525752.
Full text南台科技大學
工業管理研究所
93
Many previous studies indicate that NBC (Naïve Bayesian Classifier) is a simple and effective classification method. But NBC’s attribute independence assumption on which it is based, makes it unable to cope with the dependence among attributes, and affects its classification performance. There are many improvement methods that put forward in times gone by, however the accuracy promotes meanwhile, but usually needs tremendous calculation time. This research provides a new classification method: MANBC (Multiple Attributes Naïve Bayes Classifier). MANBC adopts a greedy algorithm to generate important class association rules from the examples that missed classification by NBC and uses the best rule in prediction. In the experiment, we compared the MANBC with other 13 methods by running the tests from 31 data sets of the UCI Repository. Experimental results showed that MANBC outstrips NBC, on an average. The classification accuracy is higher about 1% to 2%, and in the data set that attribute independence assumption is violated, its margin is about in the amount of 20%. Meanwhile, the number of the rules generated by MANBC is usually 7% of CPAR’s only. In the time aspect, for those classifiers that are as accurate as MANBC, their performance times are slightly longer than MANBC’s, except CPAR.
Liu, Yu-Hsuan, and 劉宇軒. "Naive Bayes classifier with Principal Components Analysis and Fisher Information." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/ugz36c.
Full text國立臺灣科技大學
資訊管理系
105
Naive Bayes classifier is a simple probabilistic classifier which is based on applying Bayes’theorem which strong independence assumptions between the features. We propose a method based on Naive Bayes classifier with Principal Components Analysis(PCA) and Fisher Information. We use Principal Components Analysis to make features uncorrelated. The transformed features are ranked by Fisher Information score which measuring the amount of information and calculate the posterior probability where the likelihood is replaced by p-value. We conclude our research through the classification accuracy with some examples and present our vision for future research.
Chen, I.-Chieh, and 陳羿捷. "Image Classification Using Naive Bayes Classifier With Pairwise Local Observations." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/99031641900696139918.
Full text國立清華大學
電機工程學系
103
We present image classification method using Naive Bayes classifier using pairwise local observations (NBPLO) based on the salient region (SR) selection and the local feature detection. Different from previous image classification algorithms, our method is a scale, translation, and rotation invariant classification algorithm. By transforming the pairwise local observations into training vectors, we may simulate the human visual system by developing the training classification model based on the neighboring relationship of the selected SRs. We verify our assumptions with Scene-15 and Caltech-101 database and compare the difference of mainstream feature point detection methods. And also compare the experiment results of bag-of-features (BoF) and SPM algorithms.
Lin, Cheng-Lung, and 林政龍. "Internet Traffic Classification based on Hybrid Naive Bayes HMMs Classifier." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/59142288452653192129.
Full text國立臺灣科技大學
資訊工程系
96
To deal with the large network infrastructure, we must rely on an automatic network management system. Traditionally, most of the firewall simply use the port number of the packets to identify abnormal network traffic. Furthermore, some of them observe the characteristic in application layer to identify abnormal network traffic such as the payload of a packet. However, the traditional security mechanisms encounter difficulties with the increasing popularity of encrypted protocols. Recently, some related researches which can identify application protocol by some restricted characteristics and behaviors in transition layer of TCP/IP model after encryption. Therefore, we combine and implement two models which are Naive Bayes and Hidden Markov Models (HMMs) as an automatic system and use the limited information of encrypted packets to infer and classify the application protocol behavior. Generally speaking, HMMs are relatively good to estimate the potential relationship with temporal data. Naive Bayes is simple, fast, and effective. It is usually used for dealing multidimension dataset in lots of cases. In this thesis, we propose hybrid Naive Bayes HMMs classifier as a fundamental framework to infer application protocol behavior in encrypted network traffic. The hybrid model uses the temporal property of HMMs to inspect the relation between the packets and employs Naive Bayes to character the statistical signature. In this study, our approach can not only identify network behavior in encrypted network traffic, but also employ the temporal property to raise the accuracy. It can be applied to infer application protocol and detects the abnormal behavior. Comparing to related researches, our method only uses a few features to classify multi-flow protocol and get respectable performance.
Wu, Jo-Ping, and 吳若平. "Naive Bayes classifier with Principal Components Analysis for continuous attributes." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/62937176964151133310.
Full text國立中央大學
工業管理研究所
103
Due to the progressing of the science and technology, the data is growing rapidly. The speed of classifier has become an important part of data mining. Naïve Bayes classifier model is a simple and practical method of classification, it is based on applying Bayes’ theorem with strong independence assumptions between the features. But this assumption is not very realistic as in many real situations. We propose a classifier method, PC-Naïve, which is based on Naïve Bayes classifier. We keep the simple and fast advantages of the Naïve Bays classifier and relax vital assumption for independence of the Naïve Bayes classifie model. We use Principal components analysis to transform the original data, make the attributes mutual linearly independence. Then discretization the transform data and calculate the prior and conditional probability. Final we can get the posterior probability and classifier the data. We have used the examples to present the classifier procedures in our research and compare the accuracy with four models, including PC-Naïve model, tradition Naïve Bayes model, Decision Tree model and Stepwise Logistic Regression model. At the end, we have discuss the accuracy of different dimension and discretization methods.
Chang, Liang-Hao, and 張良豪. "Improving the performance of Naive Bayes Classifier by using Selective Naive Bayesian Algorithm and Prior Distributions." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/92613736217287175606.
Full text國立成功大學
工業與資訊管理學系碩博士班
97
Naive Bayes classifiers have been widely used for data classification because of its computational efficiency and competitive accuracy. When all attributes are employed for classification, the accuracy of the naive Bayes classifier is generally affected by noisy attributes. A mechanism for attribute selection should be considered for improving its prediction accuracy. Selective naive Bayesian method is a very successful approach for removing noisy and/or redundant attributes. In addition, attributes are generally assumed to have prior distributions, such as Dirichlet or generalized Dirichlet distributions, for achieving a higher prediction accuracy. Many studies have proposed the methods for finding the best priors for attributes, but none of them takes attribute selection into account. Thus, this thesis proposes two models for combining prior distribution and feature selection together for increasing the accuracy of the naive Bayes classifier. Model I finds out the best prior for each attribute after all attributes have been determined by the selective naive Bayesian algorithm. Model II finds the best prior of the newest attribute determined by the selective naive Bayesian algorithm when all predecessors of the newest attribute have their best priors. The experimental result on 17 data sets form UCI data repository shows that Model I with the general Dirichlet prior generally and consistently achieves a higher classification accuracy.
Zhi-JunChen and 陳志濬. "Investigating the Effect of Attribute Value Ranking Methods on Naive Bayes Classifier with Generalized Dirichlet Priors." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/88464897701033696577.
Full text國立成功大學
工業管理科學系碩博士班
98
Na?ve Bayesian classifiers have been widely used for data classification because of its computational efficiency and competitive accuracy. In a na?ve Bayesian classifier, the prior distributions of an attribute are generally assumed to be Dirichlet or generalized Dirichlet distributions. The generalized Dirichlet distribution can release the restrictions of the Dirichlet distribution, and usually results in higher classification accuracy. However, the order of the variables in a generalized Dirichlet random vector is generally not arbitrary. In this study, three methods for determining the order of attribute values are proposed to study their impact on the performance of the na?ve Bayesian classifiers with noninformative generalized Dirichlet priors. The experimental results on 20 data sets from UCI data repository demonstrate that when attribute values are properly ordered, the classification accuracy can be slightly improved with respect to nonordered attribute values. When computational efficiency is a major concern, ordering attribute values for employing noninformative generalized Dirichlet priors will not be necessary.
Liu, Chong-Hsien, and 劉忠賢. "Real-time and Low-memory Multi-face Detection System Design based on Naive Bayes Classifier using FPGA." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/f92z7n.
Full text國立交通大學
電控工程研究所
104
In recent years, face detection is widely used in various fields, such as face recognition, image focusing, and surveillance systems. This thesis proposes a real-time face detection system based on naive Bayesian classifier using FPGA. The system divided into three main parts, feature extraction, candidates face detection, and false elimination. First downscale the image to the image pyramid and extract local binary image features from each downscaling image; then features go through the naive Bayesian classifier to identify candidate faces. Finally, use skin color filter and face overlapping elimination to remove false positives. Detection results output to the monitor in VGA. In this thesis, face detection system to implement in FPGA. As a result of the FPGA parallel processing, in 640480 resolutions, the face detection of an image executes within 16.7 milliseconds. And the improved local binary features, compared to Haar features, save around 140 times the amount of memory. The experimental results show that the accuracy rate is higher than 95% in face detection, which implies the proposed real-time detection system is indeed effective and efficient.
Mawila, Ntombhimuni. "Natural language processing for researchh philosophies and paradigms dissertation (DFIT91)." Diss., 2021. http://hdl.handle.net/10500/27471.
Full textScience and Technology Education
MTech. (Information Technology)