To see the other types of publications on this topic, follow the link: Multinomial naïve bayesian.

Journal articles on the topic 'Multinomial naïve bayesian'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 32 journal articles for your research on the topic 'Multinomial naïve bayesian.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Xu, Shuo. "Bayesian Naïve Bayes classifiers to text classification." Journal of Information Science 44, no. 1 (2016): 48–59. http://dx.doi.org/10.1177/0165551516677946.

Full text
Abstract:
Text classification is the task of assigning predefined categories to natural language documents, and it can provide conceptual views of document collections. The Naïve Bayes (NB) classifier is a family of simple probabilistic classifiers based on a common assumption that all features are independent of each other, given the category variable, and it is often used as the baseline in text classification. However, classical NB classifiers with multinomial, Bernoulli and Gaussian event models are not fully Bayesian. This study proposes three Bayesian counterparts, where it turns out that classical NB classifier with Bernoulli event model is equivalent to Bayesian counterpart. Finally, experimental results on 20 newsgroups and WebKB data sets show that the performance of Bayesian NB classifier with multinomial event model is similar to that of classical counterpart, but Bayesian NB classifier with Gaussian event model is obviously better than classical counterpart.
APA, Harvard, Vancouver, ISO, and other styles
2

Olanrewaju, Rasaki Olawale, Sodiq Adejare Olanrewaju, and Lukman Abiodun Nafiu. "Multinomial Naïve Bayes Classifier: Bayesian versus Nonparametric Classifier Approach." European Journal of Statistics 2 (February 22, 2022): 8. http://dx.doi.org/10.28924/ada/stat.2.8.

Full text
Abstract:
This paper proposes a Naïve Bayes Classifier for Bayesian and nonparametric methods of analyzing multinomial regression. The Naïve Bayes classifier adopted Bayes’ rule for solving the posterior of the multinomial regression via its link function known as Logit link. The nonparametric adopted Gaussian, bi-weight kernels, Silverman’s rule of thumb bandwidth selector, and adjusted bandwidth as kernel density estimation. Three categorical responses of information on 78 people using one of three diets (Diet A, B, and C) that consist of scaled variables: age (in years), height (in cm), weight (in kg) before the diet (that is, pre-weight), weight (in kg) gained after 6 weeks of diet were subjected to the classifier multinomial regression of Naïve Bayes and nonparametric. The Gaussian and bi-weight kernel density estimation produced the minimum bandwidths across the three categorical responses for the four influencers. The Naïve Bayes classifier and nonparametric kernel density estimation for the multinomial regression produced the same prior probabilities of 0.3077, 0.3462, and 0.3462; and A prior probabilities of 0.3077, 0.3462, and 0.3462 for Diet A, Diet B, and Diet C at different smoothing bandwidths.
APA, Harvard, Vancouver, ISO, and other styles
3

Yashvi, Vaghasiya, Vora Diya, Nehayadav, and Rana Manish. "Language detection using multinomial naïve bayes algorithm." i-manager's Journal on Computer Science 10, no. 2 (2022): 34. http://dx.doi.org/10.26634/jcom.10.2.19014.

Full text
Abstract:
In this multilingual world, automatic detection of written or spoken language using Language Identification (LID) technology is a boon in the global communication with people using different languages in different countries. For simplicity and for the purpose of this research, the process of automatically identifying the language(s) from a document is thought of as LID. Lot of ongoing research projects are in the field of Natural Language Processing (NLP) that uses LID as a part of NLP. This field exploits several algorithms evolved in the field of computer science, individually or in combination to achieve accuracy in identifying a language. Among the different approaches adopted in LID,NaïveBayes Classification n-gram text processing seems to be promising.This paper proposes the concept for categorising multiple language texts using Naïve Bayesian algorithms using Machine Learning approaches. Using techniques from existing researches, this paper proposes a way to recognize multilingual documents and calculate the relative proportions of these languages.
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Kuan-Liang, and Tzu-Tsung Wong. "Naïve Bayesian Classifiers with Multinomial Models for rRNA Taxonomic Assignment." IEEE/ACM Transactions on Computational Biology and Bioinformatics 10, no. 5 (2013): 1. http://dx.doi.org/10.1109/tcbb.2013.114.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Saxena, Neeraj, Ruiyang Wang, Vinayak V. Dixit, and S. Travis Waller. "Frequentist and Bayesian Approaches for Understanding Route Choice of Drivers under Stop-and-Go Traffic." Transportation Research Record: Journal of the Transportation Research Board 2674, no. 9 (2020): 371–82. http://dx.doi.org/10.1177/0361198120929332.

Full text
Abstract:
Driving in congested traffic is a nuisance that not only results in longer travel times, but also triggers frustration and impatience among drivers. A few studies have modeled the effects of congested traffic in the resulting route choice behavior of car drivers. The studies used frequentist models such as discrete choice models to analyze large samples. However, these studies did not compare the inferences obtained from the frequentist and Bayesian approaches, particularly for datasets which are not sufficiently large. It has been shown by researchers that Bayesian models perform well, especially when the sample size is small. Thus, this paper develops and compares a multinomial logit (frequentist) and a Naïve Bayes (Bayesian) model on a mid-sized dataset of size around 100 participants which was obtained from a driving simulator experiment to understand driver’s route choice under stop-and-go traffic. The results show that the prediction power of the Naïve Bayes model is much higher than the multinomial logit model (MNL). The Naïve Bayes model is also found to perform better than machine learning algorithms like the decision tree model. The findings from this study will be useful to researchers and practitioners as they should test both the approaches and select the appropriate model, particularly in the case of seemingly large datasets.
APA, Harvard, Vancouver, ISO, and other styles
6

Chen, Yilin. "Comparative Analysis of Bayesian Approaches and Variant Methods in the Financial Field." Journal of Education, Humanities and Social Sciences 49 (April 15, 2025): 59–66. https://doi.org/10.54097/38mmdr81.

Full text
Abstract:
It is important for researchers to select the most appropriate model for specific financial tasks. This comparative analysis describes the strengths, limitations, and trade-offs of Bayesian approaches and variant methods. Moreover, this study aims to compare Bayesian approaches and their variant methods in the financial field and assess their effectiveness and relevancy for stock price prediction. This study starts with a brief overview of Bayes’ Theorem, Naïve Bayes (NB), multinomial Naïve Bayes (NBM), and Gaussian Naïve Bayes (GNB). In the prediction of the Brazilian stock market, NBM demonstrates superior performance compared to other models in terms of accuracy, recall, and F1-score. In the meantime, the GNB and linear discriminant analysis-based combined model (GNB_LDA) exhibits superior performance in accuracy and F1-score, while the model incorporates GNB, standardization, and factor analysis (GNB_Z-Score_FA) excels in mean specificity in the prediction of seven different stocks. This study also suggests further exploration of hybrid models that combine the strengths of Bayesian models with other techniques to enhance financial analysis and decision-making.
APA, Harvard, Vancouver, ISO, and other styles
7

Wong, Tzu-Tsung, and Hsing-Chen Tsai. "Multinomial naïve Bayesian classifier with generalized Dirichlet priors for high-dimensional imbalanced data." Knowledge-Based Systems 228 (September 2021): 107288. http://dx.doi.org/10.1016/j.knosys.2021.107288.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Wong, Tzu-Tsung. "Generalized Dirichlet priors for Naïve Bayesian classifiers with multinomial models in document classification." Data Mining and Knowledge Discovery 28, no. 1 (2012): 123–44. http://dx.doi.org/10.1007/s10618-012-0296-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Atoyebi, Temitope Olufunmi, Rashidah Funke Olanrewaju, N. V. Blamah, and Morufu Olalere. "Malaria Disease Prediction and Grading System: A Performance Model of Multinomial Naïve Bayes (MNB) Machine Learning in Nigerian Hospitals." International Journal for Research in Applied Science and Engineering Technology 11, no. 10 (2023): 2113–24. http://dx.doi.org/10.22214/ijraset.2023.56378.

Full text
Abstract:
bstract: Malaria disease is the number one cause of death all over the Sub-Sahara world. Data mining can help extract valuable knowledge from available data in the healthcare sector. This allows training a patient health prediction model faster than in a clinical trial. Various implementation of machine learning algorithms such as Bayesian Theorem, Logistic Regression, K-Nearest Neighbor, Support Vector Machine and Multinomial Naïve Bayes (MNB), etc. have been applied on Public Hospital Malaria Disease datasets but there has been a limit to modeling using Multinomial Naïve Bayes Algorithm. This research applied MNB modeling to discover the relationship between 15 relevant attributes of the Public Hospitals data collected from Bwari General Hospital in Bwari Area Council and Maitama Hospital in Abuja Municipal Area Council, Abuja, FCT, and Nigeria. The goal is to examine how dependencies between attributes affect the performance of the classifier. The MNB produces a reliable and transparent graphical representation between the attributes with the ability to predict new scenarios. The model has an accuracy of 97%. It was concluded that the model outperformed the GNB classifier which has an accuracy of 100% and RF which also has an accuracy of 100%.
APA, Harvard, Vancouver, ISO, and other styles
10

Wong, Tzu-Tsung, and Chao-Rui Liu. "An efficient parameter estimation method for generalized Dirichlet priors in naïve Bayesian classifiers with multinomial models." Pattern Recognition 60 (December 2016): 62–71. http://dx.doi.org/10.1016/j.patcog.2016.04.019.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

El Hindi, Khalil, Bayan Abu Shawar, Reem Aljulaidan, and Hussien Alsalamn. "Improved Distance Functions for Instance-Based Text Classification." Computational Intelligence and Neuroscience 2020 (November 22, 2020): 1–10. http://dx.doi.org/10.1155/2020/4717984.

Full text
Abstract:
Text classification has many applications in text processing and information retrieval. Instance-based learning (IBL) is among the top-performing text classification methods. However, its effectiveness depends on the distance function it uses to determine similar documents. In this study, we evaluate some popular distance measures’ performance and propose new ones that exploit word frequencies and the ordinal relationship between them. In particular, we propose new distance measures that are based on the value distance metric (VDM) and the inverted specific-class distance measure (ISCDM). The proposed measures are suitable for documents represented as vectors of word frequencies. We compare these measures’ performance with their original counterparts and with powerful Naïve Bayesian-based text classification algorithms. We evaluate the proposed distance measures using the kNN algorithm on 18 benchmark text classification datasets. Our empirical results reveal that the distance metrics for nominal values render better classification results for text classification than the Euclidean distance measure for numeric values. Furthermore, our results indicate that ISCDM substantially outperforms VDM, but it is also more susceptible to make use of the ordinal nature of term-frequencies than VDM. Thus, we were able to propose more ISCDM-based distance measures for text classification than VDM-based measures. We also compare the proposed distance measures with Naïve Bayesian-based text classification, namely, multinomial Naïve Bayes (MNB), complement Naïve Bayes (CNB), and the one-versus-all-but-one (OVA) model. It turned out that when kNN uses some of the proposed measures, it outperforms NB-based text classifiers for most datasets.
APA, Harvard, Vancouver, ISO, and other styles
12

Rintyarna, Bagus Setya. "Joint Distribution pada Weighted Majority Vote (WMV) untuk Peningkatan Kinerja Sentiment Analysis Tersupervisi pada Dataset Twitter." Jurnal Teknologi Informasi dan Ilmu Komputer 9, no. 5 (2022): 1083. http://dx.doi.org/10.25126/jtiik.2022956185.

Full text
Abstract:
<p class="Abstrak"><em>Sentiment analysis</em> adalah teknik komputasi <em>text mining</em> berbasis <em>natural language processing</em> (NLP) untuk mengekstraksi pendapat seseorang yang diungkapkan dalam platform online, termasuk dalam platform <em>microblogging</em> Twitter, salah satu platform <em>microblogging</em> yang paling popular digunakan di Indonesia. Ada dua pendekatan yang umum digunakan dalam teknik sentiment analysis yaitu pendekatan berbasis <em>machine learning</em> (ML) dan pendekatan berbasis <em>sentiment lexicon</em> (SL). Fokus penelitian ini adalah untuk pengembangan teknik <em>sentiment analysis</em> berbasis <em>machine learning</em> yang disebut juga teknik tersupervisi pada dataset Twitter. Sebagian besar sentiment analysis pada dataset Twitter berbahasa Indonesia mengandalkan <em>single machine learning algorithm</em>. Penelitian ini menggabungkan kinerja berbagai algoritma/experts seraya mengurangi tingkat kesalahan klasifikasi dengan meng-update bobot secara dinamis menggunakan <em>weighted majority vote</em> (WMV) berbasis <em>joint distribution</em> dari Bayesian Network. Pada tahap pertama, data di grabbing dari Twitter dengan 3 hashtag terkait Covid-19 sebagai data eksperimen. Selanjutnya kinerja weighted majority vote secara ekstensif dibandingkan dengan 4 metode baseline sebagai pembanding, yaitu: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Naïve Bayes dan Majority Vote dari ketiga single classifier tersebut. Metrics kinerja yang digunakan adalah precision, recall, fmeasure, accuracy dan Mathews correlation coeficient (MCCC). Dalam eksperimen, terbukti bahwa WMV mampu meningkatkan kinerja <em>sentiment analysis</em> pada ketiga topik dataset dengan evaluator berbagai metrics kinerja sentiment analysis.</p><p class="Abstrak"> </p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Sentiment analysis is a computational text mining technique based on natural language processing (NLP) to extract someone's opinion expressed in online platforms, including the Twitter microblogging platform, one of the most popular microblogging platforms used in Indonesia. There are two approaches that are commonly used in sentiment analysis techniques, namely the machine learning (ML) based approach and the sentiment lexicon (SL) based approach. The focus of this research is the development of machine learning-based sentiment analysis techniques which are also called supervised techniques on the Twitter dataset. Most of the sentiment analysis on the Indonesian language Twitter dataset relies on a single machine learning algorithm. This study combines the performance of various algorithms/experts while reducing the level of misclassification by updating the weights dynamically using a joint distribution-based weighted majority vote (WMV) from the Bayesian Network. In the first stage, data was grabbed from Twitter with 3 hashtags related to Covid-19 as experimental data. Furthermore, the performance of the weighted majority vote was extensively compared with 4 baseline methods for comparison, namely: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Nave Bayes and Majority Vote from the three single classifiers. Performance metrics used are precision, recall, fmeasure, accuracy and Mathews correlation coeficient. In experiments, it is proven that WMV is able to improve sentiment analysis performance on the three dataset topics with various evaluators of sentiment analysis performance metrics.</em></p><p class="Abstrak"><em><strong><br /></strong></em></p>
APA, Harvard, Vancouver, ISO, and other styles
13

Alam, Firoj, Evgeny A. Stepanov, and Giuseppe Riccardi. "Personality Traits Recognition on Social Network - Facebook." Proceedings of the International AAAI Conference on Web and Social Media 7, no. 2 (2021): 6–9. http://dx.doi.org/10.1609/icwsm.v7i2.14464.

Full text
Abstract:
For the natural and social interaction it is necessary to understand human behavior. Personality is one of the fundamental aspects, by which we can understand behavioral dispositions. It is evident that there is a strong correlation between users’ personality and the way they behave on online social network (e.g., Facebook). This paper presents automatic recognition of Big-5 personality traits on social network (Facebook) using users’ status text. For the automatic recognition we studied different classification methods such as SMO (Sequential Minimal Optimization for Support Vector Machine), Bayesian Logistic Regression (BLR) and Multinomial Naïve Bayes (MNB) sparse modeling. Performance of the systems had been measured using macro-averaged precision, recall and F1; weighted average accuracy (WA) and un-weighted average accuracy (UA). Our comparative study shows that MNB performs better than BLR and SMO for personality traits recognition on the social network data.
APA, Harvard, Vancouver, ISO, and other styles
14

Mohammed, Ali Sura I., Marwah Nihad, Sharaf Hussien Mohamed, and Haitham Farouk. "Machine learning for text document classification-efficient classification approach." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 703–10. https://doi.org/10.11591/ijai.v13.i1.pp703-710.

Full text
Abstract:
Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides a text document categorization method that is both efficient and effective. In addition, methods for determining the proper relationship between a set of words in a document and its document categorization is also obtained.
APA, Harvard, Vancouver, ISO, and other styles
15

Li, Chengbo, Honghong Liu, Bochen Yin, and Jiaying Yang. "Weibo Depression Posts Detection by Natural Language Processing." Highlights in Science, Engineering and Technology 16 (November 10, 2022): 430–37. http://dx.doi.org/10.54097/hset.v16i.2605.

Full text
Abstract:
The goal of this paper is to detect depression based on the posts on social media. The dataset combined both tweet dataset and Weibo dataset scraped from the social media. To classify emotions, researchers have been using traditional models such as Bagging, Support Vector Machines, Decision tree, Multinomial Naïve Bayesian and K-nearest neighbor. In this paper, K- nearest neighbor is chosen based on the better precision in result. The main challenge is Chinese context translation and complexity of the context of the post. Finally, an UI page is designed to complete the mission to input a Weibo ID and output of the depression classification. Our approach achieves relatively higher quality results compared to the previous models in literature, while combining depression detection with the real-time social media posts. With this system, we demonstrate the practicability of our project by predicting the depression situation of Internet users through the model, to bring help to the depressed people or people with potential depression tendencies in society.
APA, Harvard, Vancouver, ISO, and other styles
16

Dhar, Ankita, Himadri Mukherjee, Kaushik Roy, KC Santosh, and Niladri Sekhar Dash. "Hybrid approach for text categorization: A case study with Bangla news article." Journal of Information Science 49, no. 3 (2023): 762–77. http://dx.doi.org/10.1177/01655515211027770.

Full text
Abstract:
The incredible expansion of online texts due to the Internet has intensified and revived the interest of sorting, managing and categorising the documents into their respective domains. This shows the pressing need for automatic text categorization system to assign a document into its appropriate domain. In this article, the focus is on showcasing the effectiveness of a hybrid approach that works elegantly by combining text-based and graph-based features. The hybrid approach was applied on 14,373 Bangla articles with 57,22,569 tokens collected from various online news corpora covering nine categories. This article also presents the individual application of both the features to explicate how they generally work. For classification purposes, the feature sets were passed through the Bayesian classification methods which yield satisfactory results with 98.73% accuracy for Naïve Bayes Multinomial (NBM). Also, to test the robustness and language independency of the system, the experiments were performed on two popular English datasets as well.
APA, Harvard, Vancouver, ISO, and other styles
17

Jairath, V., T. P. Leahy, R. Potluri, et al. "P849 Bayesian network meta-analysis of the efficacy of advanced therapies for patients with moderately to severely active ulcerative colitis naïve to advanced therapy." Journal of Crohn's and Colitis 18, Supplement_1 (2024): i1576—i1577. http://dx.doi.org/10.1093/ecco-jcc/jjad212.0979.

Full text
Abstract:
Abstract Background Etrasimod is an oral, once-daily, selective sphingosine 1-phosphate (S1P)1,4,5 receptor modulator for the treatment of moderately to severely active ulcerative colitis (UC). In the absence of head-to-head randomised controlled trials (RCTs), network meta-analyses (NMA) offer insight into the comparative effectiveness of treatment options. NMA were conducted to examine the relative efficacy of etrasimod vs other advanced therapies (AT) with licensed dosing (European Medicines Agency) for the treatment of UC in patients naïve to biologic agents and/or Janus kinase inhibitors. Methods A systematic literature review (SLR) was performed on 15 November 2022, and covered all available records without time limit, using NICE DSU and PRISMA guidelines. NMA were conducted under a Bayesian framework, and a multinomial fixed-effect approach was used to model outcomes, clinical response and clinical remission, in the induction phase and among induction phase responders in the maintenance phase, in patients naïve to AT. Reported outcomes from trials with a treat-through design, such as ELEVATE UC 52, were recalculated to mimic those of a responder re-randomisation design; only responders in the induction phase were analysed in the maintenance phase. Data are presented as median relative risk (RR) of the treatment vs its comparator, along with corresponding 95% credible intervals (CrI). Prespecified sensitivity analyses were performed. Results Of 81 studies identified from the SLR, 21 and 11 RCTs were included in the induction and maintenance networks, respectively. For induction and maintenance phases, all therapies demonstrated benefit over placebo, consistent with phase 3 clinical trial data. In the NMA for clinical remission in the induction phase, etrasimod 2 mg had a statistically significant benefit over adalimumab 80/40 mg and 160/80 mg (RR [95% CrI] for treatment vs etrasimod 0.49 [0.29–0.78] and 0.67 [0.50–0.92], respectively), filgotinib 100 mg (0.55 [0.37–0.84]) and placebo; conversely, upadacitinib 45 mg had statistically significant benefit vs etrasimod (1.47 [1.07–2.03]; Table). There were no statistically significant differences for etrasimod vs other comparators. Similar results were observed for clinical response. In the maintenance phase, there were no statistically significant differences between etrasimod 2 mg and other treatments for clinical remission and clinical response (Table). Conclusion With respect to clinical remission and clinical response during induction and maintenance phases, etrasimod efficacy was similar to most comparators as a first-line AT. Differences in trial design and risk-benefit profiles of AT should be considered when interpreting NMA results.
APA, Harvard, Vancouver, ISO, and other styles
18

Budi, Agus Setia, and Arda Gusema Susilowati. "SISTEM PENDUKUNG KEPUTUSAN PENERIMAAN BEASISWA MENGGUNAKAN METODE MULTINOMIAL NAIVE BAYES." Jurnal Advanced Research Informatika 1, no. 01 (2023): 13–19. http://dx.doi.org/10.24929/jars.v1i01.2447.

Full text
Abstract:
Pentingnya menimba ilmu untuk masadepan yang lebih baik, apalagi jika siswa memiliki prestasi, namun harus berhenti di tengah jalandikarenakan tidak memiliki biaya. Universitas Islam Lamongan memberikan beasiswa kepadasiswa yang berprestasi dan tidak mampu. Pemberian beasiswa berdasarkan kriteria yangsudah ditetapkan kemudian dilakukan perankinganmenggunakan excel. Berdasarkan masalah tersebut, akan dibuat sistem pendukung keputusanpenerimaan beasiswa menggunakan metodeMultinomial Naive Bayesdan mengukur tingkat akurasi dari sistem yang telah dibuat. Tujuan dari penelitian ini adalah membuat sistem pendukungkeputusan penerimaan beasiswa menggunakanmetodeMultinomialNaive Bayes. Sistem pendukungkeputusan merupakan penyelesaian dari suatu masalah yang ada. Penelitian ini menggunakan metode Multinomial Naive Bayesyang merupakan metode terbaik dari kalangan klasifikasi. Data yang digunakan dalam perhitungan adalah data kriteria siswa dari tahun sebelumnya yang diperoleh dari rumah beasiswa Universitas Islam Lamongan. Perhitungan MultinomialNaive Bayesdilakukan dengan mencari probabilitas setiap kategori kemudian dicari nilai yang tertinggi. Nilai tertinggimerupakan hasil dari perhitungan. Berdasarkan perhitungan yang telah dilakukan sistem pendukung keputusan penerimaan beasiswa menggunakan metode MultinomialNaive Bayesmemiliki tingkat akurasi sebesar 92,73%. Tingkat akurasi tersebut menunjukkan bahwa sistem pendukung keputusan penerimaan beasiswa menggunakan metode Multinomial Naive Bayessangat layak digunakan.
APA, Harvard, Vancouver, ISO, and other styles
19

Jin, Xin, Wengang Zhou, and Rongfang Bie. "Multinomial event naive Bayesian modeling for SAGE data classification." Computational Statistics 22, no. 1 (2007): 133–43. http://dx.doi.org/10.1007/s00180-007-0029-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Sanrilla, Sanrilla, Natalis Ransi, La Surimi La Surimi, Andi Tenriawaru Andi Tenriawaru, and La Ode Saidi La Ode Saidi. "ANALISIS SENTIMEN MASYARAKAT TERHADAP TOKO ONLINE APLIKASI SHOPEE MENGGUNAKAN METODE MULTINOMIAL NAÏVE BAYES." Jurnal Matematika Komputasi dan Statistika 2, no. 2 (2022): 68–75. http://dx.doi.org/10.33772/jmks.v2i2.9.

Full text
Abstract:
Analisis sentimen masyarakat terhadap toko online pada aplikasi shopee menggunakan metode Multinomial Naïve Bayes. Tujuan penelitian ini adalah untuk mengetahui tingkat sentimen masyarakat dari ulasan produk yang ada pada toko online 3second dalam aplikasi shopee dengan menggunakan metode Multinomial Naïve Bayes. Analisis ulasan secara mudah dapat dilakukan dengan melihat jumlah bintang yang diberikan oleh pembeli, tetapi jumlah bintang tidak dapat mewakili isi dari keseluruhan ulasan. Diperlukan melihat seluruh isi komentar ulasan untuk dapat mengetahui keseluruhan maksud ulasan. Sangat mungkinkan untuk menganalisis ulasan secara manual dengan melihat satu persatu, namun apabila ulasan yang dimiliki banyak akan lebih cepat menggunakan sistem analisis sentimen dengan cara melakukan teknik kalsifikasi. Algoritma Multinomial Naïve Bayes adalah pengembangan dari Naïve Bayesyang digunakan dalam pembuatan analisis sentiment, karena algoritma ini bertujuan untuk mode klasifikasi dalam kategori positif dan negatif. Berdasarkan hasil pengujian yang dilakukan metode Multinomial Naïve Bayes terbukti berfungsi baik pada penelitian ini. Hal ini dibuktikan dengan hasil confusion matrixdari pengujian klasifikasi sistem dan klasifikasi dari ahli bahasa mendapat nilai akurasi 91% dan memperoleh tingkat sentiment yang dibukti hasil recall dan precision yang bernilai 65.93% dan 60% untuk kelas positif, sedangkan untuk kelas negatif bernilai 34.06% dan 31%.
APA, Harvard, Vancouver, ISO, and other styles
21

Chebil, Wiem, Mohammad Wedyan, Moutaz Alazab, Ryan Alturki, and Omar Elshaweesh. "Improving Semantic Information Retrieval Using Multinomial Naive Bayes Classifier and Bayesian Networks." Information 14, no. 5 (2023): 272. http://dx.doi.org/10.3390/info14050272.

Full text
Abstract:
This research proposes a new approach to improve information retrieval systems based on a multinomial naive Bayes classifier (MNBC), Bayesian networks (BNs), and a multi-terminology which includes MeSH thesaurus (Medical Subject Headings) and SNOMED CT (Systematized Nomenclature of Medicine of Clinical Terms). Our approach, which is entitled improving semantic information retrieval (IMSIR), extracts and disambiguates concepts and retrieves documents. Relevant concepts of ambiguous terms were selected using probability measures and biomedical terminologies. Concepts are also extracted using an MNBC. The UMLS (Unified Medical Language System) thesaurus was then used to filter and rank concepts. Finally, we exploited a Bayesian network to match documents and queries using a conceptual representation. Our main contribution in this paper is to combine a supervised method (MNBC) and an unsupervised method (BN) to extract concepts from documents and queries. We also propose filtering the extracted concepts in order to keep relevant ones. Experiments of IMSIR using the two corpora, the OHSUMED corpus and the Clinical Trial (CT) corpus, were interesting because their results outperformed those of the baseline: the P@50 improvement rate was +36.5% over the baseline when the CT corpus was used.
APA, Harvard, Vancouver, ISO, and other styles
22

Kaushik, Keshav, Akashdeep Bhardwaj, Susheela Dahiya, et al. "Multinomial Naive Bayesian Classifier Framework for Systematic Analysis of Smart IoT Devices." Sensors 22, no. 19 (2022): 7318. http://dx.doi.org/10.3390/s22197318.

Full text
Abstract:
Businesses need to use sentiment analysis, powered by artificial intelligence and machine learning to forecast accurately whether or not consumers are satisfied with their offerings. This paper uses a deep learning model to analyze thousands of reviews of Amazon Alexa to predict customer sentiment. The proposed model can be directly applied to any company with an online presence to detect customer sentiment from their reviews automatically. This research aims to present a suitable method for analyzing the users’ reviews of Amazon Echo and categorizing them into positive or negative thoughts. A dataset containing reviews of 3150 users has been used in this research work. Initially, a word cloud of positive and negative reviews was plotted, which gave a lot of insight from the text data. After that, a deep learning model using a multinomial naive Bayesian classifier was built and trained using 80% of the dataset. Then the remaining 20% of the dataset was used to test the model. The proposed model gives 93% accuracy. The proposed model has also been compared with four models used in the same domain, outperforming three.
APA, Harvard, Vancouver, ISO, and other styles
23

Mohammed Ali, Sura I., Marwah Nihad, Hussien Mohamed Sharaf, and Haitham Farouk. "Machine learning for text document classification-efficient classification approach." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 703. http://dx.doi.org/10.11591/ijai.v13.i1.pp703-710.

Full text
Abstract:
<p>Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides a text document categorization method that is both efficient and effective. In addition, methods for determining the proper relationship between a set of words in a document and its document categorization is also obtained.</p>
APA, Harvard, Vancouver, ISO, and other styles
24

Rodrigo, Enrique G., Juan C. Alfaro, Juan A. Aledo, and José A. Gámez. "Mixture-Based Probabilistic Graphical Models for the Label Ranking Problem." Entropy 23, no. 4 (2021): 420. http://dx.doi.org/10.3390/e23040420.

Full text
Abstract:
The goal of the Label Ranking (LR) problem is to learn preference models that predict the preferred ranking of class labels for a given unlabeled instance. Different well-known machine learning algorithms have been adapted to deal with the LR problem. In particular, fine-tuned instance-based algorithms (e.g., k-nearest neighbors) and model-based algorithms (e.g., decision trees) have performed remarkably well in tackling the LR problem. Probabilistic Graphical Models (PGMs, e.g., Bayesian networks) have not been considered to deal with this problem because of the difficulty of modeling permutations in that framework. In this paper, we propose a Hidden Naive Bayes classifier (HNB) to cope with the LR problem. By introducing a hidden variable, we can design a hybrid Bayesian network in which several types of distributions can be combined: multinomial for discrete variables, Gaussian for numerical variables, and Mallows for permutations. We consider two kinds of probabilistic models: one based on a Naive Bayes graphical structure (where only univariate probability distributions are estimated for each state of the hidden variable) and another where we allow interactions among the predictive attributes (using a multivariate Gaussian distribution for the parameter estimation). The experimental evaluation shows that our proposals are competitive with the start-of-the-art algorithms in both accuracy and in CPU time requirements.
APA, Harvard, Vancouver, ISO, and other styles
25

Mandasari, Sartika, Roslina Roslina, and B. Herawan Hayadi. "Text Mining In Online Transportation User Sentiment Analysis On Social Media Twitter Using The Multinomial Naive Bayesian Classifier Method And K-nearest Neighbor Method." International Conference on Information Science and Technology Innovation (ICoSTEC) 2, no. 1 (2023): 192–97. http://dx.doi.org/10.35842/icostec.v2i1.61.

Full text
Abstract:
Text mining is the process of detecting information or something new and researching large information. Text mining can also usually perform an analysis of unstructured text. Social media users in Indonesia, which currently almost reach 200 million users, have resulted in a flood of data. This condition makes text mining a solution to extract knowledge from the flood of data [1] . In exploring knowledge, there are various techniques or methods that can be adopted including the Multinomial Naive Bayesian Clasifier and K-Nearest Neighbor methods. Both of these methods have several phases that are able to explore the potential knowledge of a flood of supervised and unsupervised learning data. It is hoped that the combination of these two methods will help analyze public sentiment or perception towards online motorcycle taxi users in Indonesia [2].
APA, Harvard, Vancouver, ISO, and other styles
26

Saikia, Hemanta, and Dibyojyoti Bhattacharjee. "On Classification of All-rounders of the Indian Premier League (IPL): A Bayesian Approach." Vikalpa: The Journal for Decision Makers 36, no. 4 (2011): 51–66. http://dx.doi.org/10.1177/0256090920110404.

Full text
Abstract:
An all-rounder can take an imperative role in any version of the game of cricket, whether it is a test match or any other limited-over format of the game. The study classifies the performance of all-rounders who participated in IPL based on their strike rate and economy rate. Based on the factors mentioned, the all-rounders can be divided into four non-overlapping classes, viz., Performer, Batting All-rounder, Bowling All-rounder, and Under-performer. Several predictor variables that are supposed to influence the performance of all-rounders are considered. Step-wise multinomial logistic regression (SMLR) is used to identify the significant predictors. Samples of six incumbent all-rounders who had not participated in the first three seasons of IPL are considered. The significant predictors were then used to predict the expected class of an incumbent all-rounder using naive Bayesian classification model. The relevant data were collected from the websites, www.cricinfo.org and www.cricketnirvana.com. The key points of this study are as follows: The training sample is populated with 35 all-rounders who had performed in the first three seasons of IPL. Two variables, viz., strike rate (number of runs scored per 100 balls faced) and economy rate (average number of runs scored per over against the bowler) are used to classify the all-rounders as follows: Performer: An all-rounder with strike rate above median and economy rate below median. Batting All-rounder: An all-rounder with strike rate above median and economy rate above median. Bowling All-rounder: An all-rounder with strike rate below median and economy rate below median. Under-performer: An all-rounder with strike rate below median and economy rate above median. The step-wise multinomial logistic regression (SMLR) was used to identify the significant variables that are actually responsible for classification of the all-rounders. The strike rate in ODI, strike rate in Twenty-20, economy rate in ODI, economy rate in Twenty-20 and bowling type (Spin or Fast) of the all-rounders are found to be significant in determining the class of an all-rounder. The naive Bayesian classification model is used for forecasting the expected class of allrounders based on the significant predictors for six incumbent all-rounders who had played only in fourth season of IPL. The prediction done before IPL IV was then compared with the actual situation at the end of the tournament. It is found that four predictions were performed correctly out of the six. This model would be useful for the participating teams' management while deciding the bid of an all-rounder in the upcoming season of IPL as per their requirement.
APA, Harvard, Vancouver, ISO, and other styles
27

Barchuk, Anton, A. Atroshchenko, V. Gaydukov, et al. "AUTOMATED DIAGNOSIS IN A POPULATION-BASED SCREENING FOR LUNG CANCER." Problems in oncology 63, no. 2 (2017): 215–20. http://dx.doi.org/10.37469/0507-3758-2017-63-2-215-220.

Full text
Abstract:
Oncologists nowadays are faced with big amount of heterogeneous medical data of diagnostic studies. Possible errors in determining the nature and extent of spread the tumor process will inevitably reduce the effectiveness of treatment and increase the unnecessary costs to it. To reduce the burden on clinicians, various computer-aided solutions based on machine learning algorithms are being developed. We made an attempt to evaluate effectiveness of thirteen machine learning algorithms in the tasks of classification of pathologic tissue samples in cancerous thorax based on gene expression levels. For a preliminary study we used open data set of molecular genetics composition of lung adenocarcinoma and pleural mesothelioma. Effectiveness of machine learning algorithms was evaluated by Matthews correlation coefficient and Area Under ROC Curve. Best results were showed by two methods: Bayesian logistic regression and Discriminative Multinomial Naive Bayes classifier. Nevertheless, all of the methods were effective at automatic discrimination of two types of cancer. That proves machine learning algorithms are applicable in lung cancer classification. In the future studies it will be carried out a similar analysis of the diagnostic value of methods for other malignancies with more complex differential morphological diagnosis. Similar methods can be applied to other diagnostic studies including computerized tomography image analysis in the differential diagnosis of lung nodules.
APA, Harvard, Vancouver, ISO, and other styles
28

Wang, Peipei, Kun Wang, Yunhan Huang, and Peter Fenn. "A contingency approach for time-cost trade-off in construction projects based on machine learning techniques." Engineering, Construction and Architectural Management, May 29, 2023. http://dx.doi.org/10.1108/ecam-11-2022-1104.

Full text
Abstract:
PurposeTime-cost trade-off is normal conduct in construction projects when projects are expectedly late for delivery. Existing research on time-cost trade-off strategic management mostly focused on the technical calculation towards the optimal combination of activities to be accelerated, while the managerial aspects are mostly neglected. This paper aims to understand the managerial efforts necessary to prepare construction projects ready for an upcoming trade-off implementation.Design/methodology/approachA preliminary list of critical factors was first identified from the literature and verified by a Delphi survey. Quantitative data was then collected by a questionnaire survey to first shortlist the preliminary factors and quantify the predictive model with different machine learning algorithms, i.e. k-nearest neighbours (kNN), radial basis function (RBF), multiplayer perceptron (MLP), multinomial logistic regression (MLR), naïve Bayes classifier (NBC) and Bayesian belief networks (BBNs).FindingsThe model's independent variable importance ranking revealed that the top challenges faced were the realism of contractual obligation, contractor planning and control and client management and monitoring. Among the tested machine learning algorithms, multilayer perceptron was demonstrated to be the most suitable in this case. This model accuracy reached 96.5% with the training dataset and 95.6% with an independent test dataset and could be used as the contingency approach for time-cost trade-offs.Originality/valueThe identified factor list contributed to the theoretical explanation of the failed implementation in general and practical managerial improvement to better avoid such failure. In addition, the established predictive model provided an ad-hoc early warning and diagnostic tool to better ensure time-cost implementation success.
APA, Harvard, Vancouver, ISO, and other styles
29

"Aspect Based Opinion Mining on Mobile Product." International Journal of Innovative Technology and Exploring Engineering 9, no. 5 (2020): 2026–31. http://dx.doi.org/10.35940/ijitee.e3059.039520.

Full text
Abstract:
In this era, the web technology is growing quickly. Many people express their feedback related to products, social issues and services. As e-commerce site is becoming more popular, the customer review related to the product grows quickly. Due to this growth it is very difficult for the customer to read huge amount of data and make a decision whether to buy a product or not. It is also very difficult for the manufacturer of the product in-order to manage and focus on customer opinions. In this research we focus on mobile product review which is extracted from Kaggle site. In this experiment we have focused on one particular mobile product review that is Samsung. After data collection we do pre-processing, and further we extract aspect and corresponding opinion using Natural language processing and then categorize whether the extracted opinion is positive or negative by finding polarity for each extracted opinion of words. Finally performance evaluation is done by using two machine learning algorithm i.e. Multinomial Naive Bayesian (MNB) and K-Nearest Neighbour (K-NN) algorithm. This performance evaluation is calculated based on bag of words. Out of two algorithms K-NN gave best accuracy compared to Multinomial Naive Bayesian.
APA, Harvard, Vancouver, ISO, and other styles
30

Leguey, Ignacio, Concha Bielza, and Pedro Larrañaga. "Circular Bayesian classifiers using wrapped Cauchy distributions." July 1, 2019. https://doi.org/10.5281/zenodo.3743300.

Full text
Abstract:
Capturing the dependences among circular variables within supervised classification models is a challenging task. In this paper, we propose four different supervised Bayesian classification algorithms where the predictor variables follow all circular wrapped Cauchy distributions. For this purpose, we introduce four wrapped Cauchy classifiers. The bivariate wrapped Cauchy distribution is the only bivariate circular distribution whose marginals and conditionals are also wrapped Cauchy distributions, a property that makes it possible to define these models easily. Furthermore, the wrapped Cauchy tree-augmented naive Bayes classifier requires the definition of a conditional circular mutual information measure between variables that follow wrapped Cauchy distributions. Synthetic data is used to illustrate, compare and evaluate the classification algorithms (including a comparison with the Gaussian TAN classifier, decision tree, random forest, multinomial logistic regression, support vector machine and simple neural network), leading to satisfactory predictive results. We also use a real neuromorphological dataset obtained from juvenile rat somatosensory cortex cells, where we measure the bifurcation angles of the dendritic basal arbors.
APA, Harvard, Vancouver, ISO, and other styles
31

Chung, Hyesun, and X. Jessie Yang. "Predicting Trust Dynamics With Personal Characteristics." Proceedings of the Human Factors and Ergonomics Society Annual Meeting, September 9, 2024. http://dx.doi.org/10.1177/10711813241260383.

Full text
Abstract:
Previous research into trust dynamics in human-autonomy interaction has demonstrated that individuals exhibit specific patterns of trust when interacting repeatedly with automated systems. Moreover, people with different types of trust dynamics have been shown to differ across seven personal characteristic dimensions: masculinity, positive affect, extraversion, neuroticism, intellect, performance expectancy, and high expectations. In this study, we develop classification models aimed at predicting an individual’s trust dynamics type–categorized as Bayesian decision-maker, disbeliever, or oscillator–based on these key dimensions. We employed multiple classification algorithms including the random forest classifier, multinomial logistic regression, Support Vector Machine, XGBoost, and Naive Bayes, and conducted a comparative evaluation of their performance. The results indicate that personal characteristics can effectively predict the type of trust dynamics, achieving an accuracy rate of 73.1%, and a weighted average F1 score of 0.64. This study underscores the predictive power of personal traits in the context of human-autonomy interaction.
APA, Harvard, Vancouver, ISO, and other styles
32

Zou, Yutong, and Fang Liu. "#1756 Development and internal validation of machine learning algorithms for mortality prediction model of people with DM and CKD." Nephrology Dialysis Transplantation 39, Supplement_1 (2024). http://dx.doi.org/10.1093/ndt/gfae069.501.

Full text
Abstract:
Abstract Background and Aims Diabetes Mellitus (DM) and Chronic Kidney Disease (CKD) are leading causes of morbidity and mortality, presenting challenges in patient management. The aim of this study is to develop a machine learning-based predictive model for mortality in DM and CKD patients, improving early intervention and treatment personalization. Method 3,637 participants with DM and CKD from the National Health and Nutrition Examination Survey (NHANES) from 1999 to 2018 were included. All-cause mortality was ascertained by linkage to National Death Index records through 31 December 2019 in NHANES. The dataset included clinical profiles, demographic details, and laboratory results of patients with DM and CKD. We performed 100 random groupings, dividing the data into a 75% training set and a 25% validation set each time. We then visualized the results of 100 AUC measurements using a boxplot. The objective was to forecast patient survival over 1, 3, 5, and 10-year horizons using a variety of machine learning algorithms. The algorithms tested included Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), K-Nearest Neighbors (KNN), Gaussian Naive Bayes (GNB), Multinomial Naive Bayes (MNB), Bayesian Network Classifier (BNC), AdaBoost (Ada), Gradient Boosting (Gradient), and Extreme Gradient Boosting (XG). The performance of each algorithm was evaluated based on the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC), with the results visualized in boxplot format. Results Upon result shown as the boxplots, it becomes evident that the Random Forest (RF) algorithm consistently exhibits one of the highest AUCs on average across all prediction intervals, suggesting a strong and stable predictive capacity. In the 1-year survival prediction, the RF algorithm demonstrates a highest AUC, implying consistent performance. The 3 and 5-year predictions reveal a slight dip in AUC for most models, including RF, but it still remains one of the top-performing algorithms with fewer outliers than XG and Gradient Boosting. The mean AUC of RF is particularly noteworthy at the 10-year mark, where most algorithms struggle with prediction accuracy, as indicated by lower median AUCs and greater variability. Notably, the RF algorithm identified six major factors for 1-year survival prediction: systolic blood pressure, uric acid, blood urea nitrogen, lactate dehydrogenase, lymphocyte count and total cholesterol, as well as three major factors for 10-year survival prediction: age, lymphocyte and gamma-glutamyl transferase. Conclusion Overall, the study indicates that while machine learning can be a powerful tool for survival prediction in patients with DM and CKD, the choice of algorithm is crucial, with RF standing out for its consistent and reliable performance.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!