Log in

Relevant bibliographies by topics / UCI repository / Journal articles

To see the other types of publications on this topic, follow the link: UCI repository.

Journal articles on the topic 'UCI repository'

Author: Grafiati

Published: 4 June 2025

Last updated: 1 August 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'UCI repository.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Macià, Núria, and Ester Bernadó-Mansilla. "Towards UCI+: A mindful repository design." Information Sciences 261 (March 2014): 237–62. http://dx.doi.org/10.1016/j.ins.2013.08.059.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Chu, Xianghua, Shuxiang Li, Da Gao, Wei Zhao, Jianshuang Cui, and Linya Huang. "A Binary Superior Tracking Artificial Bee Colony with Dynamic Cauchy Mutation for Feature Selection." Complexity 2020 (November 9, 2020): 1–13. http://dx.doi.org/10.1155/2020/8864315.

Full text

Abstract:

This paper aims to propose an improved learning algorithm for feature selection, termed as binary superior tracking artificial bee colony with dynamic Cauchy mutation (BSTABC-DCM). To enhance exploitation capacity, a binary learning strategy is proposed to enable each bee to learn from the superior individuals in each dimension. A dynamic Cauchy mutation is introduced to diversify the population distribution. Ten datasets from UCI repository are adopted as test problems, and the average results of cross-validation of BSTABC-DCM are compared with other seven popular swarm intelligence metaheuri

APA, Harvard, Vancouver, ISO, and other styles

3

Дюк, В. А., И. Г. Малыгин, and В. И. Прицкер. "Vehicle recognition by silhouettes – a three-stage machine learning method in computer vision systems." MORSKIE INTELLEKTUAL`NYE TEHNOLOGII)</msg>, no. 2(56) (June 9, 2022): 162–67. http://dx.doi.org/10.37220/mit.2022.56.2.022.

Full text

Abstract:

В районах морских портов, на морских и сухопутных трассах актуальной является задача учета и контроля различных транспортных средств. Для решения этой задачи всё чаще используются технические системы распознавания таких средств, использующие видеокамеры. Однако видеоизображения по ряду причин не всегда бывают высокого качества. Поэтому теоретический и практический интерес представляет задача распознавания транспортных средств по сильно загрубленным их изображениям – силуэтам. В нашем исследовании используется экспериментальный материал из репозитория данных UCI (UCI Machine Learning Repository

APA, Harvard, Vancouver, ISO, and other styles

4

Lee, Chun-Yao, and Guang-Lin Zhuo. "A Hybrid Whale Optimization Algorithm for Global Optimization." Mathematics 9, no. 13 (2021): 1477. http://dx.doi.org/10.3390/math9131477.

Full text

Abstract:

This paper proposes a hybrid whale optimization algorithm (WOA) that is derived from the genetic and thermal exchange optimization-based whale optimization algorithm (GWOA-TEO) to enhance global optimization capability. First, the high-quality initial population is generated to improve the performance of GWOA-TEO. Then, thermal exchange optimization (TEO) is applied to improve exploitation performance. Next, a memory is considered that can store historical best-so-far solutions, achieving higher performance without adding additional computational costs. Finally, a crossover operator based on t

APA, Harvard, Vancouver, ISO, and other styles

5

P., Ashok, and G. M. Kadhar Nawaz. "Outlier Detection Method on UCI Repository Dataset by Entropy Based Rough K-means." Defence Science Journal 66, no. 2 (2016): 113. http://dx.doi.org/10.14429/dsj.66.9463.

Full text

Abstract:

<p>Rough set theory is used to handle uncertainty and incomplete information by applying two sets, lower and upper approximation. In this paper, the clustering process is improved by adapting the preliminary centroid selection method on rough K-means (RKM) algorithm. The entropy based rough K-means (ERKM) method is developed by adapting entropy based preliminary centroids selection on RKM and executed and also validated by cluster validity indexes. An example shows that the ERKM performs effectively by selection of entropy based preliminary centroid. In addition, Outlier detection is an

APA, Harvard, Vancouver, ISO, and other styles

6

Naz, Mehreen, Kashif Zafar, and Ayesha Khan. "Ensemble Based Classification of Sentiments Using Forest Optimization Algorithm." Data 4, no. 2 (2019): 76. http://dx.doi.org/10.3390/data4020076.

Full text

Abstract:

Feature subset selection is a process to choose a set of relevant features from a high dimensionality dataset to improve the performance of classifiers. The meaningful words extracted from data forms a set of features for sentiment analysis. Many evolutionary algorithms, like the Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), have been applied to feature subset selection problem and computational performance can still be improved. This research presents a solution to feature subset selection problem for classification of sentiments using ensemble-based classifiers. It consists o

APA, Harvard, Vancouver, ISO, and other styles

7

Kurniawan, Ilham. "Prediksi Gejala Autism Spectrum Disorders pada Remaja Menggunakan Optimasi Particle Swarm Optimization dan Algoritma Support Vector Machine." INFORMATICS FOR EDUCATORS AND PROFESSIONAL : Journal of Informatics 4, no. 2 (2020): 113. http://dx.doi.org/10.51211/itbi.v4i2.1306.

Full text

Abstract:

Abstrak: Telah ada peningkatan prevalensi diagnosis Autism Spectrum Disorder (ASD) secara global selama dekade terakhir. Perkiraan prevalensi ASD yang diperbarui dan keseluruhan di Asia akan membantu para profesional kesehatan untuk mengembangkan strategi kesehatan masyarakat yang relevan. Dalam penelitian ini, mengusulkan metode untuk prediksi gejala ASD menggunakan teknik integrasi seleksi fitur PSO dan algoritma Support Vector Machine. Penelitian ini menggunakan dataset dari UCI repository. Model yang diusulkan meliputi penerapan seleksi fitur menggunakan particle swarm optimization (PSO),

APA, Harvard, Vancouver, ISO, and other styles

8

Yamaguchi, Naoto, Mao Wu, Michinori Nakata, and Hiroshi Sakai. "Application of Rough Set-Based Information Analysis to Questionnaire Data." Journal of Advanced Computational Intelligence and Intelligent Informatics 18, no. 6 (2014): 953–61. http://dx.doi.org/10.20965/jaciii.2014.p0953.

Full text

Abstract:

This article reports an application ofRough Nondeterministic Information Analysis (RNIA)to two data sets. One is the Mushroom data set in the UCI machine leaning repository, and the other is a student questionnaire data set. Even though these data sets include many missing values, we obtained some interesting rules by using ourgetRNIAsoftware tool. This software is powered by theNIS-Apriorialgorithm, and we apply rule generation and question-answering functionalities to data sets with nondeterministic values.

APA, Harvard, Vancouver, ISO, and other styles

9

Zouggar, Souad Taleb, and Abdelkader Adla. "Proposal for Measuring Quality of Decision Trees Partition." International Journal of Decision Support System Technology 9, no. 4 (2017): 16–36. http://dx.doi.org/10.4018/ijdsst.2017100102.

Full text

Abstract:

To compute a partition quality for a decision tree, we propose a new measure called NIM “New Information Measure”. The measure is simpler, provides similar performance, and sometimes outperforms the existing measures used with tree-based methods. The experimental results using the MONITDIAB application (Taleb & Atmani, 2013) and datasets from the UCI repository (Asuncion & Newman, 2007) confirm the classification capabilities of our proposal in comparison to the Shannon measure used with ID3 and C4.5 decision tree methods.

APA, Harvard, Vancouver, ISO, and other styles

10

Mbevi, Rose Mueni, John Kamau, and Faith Mueni Musyoka. "Content Based Approach for Detecting Smishing Messages in Mobile Phones Using an Improved Convolutional Neural Networks Model." African Journal of Empirical Research 6, no. 2 (2025): 188–204. https://doi.org/10.51867/ajernet.6.2.17.

Full text

Abstract:

SMS stands for Short Message Service (SMS). Short messaging service is a text messaging service where a user can send short messages via a mobile device. Short message service has evolved and become very popular as a communication medium in the last decade. It has become a more effective mode of communication compared to email. Unfortunately, smishing (SMS phishing) has emerged as the most common type of spam because traditional detection methods have difficulty understanding the informal nature of these messages. An improved class of CNN-based models targeted at accurate detection of smishing

APA, Harvard, Vancouver, ISO, and other styles

11

Setiawati, Intan, Adityo Permana, and Arief Hermawan. "IMPLEMENTASI DECISION TREE UNTUK MENDIAGNOSIS PENYAKIT LIVER." Journal of Information System Management (JOISM) 1, no. 1 (2019): 13–17. http://dx.doi.org/10.24076/joism.2019v1i1.17.

Full text

Abstract:

Hati merupakan salah satu organ manusia yang paling penting. UCI Machine Learning Repository mempunyai banyak dataset, salah satunya adalah dataset ILPD (Indian Liver Patient Dataset). Penelitian ini membahas tentang klasifikasi penyakit liver pada dataset ILPD menggunakan Algoritma Decision Tree C4.5. Berdasarkan hasil pengolahan yang dilakukan, didapatkan bahwa Algoritma Decision Tree C4.5 menghaasilkan nilai akurasi sebesar 72.67% dan juga membuktikan bahwa dari 11 variabel penyakit liver yang ada pada dataset ILPD, hanya 2 variabel (Almine Alminotransferase) yang menjadi pokok dalam penent

APA, Harvard, Vancouver, ISO, and other styles

12

Prahartiwi, Lusa Indah, and Wulan Dari. "Komparasi Algoritma Naive Bayes, Decision Tree dan Support Vector Machine untuk Prediksi Penyakit Kanker Payudara." Jurnal Teknik Komputer 7, no. 1 (2021): 51–54. http://dx.doi.org/10.31294/jtk.v7i1.9191.

Full text

Abstract:

Kanker payudara merupakan kanker paling umum pada wanita di seluruh dunia dengan menyumbang 25,4% dari total jumlah kasus baru yang didiagnosis pada tahun 2018. Kanker adalah sekelompok besar penyakit yang dapat dimulai di hampir semua organ atau jaringan tubuh ketika sel abnormal tumbuh tak terkendali, melampaui batas biasanya untuk menyerang bagian tubuh yang berdekatan dan/atau menyebar ke organ lain. Penyakit kanker payudara dapat diprediksi dengan pengetahuan data mining. Data mining dapat menemukan korelasi, pola, dan tren baru yang bermakna dengan memilah-milah data dalam jumlah besar y

APA, Harvard, Vancouver, ISO, and other styles

13

Gunawan, Gunawan, Abd Charis Fauzan, and Harliana Harliana. "Implementasi Algoritma Decision Tree Iterative Dichotomiser 3 (ID3) untuk Prediksi Keberhasilan Pengobatan Penyakit Kutil Menggunakan Cryotherapy." Jurnal Bumigora Information Technology (BITe) 4, no. 1 (2022): 73–82. http://dx.doi.org/10.30812/bite.v4i1.1949.

Full text

Abstract:

Kutil merupakan salah satu penyebab gangguan kesehatan kulit, yang ditandai dengan adanya tonjolan kecil pada kulit. Masalah ini disebabkan oleh human papillomavirus (HPV). Menyembuhkan kutil menggunakan cryotherapy adalah salah satu jenis pengobatan kutil yang direkomendasikan oleh beberapa profesional kesehatan. Prosedur yang digunakan dalam perawatan ini adalah dengan membekukan kutil menggunakan nitrogen cair. Penelitian ini bertujuan untuk memprediksi tingkat keberhasilan penyembuhan pengobatan penyakit kutil sehingga dapat dikembangkan tindakan pencegahan. Penelitian ini menggunakan data

APA, Harvard, Vancouver, ISO, and other styles

14

Ibrahim, Ashraf Osman, Walaa Akif Hussien, Ayat Mohammoud Yagoop та Mohd Arfian Ismail. "Feature Selection and Radial Basis Function Network for Parkinson Disease Classiﬁcation". Kurdistan Journal of Applied Research 2, № 3 (2017): 167–71. http://dx.doi.org/10.24017/science.2017.3.121.

Full text

Abstract:

Recently, several works have focused on detection of a different disease using computational intelligence techniques. In this paper, we applied feature selection method and radial basis function neural network (RBFN) to classify the diagnosis of Parkinson’s disease. The feature selection (FS) method used to reduce the number of attributes in Parkinson disease data. The Parkinson disease dataset is acquired from UCI repository of large well-known data sets. The experimental results have revealed significant improvement to detect Parkinson’s disease using feature selection method and RBF network

APA, Harvard, Vancouver, ISO, and other styles

15

Yılmaz, Ersen. "An Expert System Based on Fisher Score and LS-SVM for Cardiac Arrhythmia Diagnosis." Computational and Mathematical Methods in Medicine 2013 (2013): 1–6. http://dx.doi.org/10.1155/2013/849674.

Full text

Abstract:

An expert system having two stages is proposed for cardiac arrhythmia diagnosis. In the first stage, Fisher score is used for feature selection to reduce the feature space dimension of a data set. The second stage is classification stage in which least squares support vector machines classifier is performed by using the feature subset selected in the first stage to diagnose cardiac arrhythmia. Performance of the proposed expert system is evaluated by using an arrhythmia data set which is taken from UCI machine learning repository.

APA, Harvard, Vancouver, ISO, and other styles

16

Mattiev, Jamolbek, and Branko Kavsek. "Coverage-Based Classification Using Association Rule Mining." Applied Sciences 10, no. 20 (2020): 7013. http://dx.doi.org/10.3390/app10207013.

Full text

Abstract:

Building accurate and compact classifiers in real-world applications is one of the crucial tasks in data mining nowadays. In this paper, we propose a new method that can reduce the number of class association rules produced by classical class association rule classifiers, while maintaining an accurate classification model that is comparable to the ones generated by state-of-the-art classification algorithms. More precisely, we propose a new associative classifier that selects “strong” class association rules based on overall coverage of the learning set. The advantage of the proposed classifie

APA, Harvard, Vancouver, ISO, and other styles

17

Geldiev, Ertan Mustafa, Nayden Valkov Nenkov, and Mariana Mateeva Petrova. "EXERCISE OF MACHINE LEARNING USING SOME PYTHON TOOLS AND TECHNIQUES." CBU International Conference Proceedings 6 (September 25, 2018): 1062–70. http://dx.doi.org/10.12955/cbup.v6.1295.

Full text

Abstract:

One of the goals of predictive analytics training using Python tools is to create a "Model" from classified examples that classifies new examples from a Dataset. The purpose of different strategies and experiments is to create a more accurate prediction model. The goals we set out in the study are to achieve successive steps to find an accurate model for a dataset and preserving it for its subsequent use using the python instruments. Once we have found the right model, we save it and load it later, to classify if we have "phishing" in our case. In the case that the path we reach to the discove

APA, Harvard, Vancouver, ISO, and other styles

18

Hamed, Samer, Abdelwadood Mesleh, and Abdullah Arabiyyat. "Breast Cancer Detection Using Machine Learning Algorithms." International Journal of Computer Science and Mobile Computing 10, no. 11 (2021): 4–11. http://dx.doi.org/10.47760/ijcsmc.2021.v10i11.002.

Full text

Abstract:

This paper presents a computer-aided design (CAD) system that detects breast cancers (BCs). BC detection uses random forest, AdaBoost, logistic regression, decision trees, naïve Bayes and conventional neural networks (CNNs) classifiers, these machine learning (ML) based algorithms are trained to predicting BCs (malignant or benign) on BC Wisconsin data-set from the UCI repository, in which attribute clump thickness is used as evaluation class. The effectiveness of these ML algorithms are evaluated in terms of accuracy and F-measure; random forest outperformed the other classifiers and achieved

APA, Harvard, Vancouver, ISO, and other styles

19

Singh, Abhishek, Zohaib Hasan2, and Saurabh Sharma. "Improving Banknote Authentication Accuracy through Logistic Regression and Data Preprocessing." International Journal of Innovative Research in Computer and Communication Engineering 10, no. 10 (2023): 11779–85. http://dx.doi.org/10.15680/ijircce.2022.1001044.

Full text

Abstract:

This study employs a logistic regression model to classify the authenticity of banknotes based on their variance, skewness, kurtosis, and entropy features. Using a dataset from the UCI Machine Learning Repository, the model demonstrates high accuracy in distinguishing between authentic and forged banknotes. The dataset is preprocessed, split into training and test sets, and scaled for optimal performance. The trained logistic regression model achieves an accuracy of 98.36%, illustrating its effectiveness. Additionally, the model's prediction capabilities are validated with new banknote data. T

APA, Harvard, Vancouver, ISO, and other styles

20

Takenouchi, Takashi, and Shin Ishii. "A Multiclass Classification Method Based on Decoding of Binary Classifiers." Neural Computation 21, no. 7 (2009): 2049–81. http://dx.doi.org/10.1162/neco.2009.03-08-740.

Full text

Abstract:

In this letter, we present new methods of multiclass classification that combine multiple binary classifiers. Misclassification of each binary classifier is formulated as a bit inversion error with probabilistic models by making an analogy to the context of information transmission theory. Dependence between binary classifiers is incorporated into our model, which makes a decoder a type of Boltzmann machine. We performed experimental studies using a synthetic data set, data sets from the UCI repository, and bioinformatics data sets, and the results show that the proposed methods are superior t

APA, Harvard, Vancouver, ISO, and other styles

21

Naseem, Rashid, Bilal Khan, Muhammad Arif Shah, et al. "Performance Assessment of Classification Algorithms on Early Detection of Liver Syndrome." Journal of Healthcare Engineering 2020 (December 12, 2020): 1–13. http://dx.doi.org/10.1155/2020/6680002.

Full text

Abstract:

In the recent era, a liver syndrome that causes any damage in life capacity is exceptionally normal everywhere throughout the world. It has been found that liver disease is exposed more in young people as a comparison with other aged people. At the point when liver capacity ends up, life endures just up to 1 or 2 days scarcely, and it is very hard to predict such illness in the early stage. Researchers are trying to project a model for early prediction of liver disease utilizing various machine learning approaches. However, this study compares ten classifiers including A1DE, NB, MLP, SVM, KNN,

APA, Harvard, Vancouver, ISO, and other styles

22

Kurniawan, Muchamad, Maftahatul Hakimah, and Siti Agustini. "Perbandingan SVM dan Perceptron dengan Optimasi Heuristik." Jurnal Telematika 15, no. 2 (2021): 85–92. http://dx.doi.org/10.61769/telematika.v15i2.356.

Full text

Abstract:

Support Vector Machine (SVM) and Perceptron are methods used in machine learning to determine classification. Both methods have the same motivation, namely to get the dividing line (hyperplane). Hyperplane can be obtained by using the optimization method Gradient Descent (GD), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO). This study compares machine learning methods (Support Vector Machine and Perceptron) to optimization methods (Gradient Descent, Genetic Algorithm, and Particle Swarm Optimization) to find hyperplane. The dataset used is Iris Flower obtained from the UCI Machi

APA, Harvard, Vancouver, ISO, and other styles

23

Muliawan, Agung, Achmad Rizal, and Sugondo Hadiyoso. "Heart Disease Prediction based on Physiological Parameters Using Ensemble Classifier and Parameter Optimization." Journal of Applied Engineering and Technological Science (JAETS) 5, no. 1 (2023): 258–67. http://dx.doi.org/10.37385/jaets.v5i1.2169.

Full text

Abstract:

This study describes the prediction of heart disease using ensemble classifiers with parameter optimization. As input, a public dataset was taken from UCI machine learning repository, which refers to the dataset at UCI Machine learning. The dataset consists of 13 variables that are considered to influence heart disease. Particle swarm optimization (PSO) was used for feature selection and principal component analysis (PCA) for feature extraction to reduce the features' dimensions. The application of parameter optimization on several machine learning methods such as SVM (Radial Basis Function),

APA, Harvard, Vancouver, ISO, and other styles

24

Nanda, Bijaya Kumar, and Satchidananda Dehuri. "Ant Miner." International Journal of Applied Evolutionary Computation 11, no. 2 (2020): 47–64. http://dx.doi.org/10.4018/ijaec.2020040104.

Full text

Abstract:

Discovering classification rules from large data is an important task of data mining and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. Our ant miner is inspired by research on the behavior of real ant colonies, simulated annealing, and some data mining concepts as well as principles. Here we present a Michigan style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. Our experimental outcomes confirm that ant miner-HMC (Hybrid Michigan Style Classifi

APA, Harvard, Vancouver, ISO, and other styles

25

Rattan, Vikas, Ruchi Mittal, and Varun Malik. "Comparative Study of Data Mining Classifiers for Students’ Academic Performance." Journal of Computational and Theoretical Nanoscience 17, no. 9 (2020): 4548–52. http://dx.doi.org/10.1166/jctn.2020.9277.

Full text

Abstract:

Tremendous growth of educational institutions forced educational institutes to adopt data mining techniques to bring out important and yet unknown facts from educational data to have a competitive edge over their counterparts. In this paper, student performance dataset comprises of 131 records is taken from UCI repository and data mining tool Orange is used to study the comparative analyses of accuracy for classifying the performance of student in graduation using four classifiers namely random forest, k nearest neighbor (KNN), decision tree and naïve bayes. The result shows that decision tree

APA, Harvard, Vancouver, ISO, and other styles

26

Jin-Mao, Wei, Wang Shu-Qin, and Wang Ming-Yang. "Novel Approach to Decision-Tree Construction." Journal of Advanced Computational Intelligence and Intelligent Informatics 8, no. 3 (2004): 332–35. http://dx.doi.org/10.20965/jaciii.2004.p0332.

Full text

Abstract:

A new approach is presented, in which rough set theory is applied to select attributes as nodes of a decision tree. Initially, dataset is partitioned into subsets based on different condition attributes, then an attribute is chosen as a node for branching when the size of its corresponding implicit region is smaller than that of all other attributes. This approach is compared to the entropy-based method on some datasets from the UCI Machine Learning Database Repository, which instantiates the performance of the rough set approach. Statistical experiments showed that the proposed approach is fe

APA, Harvard, Vancouver, ISO, and other styles

27

Maulidah, Nurlaelatul. "Prediksi Peningkatan Jumlah Nasabah Deposito Berjangka Menggunakan Algoritma KNN, Decision Tree, Random Forest Dan Xgboost." InComTech : Jurnal Telekomunikasi dan Komputer 13, no. 2 (2023): 90. http://dx.doi.org/10.22441/incomtech.v13i2.16921.

Full text

Abstract:

Bank merupakan sebuah lembaga keuangan yang umumnya didirikan untuk menghimpun dana dari masyarakat dalam bentuk simpanan dan menyalurkan kepada masyarakat dalam bentuk kredit atau bentuk lainnya dengan rangka meningkatkan taraf hidup rakyat banyak. Pada penelitian ini, dilakukan pengujian empat algoritma machine learning yaitu K-Nearest Neighbor (K-NN), Decision Tree, Random Forest dan XGBoost, untuk mengetahui dan membandingkan tingkat akurasi dari masing-masing algoritma tersebut dalam melakukan prediksi terhadap peningkatan jumlah nasabah deposito berjangka bank. Pada penelitian ini datase

APA, Harvard, Vancouver, ISO, and other styles

28

Prince, Kumar. "Evaluating Machine Learning Algorithms for Enhanced Prediction of Student Academic Performance." International Journal of Innovative Science and Research Technology (IJISRT) 9, no. 12 (2025): 2581–84. https://doi.org/10.5281/zenodo.14613845.

Full text

Abstract:

This study aims to evaluate and compare the predictive performance of decision trees, random forests, support vector machines, and neural networks in forecasting student academic outcomes based on academic and demographic factors. The research utilizes a dataset from the UCI Machine Learning Repository, encompassing student performance data from Portuguese secondary schools. The results indicate that neural networks and random forests achieved the highest accuracy rates of 87.4% and 85.6%, respectively, suggesting their potential for effective educational analytics and early intervention strat

APA, Harvard, Vancouver, ISO, and other styles

29

Lichode, Prof Rupatai. "Analysis of Machine Learning Algorithms for Heart Disease Prediction." International Journal for Research in Applied Science and Engineering Technology 12, no. 5 (2024): 2496–504. http://dx.doi.org/10.22214/ijraset.2024.62085.

Full text

Abstract:

Abstract: In recent times, heart disease prediction is one of the most complicated tasks in medical field. In the modern era, approximately one person dies per minute due to heart disease. Data science plays a crucial role in processing huge amount of data in the field of healthcare. As heart disease prediction is a complex task, there is a need to automate the prediction process to avoid risks associated with it and alert the patient well in advance. This project makes use of heart disease clinical dataset available in UCI machine learning repository. The proposed Work predicts the chances of

APA, Harvard, Vancouver, ISO, and other styles

30

Chakraborty, Subhadeep. "Multi-Disease Detection using Hybrid Machine Learning." Scholars Journal of Engineering and Technology 10, no. 10 (2022): 271–78. http://dx.doi.org/10.36347/sjet.2022.v10i10.002.

Full text

Abstract:

Machine Learning has a significant application in the detection of disease because of the automated process. Using machine learning models, the detection of disease can be done with higher effectiveness and with less error which may be seen in the context of computations made by humans. In this research, the detection of multiple diseases has been done with the application of machine learning. In this research context, three data have been selected namely Heart Disease Data (from UCI Repository), Liver Disease Data (from Kaggle Repository) and Diabetes Data (from Kaggle Repository). To detect

APA, Harvard, Vancouver, ISO, and other styles

31

Handaga, Bana, Tutut Herawan, and Mustafa Mat Deris. "FSSC." International Journal of Fuzzy System Applications 2, no. 4 (2012): 29–46. http://dx.doi.org/10.4018/ijfsa.2012100102.

Full text

Abstract:

Introduced is a new algorithm for the classification of numerical data using the theory of fuzzy soft set, named Fuzzy Soft Set Classifier (FSSC). The algorithm uses the fuzzy approach in the pre-processing stage to obtain features, and similarity concept in the process of classification. It can be applied not only to binary-valued datasets, but also be able to classify the data that consists of real numbers. Comparison tests on seven datasets from UCI Machine Learning Repository have been carried out. It is shown that the proposed algorithm provides better accuracy and higher accuracy as comp

APA, Harvard, Vancouver, ISO, and other styles

32

Derisma, D. "Perbandingan Kinerja Algoritma untuk Prediksi Penyakit Jantung dengan Teknik Data Mining." Journal of Applied Informatics and Computing 4, no. 1 (2020): 84–88. http://dx.doi.org/10.30871/jaic.v4i1.2152.

Full text

Abstract:

Heart disease is a disease that contributes to a relatively high mortality rate. The rate of human death caused by disease in the heart is a widespread problem in the world. The main objective of this study is to predict people with heart disease using the publicly available dataset in the UCI Repository with the Heart Disease dataset. To obtain the best classification algorithm is by comparing three Algoritma Naive Bayes, Random Forest, Neural Network algorithms, which are frequently used to predict people with heart disease. Comparison results show that Naive Bayes ' algorithm is a precise a

APA, Harvard, Vancouver, ISO, and other styles

33

Rukmana, Indra, Arvin Rasheda, Faiz Fathulhuda, Muh Rizky Cahyadi, and Fitriyani Fitriyani. "Analisis Perbandingan Kinerja Algoritma Naïve Bayes, Decision Tree-J48 dan Lazy-IBK." JURNAL MEDIA INFORMATIKA BUDIDARMA 5, no. 3 (2021): 1038. http://dx.doi.org/10.30865/mib.v5i3.3055.

Full text

Abstract:

This research is focused on knowing the performance of the classification algorithms, namely Naïve Bayes, Decision Tree-J48 and K-Nearest Neighbor. The speed and the percentage of accuracy in this study are the benchmarks for the performance of the algorithm. This study uses the Breast Cancer and Thoracic Surgery dataset, which is downloaded on the UCI Machine Learning Repository website. Using the help of Weka software Version 3.8.5 to find out the classification algorithm testing. The results show that the J-48 Decision Tree algorithm has the best accuracy, namely 75.6% in the cross-validati

APA, Harvard, Vancouver, ISO, and other styles

34

Nanda, Bijaya Kumar, and Satchidananda Dehuri. "Ant Miner." International Journal of Artificial Intelligence and Machine Learning 10, no. 1 (2020): 45–59. http://dx.doi.org/10.4018/ijaiml.2020010104.

Full text

Abstract:

In data mining the task of extracting classification rules from large data is an important task and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. The ant miner is inspired by researches on the behaviour of real ant colonies, simulated annealing, and some data mining concepts as well as principles. This paper presents a Pittsburgh style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. The experimental outcomes confirm that ant miner-HPB (Hybrid Pit

APA, Harvard, Vancouver, ISO, and other styles

35

Zhi, Wei Mei, Hua Ping Guo, and Ming Fan. "Sample Size on the Impact of Imbalance Learning." Advanced Materials Research 756-759 (September 2013): 2547–51. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.2547.

Full text

Abstract:

Classification of imbalanced data sets is widely used in many real life applications. Most state-of-the-art classification methods which assume the data sets are relatively balanced lose their efficiency. The paper discusses the factors which influence the modeling of a capable classifier in identifying rare events, especially for the factor of sample size. Carefully designed experiments using Rotation Forest as base classifier, carried on 3 datasets from UCI Machine Learning Repository based on weak show that, in particular imbalance ratio, increases the size of training set by unsupervised r

APA, Harvard, Vancouver, ISO, and other styles

36

Hossain, Nur, Nafis Anjum, Murshida Alam, et al. "PERFORMANCE OF MACHINE LEARNING ALGORITHMS FOR LUNG CANCER PREDICTION: A COMPARATIVE STUDY." International Journal of Medical Science and Public Health Research 05, no. 11 (2024): 41–55. http://dx.doi.org/10.37547/ijmsphr/volume05issue11-05.

Full text

Abstract:

This study compares the performance of five machine learning algorithms—logistic regression, support vector machines, random forests, gradient boosting, and neural networks—for lung cancer prediction using demographic, lifestyle, and medical data from the UCI Machine Learning Repository. Gradient boosting and random forests achieved the highest accuracy (89% and 87%, respectively) and AUC-ROC scores (0.93 and 0.92), while neural networks reached 90% accuracy but presented interpretability limitations. Key predictors included smoking history, chronic disease, and respiratory symptoms, aligning

APA, Harvard, Vancouver, ISO, and other styles

37

Kurniawan, Wildan, and Uce Indahyanti. "Prediksi Angka Harapan Hidup Penduduk Menggunakan Metode XGBoost." Indonesian Journal of Applied Technology 1, no. 2 (2024): 18. http://dx.doi.org/10.47134/ijat.v1i2.3045.

Full text

Abstract:

Penelitian ini bertujuan untuk memprediksi angka harapan hidup di beberapa negara wilayah Asia menggunakan algoritma XGBoost Regressor. Data yang digunakan berasal dari UCI Machine Learning Repository. Dalam penelitian ini, peneliti membangun model prediksi menggunakan pendekatan machine learning dan melakukan evaluasi berdasarkan tingkat akurasi dan Mean Absolute Error (MAE). Hasil penelitian menunjukkan bahwa model XGBoost Regressor memiliki tingkat akurasi sebesar 96,8% dalam memprediksi angka harapan hidup. Nilai MAE yang diperoleh adalah sebesar 0,97. Temuan ini menunjukkan potensi algori

APA, Harvard, Vancouver, ISO, and other styles

38

Sakai, Hiroshi, Kazuhiro Koba, and Michinori Nakata. "Rough Sets Based Rule Generation from Data with Categorical and Numerical Values." Journal of Advanced Computational Intelligence and Intelligent Informatics 12, no. 5 (2008): 426–34. http://dx.doi.org/10.20965/jaciii.2008.p0426.

Full text

Abstract:

Rough set theory has been mainly applied to data with categorical values. In order to handle data with numerical values in this theory, a familiar concept of ‘wildcards’ was employed, and a new framework of rough sets based rule generation has been proposed. Two characters @ and # were introduced into this framework, and numerical patterns were also defined for numerical values. The concepts of ‘coarse’ and ‘fine’ for rules were explicitly defined according to numerical patterns. This paper enhances the previous framework, and describes the implementation of an utility program. This utility pr

APA, Harvard, Vancouver, ISO, and other styles

39

Fadilah, Zahra Rizky, and Arie Wahyu Wijayanto. "Perbandingan Metode Klasterisasi Data Bertipe Campuran: One-Hot-Encoding, Gower Distance, dan K-Prototype Berdasarkan Akurasi (Studi Kasus: Chronic Kidney Disease Dataset)." Journal of Applied Informatics and Computing 7, no. 1 (2023): 57–67. http://dx.doi.org/10.30871/jaic.v7i1.5857.

Full text

Abstract:

Penelitian ini bertujuan untuk membandingkan metode one-hot-encoding, Gower distance yang dikombinasikan dengan algoritma k-means, DBSCAN, dan OPTICS, serta k-prototype untuk pengelompokan data bertipe campuran. Dataset yang digunakan dalam penelitian ini adalah dataset penyakit ginjal kronis (CKD) yang bersumber dari UCI Machine Learning Repository. Berdasarkan evaluasi dengan menggunakan indeks siluet, diketahui bahwa k-prototype dengan jumlah cluster k=2 merupakan metode clustering yang paling optimal karena memberikan nilai indeks siluet paling tinggi dibandingkan keempat metode lainnya, y

APA, Harvard, Vancouver, ISO, and other styles

40

Senthil Kumar T. "Ensemble Fusion of Classifiers with Kernel PCA for Breast Cancer Classification." Journal of Information Systems Engineering and Management 10, no. 29s (2025): 298–309. https://doi.org/10.52783/jisem.v10i29s.4479.

Full text

Abstract:

Breast cancer, predominantly affecting females, ranks as the most prevalent cancer among women globally, with potentially fatal consequences. Its invasive nature poses a significant health threat. Delayed diagnosis due to asymptomatic early stages hinders effective medical intervention. Early screenings prove pivotal in reducing breast cancer mortality. Beyond conventional diagnostic approaches, machine learning employs health data to predict breast cancer risk. This study employs Wisconsin breast cancer diagnosis data from the UCI machine learning repository. Class Imbalance is handled using

APA, Harvard, Vancouver, ISO, and other styles

41

Zubairi, Ach, Hermanto Hermanto, Hari Santoso, Abdus Samad, and Ahmad Homaidi. "IMPLEMENTASI ALGORITMA K-NEAREST NEIGHBOR UNTUK PENENTUAN STATUS KANKER." JUSTIFY : Jurnal Sistem Informasi Ibrahimy 3, no. 2 (2024): 117–21. https://doi.org/10.35316/justify.v3i2.6467.

Full text

Abstract:

Kanker merupakan tantangan kesehatan global utama dengan tingkat kematian yang signifikan. Penentuan status kanker yang akurat penting untuk diagnosis dan strategi pengobatan yang tepat. Penelitian ini mengeksplorasi algoritma K-Nearest Neighbor (KNN) dalam klasifikasi jenis kanker, dengan fokus pada dataset kanker payudara dari UCI Machine Learning Repository. Metodologi yang digunakan mencakup pengumpulan data, seleksi atribut, pemisahan data menjadi training dan testing, serta implementasi KNN. Hasil menunjukkan bahwa KNN dapat mencapai akurasi 87.61% dalam klasifikasi dengan evaluasi mengg

APA, Harvard, Vancouver, ISO, and other styles

42

Setiawan, Noor Akhmad. "Fuzzy Decision Support System for Coronary Artery Disease Diagnosis Based on Rough Set Theory." International Journal of Rough Sets and Data Analysis 1, no. 1 (2014): 65–80. http://dx.doi.org/10.4018/ijrsda.2014010105.

Full text

Abstract:

The objective of this research is to develop an evidence based fuzzy decision support system for the diagnosis of coronary artery disease. The development of decision support system is implemented based on three processing stages: rule generation, rule selection and rule fuzzification. Rough Set Theory (RST) is used to generate the classification rules from training data set. The training data are obtained from University California Irvine (UCI) data repository. Rule selection is conducted by transforming the rules into a decision table based on unseen data set. Furthermore, RST attributes red

APA, Harvard, Vancouver, ISO, and other styles

43

Cardone, Barbara, and Ferdinando Di Martino. "A Novel Fuzzy Entropy-Based Method to Improve the Performance of the Fuzzy C-Means Algorithm." Electronics 9, no. 4 (2020): 554. http://dx.doi.org/10.3390/electronics9040554.

Full text

Abstract:

One of the main drawbacks of the well-known Fuzzy C-means clustering algorithm (FCM) is the random initialization of the centers of the clusters as it can significantly affect the performance of the algorithm, thus not guaranteeing an optimal solution and increasing execution times. In this paper we propose a variation of FCM in which the initial optimal cluster centers are obtained by implementing a weighted FCM algorithm in which the weights are assigned by calculating a Shannon Fuzzy Entropy function. The results of the comparison tests applied on various classification datasets of the UCI

APA, Harvard, Vancouver, ISO, and other styles

44

Aneja, Veena, and Rajeev Kumar Singh. "AN AIR QUALITY FORECASTING MODEL BASED ON BIDIRECTIONAL GRU AND LSTM." International Research Journal of Computer Science 8, no. 9 (2021): 221–25. http://dx.doi.org/10.26562/irjcs.2021.v0809.001.

Full text

Abstract:

With the urban and industrial growth, many evolving countries suffer from excessive air pollution. The growing concern about air pollution has been raised by the government and people because it affects individual’s health and sustainable development globally. Several factors influence Air Quality and we must use all of them to interpose and forecast air pollution for the whole city. Thearticle suggested a novel deep learning CBGLSTM model for air quality forecasting based on 1D CNN, Bi-GRU, and Bi-LSTM neural networks. The model has experimented on the PM2.5 dataset taken from UCI Machine Lea

APA, Harvard, Vancouver, ISO, and other styles

45

Alaoui, Abdiya, and Zakaria Elberrichi. "Neuronal Communication Genetic Algorithm-Based Inductive Learning." Journal of Information Technology Research 13, no. 2 (2020): 141–54. http://dx.doi.org/10.4018/jitr.2020040109.

Full text

Abstract:

The development of powerful learning strategies in the medical domain constitutes a real challenge. Machine learning algorithms are used to extract high-level knowledge from medical datasets. Rule-based machine learning algorithms are easily interpreted by humans. To build a robust rule-based algorithm, a new hybrid metaheuristic was proposed for the classification of medical datasets. The hybrid approach uses neural communication and genetic algorithm-based inductive learning to build a robust model for disease prediction. The resulting classification models are characterized by good predicti

APA, Harvard, Vancouver, ISO, and other styles

46

Liu, Hongbing, Chunhua Liu, and Chang-an Wu. "Granular Computing Classification Algorithms Based on Distance Measures between Granules from the View of Set." Computational Intelligence and Neuroscience 2014 (2014): 1–9. http://dx.doi.org/10.1155/2014/656790.

Full text

Abstract:

Granular computing classification algorithms are proposed based on distance measures between two granules from the view of set. Firstly, granules are represented as the forms of hyperdiamond, hypersphere, hypercube, and hyperbox. Secondly, the distance measure between two granules is defined from the view of set, and the union operator between two granules is formed to obtain the granule set including the granules with different granularity. Thirdly the threshold of granularity determines the union between two granules and is used to form the granular computing classification algorithms based

APA, Harvard, Vancouver, ISO, and other styles

47

Zhang, Yong, Jiaxin Yu, Wenzhe Liu, and Kaoru Ota. "Ensemble Classification for Skewed Data Streams Based on Neural Network." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 26, no. 05 (2018): 839–53. http://dx.doi.org/10.1142/s021848851850037x.

Full text

Abstract:

Data stream learning in non-stationary environments and skewed class distributions has been receiving more attention in machine learning communities. This paper proposes a novel ensemble classification method (ECSDS) for classifying data streams with skewed class distributions. In the proposed ensemble method, back-propagation neural network is selected as the base classifier. In order to demonstrate the effectiveness of our proposed method, we choose three baseline methods based on ECSDS and evaluate their overall performance on ten datasets from UCI machine learning repository. Moreover, the

APA, Harvard, Vancouver, ISO, and other styles

48

Sakurai, Shigeaki. "Discovery of Characteristic Patterns from Transactions with Their Classes." Applied Computational Intelligence and Soft Computing 2012 (2012): 1–12. http://dx.doi.org/10.1155/2012/786387.

Full text

Abstract:

This paper deals with transactions with their classes. The classes represent the difference of conditions in the data collection. This paper redefines two kinds of supports: characteristic support and possible support. The former one is based on specific classes assigned to specific patterns. The latter one is based on the minimum class in the classes. This paper proposes a new method that efficiently discovers patterns whose characteristic supports are larger than or equal to the predefined minimum support by using their possible supports. Also, this paper verifies the effect of the method th

APA, Harvard, Vancouver, ISO, and other styles

49

Nurcahyo, Rudi, Ahmad Zainul Fanani, Affandy Affandy, and Mochammad Ilham Aziz. "Peningkatan Algoritma C4.5 Berbasis PSO Pada Penyakit Kanker Payudara." JURNAL MEDIA INFORMATIKA BUDIDARMA 7, no. 4 (2023): 1758. http://dx.doi.org/10.30865/mib.v7i4.6841.

Full text

Abstract:

Onenof the diseases innthe world that causes deathnin women isncancer. Cancernis a diseasencaused by uncontrolled enlargement of abnormal organs in the body. Cancer diagnosis is made using anthropometric data from routine blood analysis. The data used is the Breast Cancer Coimbra Data Set obtained from the UCI Machine Learning Repository. The C4.5 method is andecision treenalgorithm that is often used in the classification process. The selection of the right features, as well as the selectionnof the right method to overcome the class imbalance in the classification process cannimprove the perf

APA, Harvard, Vancouver, ISO, and other styles

50

Sujeet Kumar Sahani and Dr. Sonam Singh. "Clustering Social Networking Data With K-Means Algorithm Using R Language." International Journal of Scientific Research in Computer Science, Engineering and Information Technology 10, no. 4 (2024): 23–30. http://dx.doi.org/10.32628/cseit24104105.

Full text

Abstract:

The main objectives of this research work are to report detailed empirical studies on sequential and parallel algorithms for diverse clustering tasks executed on very large social network datasets using memory efficient out-of-core approaches. We evaluate the spark implementation for R on Cloudera using the data from social media review datasets like k-means and hierarchical clustering to rank these algorithms. This implementation leverages the YouTube dataset from UCI Machine Learning Repository. Our goal is to compare a few algorithms, so we can know exactly how accurately these models are p

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!