Log in

Relevant bibliographies by topics / K-Nearest Neighbours (KNN) / Journal articles

To see the other types of publications on this topic, follow the link: K-Nearest Neighbours (KNN).

Journal articles on the topic 'K-Nearest Neighbours (KNN)'

Author: Grafiati

Published: 5 June 2025

Last updated: 24 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'K-Nearest Neighbours (KNN).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Mahfouz, Mohamed A. "INCORPORATING DENSITY IN K-NEAREST NEIGHBORS REGRESSION." International Journal of Advanced Research in Computer Science 14, no. 03 (2023): 144–49. http://dx.doi.org/10.26483/ijarcs.v14i3.6989.

Full text

Abstract:

The application of the traditional k-nearest neighbours in regression analysis suffers from several difficulties when only a limited number of samples are available. In this paper, two decision models based on density are proposed. In order to reduce testing time, a k-nearest neighbours table (kNN-Table) is maintained to keep the neighbours of each object x along with their weighted Manhattan distance to x and a binary vector representing the increase or the decrease in each dimension compared to x’s values. In the first decision model, if the unseen sample having a distance to one of its neighbours x less than the farthest neighbour of x’s neighbour then its label is estimated using linear interpolation otherwise linear extrapolation is used. In the second decision model, for each neighbour x of the unseen sample, the distance of the unseen sample to x and the binary vector are computed. Also, the set S of nearest neighbours of x are identified from the kNN-Table. For each sample in S, a normalized distance to the unseen sample is computed using the information stored in the kNN-Table and it is used to compute the weight of each neighbor of the neighbors of the unseen object. In the two models, a weighted average of the computed label for each neighbour is assigned to the unseen object. The diversity between the two proposed decision models and the traditional kNN regressor motivates us to develop an ensemble of the two proposed models along with traditional kNN regressor. The ensemble is evaluated and the results showed that the ensemble achieves significant increase in the performance compared to its base regressors and several related algorithms.

APA, Harvard, Vancouver, ISO, and other styles

2

He, Hongxing, Simon Hawkins, Warwick Graco, and Xin Yao. "Application of Genetic Algorithm and K-Nearest Neighbour Method in Real World Medical Fraud Detection Problem." Journal of Advanced Computational Intelligence and Intelligent Informatics 4, no. 2 (2000): 130–37. http://dx.doi.org/10.20965/jaciii.2000.p0130.

Full text

Abstract:

In the k-Nearest Neighbour (kNN) algorithm, the classification of a new sample is determined by the class of its k nearest neighbours. The performance of the kNN algorithm is influenced by three main factors: (1) the distance metric used to locate the nearest neighbours; (2) the decision rule used to derive a classification from the k-nearest neighbours; and (3) the number of neighbours used to classify the new sample. Using k = 1, 3, or 5 nearest neighbours, this study uses a Genetic Algorithm (GA) to find the optimal non-Euclidean distance metric in the kNN algorithm and examines two alternative methods (Majority Rule and Bayes Rule) to derive a classification from the k nearest neighbours. This modified algorithm was evaluated on two real-world medical fraud problems. The General Practitioner (GP) database is a 2-class problem in which GPs are classified as either practising appropriately or inappropriately. The ’.Doctor-Shoppers’ database is a 5-class problem in which patients are classified according to the likelihood that they are ’doctor-shoppers’. Doctor-shoppers are patients who consult many physicians in order to obtain multiple prescriptions of drugs of addiction in excess of their own therapeutic need. In both applications, classification accuracy was improved by optimising the distance metric in the kNN algorithm. The agreement rate on the GP dataset improved from around 70% (using Euclidean distance) to 78 % (using an optimised distance metric), and from about 55% to 82% on the Doctor Shopper’s dataset. Differences in either the decision rule or the number of nearest neighbours had little or no impact on the classification performance of the kNN algorithm. The excellent performance of the kNN algorithm when the distance metric is optimised using a genetic algorithm paves the way for its application in the real world fraud detection problems faced by the Health Insurance Commission (HIC).

APA, Harvard, Vancouver, ISO, and other styles

3

Sukshitha, R., and Satyanarayana Satyanarayana. "Empirical Likelihood Ratio Based K-Nearest Neighbours Regression." INTERNATIONAL JOURNAL OF AGRICULTURAL AND STATISTICAL SCIENCES 20, no. 02 (2024): 421. https://doi.org/10.59467/ijass.2024.20.421.

Full text

Abstract:

Regression models play a pivotal role in real-life applications by enabling the analysis and prediction of continuous outcomes. Among these, the k-Nearest Neighbours (KNN) model stands out as a significant advancement in machine learning. KNN's ability to make predictions based on the proximity of data points has found wide-ranging applications in various fields. However, the traditional KNN regression model has its limitations, including sensitivity to noise and an uneven distribution of neighbours. In response, this paper introduces a novel approach: an empirical likelihood ratio (ELR) based regression algorithm. The ELR technique offers distinct advantages over distance-based nearest neighbour computations, particularly in handling skewed data distributions and minimizing the impact of outliers. The proposed ELRbased KNN regression model is rigorously assessed through both simulation studies and real-life scenarios. The results explicitly demonstrate the enhanced performance of the ELR-based approach over the conventional KNN model. This research contributes to a deeper understanding of regression techniques and underscores the practical significance of leveraging empirical likelihood ratios in refining predictive models for real-world applications.. KEYWORDS :Regression model, k-Nearest neighbours, Empirical likelihood ratio, Distance measures, Data distributions.

APA, Harvard, Vancouver, ISO, and other styles

4

Farooq, Muhammad, Sehrish Sarfraz, Christophe Chesneau, et al. "Computing Expectiles Using k-Nearest Neighbours Approach." Symmetry 13, no. 4 (2021): 645. http://dx.doi.org/10.3390/sym13040645.

Full text

Abstract:

Expectiles have gained considerable attention in recent years due to wide applications in many areas. In this study, the k-nearest neighbours approach, together with the asymmetric least squares loss function, called ex-kNN, is proposed for computing expectiles. Firstly, the effect of various distance measures on ex-kNN in terms of test error and computational time is evaluated. It is found that Canberra, Lorentzian, and Soergel distance measures lead to minimum test error, whereas Euclidean, Canberra, and Average of (L1,L∞) lead to a low computational cost. Secondly, the performance of ex-kNN is compared with existing packages er-boost and ex-svm for computing expectiles that are based on nine real life examples. Depending on the nature of data, the ex-kNN showed two to 10 times better performance than er-boost and comparable performance with ex-svm regarding test error. Computationally, the ex-kNN is found two to five times faster than ex-svm and much faster than er-boost, particularly, in the case of high dimensional data.

APA, Harvard, Vancouver, ISO, and other styles

5

Fatah, Haerul, and Agus Subekti. "PREDIKSI HARGA CRYPTOCURRENCY DENGAN METODE K-NEAREST NEIGHBOURS." Jurnal Pilar Nusa Mandiri 14, no. 2 (2018): 137. http://dx.doi.org/10.33480/pilar.v14i2.894.

Full text

Abstract:

Uang elektronik menjadi pilihan yang mulai ramai digunakan oleh banyak orang, terutama para pengusaha, pebisnis dan investor, karena menganggap bahwa uang elektronik akan menggantikan uang fisik dimasa depan. Cryptocurrency muncul sebagai jawaban atas kendala uang eletronik yang sangat bergantung kepada pihak ketiga. Salah satu jenis Cryptocurrency yaitu Bitcoin. Analogi keuangan Bitcoin sama dengan analogi pasar saham, yakni fluktuasi harga tidak tentu setiap detik. Tujuan dari penelitian yang dilakukan yaitu melakukan prediksi harga Cryptocurrency dengan menggunakan metode KNN (K-Nearest Neighbours). Hasil dari penelitian ini diketahui bahwa model KNN yang paling baik dalam memprediksi harga Cryptocurrency adalah KNN dengan parameter nilai K=3 dan Nearest Neighbour Search Algorithm : Linear NN Search. Dengan nilai Mean Absolute Error (MAE) sebesar 0.0018 dan Root Mean Squared Error (RMSE) sebesar 0.0089.

APA, Harvard, Vancouver, ISO, and other styles

6

Lu, Zhigang, and Hong Shen. "An accuracy-assured privacy-preserving recommender system for internet commerce." Computer Science and Information Systems 12, no. 4 (2015): 1307–26. http://dx.doi.org/10.2298/csis140725056l.

Full text

Abstract:

Recommender systems, tool for predicting users? potential preferences by computing history data and users? interests, show an increasing importance in various Internet applications such as online shopping. As a well-known recommendation method, neighbourhood-based collaborative filtering has attracted considerable attentions recently. The risk of revealing users? private information during the process of filtering has attracted noticeable research interests. Among the current solutions, the probabilistic techniques have shown a powerful privacy preserving effect. The existing methods deploying probabilistic methods are in three categories, one [19] adds differential privacy noises in the covariance matrix; one [1] introduces the randomisation in the neighbour selection process; the other [29] applies differential privacy in both the neighbour selection process and covariance matrix. When facing the k Nearest Neighbour (kNN) attack, all the existing methods provide no data utility guarantee, for the introduction of global randomness. In this paper, to overcome the problem of recommendation accuracy loss, we propose a novel approach, Partitioned Probabilistic Neighbour Selection, to ensure a required prediction accuracy while maintaining high security against the kNN attack. We define the sum of k neighbours? similarity as the accuracy metric ?, the number of user partitions, across which we select the k neighbours, as the security metric ?. We generalise the k Nearest Neighbour attack to the ?k Nearest Neighbours attack. Differing from the existing approach that selects neighbours across the entire candidate list randomly, our method selects neighbours from each exclusive partition of size k with a decreasing probability. Theoretical and experimental analysis show that to provide an accuracy-assured recommendation, our Partitioned Probabilistic Neighbour Selection method yields a better trade-off between the recommendation accuracy and system security.

APA, Harvard, Vancouver, ISO, and other styles

7

Jagath Prasad, Himayavardhini, and Roji Marjorie S. "Optimized k-nearest neighbours classifier based prediction of epileptic seizures." Bulletin of Electrical Engineering and Informatics 13, no. 4 (2024): 2442–55. http://dx.doi.org/10.11591/eei.v13i4.6598.

Full text

Abstract:

Epileptic seizure is an unstable condition of the brain that cause severe mental disorder and can be fatal if not properly diagnosed at an early stage. Electroencephalogram (EEG) plays a major role in early diagnosis of epileptic seizures. The volume of medical databases is enormous. Classification may become less accurate if the dataset contains redundant and irrelevant attributes. To reduce the mortality rate due to epilepsy, a decision support system that can assist medical professionals in taking immediate precautionary measures prior to reaching the critical condition is required. In this work, k-nearest neighbours (KNN) classifier algorithm is optimised using genetic algorithm for effective classification and faster prediction to meet this requirement. Genetic algorithms search for optimal solutions in complex and large environments. Results are compared with other machine learning models such as support vector machine (SVM), KNN, decision tree classifier, and random forest. With optimization using genetic algorithm KNN was able to achieve an enhancement in accuracy at lower training and testing times. It was observed that the accuracy offered by optimized KNN was 92%. Random forest classifiers showed minimum complexity and KNN algorithm provided faster performance with better accuracy.

APA, Harvard, Vancouver, ISO, and other styles

8

Magnussen, S., R. E. McRoberts, and E. O. Tomppo. "A resampling variance estimator for the k nearest neighbours technique." Canadian Journal of Forest Research 40, no. 4 (2010): 648–58. http://dx.doi.org/10.1139/x10-020.

Full text

Abstract:

Current estimators of variance for the k nearest neighbours (kNN) technique are designed for estimates of population totals. Their efficiency in small-area estimation problems can be poor. In this study, we propose a modified balanced repeated replication estimator of variance (BRR) of a kNN total that performs well in small-area estimation problems and under both simple random and cluster sampling. The BRR estimate of variance is the sum of variances and covariances of unit-level kNN estimates in the area of interest. In Monte Carlo simulations of simple random and cluster sampling from seven artificial populations with real and simulated forest inventory data, the agreement between averages of BRR estimates of variance and Monte Carlo sampling variances was good both for population and for small-area totals. The modified BRR estimator is currently limited to sample sizes no larger than 1984. An accurate approximation to the proposed BRR estimator allows significant savings in computing time.

APA, Harvard, Vancouver, ISO, and other styles

9

Pandey, Shubham, Vivek Sharma, and Garima Agrawal. "Modification of KNN Algorithm." International Journal of Engineering and Computer Science 8, no. 11 (2019): 24869–77. http://dx.doi.org/10.18535/ijecs/v8i11.4383.

Full text

Abstract:

K-Nearest Neighbor (KNN) classification is one of the most fundamental and simple classification methods. It is among the most frequently used classification algorithm in the case when there is little or no prior knowledge about the distribution of the data. In this paper a modification is taken to improve the performance of KNN. The main idea of KNN is to use a set of robust neighbors in the training data. This modified KNN proposed in this paper is better from traditional KNN in both terms: robustness and performance. Inspired from the traditional KNN algorithm, the main idea is to classify an input query according to the most frequent tag in set of neighbor tags with the say of the tag closest to the new tuple being the highest. Proposed Modified KNN can be considered a kind of weighted KNN so that the query label is approximated by weighting the neighbors of the query. The procedure computes the frequencies of the same labeled neighbors to the total number of neighbors with value associated with each label multiplied by a factor which is inversely proportional to the distance between new tuple and neighbours. The proposed method is evaluated on a variety of several standard UCI data sets. Experiments show the significant improvement in the performance of KNN method.

APA, Harvard, Vancouver, ISO, and other styles

10

Safar, Maytham. "Spatial Queries in Road Networks Based on PINE." JUCS - Journal of Universal Computer Science 14, no. (4) (2008): 590–611. https://doi.org/10.3217/jucs-014-04-0590.

Full text

Abstract:

Over the last decade, due to the rapid developments in information technology (IT), a new breed of information systems has appeared such as geographic information systems that introduced new challenges for researchers, developers and users. One of its applications is the car navigation system, which allows drivers to receive navigation instructions without taking their eyes off the road. Using a Global Positioning System (GPS) in the car navigation system enables the driver to perform a wide range of queries, from locating the car position, to finding a route from a source to a destination, or dynamically selecting the best route in real time. Several types of spatial queries (e.g., nearest neighbour - NN, K nearest neighbours - KNN, continuous k nearest neighbours - CKNN, reverse nearest neighbour - RNN) have been proposed and studied in the context of spatial databases. With spatial network databases (SNDB), objects are restricted to move on pre-defined paths (e.g., roads) that are specified by an underlying network. In our previous work, we proposed a novel approach, termed Progressive Incremental Network Expansion (PINE), to efficiently support NN and KNN queries. In this work, we utilize our developed PINE system to efficiently support other spatial queries such as CKNN. The continuous K nearest neighbour (CKNN) query is an important type of query that finds continuously the K nearest objects to a query point on a given path. We focus on moving queries issued on stationary objects in Spatial Network Database (SNDB) (e.g., continuously report the five nearest gas stations while I am driving.) The result of this type of query is a set of intervals (defined by split points) and their corresponding KNNs. This means that the KNN of an object travelling on one interval of the path remains the same all through that interval, until it reaches a split point where its KNNs change. Existing methods for CKNN are based on Euclidean distances. In this paper we propose a new algorithm for answering CKNN in SNDB where the important measure for the shortest path is network distances rather than Euclidean distances. Our solution addresses a new type of query that is plausible to many applications where the answer to the query not only depends on the distances of the nearest neighbours, but also on the user or application need. By distinguishing between two types of split points, we reduce the number of computations to retrieve the continuous KNN of a moving object. We compared our algorithm with CKNN based on VN3 using IE (Intersection Examination). Our experiments show that our approach has better response time than approaches that are based on IE, and requires fewer shortest distance computations and KNN queries.

APA, Harvard, Vancouver, ISO, and other styles

11

Rahal, Imad, Hassan Najadat, and William Perrizo. "Efficiency Considerations for Vertical kNN Text Categorisation." Journal of Information & Knowledge Management 05, no. 03 (2006): 211–22. http://dx.doi.org/10.1142/s021964920600144x.

Full text

Abstract:

The importance of text mining stems from the availability of huge volumes of text databases holding a wealth of valuable information that needs to be mined. Text mining is a coarse area encompassing many finer branches one of which is text categorisation or text classification. Text categorisation is the process of assigning class labels to documents based entirely on their textual contents where we are given a document d, and asked to find its subject matter or class label, Ci. In this paper, an optimised k-Nearest Neighbours classifier that uses discretisation, the P-tree technology, and dimensionality reduction to achieve a high degree of accuracy, space utilisation and time efficiency is proposed. One of the fundamental contributions of this work is that as new samples arrive, the proposed classifier can find the k nearest neighbours to the new sample from the training space without a single database scan.

APA, Harvard, Vancouver, ISO, and other styles

12

Barkalov, Konstantin, Anton Shtanyuk, and Alexander Sysoyev. "A Fast kNN Algorithm Using Multiple Space-Filling Curves." Entropy 24, no. 6 (2022): 767. http://dx.doi.org/10.3390/e24060767.

Full text

Abstract:

The paper considers a time-efficient implementation of the k nearest neighbours (kNN) algorithm. A well-known approach for accelerating the kNN algorithm is to utilise dimensionality reduction methods based on the use of space-filling curves. In this paper, we take this approach further and propose an algorithm that employs multiple space-filling curves and is faster (with comparable quality) compared with the kNN algorithm, which uses kd-trees to determine the nearest neighbours. A specific method for constructing multiple Peano curves is outlined, and statements are given about the preservation of object proximity information in the course of dimensionality reduction. An experimental comparison with known kNN implementations using kd-trees was performed using test and real-life data.

APA, Harvard, Vancouver, ISO, and other styles

13

Deshpande, Dr Bhagwant K. "Classifying And Predicting Adolescent Cardiac Health Using KNN." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem46191.

Full text

Abstract:

Abstract- This study uses a K-Nearest Neighbours (KNN) classifier to detect teenage cardiac disease. Years of age, Sexual orientation, Chest Ache category, Ambient Heart pressure, cholesterol levels, Highest Cardiac Level, and Exercise-Induced Angina are clinically important and interpretable aspects of the cardiovascular disease dataset used to create the model. These qualities were chosen to improve model interpretability for doctors and laypeople. Data preparation encoded categorical variables and standardised features for model optimisation. Histograms, charts with bars, and scatter plots were used to explore feature distribution and heart disease status. The kernel nearest neighbour (KNN) model was trained with k=11 neighbours and tested using precision, a matrix of confusion, classifying report, ROC spectrum, and precision-recall curve, proving predictive power. To select the most influential predictors, feature importance was permutation-based. A Flask web application lets users input health parameters to forecast heart disease risk. The software generates personalised, patient-friendly health information using OpenAI's API and generates PDF reports. The technology keeps prognosis records in an information system for future reference. This approach uses data-driven modelling, interpretability, and user-friendliness to help adolescents recognise and understand heart disease risk. Keywords - Adolescent heart disease identification, risk factors for heart disease, KNN classifier, artificial intelligence, data preprocessing, feature relevance, model assessment. Flask web app, OpenAI integration into the API, health insights, PDF reports, early diagnosis, and healthcare information visualisation.

APA, Harvard, Vancouver, ISO, and other styles

14

Aldo januansyah. H, Muhammad Fikry, and Yesy Afrillia. "Machine Learning Algorithms Comparison for Gender Identification." Proceedings of Malikussaleh International Conference on Multidisciplinary Studies (MICoMS) 4 (December 18, 2024): 00007. https://doi.org/10.29103/micoms.v4i.885.

Full text

Abstract:

Abstract. In this study, we presents a comprehensive analysis of gender identification methods utilising eight distinct classification models: K-Nearest Neighbors (KNN), Naive Bayes, Decision Tree, Random Forest, Logistic Regression, XGBoost, Support Vector Machine (SVM), and Neural Network. Gender identification is a critical task with significant applications in marketing, social analysis, and security systems, necessitating the exploration of various methodologies to achieve optimal performance. The dataset employed in this research underwent normalisation using the Min-Max scaling technique, which enhances the performance of classification models by ensuring that all features contribute equally, particularly when the data exhibits varying ranges of values. The results reveal that the K-Nearest Neighbors (KNN) model significantly outperformed the other models, achieving an impressive accuracy of 0.9758 with a support of 951, underscoring the effectiveness of the KNN algorithm in gender identification tasks and establishing it as a reliable choice for applications requiring high accuracy. Furthermore, the study emphasises the critical importance of selecting appropriate models in machine learning tasks and the substantial impact of data normalisation on model performance. Overall, this research provides valuable insights into the KNN algorithm, demonstrating its ease of implementation and exceptional effectiveness in achieving high precision in gender identification tasks, with implications for future research and practical applications across various fields. Keywords : classification models; data normalisation; gender identification; K-Nearest Neighbours; machine learning.

APA, Harvard, Vancouver, ISO, and other styles

15

Ati, Indri, and Ari Kusyanti. "Metode Ensemble Classifier untuk Mendeteksi Jenis Attention Deficit Hyperactivity Disorder (SDHD) pada Anak Usia Dini." Jurnal Teknologi Informasi dan Ilmu Komputer 6, no. 3 (2019): 301. http://dx.doi.org/10.25126/jtiik.2019631313.

Full text

Abstract:

<p class="Abstract">Pada awal masa perkembangan, beberapa anak mengalami hambatan diantaranya sulit untuk diam, sulit untuk berkonsentrasi dan mengontrol perilakunya, apabila anak mengalami gangguan pemusatan perhatian dan sulit mengontrol perilaku yang sesuai, dapat disebut dengan ADHD (Attention Deficit Hyperactive Disorder). Ini merupakan masalah yang serius dikarenakan anak penyandang ADHD mengalami masalah perilaku sosial, emosional dan mengalami kesulitan belajar sekolah sehingga akan mempengaruhi perkembangan pada masa dewasa anak penyandang ADHD. Oleh karena itu perlu diketahui gejala ADHD sejak dini, agar dapat dilakukan suatu penanganan dengan cepat dan tepat. Penelitian ini menghasilkan aplikasi yang digunakan untuk mendeteksi jenis ADHD berdasarkan gejala-gejala yang di masukkan oleh pengguna sehingga akan tampil hasil klasifikasi jenis ADHD nya secara otomatis. Aplikasi ini menggunakan metode Ensemble Classifier yaitu metode yang menggabungkan beberapa classifier agar dapat meningkatkan akurasi yang dihasilkan. Pada tahap klasifikasi setiap data akan dihitung menggunakan K-Nearest Neighbour (KNN), Fuzzy K-Nearest Neighbour (FKNN) dan Neighbour Weighted K-Nearest Neighbour (NWKNN). Hasil perhitungan ketiga classifier tersebut akan diproses kembali dengan metode Ensemble Classifier dengan menggunakan majority voting untuk penentuan klasnya. Hasil akurasi tertinggi dari metode ensemble classifier yaitu 95% dengan nilai k optimal yaitu k=10. Akan tetapi semakin besar nilai k yaitu diatas k=20 maka nilai akurasi untuk masing-masing algoritme akan semakin turun. Hal ini dikarenakan semua algoritme penentuan klasifikasinya berdasarkan jumlah ketetanggaannya. Maka semakin banyak jumlah tetangga yang diperhitungkan maka kemungkinan salah klasifikasinya semakin besar.</p><p class="Abstract"> </p><p class="Abstract"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>At the beginning of the development stage, some children experience difficulty to calm, to concentrate and to control their behavior. These symptoms are known as ADHD (Attention Deficit Hyperactive Disorder). This research develops an application that is used to defineADHD based on symptoms that that is entered by the user so that it will show its ADHD type automatically. This application uses the Ensemble Classifier method, in which a method that allows some classifier in order to increase the resulting value. At the classification stage each data will be calculated using K-Nearest Neighbor (KNN), Fuzzy K-Nearest Neighbor (FKNN) and Neighbor Weighted K-Nearest Neighbor (NWKNN). The results of the three classifier calculations will return using the Ensemble Classifier method using the majority voting for class determination. Acceptance results from the ensemble classifier method is 95% with the optimal k value k = 10. However, when the k value, i.e k &gt;=20 then the value for each algorithm will decrease. This is due to the calculation of all the classification algorithm based on the number of its neighbors. Therefore, the more neighbours that are calculated then the possibility of misclassification is greater.</em></p><p class="Abstract"><em><strong><br /></strong></em></p>

APA, Harvard, Vancouver, ISO, and other styles

16

Rihastuti, Siti, Afnan Rosyidi, and Handoko Handoko. "Implementasi K-Nearest Neighbours Dengan Google Collab Untuk Klasifikasi Penyakit Kanker Payudara." Jurnal Teknologi Informasi 10, no. 2 (2025): 79–85. https://doi.org/10.52643/jti.v10i2.5223.

Full text

Abstract:

Penelitian ini bertujuan untuik mengklasifikasi penyakit kanker payudara berdasarkan dataset pasien penderita penyakit kanker payudara menggunakan model K-Nearest Neigbors dan pengujian menggunakan Google Colab. Kanker payudara adalah salah satu jenis kanker yang paling umum di kalangan wanita dan deteksi dini sangat penting untuk meningkatkan peluang kesembuhan bagi penderitanya. Sebanyak 569 record dataset penyakit kanker payudara digunakan yang diambil dari situs kaggle. Terdapat 32 variabel yang ada didalam kriteria dataset yang akan diuji. Model KNN diterapkan dengan menentukan nilai K = 1 hingga 20. Dari hasil pengujian menggunakan KNN dan Google Colab terhadap dataset diperoleh nilai akurasi tertinggi sebesar 96.49% dengan nilai K=9. Akurasi cenderung stabil dan memiliki selisih yang sedikit mulai dari K=3 hingga K =20. Nilai presisi tertinggi terdapat pada angka 97,18% dengan beberapa nilai K=4, 6, 8, 9, dan K=10. Nilai recall dan F1-score tertinggi sebesar 97.18% pada K=9. Perolehan nilai akurasi, presisi, recall dan F1-score yang tinggi dan cukup stabil mampu mendeteksi kasus kanker payudara dengan tingkat akurasi yang baik. Nilai F1-score yang tinggi menunjukkan bahwa model KNN memiliki keseimbangan yang baik antara precision (ketepatan dalam memprediksi secara akurat) dan recall (sensitivitas dalam mengukur kinerja model dalam memprediksi kasus kanker payudara ganas yang sebenarnya). Berdasarkan hasil pengujian menunjukkan bahwa model K-Nearest Neigbors memiliki kinerja yang cukup baik dalam memprediksi kanker payudara.

APA, Harvard, Vancouver, ISO, and other styles

17

Suriya, S., and J. Joanish Muthu. "Type 2 Diabetes Prediction using K-Nearest Neighbor Algorithm." Journal of Trends in Computer Science and Smart Technology 5, no. 2 (2023): 190–205. http://dx.doi.org/10.36548/jtcsst.2023.2.007.

Full text

Abstract:

Type 2 diabetes is a persistent disorder that affects millions of individuals globally. It is characterised by the excessive levels of glucose within the blood due to insulin resistance or the incapability to supply insulin. Early detection and prediction of type 2 diabetes can improve patient outcomes. K-Nearest Neighbor (KNN) is used in the present model to predict type 2 diabetes. The KNN set of rules is a simple but powerful machine learning set of rules used for categorization and regression. It's far a non-parametric approach that makes predictions based totally on the nearest k-neighbours in a dataset. KNN is widely used in healthcare and scientific studies to expect and classify sicknesses primarily based on the affected person’s data. The intention of this work is to predict the threat of growing type 2 diabetes using the KNN set of rules. Data has been collected from electronic medical records of patients diagnosed with type 2 diabetes and healthy individuals. The dataset consists of various patient attributes, such as age, gender, body mass index, blood pressure, cholesterol levels, and glucose levels. Information has also been collected about lifestyle habits, such as physical activity, smoking status, and alcohol consumption. Data have been pre-processed by removing missing values and outliers, and normalization of the data has been done to ensure that all features have the same scale. Splitting the dataset into training and test sets, with training sets using 80% of the data and test sets using 20% of the data is performed. KNN algorithm have been used to classify the patients into two groups: those at high risk of developing type 2 diabetes and those at low risk. The model's performance has been assessed using a variety of metrics, including accuracy, precision, recall, and F1-score.

APA, Harvard, Vancouver, ISO, and other styles

18

Kutyłowska, Małgorzata. "Application of K-nearest neighbours method for water pipes failure frequency assessment." E3S Web of Conferences 59 (2018): 00021. http://dx.doi.org/10.1051/e3sconf/20185900021.

Full text

Abstract:

The paper describes the results of failure rate modeling using K-nearest neighbours method (KNN). This algorithm is one among other regression methods, called machine learning methods. The aim of the presented paper was to check the possibilities of application of such kind of modelling and the comparison between current results and investigations of failure rate prediction in another Polish city. Operational data from 12 years of exploitation, received from water utility, were used to predict dependent variable (failure rate). Data (249 and 294 for distribution pipes and house connections, respectively) from the time span 2001–2012 were used for creating the KNN models. On the basis of other data (one case for each year) the validation of optimal model, based on Euclidean distance metric with the number of nearest neighbours K = 2, was carried out. The realization of the modelling was performed in the software program Statistica 12.0.

APA, Harvard, Vancouver, ISO, and other styles

19

Abid, Sarwar. "K Nearest Neighbours based diagnosis of hyperglycemia." International Journal of Trend in Scientific Research and Development 2, no. 1 (2017): 611–14. https://doi.org/10.31142/ijtsrd7046.

Full text

Abstract:

AI or artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems. These processes include learning the acquisition of information and rules for using the information , reasoning using the rules to reach approximate or definite conclusions , and self correction. As a result, Artificial Intelligence is gaining Importance in science and engineering fields. The use of Artificial Intelligence in medical diagnosis too is becoming increasingly common and has been used widely in the diagnosis of cancers, tumors, hepatitis, lung diseases, etc... The main aim of this paper is to build an Artificial Intelligent System that after analysis of certain parameters can predict that whether a person is diabetic or not. Diabetes is the name used to describe a metabolic condition of having higher than normal blood sugar levels. Diabetes is becoming increasingly more common throughout the world, due to increased obesity which can lead to metabolic syndrome or pre diabetes leading to higher incidences of type 2 diabetes. Authors have identified 10 parameters that play an important role in diabetes and prepared a rich database of training data which served as the backbone of the prediction algorithm. Keeping in view this training data authors developed a system that uses the artificial neural networks algorithm to serve the purpose. These are capable of predicting new observations on specific variables from previous observations on the same or other variables after executing a process of so called learning from existing training data Haykin 1998 .The results indicate that the performance of KNN method when compared with the medical diagnosis system was found to be 91 . This system can be used to assist medical programs especially in geographically remote areas where expert human diagnosis not possible with an advantage of minimal expenses and faster results. Abid Sarwar "K-Nearest Neighbours based diagnosis of hyperglycemia" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: https://www.ijtsrd.com/papers/ijtsrd7046.pdf

APA, Harvard, Vancouver, ISO, and other styles

20

Bayraktar, Rabia, Batur Alp Akgul, and Kadir Sercan Bayram. "Colour recognition using colour histogram feature extraction and K-nearest neighbour classifier." New Trends and Issues Proceedings on Advances in Pure and Applied Sciences, no. 12 (April 30, 2020): 08–14. http://dx.doi.org/10.18844/gjpaas.v0i12.4981.

Full text

Abstract:

K-nearest neighbours (KNN) is a widely used neural network and machine learning classification algorithm. Recently, it has been used in the neural network and digital image processing fields. In this study, the KNN classifier is used to distinguish 12 different colours. These colours are black, blue, brown, forest green, green, navy, orange, pink, red, violet, white and yellow. Using colour histogram feature extraction, which is one of the image processing techniques, the features that distinguish these colours are determined. These features increase the effectiveness of the KNN classifier. The training data consist of saved frames and the test data are obtained from the video camera in real-time. The video consists of consecutive frames. The frames are 100 × 70 in size. Each frame is tested with K = 3,5,7,9 and the obtained results are recorded. In general, the best results are obtained when used K = 5.  Keywords: KNN algorithm, classifier, application, neural network, image processing, developed, colour, dataset, colour recognition.

APA, Harvard, Vancouver, ISO, and other styles

21

Roy, Mithu. "Enhancing Breast Cancer Prediction: A Comparative Study of Machine Learning Algorithms." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem45893.

Full text

Abstract:

Abstract — In this study, a comparative research analysis was performed of three of the most significant Machine Learning algorithms, namely Logistic Regression, Random Forest, and K Nearest Neighbors (KNN) to improve the model for breast cancer prediction. Using the complete patient data set comprising 569 patients, we assessed the predictive accuracy of all the showcases of the algorithms. The results showed that Logistic Regression achieved the highest accuracy at 97.37%, followed nearly by Random Forest at 96.49%, and KNN at 94.74%. These outcomes show the effectiveness of machine learning in increasing the practical accuracy of breast cancer prediction and, highlighting the importance of algorithm selection based on performance metrics. This study aims to contribute to the ongoing efforts to enhance early diagnosis and personalized treatment strategies for breast cancer patients, thereby improving overall patient outcomes. Index Terms— Breast Cancer, Machine Learning Algorithms, Prediction, Healthcare, Random Forest, Logistic Regression, K-Nearest Neighbours.

APA, Harvard, Vancouver, ISO, and other styles

22

Siagian, Yessica, Jeperson Hutahaean, Arridha Zikra Syah, Jhonson Efendi Hutagalung, and Abdul Karim. "Implementasi Metode K-Nearest Neigbours (KNN) Untuk Klasifikasi Penyakit Diabetes." Jurnal Informatika dan Teknologi Informasi 2, no. 3 (2024): 253–62. http://dx.doi.org/10.56854/jt.v2i3.331.

Full text

Abstract:

Penelitian ini mengeksplorasi peran teknologi informasi dalam upaya pencegahan penyakit diabetes melalui prediksi risiko menggunakan metode machine learning. Diabetes, sebagai penyakit kronis yang memiliki dampak serius pada kesehatan, dapat diprediksi berdasarkan variabel seperti Glucose, Blood Pressure, dan BMI. Metode K-Nearest Neighbours (KNN) digunakan untuk membandingkan dengan model Decision Tree dan Naïve Bayes dalam memprediksi risiko diabetes. Data penelitian melibatkan tahapan Data Preprocessing, Data Visualization, Pembagian Data, dan Model Machine Learning, dengan evaluasi menggunakan metrik Accuracy. Hasil pengujian menunjukkan bahwa model Decision Tree memberikan kinerja terbaik, terutama pada rasio data 80:20 dan 70:30. Penelitian ini menggarisbawahi pentingnya penerapan teknologi informasi dan machine learning dalam pencegahan penyakit diabetes dengan fokus pada prediksi risiko.

APA, Harvard, Vancouver, ISO, and other styles

23

Benjamin Zulu, Godfrey, Dr C. Kara Mostefa Khelil, Godfrey Murairidzi Gotora, and Taane Zahreddine. "Identification of PV Fault Classes Using Intelligent Method KNN (K-Nearest Neighbours)." International Journal of Research and Scientific Innovation XI, no. VIII (2024): 1202–29. http://dx.doi.org/10.51244/ijrsi.2024.1108093.

Full text

Abstract:

Throughout many developing nations of our humble planet, renewable energy is a hot topic. Every country at this very moment is trying to move away from fossil fuels like petrol to complete renewable energy sources especially Photovoltaic systems. The reliability and efficiency of renewable energy systems is now a frequent topic of discussion. Like all systems of production, renewable energy systems are subject to failures and defects in their normal operating functions with regards to the amount of power output. These systems break down and deteriorate during the period of their operation. This is why a system of diagnostic is required whose many objectives is to provide indicators with the given valuables like temperature, solar irradiation, voltage and current output to detect the faults and thus maintain the energy production at optimum. The work in progress relates to the diagnostic of faults in the PV systems using artificial intelligent methods particularly the K-nearest Neighbour algorithm.

APA, Harvard, Vancouver, ISO, and other styles

24

Dange, Varsha, and Tejaswini Bhosale. "Automatic Detection of Baby Cry using Machine Learning with Self Learning Music Player System for Soothing." International Journal for Research in Applied Science and Engineering Technology 10, no. 4 (2022): 1541–46. http://dx.doi.org/10.22214/ijraset.2022.41433.

Full text

Abstract:

Abstract: Automatic voice detection of baby cry plays an outstanding role in different applications for smart monitoring of smart baby condition. In this proposed model, a baby’s cry is being detected and a music player will be played after detection in order to create a soothing environment for the baby. This system employs a machine learning approach to recognize newborn cry sounds in a variety of residential settings under difficult situations. The automatic detection of a baby cry can be used in a variety of situations involving various types of sounds in the environment. The proposed system also provided with alert-based notifications to parents. Also, in commercial used products such as remote monitoring of baby, baby facial recognition as well as in medical applications. Also, evaluate K-Nearest Neighbours (KNN) algorithm for baby cry detection and perform priority queue functions for playing music. In this system, the input consists of the cry sound with k nearest training samples in the database. The output is depended on analysis of the cry-detection performance with KNN which is used for classification. Keywords: Baby cry detection, Machine learning, Audio detection, K-Nearest Neighbours, smart monitoring

APA, Harvard, Vancouver, ISO, and other styles

25

Ajay N. Upadhyaya. "Enhancing Credit Card Fraud Detection with K-Nearest Neighbours (KNN): A Machine Learning Approach." Journal of Information Systems Engineering and Management 10, no. 2 (2025): 498–506. https://doi.org/10.52783/jisem.v10i2.2388.

Full text

Abstract:

The rise in digital transactions has made credit card fraud an even more serious worldwide issue. The study examines the use of a highly biased dataset of transactions from European cardholders to identify credit card fraud using the K-Nearest Neighbours (KNN) method. The dataset was improved by utilising Principal Component Analysis (PCA) to improve feature relevance. It consisted of 284,807 transactions with only 492 fraudulent cases. After training and evaluation, the KNN model showed a good accuracy of 94.04%. However, a considerable number of false positives and undetected frauds are indicated by the model's low accuracy (0.0136) and moderate recall (0.5074). The study highlights the importance of effective data preprocessing, feature selection, and parameter optimization in improving model performance. This study improves fraud detection rates in real time using KNN, providing information for future research on advanced machine learning methods and ensemble techniques.

APA, Harvard, Vancouver, ISO, and other styles

26

Gyunka, B. A., and S. I. Barda. "Anomaly detection of android malware using One-Class K-Nearest Neighbours (OC-KNN)." Nigerian Journal of Technology 39, no. 2 (2020): 542–52. http://dx.doi.org/10.4314/njt.v39i2.25.

Full text

Abstract:

The advent of the Android Operating System has recorded a remarkable ground-breaking opportunities in the Technological world. However, this great breakthrough also has a very dark side – an uncontrollable rapid continuous releases of malware in the wild, targeted at the platform and all its information and human assets. The misuse-based approaches adopted by many detection systems do no longer have the rigidity and the tenacity to accommodate the rapid successive releases of malware that come in great volume in order to keep up with active defenses against unknown and novel attacks. Systems that are capable of offering anomaly protection are thus in dire need. This study developed a normality model that is based on One-Class K-Nearest Neighbour (OC-kNN) Machine Learning approach for anomaly detection of Android Malware. The OC-kNN was trained, using WEKA 3.8.2 Machine Learning Suite, through a semi-supervise procedure that contained mostly benign and a very few outliers Android application samples. The OC-kNN had 88.57% true performance accuracy for normal instances while 71.9% was recorded as true performance accuracy for outliers (unknown) instances. The false alarm rates for both normal and outlier’s instances were recorded as 28.1% and 11.5%. The study concluded that a One-Class Classification model is an effective approach to be used for the detection of unknown Android malware. Keywords: Android; Machine Learning, Malware, One-Class Classification, Anomaly Detection, Outlier Detection, Novelty Detection, Concept Learning, k-NN

APA, Harvard, Vancouver, ISO, and other styles

27

Zubi, Zakaria Suliman, Ali A. Elrowayati, and Ibrahim Saad Abu Fanas. "A Movie Recommendation System Design Using Association Rules Mining and Classification Techniques." WSEAS TRANSACTIONS ON COMPUTERS 21 (June 7, 2022): 189–99. http://dx.doi.org/10.37394/23205.2022.21.24.

Full text

Abstract:

The importance of recommendation systems is increasing day by day due to the massive number of data and information-overloaded arising from the internet. This data can be collected in predictive datasets; these datasets can be processed and analysed via data mining methods. In this paper, an efficient hybrid movie recommender system has been designed using the association rules mining technique and K-nearest neighbours (KNN) algorithm as a classification method. The K-nearest neighbours (KNN) algorithm subsystem was used to create the first candidate list through a practical MovieLens dataset, which was retrieved from the source of the NetFlix network. Beside, the Apriori algorithm subsystem is used to analyse the same dataset and create the second list. Finally, the proposed system creates a final recommended list by matching the two lists. The results of the proposed system provide better performance than the existing systems in terms of the important degree. The important degree gives a better accuracy rate than the existing techniques used.

APA, Harvard, Vancouver, ISO, and other styles

28

Permana, I. Gede Teguh, Ida Bagus Gede Dwidasmara, Made Agung Raharja, and I. Wayan Santiyasa. "Ekstraksi Fitur Dengan Convolutional Neural Network Dan Rekomendasi Fashion Menggunakan Algoritma K-Nearest Neighbours." JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) 12, no. 4 (2024): 845. http://dx.doi.org/10.24843/jlk.2024.v12.i04.p10.

Full text

Abstract:

Pesatnya pertumbuhan industri fashion pada platform e-commerce sehingga fashion dapat diperoleh dengan mudah oleh berbagai segmentasi konsumen. Segmentasi konsumen dapat direpresentasikan disetiap search jenis fashion yang di inginkan, namun search jenis fashion pada e-commerce dilakukan dengan search berbasiskan kata kunci string sehingga segmentasi konsumen terhadap karakteristik fashion sulit dilakukan. Fashion merupakan object yang mudah dikenali secara visual sehingga search berbasiskan gambar sangat diperlukan pada platform e-commerce untuk memilih fashion berbasiskan segmentasi konsumen. Implementasi search berbasiskan gambar dapat dilakukan dengan rekomendasi fashion berbasiskan content dengan k-nearest neighbour (KNN) untuk melakukan pendekatan antara feature fashion terhadap input image fashion oleh konsumen dengan setiap feature data dilakukan ekstraksi feature kedalam convolution layer pada model convolutional neural network (CNN) dan histogram oriented gradient (HOG) dapat dievaluasi dengan top-n accuracy terhadap model Resnet, GoogLeNet, VGG, dan HOG dengan masing-masing performa model tersebut dibandingkan sehingga dapat diperoleh accuracy sebesar 93% pada GoogLeNet dengan KNN sebagai model terbaik dalam rekomendasi fashion. Adapun pendekatan antara feature fashion dilakukan berbasiskan hasil label dari proses classification ke dalam convolution dan fully connected layer pada convolutional neural network (CNN) dapat dievaluasi dengan evaluation matrices terhadap model Resnet, GoogLeNet, VGG dengan masing-masing performa model tersebut dibandingkan sehingga dapat diperoleh nilai accuracy sebesar 99%, precision sebesar 100%, recall 99%, f1-score 99% pada VGG sebagai model terbaik untuk identifikasi jenis fashion. Keywords: Fashion, Ekstraksi Feature, Sistem Rekomendasi, Arsitektur CNN, HOG, KNN, Evaluation Matrices, Top-n accuracy

APA, Harvard, Vancouver, ISO, and other styles

29

Nufus, Annisa Hidayatul, and Helma Helma. "Klasifikasi Masyarakat Penerima BPNT Program Sembako 2021 di Kelurahan Tiakar dengan Mengunakan Metode KNN Classifier." Journal of Mathematics UNP 8, no. 2 (2023): 51. http://dx.doi.org/10.24036/unpjomath.v8i2.14233.

Full text

Abstract:

At the time of the implementation of the food program in the Tiakar Sub-District, there was no information available on the criteria that would make the head of a family eligible to be a recipient of assistance, even though this was very important so that the government's goal of fulfilling the need for nutritious food could be felt by those who really needed it. The purpose of this study was to classify the heads of families in Tiakar Subdistrict as eligible or not eligible to receive staple foods using the K-Nearest Neighbor method. This study uses interview data conducted with the heads of families in the Tiakar Village. The data analysis step is to divide the data into training data and test data by 80%:20%, determine the number of nearest neighbours, calculate the dissimilarity distance and choose a class, calculate the level of accuracy using the confusion matrix and then choose the optimal K. Based on the results of the study, it was found that the K value that was good to use in the classification of family heads in Tiakar Subdistrict was K=3 because it had an accuracy percentage of 95%.

APA, Harvard, Vancouver, ISO, and other styles

30

Ogunsuyi, Opeyemi J., and adebola Ojo. "K-Nearest Neighbors Bayesian Approach to False News Detection from Text on Social Media." International Journal of Education and Management Engineering 12, no. 4 (2022): 22–32. https://doi.org/10.5815/ijeme.2022.04.03.

Full text

Abstract:

Social media usage has increased due to the rate at which technologies are emerging and it is less likely to detect false news/information manually as it aims to capture the human mind. The spread of false news can cause havoc; therefore, detection of false news becomes paramount where almost everyone has access to social media. Our proposed system optimizes the false news detection process. The system combines advantages of two textual feature extraction methods and two machine learning algorithms for text classification. Basic pre-processing methods were employed. Feature extraction was carried out using Term Frequency-Inverse Document Frequency with Word2Vector. K-Nearest Neighbour (KNN) and Naïve Bayes (NB) algorithms are combined to give KNN Bayesian. The most available systems made use of a single feature extraction method but in our system, two feature extraction methods are combined. The evaluation metrics used were accuracy, precision, recall, f1score and KNN Bayesian performed better than KNN. To further evaluate our model, the Area under the Curve-Receiver Operator Characteristics (AUC-ROC) revealed that AUC of KNN Bayesian ROC curve is higher than that of KNN. 

APA, Harvard, Vancouver, ISO, and other styles

31

Kurniati, Florentina Tatrin, Hindriyanto Dwi Purnomo, Irwan Sembiring, and Ade Iriani. "Digital Image Object Detection with GLCM Multi-Degrees and Ensemble Learning." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 8, no. 2 (2024): 321–27. http://dx.doi.org/10.29207/resti.v8i2.5597.

Full text

Abstract:

Object detection in digital images has been implemented in various fields. Object detection faces challenges, one of which is rotation problems, causing objects to become unknown. We need a method that can extract features that do not affect rotation and reliable ensemble-based classification. The proposal uses the GLCM-MD (Gray-Level Co-occurrence Matrix Multi-Degrees) extraction method with classification using K-Nearest Neighbours (K-NN) and Random Forest (RF) learning as well as Voting Ensemble (VE) from two single classifications. The main goal is to overcome the difficulty of detecting objects when the object experiences rotation which results in significant visualization variations. In this research, the GLCM method is used to produce features that are stable against rotation. Furthermore, classification methods such as K-Nearest Neighbours (KNN), Random Forest (RF), and KNN-RF fusion using the Voting ensemble method are evaluated to improve detection accuracy. The experimental results show that the use of multi-degrees and the use of ensemble voting at all degrees can increase the accuracy value, and the highest accuracy for extraction using multi-degrees is 95.95%. Based on test results which show that the use of features of various degrees and the ensemble voting method can increase accuracy for detecting objects experiencing rotation

APA, Harvard, Vancouver, ISO, and other styles

32

MAYANK, GOEL, KUMAR TIWARI ABHISHEK, and SURESH PATIL HARSHAL. "RECOMMENDATION ENGINE FOR B2B CUSTOMERS IN TELECOM BY CUSTOMIZING KNN ALGORITHM." JournalNX - a Multidisciplinary Peer Reviewed Journal Volume 4, Issue 4 (2018): 5–7. https://doi.org/10.5281/zenodo.1413861.

Full text

Abstract:

In this paper, we propose a KNN type method for classification that is focussed at overcoming above shortcomings. Our method constructs a cross-sell penetration model using Revenue, Usage, and Firm graphics data for targeting telecom Enterprise Customers. Value of K is varied for different data, and is optimally chosen based on classification accuracy. After Propensity of an account is determined from traditional algorithm, weights are assigned to nearest neighbours and Revenue is determined. https://journalnx.com/journal-article/20150476

APA, Harvard, Vancouver, ISO, and other styles

33

Praveen, M. R., and M. Saimurugan. "Health Monitoring of a Gear Box Using Vibration Signal Analysis." Applied Mechanics and Materials 813-814 (November 2015): 1012–17. http://dx.doi.org/10.4028/www.scientific.net/amm.813-814.1012.

Full text

Abstract:

A gear plays a crucial role in the performance of a gear box. The faults in a gear reduces the gear life and if problem arises in shaft it affects bearing. Gear box is finally affected due to these faults. Vibration signals carries information about condition of a gear box which are captured using piezoelectric accelerometer. In this paper, features are extracted and classified using K nearest neighbours (KNN) algorithms for both time and frequency domain. The effectiveness of KNN in classification of gear faults for both time and frequency domain is discussed and compared.

APA, Harvard, Vancouver, ISO, and other styles

34

Joshuva, A., and V. Sugumaran. "A Comparative Study for Condition Monitoring on Wind Turbine Blade using Vibration Signals through Statistical Features: a Lazy Learning Approach." International Journal of Engineering & Technology 7, no. 4.10 (2018): 190. http://dx.doi.org/10.14419/ijet.v7i4.10.20833.

Full text

Abstract:

This study is to identify whether the wind turbine blades are in good or faulty conditions. If faulty, then the objective to find which fault condition are the blades subjected to. The problem identification is carried out by machine learning approach using vibration signals through statistical features. In this study, a three bladed wind turbine was chosen and faults like blade cracks, hub-blade loose connection, blade bend, pitch angle twist and blade erosion were considered. Here, the study is carried out in three phases namely, feature extraction, feature selection and feature classification. In phase 1, the required statistical features are extracted from the vibration signals which obtained from the wind turbine through accelerometer. In phase 2, the most dominating or the relevant feature is selected from the extracted features using J48 decision tree algorithm. In phase 3, the selected features are classified using machine learning classifiers namely, K-star (KS), locally weighted learning (LWL), nearest neighbour (NN), k-nearest neighbours (kNN), instance based K-nearest using log and Gaussian weight kernels (IBKLG) and lazy Bayesian rules classifier (LBRC). The results were compared with respect to the classification accuracy and the computational time of the classifier.

APA, Harvard, Vancouver, ISO, and other styles

35

Sugianto, Castaka Agus, and Shandy Tresnawati. "A Covid-19 Sentiment Analysis on Twitter Using K-Nearest Neighbours." Journal of Applied Intelligent System 7, no. 1 (2022): 58–69. http://dx.doi.org/10.33633/jais.v7i1.5984.

Full text

Abstract:

In December 2019, an outbreak named Corona Virus (SARS-CoV-2) occurred in the city of Wuhan, China which was later known as COVID-19. News of the development of the virus spread through various media, one of which was through the well-known platform Twitter. Twitter is one of the widely used media platforms to communicate about Covid-19. Information related to Covid-19 circulating in the community can be in the form of news or opinions or opinions. Then, the circulating information will be classified into three classes, namely positive, negative or neutral. The method used to calculate the prediction of text classification on Twitter is K-nearest neighbors (KNN). The dataset used in grouping on twitter by using the account name Covid19. Firstly, the dataset by crawling data or information on twitter. Secondly, the text mining stage to determine the class distance value and calculate the Euclidean distance formula based on all the training data to be tested. After the training process is complete, the evaluation model used will be used, the Euclidean results are taken based on the value of the closest distance. The accuracy of the model will be calculated using the previous Euclidean method. The results of this study he obtained with the highest value, one of which was 78% using a 50:50 sample comparison with k-5 and k-9 values.

APA, Harvard, Vancouver, ISO, and other styles

36

T. Nagamani. "Automatic Diagnosis of Parkinson’s Disease using Handwriting Patterns." Journal of Electrical Systems 20, no. 7s (2024): 1395–405. http://dx.doi.org/10.52783/jes.3712.

Full text

Abstract:

Parkinson's Disease (PD) is a neuro-degenerative syndrome characterized by motor and non-motor signs, and early detection is crucial for effective intervention. This paper presents a novel approach for PD detection using computer vision and machine learning techniques applied to Spiral-Wave handwriting analysis. The dataset comprises frontal handwritten images obtained through the Spiral-Wave test, capturing subtle motor control differences. Our methodology involves resizing images to a standardized 200x200 pixels, converting them to grayscale, and applying thresholding for improved feature abstraction. Histogram of Oriented Gradients (HOG) is employed to capture shape and texture information. The development of a strong approach for deriving significant features from Spiral-Wave handwriting patterns and the usage of machine learning classifiers for precise PD analysis are the two main goals of this work. The emphasis is on using Random Forest and K-Nearest Neighbours (KNN) classifiers for Spiral and Wave pictures, respectively, in conjunction with the Histogram of Oriented Gradients (HOG) approach for feature extraction. For Spiral images, a Random Forest Classifier is utilized, achieving an accuracy of 86.67%. The classifier's interpretability is enhanced through an analysis of feature importance, revealing critical HOG features for distinguishing between healthy and PD-afflicted patterns. The Wave images are classified using a K-Nearest Neighbours (KNN) model, attaining an accuracy of 76.67%. Performance metrics, including precision, recall, and F1-score, offer a nuanced assessment of the KNN model's capabilities.

APA, Harvard, Vancouver, ISO, and other styles

37

Swetharani, K., and Prasad Vara. "Design and Implementation of an Efficient Rose Leaf Disease Detection using K-Nearest Neighbours." International Journal of Recent Technology and Engineering (IJRTE) 9, no. 3 (2020): 21–27. https://doi.org/10.35940/ijrte.C4213.099320.

Full text

Abstract:

Plants are prone to different diseases caused by multiple reasons like environmental conditions, light, bacteria, and fungus. These diseases always have some physical characteristics on the leaves, stems, and fruit, such as changes in natural appearance, spot, size, etc. Due to similar patterns, distinguishing and identifying category of plant disease is the most challenging task. Therefore, efficient and flawless mechanisms should be discovered earlier so that accurate identification and prevention can be performed to avoid several losses of the entire plant. Therefore, an automated identification system can be a key factor in preventing loss in the cultivation and maintaining high quality of agriculture products. This paper introduces modeling of rose plant leaf disease classification technique using feature extraction process and supervised learning mechanism. The outcome of the proposed study justifies the scope of the proposed system in terms of accuracy towards the classification of different kind of rose plant disease.

APA, Harvard, Vancouver, ISO, and other styles

38

Hemamalini, D., C. Nagamani, K. Deepesh, and P. Kamal. "Predicting Stock Market Trends using Machine Learning and Deep Learning Algorithms Via Continuous and Binary Data." Shanlax International Journal of Arts, Science and Humanities 11, S3-July (2024): 26–33. http://dx.doi.org/10.34293/sijash.v11is3-july.7915.

Full text

Abstract:

This investigation aimed to utilize machine learning algorithms for predicting stock market movements in Iran. The study centered on three specific sectors - diversified finance, information technology (IT), and metals - within the Tehran Stock Exchange. Ten years of historical data were analyzed . Incorporating ten technical indicators. To achieve this goal, six machine learning models were deployed. Support Vector Regression (Linear)Support Vector Regression (RBF)Linear Regression, Random -Forests ,K-Nearest Neighbours (KNN) Decision Trees.

APA, Harvard, Vancouver, ISO, and other styles

39

Coşkun, İ. B., S. Sertok, and B. Anbaroğlu. "K-NEAREST NEIGHBOUR QUERY PERFORMANCE ANALYSES ON A LARGE SCALE TAXI DATASET: POSTGRESQL VS. MONGODB." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W13 (June 5, 2019): 1531–38. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w13-1531-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> The increasing volume of transport network data necessitates the use of a DataBase Management System (DBMS) to store, query and analyse data. There are two main types of DBMS: relational and non-relational. Many different DBMS are available on the market but only some of them could handle spatial data. Therefore, determining which DBMS to use for operational purposes is of interest to researchers and analysts working in spatial information science. One of the commonly used spatial queries in GIS is the k-Nearest Neighbour (kNN) of a given point. This paper analyses the performance of the kNN query in PostgreSQL and MongoDB, both being a representative of relational and NoSQL DBMS respectively. Two different metrics have been investigated to determine the performance: i) spatial accuracy and ii) run time. Haversine and Vincenty formulas are used to calculate the distance between the point and the determined neighbours, which are then used to determine the spatial accuracy of the DBMS. Sensitivity analysis have been carried out by varying the k value and the execution times are recorded. The experiments are carried out on New York City’s openly available taxi dataset consisting of millions of taxi pickup and dropoff points. The results indicate that MongoDB outperforms Postgres both in terms of execution time and spatial accuracy regardless the value of k. In order to facilitate reproducibility of the results, the developed software is shared on GitHub.</p>

APA, Harvard, Vancouver, ISO, and other styles

40

Fathian, Ramin, Steven Phan, Chester Ho, and Hossein Rouhani. "Face touch monitoring using an instrumented wristband using dynamic time warping and k-nearest neighbours." PLOS ONE 18, no. 2 (2023): e0281778. http://dx.doi.org/10.1371/journal.pone.0281778.

Full text

Abstract:

One of the main factors in controlling infectious diseases such as COVID-19 is to prevent touching preoral and prenasal regions. Face touching is a habitual behaviour that occurs frequently. Studies showed that people touch their faces 23 times per hour on average. A contaminated hand could transmit the infection to the body by a facial touch. Since controlling this spontaneous habit is not easy, this study aimed to develop and validate a technology to detect and monitor face touch using dynamic time warping (DTW) and KNN (k-nearest neighbours) based on a wrist-mounted inertial measurement unit (IMU) in a controlled environment and natural environment trials. For this purpose, eleven volunteers were recruited and their hand motions were recorded in controlled and natural environment trials using a wrist-mounted IMU. Then the sensitivity, precision, and accuracy of our developed technology in detecting the face touch were evaluated. It was observed that the sensitivity, precision, and accuracy of the DTW-KNN classifier were 91%, 97%, and 85% in controlled environment trials and 79%, 92%, and 79% in natural environment trials (daily life). In conclusion, a wrist-mounted IMU, widely available in smartwatches, could detect the face touch with high sensitivity, precision, and accuracy and can be used as an ambulatory system to detect and monitor face touching as a high-risk habit in daily life.

APA, Harvard, Vancouver, ISO, and other styles

41

Badr, Hssina, Grota Abdelakder, and Erritali Mohammed. "Recommendation system using the k-nearest neighbors and singular value decomposition algorithms." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 6 (2021): 5541–48. https://doi.org/10.11591/ijece.v11i6.pp5541-5548.

Full text

Abstract:

Nowadays, recommendation systems are used successfully to provide items (example: movies, music, books, news, images) tailored to user preferences. Amongst the approaches existing to recommend adequate content, we use the collaborative filtering approach of finding the information that satisfies the user by using the reviews of other users. These reviews are stored in matrices that their sizes increase exponentially to predict whether an item is relevant or not. The evaluation shows that these systems provide unsatisfactory recommendations because of what we call the cold start factor. Our objective is to apply a hybrid approach to improve the quality of our recommendation system. The benefit of this approach is the fact that it does not require a new algorithm for calculating the predictions. We are going to apply two algorithms: k-nearest neighbours (KNN) and the matrix factorization algorithm of collaborative filtering which are based on the method of (singular-value-decomposition). Our combined model has a very high precision and the experiments show that our method can achieve better results.

APA, Harvard, Vancouver, ISO, and other styles

42

Juwaied, Abdulla, Lidia Jackowska-Strumillo, and Artur Sierszeń. "Enhancing Clustering Efficiency in Heterogeneous Wireless Sensor Network Protocols Using the K-Nearest Neighbours Algorithm." Sensors 25, no. 4 (2025): 1029. https://doi.org/10.3390/s25041029.

Full text

Abstract:

Wireless Sensor Networks are formed by tiny, self-contained, battery-powered computers with radio links that can sense their surroundings for events of interest and store and process the sensed data. Sensor nodes wirelessly communicate with each other to relay information to a central base station. Energy consumption is the most critical parameter in Wireless Sensor Networks (WSNs). Network lifespan is directly influenced by the energy consumption of the sensor nodes. All sensors in the network send and receive data from the base station (BS) using different routing protocols and algorithms. These routing protocols use two main types of clustering: hierarchical clustering and flat clustering. Consequently, effective clustering within Wireless Sensor Network (WSN) protocols is essential for establishing secure connections among nodes, ensuring a stable network lifetime. This paper introduces a novel approach to improve energy efficiency, reduce the length of network connections, and increase network lifetime in heterogeneous Wireless Sensor Networks by employing the K-Nearest Neighbours (KNN) algorithm to optimise node selection and clustering mechanisms for four protocols: Low-Energy Adaptive Clustering Hierarchy (LEACH), Stable Election Protocol (SEP), Threshold-sensitive Energy Efficient sensor Network (TEEN), and Distributed Energy-efficient Clustering (DEC). Simulation results obtained using MATLAB (R2024b) demonstrate the efficacy of the proposed K-Nearest Neighbours algorithm, revealing that the modified protocols achieve shorter distances between cluster heads and nodes, reduced energy consumption, and improved network lifetime compared to the original protocols. The proposed KNN-based approach enhances the network’s operational efficiency and security, offering a robust solution for energy management in WSNs.

APA, Harvard, Vancouver, ISO, and other styles

43

Perihanoglu, G. M., and H. Karaman. "SPATIAL PREDICTION OF RECEIVED SIGNAL STRENGTH FOR CELLULAR COMMUNICATION USING SUPPORT VECTOR MACHINE AND K-NEAREST NEIGHBOURS REGRESSION." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLVIII-4/W9-2024 (March 8, 2024): 291–97. http://dx.doi.org/10.5194/isprs-archives-xlviii-4-w9-2024-291-2024.

Full text

Abstract:

Abstract. Signal strength maps are of great importance for cellular system providers in network planning and operation. Accurate prediction of signal strength is important for solving problems such as link quality. In this study, Received Signal Strength (RSS) prediction model is proposed for the 900 MHz band in the Van Yüzüncü Yıl University campus environment by using machine learning regression methods such as K- Nearest Neıghbours (KNN) and Support Vector Regression (SVR) together with Geographic Information Systems. For the training of this model, signal strength values taken from the RF Spectrum Analyser at different locations and distances were used. In addition, spatial data sets such as the digital elevation model, location of base stations and measurement stations, building heights and location, and land use/cover were used in the model. The effect of these data sets on RSS power is included in the model. The model aims to predict RSS accurately, visualize the estimated signal strength, and analyze the signal field strength coverage. Different kernels from the SVR model such as Polynomial, , and Sigmoid were tested. To increase the success of the model, appropriate parameter values were selected and configured according to SVR and KNN methods. For 900 MHz, the performances of SVR and KNN models were compared and the results of the models were verified using root mean squares (RMSE). Among the measured data, the lowest prediction is found in KNN Manhattan. According to the results of the simulation was observed that the SVR model created with spatial data performs better for Signal Strength. Finally, the lowest RMSE value (1.71 dB) was obtained from the Sigmoid kernel in the best signal strength estimation SVR model. The SVR model is recommended for Campus Area signal strength estimation.

APA, Harvard, Vancouver, ISO, and other styles

44

Sharma, Saurabh, Vishal Paranjape, and Abhishek Singh. "Improving Real-Time Pollution Prediction Using Dimensionality Reduction and KNN Algorithms." International Journal of Innovative Research in Computer and Communication Engineering 11, no. 11 (2023): 11779–85. http://dx.doi.org/10.15680/ijircce.2023.1111059.

Full text

Abstract:

This study introduces an innovative method that uses machine learning to forecast air quality. It use a large dataset that includes a range of environmental parameters. The methodology incorporates standard scaling, Principal Component Analysis (PCA) for reducing dimensionality, and the K-Nearest Neighbours (KNN) algorithm to improve predicting accuracy. The model demonstrates an impressive overall accuracy of 95%, surpassing conventional approaches. This method provides a strong and effective solution for predicting air quality in real-time, making a substantial contribution to proactive environmental management and safeguarding public health. The findings highlight the capacity of sophisticated machine learning methods to tackle the issues related to urban air pollution.

APA, Harvard, Vancouver, ISO, and other styles

45

Chong, Jun Chen, Nah Yi Sim, Chia Wei Khoh, and Law Teng Yi. "Phishing Attack Detection on URLs Using KNN, RF, DT with GA and K-fold Cross Validation Approach." International Journal of Research and Innovation in Social Science IX, no. I (2025): 1623–41. https://doi.org/10.47772/ijriss.2025.9010134.

Full text

Abstract:

This research paper highlights a comprehensive study on phishing attack detection using machine learning algorithms which covers K-Nearest Neighbours (KNN), Random Forest and Decision Tree methods. Due to the ongoing rise of phishing attack, the needs for phishing attack detection method is necessary. This study used dataset downloaded from Kaggle, and then use various features to extract each URLs link retrieve from the dataset to generate another form of dataset for more successful detection. Next, employs K-Fold Cross-Validation methodology and Genetic Algorithms to optimise hyper parameter. The results show that both the Random Forest and Decision Tree models achieved perfect accuracy of 100%, while the KNN model achieved accuracy of 99.87%. The results underscore the effectiveness of machine learning techniques in enhancing phishing detection capabilities, contributing to improved cybersecurity measures.

APA, Harvard, Vancouver, ISO, and other styles

46

Bhayunagiri, IBP, and M. Saifulloh. "Urban footprint extraction derived from worldview-2 satellite imagery by random forest and k-nearest neighbours algorithm." IOP Conference Series: Earth and Environmental Science 1200, no. 1 (2023): 012043. http://dx.doi.org/10.1088/1755-1315/1200/1/012043.

Full text

Abstract:

Abstract High-resolution spatial data regarding the distribution of urban areas is fundamental concerning regional spatial planning and monitoring the development of built-up areas. Many researchers have extracted urban footprints using low to medium-resolution satellite imagery. For applications on a global and regional scale, low to medium image resolution are suitable. Nevertheless, higher image resolution is required on a local scale, down to a small urban area level. This study objective to mapping the built-up land and examine the accuracy of 2 machine learning algorithms. This investigation employs a novel approach that combines the utilization of remote sensing technology with the implementation of machine learning algorithms. We use Random Forest (RF) and K-Nearest Neighbours (KNN) machine learning algorithms. This study used a high-resolution (0.5 meter) satellite image derived formWorldView-2. We only used three visible channels (Red-Green-Blue) with a 450 – 690 nm wavelength. Integrating remote sensing and machine learning can adequately investigate the urban footprint area. Based on this research, the RF better than KNN algorithm. It is proven by the confidence iteration value and the overall accuracy of the RF and KNN algorithms, i.e., 73.32%, 71.99%, 82.08%, and 77.89% respectively. Based on WorldView-2 imagery acquired in 2015, the proportion of urban footprint is still lower than the green area with 41.75%: 58.24%, especially in the centre of the capital city of Bali Province. Such conditions are undoubtedly different in other urban areas in Bali. Even one city area, e.g., West Denpasar, which almost the entire area is dominated by the urban footprint area. Such conditions are a particular concern for the local government in managing future spatial planning regulations. It is recommended that the proportion of green open space remains a priority so that there are no environmental problems in urban areas (e.g., air pollution, flooding due to runoff problems, etc.).

APA, Harvard, Vancouver, ISO, and other styles

47

ALYAMANI, Amina, and Oleh YASNIY. "CLASSIFICATION OF EEG SIGNAL BY METHODS OF MACHINE LEARNING." Applied Computer Science 16, no. 4 (2020): 56–63. http://dx.doi.org/10.35784/acs-2020-29.

Full text

Abstract:

Electroencephalogram (EEG) signal of two healthy subjects that was available from literature, was studied using the methods of machine learning, namely, decision trees (DT), multilayer perceptron (MLP), K-nearest neighbours (kNN), and support vector machines (SVM). Since the data were imbalanced, the appropriate balancing was performed by Kmeans clustering algorithm. The original and balanced data were classified by means of the mentioned above 4 methods. It was found, that SVM showed the best result for the both datasets in terms of accuracy. MLP and kNN produce the comparable results which are almost the same. DT accuracies are the lowest for the given dataset, with 83.82% for the original data and 61.48% for the balanced data.

APA, Harvard, Vancouver, ISO, and other styles

48

Krzywicki, Tomasz. "Weather and a part of day recognition in the photos using a KNN methodology." Technical Sciences 4, no. 21 (2018): 291–302. http://dx.doi.org/10.31648/ts.4174.

Full text

Abstract:

This article presents a proposal for recognizing the weather and part of a day in digital photos encoded in the bitmap format, based on auctorial edge detection algorithm of horizon to demarcate the sky and k-nearest neighbours algorithm, to classify the daytime in the picture as “day” or “night” and to classify the weather as “sunny” or “cloudy”. To verify the effectiveness of the classification the Internal Bagging-5 model was applied. The data for surveys in the form of pictures was prepared on self-provision. To test the method in a different location, data from the Internet was used.

APA, Harvard, Vancouver, ISO, and other styles

49

Hssina, Badr, Abdelkader Grota, and Mohammed Erritali. "Recommendation system using the k-nearest neighbors and singular value decomposition algorithms." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 6 (2021): 5541. http://dx.doi.org/10.11591/ijece.v11i6.pp5541-5548.

Full text

Abstract:

<span>Nowadays, recommendation systems are used successfully to provide items (example: movies, music, books, news, images) tailored to user preferences. Amongst the approaches existing to recommend adequate content, we use the collaborative filtering approach of finding the information that satisfies the user by using the reviews of other users. These reviews are stored in matrices that their sizes increase exponentially to predict whether an item is relevant or not. The evaluation shows that these systems provide unsatisfactory recommendations because of what we call the cold start factor. Our objective is to apply a hybrid approach to improve the quality of our recommendation system. The benefit of this approach is the fact that it does not require a new algorithm for calculating the predictions. We are going to apply two algorithms: k-nearest neighbours (KNN) and the matrix factorization algorithm of collaborative filtering which are based on the method of (singular-value-decomposition). Our combined model has a very high precision and the experiments show that our method can achieve better results.</span>

APA, Harvard, Vancouver, ISO, and other styles

50

Lumbantobing, Hariman, Irma Ratna Avianti, Kukuh Harisapto, and Suharjito Suharjito. "Flood Prediction based on Weather Parameters in Jakarta using K-Nearest Neighbours Algorithm." Eduvest - Journal of Universal Studies 4, no. 6 (2024): 5055–65. http://dx.doi.org/10.59188/eduvest.v4i6.1339.

Full text

Abstract:

Flooding is a difficult and common hazard in Indonesia, particularly in Jakarta during the rainy season. Floods have been the subject of several endeavours, ranging from discovering the causes to reducing their impacts. Floods cause significant damage to infrastructure, the social economy, and human lives. The government continues to create reliable flood risk maps and plans for long-term flood risk management. According to data from Jakarta Flood Monitoring, 12 sub-districts and 26 urban villages were hit by floods each year between 2016 and 2020, with an average flood length of nearly 2 days. The flood tendency in Jakarta decreased from 2018 to 2019, but increased in 2020. Floods are produced by a variety of reasons, including weather, geography, and human actions such as deforestation. Strong flood prediction is required for disaster management, however this might be difficult owing to changing weather conditions. This study focuses on flood prediction in Jakarta based on weather parameters utilising machine learning techniques to provide accurate and real-time predictions. K-Nearest Neighbours (KNN) is an algorithm employed to forecast the areas that will encounter the consequences of floods. The outcomes of this research with the value of k=2 to k=9 obtained the best performance values at k=7, where the level of accuracy reaches 92.25%, 88.89% precision, 92.25% recall, and F1-measure of 89.52%. The integration of machine learning algorithms which encompasses multiple weather variables provides significant utility in comprehensive flood predictions and early warning systems in flood disaster mitigation.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!