To see the other types of publications on this topic, follow the link: K-Nearest Neighbors algorithm.

Journal articles on the topic 'K-Nearest Neighbors algorithm'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'K-Nearest Neighbors algorithm.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhai, Junhai, Jiaxing Qi, and Sufang Zhang. "An instance selection algorithm for fuzzy K-nearest neighbor." Journal of Intelligent & Fuzzy Systems 40, no. 1 (January 4, 2021): 521–33. http://dx.doi.org/10.3233/jifs-200124.

Full text
Abstract:
The condensed nearest neighbor (CNN) is a pioneering instance selection algorithm for 1-nearest neighbor. Many variants of CNN for K-nearest neighbor have been proposed by different researchers. However, few studies were conducted on condensed fuzzy K-nearest neighbor. In this paper, we present a condensed fuzzy K-nearest neighbor (CFKNN) algorithm that starts from an initial instance set S and iteratively selects informative instances from training set T, moving them from T to S. Specifically, CFKNN consists of three steps. First, for each instance x ∈ T, it finds the K-nearest neighbors in S and calculates the fuzzy membership degrees of the K nearest neighbors using S rather than T. Second it computes the fuzzy membership degrees of x using the fuzzy K-nearest neighbor algorithm. Finally, it calculates the information entropy of x and selects an instance according to the calculated value. Extensive experiments on 11 datasets are conducted to compare CFKNN with four state-of-the-art algorithms (CNN, edited nearest neighbor (ENN), Tomeklinks, and OneSidedSelection) regarding the number of selected instances, the testing accuracy, and the compression ratio. The experimental results show that CFKNN provides excellent performance and outperforms the other four algorithms.
APA, Harvard, Vancouver, ISO, and other styles
2

Houben, I., L. Wehenkel, and M. Pavella. "Genetic Algorithm Based k Nearest Neighbors." IFAC Proceedings Volumes 30, no. 6 (May 1997): 1075–80. http://dx.doi.org/10.1016/s1474-6670(17)43506-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Onyezewe, Anozie, Armand F. Kana, Fatimah B. Abdullahi, and Aminu O. Abdulsalami. "An Enhanced Adaptive k-Nearest Neighbor Classifier Using Simulated Annealing." International Journal of Intelligent Systems and Applications 13, no. 1 (February 8, 2021): 34–44. http://dx.doi.org/10.5815/ijisa.2021.01.03.

Full text
Abstract:
The k-Nearest Neighbor classifier is a non-complex and widely applied data classification algorithm which does well in real-world applications. The overall classification accuracy of the k-Nearest Neighbor algorithm largely depends on the choice of the number of nearest neighbors(k). The use of a constant k value does not always yield the best solutions especially for real-world datasets with an irregular class and density distribution of data points as it totally ignores the class and density distribution of a test point’s k-environment or neighborhood. A resolution to this problem is to dynamically choose k for each test instance to be classified. However, given a large dataset, it becomes very tasking to maximize the k-Nearest Neighbor performance by tuning k. This work proposes the use of Simulated Annealing, a metaheuristic search algorithm, to select optimal k, thus eliminating the prospect of an exhaustive search for optimal k. The results obtained in four different classification tasks demonstrate a significant improvement in the computational efficiency against the k-Nearest Neighbor methods that perform exhaustive search for k, as accurate nearest neighbors are returned faster for k-Nearest Neighbor classification, thus reducing the computation time.
APA, Harvard, Vancouver, ISO, and other styles
4

Piegl, Les A., and Wayne Tiller. "Algorithm for finding all k nearest neighbors." Computer-Aided Design 34, no. 2 (February 2002): 167–72. http://dx.doi.org/10.1016/s0010-4485(00)00141-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tu, Ching Ting, Hsiau Wen Lin, Hwei-Jen Lin, and Yue Shen Li. "Super-Resolution Based on Clustered Examples." International Journal of Pattern Recognition and Artificial Intelligence 30, no. 06 (May 9, 2016): 1655015. http://dx.doi.org/10.1142/s0218001416550156.

Full text
Abstract:
In this paper, we propose an improved version of the neighbor embedding super-resolution (SR) algorithm proposed by Chang et al. [Super-resolution through neighbor embedding, in Proc. 2004 IEEE Computer Society Conf. Computer Vision and Pattern Recognition(CVPR), Vol. 1 (2004), pp. 275–282]. The neighbor embedding SR algorithm requires intensive computational time when finding the K nearest neighbors for the input patch in a huge set of training samples. We tackle this problem by clustering the training sample into a number of clusters, with which we first find for the input patch the nearest cluster center, and then find the K nearest neighbors in the corresponding cluster. In contrast to Chang’s method, which uses Euclidean distance to find the K nearest neighbors of a low-resolution patch, we define a similarity function and use that to find the K most similar neighbors of a low-resolution patch. We then use local linear embedding (LLE) [S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290(5500) (2000) 2323–2326] to find optimal coefficients, with which the linear combination of the K most similar neighbors best approaches the input patch. These coefficients are then used to form a linear combination of the K high-frequency patches corresponding to the K respective low-resolution patches (or the K most similar neighbors). The resulting high-frequency patch is then added to the enlarged (or up-sampled) version of the input patch. Experimental results show that the proposed clustering scheme efficiently reduces computational time without significantly affecting the performance.
APA, Harvard, Vancouver, ISO, and other styles
6

Song, Yunsheng, Xiaohan Kong, and Chao Zhang. "A Large-Scale k -Nearest Neighbor Classification Algorithm Based on Neighbor Relationship Preservation." Wireless Communications and Mobile Computing 2022 (January 7, 2022): 1–11. http://dx.doi.org/10.1155/2022/7409171.

Full text
Abstract:
Owing to the absence of hypotheses of the underlying distributions of the data and the strong generation ability, the k -nearest neighbor (kNN) classification algorithm is widely used to face recognition, text classification, emotional analysis, and other fields. However, kNN needs to compute the similarity between the unlabeled instance and all the training instances during the prediction process; it is difficult to deal with large-scale data. To overcome this difficulty, an increasing number of acceleration algorithms based on data partition are proposed. However, they lack theoretical analysis about the effect of data partition on classification performance. This paper has made a theoretical analysis of the effect using empirical risk minimization and proposed a large-scale k -nearest neighbor classification algorithm based on neighbor relationship preservation. The process of searching the nearest neighbors is converted to a constrained optimization problem. Then, it gives the estimation of the difference on the objective function value under the optimal solution with data partition and without data partition. According to the obtained estimation, minimizing the similarity of the instances in the different divided subsets can largely reduce the effect of data partition. The minibatch k -means clustering algorithm is chosen to perform data partition for its effectiveness and efficiency. Finally, the nearest neighbors of the test instance are continuously searched from the set generated by successively merging the candidate subsets until they do not change anymore, where the candidate subsets are selected based on the similarity between the test instance and cluster centers. Experiment results on public datasets show that the proposed algorithm can largely keep the same nearest neighbors and no significant difference in classification accuracy as the original kNN classification algorithm and better results than two state-of-the-art algorithms.
APA, Harvard, Vancouver, ISO, and other styles
7

Prasetio, Rizki Tri, Ali Akbar Rismayadi, and Iedam Fardian Anshori. "Implementasi Algoritma Genetika pada k-nearest neighbours untuk Klasifikasi Kerusakan Tulang Belakang." Jurnal Informatika 5, no. 2 (September 29, 2018): 186–94. http://dx.doi.org/10.31311/ji.v5i2.4123.

Full text
Abstract:
AbstrakKerusakan tulang belakang dialami oleh sekitar dua pertiga orang dewasa serta termasuk ke dalam penyakit yang paling umum kedua setelah sakit kepala. Klasifikasi gangguan tulang belakang sulit dilakukan karena membutuhkan radiologist untuk menganalisa citra Magnetic Resonance Imaging (MRI). Penggunaan Computer Aided Diagnosis (CAD) System dapat membantu radiologist untuk mendeteksi kelainan pada tulang belakang dengan lebih optimal. Dataset vertebral column memiliki tiga kelas sebagai klasifikasi penyakit kerusakan tulang belakang yaitu, herniated disk, spondylolisthesis dan kelas normal yang diambil berdasarkan hasil ekstraksi citra MRI. Dataset akan diolah dalam lima eksperimen berdasarkan validasi dataset menggunakan split validation dengan pembagian data training dan data testing yang bervariasi. Pada penelitian ini diusulkan implementasi algoritma genetika pada algoritma k-nearest neighbours untuk meningkatkan akurasi dari klasifikasi gangguan tulang belakang. Algoritma genetika digunakan untuk fitur seleksi dan optimasi parameter algoritma k-nearest neighbours. Hasil penelitian menunjukan bahwa metode yang diusulkan menghasilkan peningkatan yang signifikan dalam klasifikasi kerusakan pada tulang belakang. Metode yang diusulkan menghasilkan rata-rata akurasi sebesar 93% dari lima eksperimen. Hasil ini lebih baik dari algoritma k-nearest neighbours yang menghasilkan rata-rata akurasi hanya sebesar 82.54%. Kata kunci: algoritma genetika, k-nearest neighbours, kerusakan tulang belakang, vertebral AbstractSpinal disorder is experienced by about two-thirds of adults and is included in the second most common disease after headaches. Classification of spinal disorders is difficult because it requires a radiologist to analyze Magnetic Resonance Imaging (MRI) images. The use of Computer Aided Diagnosis (CAD) System can help radiologists to detect abnormalities in the spine more optimally. The vertebral column dataset has three classes as a classification of spinal disorders, namely, herniated disk, spondylolisthesis and normal classes taken based on MRI Image extraction. The dataset will be processed in five experiments based on dataset validation using split validation with various training data and testing data. In this study proposed the implementation of genetic algorithms in the k-nearest neighbors algorithm to improve the accuracy of the classification of spinal disorders. Genetic algorithms are used for algorithm feature selection and parameter optimization of k-nearest neighbors. The results showed that the proposed method produced a significant increase in the classification of spinal disorder. The proposed method produces an average accuracy of 93% from five experiments. This result is better than the k-nearest neighbors algorithm which produces an average accuracy of only 82.54%. Keywords: genetic algorithm, k-nearest neighbours, spinal disorder, vertebral column.
APA, Harvard, Vancouver, ISO, and other styles
8

Prasetio, Rizki Tri, Ali Akbar Rismayadi, and Iedam Fardian Anshori. "Implementasi Algoritma Genetika pada k-nearest neighbours untuk Klasifikasi Kerusakan Tulang Belakang." Jurnal Informatika 5, no. 2 (September 29, 2018): 186–94. http://dx.doi.org/10.31294/ji.v5i2.4123.

Full text
Abstract:
AbstrakKerusakan tulang belakang dialami oleh sekitar dua pertiga orang dewasa serta termasuk ke dalam penyakit yang paling umum kedua setelah sakit kepala. Klasifikasi gangguan tulang belakang sulit dilakukan karena membutuhkan radiologist untuk menganalisa citra Magnetic Resonance Imaging (MRI). Penggunaan Computer Aided Diagnosis (CAD) System dapat membantu radiologist untuk mendeteksi kelainan pada tulang belakang dengan lebih optimal. Dataset vertebral column memiliki tiga kelas sebagai klasifikasi penyakit kerusakan tulang belakang yaitu, herniated disk, spondylolisthesis dan kelas normal yang diambil berdasarkan hasil ekstraksi citra MRI. Dataset akan diolah dalam lima eksperimen berdasarkan validasi dataset menggunakan split validation dengan pembagian data training dan data testing yang bervariasi. Pada penelitian ini diusulkan implementasi algoritma genetika pada algoritma k-nearest neighbours untuk meningkatkan akurasi dari klasifikasi gangguan tulang belakang. Algoritma genetika digunakan untuk fitur seleksi dan optimasi parameter algoritma k-nearest neighbours. Hasil penelitian menunjukan bahwa metode yang diusulkan menghasilkan peningkatan yang signifikan dalam klasifikasi kerusakan pada tulang belakang. Metode yang diusulkan menghasilkan rata-rata akurasi sebesar 93% dari lima eksperimen. Hasil ini lebih baik dari algoritma k-nearest neighbours yang menghasilkan rata-rata akurasi hanya sebesar 82.54%. Kata kunci: algoritma genetika, k-nearest neighbours, kerusakan tulang belakang, vertebral AbstractSpinal disorder is experienced by about two-thirds of adults and is included in the second most common disease after headaches. Classification of spinal disorders is difficult because it requires a radiologist to analyze Magnetic Resonance Imaging (MRI) images. The use of Computer Aided Diagnosis (CAD) System can help radiologists to detect abnormalities in the spine more optimally. The vertebral column dataset has three classes as a classification of spinal disorders, namely, herniated disk, spondylolisthesis and normal classes taken based on MRI Image extraction. The dataset will be processed in five experiments based on dataset validation using split validation with various training data and testing data. In this study proposed the implementation of genetic algorithms in the k-nearest neighbors algorithm to improve the accuracy of the classification of spinal disorders. Genetic algorithms are used for algorithm feature selection and parameter optimization of k-nearest neighbors. The results showed that the proposed method produced a significant increase in the classification of spinal disorder. The proposed method produces an average accuracy of 93% from five experiments. This result is better than the k-nearest neighbors algorithm which produces an average accuracy of only 82.54%. Keywords: genetic algorithm, k-nearest neighbours, spinal disorder, vertebral column.
APA, Harvard, Vancouver, ISO, and other styles
9

Li, Xiaoguang. "Research and Implementation of Digital Media Recommendation System Based on Semantic Classification." Advances in Multimedia 2022 (March 27, 2022): 1–6. http://dx.doi.org/10.1155/2022/4070827.

Full text
Abstract:
In order to study the recommendation system of digital media based on semantic classification, the CF-LFMC algorithm based on semantic classification is proposed. Firstly, the traditional algorithm is analyzed. Aiming at some problems existing in the traditional algorithm, a clustering algorithm model based on term meaning and collaborative filtering algorithm is designed by combining the collaborative filtering algorithm and project-based clustering algorithm. Before analyzing sparse data, the cold start and timeliness of the traditional algorithm are improved. Secondly, the performance comparison of three cosine similarity calculation methods of experimental IBCF algorithm, the performance comparison between CF-LFMC algorithm and IBCF algorithm, and the performance comparison between CF-LFMC algorithm and CF-LFMC algorithm without the time function is carried out. The clustering value N = 10 in the CF-LFMC algorithm is taken as the experimental result; MAE values of both algorithms decrease with the increase of the nearest neighbor number k. When the number of nearest neighbors is small, MAE values of the two algorithms are close to each other. As the number of nearest neighbors increases, the accuracy of the algorithm does not improve significantly, and the calculation cost of the algorithm will increase with the increase of the number of nearest neighbors, so the number of nearest neighbors between 20 and 30 is more appropriate. CF-LFMC shows better accuracy, and the CF-LFMC algorithm improved by the time function has improved the accuracy, which is better than the traditional algorithm in accuracy.
APA, Harvard, Vancouver, ISO, and other styles
10

Prasad, Devendra, Sandip Kumar Goyal, Avinash Sharma, Amit Bindal, and Virendra Singh Kushwah. "System Model for Prediction Analytics Using K-Nearest Neighbors Algorithm." Journal of Computational and Theoretical Nanoscience 16, no. 10 (October 1, 2019): 4425–30. http://dx.doi.org/10.1166/jctn.2019.8536.

Full text
Abstract:
Machine Learning is a growing area in computer science in today’s era. This article is focusing on prediction analysis using K-Nearest Neighbors (KNN) Machine Learning algorithm. Data in the dataset are processed, analyzed and predicated using the specified algorithm. Introduction of various Machine Learning algorithms, its pros and cons have been discussed. The KNN algorithm with detail study is given and it is implemented on the specified data with certain parameters. The research work elucidates prediction analysis and explicates the prediction of quality of restaurants.
APA, Harvard, Vancouver, ISO, and other styles
11

Sugesti, Annisa, Moch Abdul Mukid, and Tarno Tarno. "PERBANDINGAN KINERJA MUTUAL K-NEAREST NEIGHBOR (MKNN) DAN K-NEAREST NEIGHBOR (KNN) DALAM ANALISIS KLASIFIKASI KELAYAKAN KREDIT." Jurnal Gaussian 8, no. 3 (August 30, 2019): 366–76. http://dx.doi.org/10.14710/j.gauss.v8i3.26681.

Full text
Abstract:
Credit feasibility analysis is important for lenders to avoid the risk among the increasement of credit applications. This analysis can be carried out by the classification technique. Classification technique used in this research is instance-based classification. These techniques tend to be simple, but are very dependent on the determination of K values. K is number of nearest neighbor considered for class classification of new data. A small value of K is very sensitive to outliers. This weakness can be overcome using an algorithm that is able to handle outliers, one of them is Mutual K-Nearest Neighbor (MKNN). MKNN removes outliers first, then predicts new observation classes based on the majority class of their mutual nearest neighbors. The algorithm will be compared with KNN without outliers. The model is evaluated by 10-fold cross validation and the classification performance is measured by Gemoetric-Mean of sensitivity and specificity. Based on the analysis the optimal value of K is 9 for MKNN and 3 for KNN, with the highest G-Mean produced by KNN is equal to 0.718, meanwhile G-Mean produced by MKNN is 0.702. The best alternative to classifying credit feasibility in this study is K-Nearest Neighbor (KNN) algorithm with K=3.Keywords: Classification, Credit, MKNN, KNN, G-Mean.
APA, Harvard, Vancouver, ISO, and other styles
12

Salvador–Meneses, Jaime, Zoila Ruiz–Chavez, and Jose Garcia–Rodriguez. "Compressed kNN: K-Nearest Neighbors with Data Compression." Entropy 21, no. 3 (February 28, 2019): 234. http://dx.doi.org/10.3390/e21030234.

Full text
Abstract:
The kNN (k-nearest neighbors) classification algorithm is one of the most widely used non-parametric classification methods, however it is limited due to memory consumption related to the size of the dataset, which makes them impractical to apply to large volumes of data. Variations of this method have been proposed, such as condensed KNN which divides the training dataset into clusters to be classified, other variations reduce the input dataset in order to apply the algorithm. This paper presents a variation of the kNN algorithm, of the type structure less NN, to work with categorical data. Categorical data, due to their nature, can be compressed in order to decrease the memory requirements at the time of executing the classification. The method proposes a previous phase of compression of the data to then apply the algorithm on the compressed data. This allows us to maintain the whole dataset in memory which leads to a considerable reduction of the amount of memory required. Experiments and tests carried out on known datasets show the reduction in the volume of information stored in memory and maintain the accuracy of the classification. They also show a slight decrease in processing time because the information is decompressed in real time (on-the-fly) while the algorithm is running.
APA, Harvard, Vancouver, ISO, and other styles
13

Dong, Yuan, Shang, Ye, and Zhang. "Direction-Aware Continuous Moving K-Nearest-Neighbor Query in Road Networks." ISPRS International Journal of Geo-Information 8, no. 9 (August 29, 2019): 379. http://dx.doi.org/10.3390/ijgi8090379.

Full text
Abstract:
Continuous K-nearest neighbor (CKNN) queries on moving objects retrieve the K-nearest neighbors of all points along a query trajectory. They mainly deal with the moving objects that are nearest to the moving user within a specified period of time. The existing methods of CKNN queries often recommend K objects to users based on distance, but they do not consider the moving directions of objects in a road network. Although a few CKNN query methods consider the movement directions of moving objects in Euclidean space, no efficient direction determination algorithm has been applied to CKNN queries over data streams in spatial road networks until now. In order to find the top K-nearest objects move towards the query object within a period of time, this paper presents a novel algorithm of direction-aware continuous moving K-nearest neighbor (DACKNN) queries in road networks. In this method, the objects’ azimuth information is adopted to determine the moving direction, ensuring the moving objects in the result set towards the query object. In addition, we evaluate the DACKNN query algorithm via comprehensive tests on the Los Angeles network TIGER/LINE data and compare DACKNN with other existing algorithms. The comparative test results demonstrate that our algorithm can perform the direction-aware CKNN query accurately and efficiently.
APA, Harvard, Vancouver, ISO, and other styles
14

Wang, Bingming, Shi Ying, and Zhe Yang. "A Log-Based Anomaly Detection Method with Efficient Neighbor Searching and Automatic K Neighbor Selection." Scientific Programming 2020 (June 2, 2020): 1–17. http://dx.doi.org/10.1155/2020/4365356.

Full text
Abstract:
Using the k-nearest neighbor (kNN) algorithm in the supervised learning method to detect anomalies can get more accurate results. However, when using kNN algorithm to detect anomaly, it is inefficient at finding k neighbors from large-scale log data; at the same time, log data are imbalanced in quantity, so it is a challenge to select proper k neighbors for different data distributions. In this paper, we propose a log-based anomaly detection method with efficient selection of neighbors and automatic selection of k neighbors. First, we propose a neighbor search method based on minhash and MVP-tree. The minhash algorithm is used to group similar logs into the same bucket, and MVP-tree model is built for samples in each bucket. In this way, we can reduce the effort of distance calculation and the number of neighbor samples that need to be compared, so as to improve the efficiency of finding neighbors. In the process of selecting k neighbors, we propose an automatic method based on the Silhouette Coefficient, which can select proper k neighbors to improve the accuracy of anomaly detection. Our method is verified on six different types of log data to prove its universality and feasibility.
APA, Harvard, Vancouver, ISO, and other styles
15

Pandey, Shubham, Vivek Sharma, and Garima Agrawal. "Modification of KNN Algorithm." International Journal of Engineering and Computer Science 8, no. 11 (November 28, 2019): 24869–77. http://dx.doi.org/10.18535/ijecs/v8i11.4383.

Full text
Abstract:
K-Nearest Neighbor (KNN) classification is one of the most fundamental and simple classification methods. It is among the most frequently used classification algorithm in the case when there is little or no prior knowledge about the distribution of the data. In this paper a modification is taken to improve the performance of KNN. The main idea of KNN is to use a set of robust neighbors in the training data. This modified KNN proposed in this paper is better from traditional KNN in both terms: robustness and performance. Inspired from the traditional KNN algorithm, the main idea is to classify an input query according to the most frequent tag in set of neighbor tags with the say of the tag closest to the new tuple being the highest. Proposed Modified KNN can be considered a kind of weighted KNN so that the query label is approximated by weighting the neighbors of the query. The procedure computes the frequencies of the same labeled neighbors to the total number of neighbors with value associated with each label multiplied by a factor which is inversely proportional to the distance between new tuple and neighbours. The proposed method is evaluated on a variety of several standard UCI data sets. Experiments show the significant improvement in the performance of KNN method.
APA, Harvard, Vancouver, ISO, and other styles
16

Sahu, Santosh Kumar, Sanjay Kumar Jena, and Manish Verma. "K-NN Based Outlier Detection Technique on Intrusion Dataset." International Journal of Knowledge Discovery in Bioinformatics 7, no. 1 (January 2017): 58–70. http://dx.doi.org/10.4018/ijkdb.2017010105.

Full text
Abstract:
Outliers in the database are the objects that deviate from the rest of the dataset by some measure. The Nearest Neighbor Outlier Factor is considering to measure the degree of outlier-ness of the object in the dataset. Unlike the other methods like Local Outlier Factor, this approach shows the interest of a point from both neighbors and reverse neighbors, and after that, an object comes into consideration. We have observed that in GBBK algorithm that based on K-NN, used quick sort to find k nearest neighbors that take O (N log N) time. However, in proposed method, the time required for searching on K times which complete in O (KN) time to find k nearest neighbors (k < < log N). As a result, the proposed method improves the time complexity. The NSL-KDD and Fisher iris dataset is used, and experimental results compared with the GBBK method. The result is same in both the methods, but the proposed method takes less time for computation.
APA, Harvard, Vancouver, ISO, and other styles
17

Jędrzejewski, Krzysztof, and Maurycy Zamorski. "Performance of K-Nearest Neighbors Algorithm in Opinion Classification." Foundations of Computing and Decision Sciences 38, no. 2 (June 1, 2013): 97–110. http://dx.doi.org/10.2478/fcds-2013-0002.

Full text
Abstract:
AbstractThis paper presents another approach for determining document’s semantic orientation process. It includes a brief introduction describing the area of application of opinion mining, and some definitions useful in the field. The most commonly used methods are mentioned and some alternative ones are described. Experiment results are presented which show that kNN algorithm gives similar results to proportional algorithm.
APA, Harvard, Vancouver, ISO, and other styles
18

Karabulut, Bergen, Güvenç Arslan, and Halil Murat ÜNVER. "A Weighted Similarity Measure for k-Nearest Neighbors Algorithm." Celal Bayar Üniversitesi Fen Bilimleri Dergisi 15, no. 4 (December 30, 2019): 393–400. http://dx.doi.org/10.18466/cbayarfbe.618964.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Martinasek, Z., V. Zeman, L. Malina, and J. Martinasek. "k-Nearest Neighbors Algorithm in Profiling Power Analysis Attacks." Radioengineering 25, no. 2 (April 14, 2016): 365–82. http://dx.doi.org/10.13164/re.2016.0365.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Shi, Zhan. "Improving k-Nearest Neighbors Algorithm for Imbalanced Data Classification." IOP Conference Series: Materials Science and Engineering 719 (January 8, 2020): 012072. http://dx.doi.org/10.1088/1757-899x/719/1/012072.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Yirng-An Chen, Youn-Long Lin, and Long-Wen Chang. "A systolic algorithm for the k-nearest neighbors problem." IEEE Transactions on Computers 41, no. 1 (1992): 103–8. http://dx.doi.org/10.1109/12.123385.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Mangalova, E., and E. Agafonov. "Wind power forecasting using the k-nearest neighbors algorithm." International Journal of Forecasting 30, no. 2 (April 2014): 402–6. http://dx.doi.org/10.1016/j.ijforecast.2013.07.008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Mladenović, Dušan, Slađana Janković, Stefan Zdravković, Snežana Mladenović, and Ana Uzelac. "Night Traffic Flow Prediction Using K-Nearest Neighbors Algorithm." Operational Research in Engineering Sciences: Theory and Applications 5, no. 1 (April 20, 2022): 152–68. http://dx.doi.org/10.31181/oresta240322136m.

Full text
Abstract:
The aim of this research is to predict the total and average monthly night traffic on state roads in Serbia, using the technique of supervised machine learning. A set of data on total and average monthly night traffic has been used for training and testing of predictive models. The data set was obtained by counting the traffic on the roads in Serbia, in the period from 2011 to 2020. Various classification and regression prediction models have been tested using the Weka software tool on the available data set and the models based on the K-Nearest Neighbors algorithm, as well as models based on regression trees, have shown the best results. Furthermore, the best model has been chosen by comparing the performances of models. According to all the mentioned criteria, the model based on the K-Nearest Neighbors algorithm has shown the best results. Using this model, the prediction of the total and average nightly traffic per month for the following year at the selected traffic counting locations has been made.
APA, Harvard, Vancouver, ISO, and other styles
24

Alsariera, Yazan Ahmad. "Detecting block ciphers generic attacks: An instance-based machine learning method." International Journal of ADVANCED AND APPLIED SCIENCES 9, no. 5 (May 2022): 60–68. http://dx.doi.org/10.21833/ijaas.2022.05.007.

Full text
Abstract:
Cryptography facilitates selective communication through encryption of messages and or data. Block-cipher processing is one of the prominent methods for modern cryptographic symmetric encryption schemes. The rise in attacks on block-ciphers led to the development of more difficult encryption schemes. However, attackers decrypt block-ciphers through generic attacks given sufficient time and computing. Recent research had applied machine learning classification algorithms to develop intrusion detection systems to detect multiple types of attacks. These intrusion detection systems are limited by misclassifying generic attacks and suffer reduced effectiveness when evaluated for detecting generic attacks only. Hence, this study introduced and proposed k-nearest neighbors, an instance-based machine learning classification algorithm, for the detection of generic attacks on block-ciphers. The value of k was varied (i.e., 1, 3, 5, 7, and 9) and multiple nearest neighbors classification models were developed and evaluated using two distance functions (i.e., Manhattan and Euclidean) for classifying between generic attacks and normal network packets. All nearest neighbors models using the Manhattan distance function performed better than their Euclidean counterparts. The 1-nearest neighbor (Manhattan distance function) model had the highest overall accuracy of 99.6%, a generic attack detection rate of 99.5% which tallies with the 5, 7, and 9 nearest neighbors models, and a false alarm rate of 0.0003 which is the same for all Manhattan nearest neighbors classification models. These instance-based methods performed better than some existing methods that even implemented an ensemble of deep-learning algorithms. Therefore, an instance-based method is recommended for detecting block-ciphers generic attacks.
APA, Harvard, Vancouver, ISO, and other styles
25

Mansor, Muhammad Naufal, Ahmad Kadri Junoh, Wan Suhana Wan Daud, Wan Zuki Azman Wan Muhamad, and Azrini Idris. "Nonlinear Fuzzy Robust PCA Algorithm for Pain Decision Support System." Advanced Materials Research 1016 (August 2014): 785–89. http://dx.doi.org/10.4028/www.scientific.net/amr.1016.785.

Full text
Abstract:
This paper describes particular pain events to be located as in infant face images with feature extraction algorithm. Nonlinear Fuzzy Robust PCA (NFRPCA) feature extraction is implemented to test its effectiveness in recognizing pain in images. In this work, two classifiers, Fuzzy k-nearest neighbors (Fuzzy k-NN) and k-nearest neighbors (k-NN) are employed. Result shows that the NFRPCA and classifier (Fuzzy k-NN and k-NN) can be used for the recognition of infant pain images with the best accuracy of 89.77%.
APA, Harvard, Vancouver, ISO, and other styles
26

Reeve, Henry W. J., and Ata Kabán. "Robust randomized optimization with k nearest neighbors." Analysis and Applications 17, no. 05 (September 2019): 819–36. http://dx.doi.org/10.1142/s0219530519400086.

Full text
Abstract:
Modern applications of machine learning typically require the tuning of a multitude of hyperparameters. With this motivation in mind, we consider the problem of optimization given a set of noisy function evaluations. We focus on robust optimization in which the goal is to find a point in the input space such that the function remains high when perturbed by an adversary within a given radius. Here we identify the minimax optimal rate for this problem, which turns out to be of order [Formula: see text], where [Formula: see text] is the sample size and [Formula: see text] quantifies the smoothness of the function for a broad class of problems, including situations where the metric space is unbounded. The optimal rate is achieved (up to logarithmic factors) by a conceptually simple algorithm based on [Formula: see text]-nearest neighbor regression.
APA, Harvard, Vancouver, ISO, and other styles
27

Jagruthi, Y., Dr Y. Ramadevi, and A. Sangeeta. "An Instance Selection Algorithm Based On Reverse k Nearest Neighbor." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 10, no. 7 (August 30, 2013): 1858–61. http://dx.doi.org/10.24297/ijct.v10i7.3217.

Full text
Abstract:
Classification is one of the most important data mining techniques. It belongs to supervised learning. The objective of classification is to assign class label to unlabelled data. As data is growing rapidly, handling it has become a major concern. So preprocessing should be done before classification and hence data reduction is essential. Data reduction is to extract a subset of features from a set of features of a data set. Data reduction helps in decreasing the storage requirement and increases the efficiency of classification. A way to measure data reduction is reduction rate. The main thing here is choosing representative samples to the final data set. There are many instance selection algorithms which are based on nearest neighbor decision rule (NN). These algorithms select samples on incremental strategy or decremental strategy. Both the incremental algorithms and decremental algorithms take much processing time as they iteratively scan the dataset. There is another instance selection algorithm, reverse nearest neighbor reduction (RNNR) based on the concept of reverse nearest neighbor (RNN). RNNR does not iteratively scan the data set. In this paper, we extend the RNN to RkNN and we use the concept of RNNR to RkNN. RkNN finds the query objects that has the query point as their k nearest-neighbors. Our approach utilizes the advantage of RNN and proposes to use the concept of RkNN. We have taken the dataset of theatres, hospitals and restaurants and extracted the sample set. Classification has been done the resultant sample data set. We observe two parameters here they are classification accuracy and reduction rate.
APA, Harvard, Vancouver, ISO, and other styles
28

Lu, Jiantao, Weiwei Qian, Shunming Li, and Rongqing Cui. "Enhanced K-Nearest Neighbor for Intelligent Fault Diagnosis of Rotating Machinery." Applied Sciences 11, no. 3 (January 20, 2021): 919. http://dx.doi.org/10.3390/app11030919.

Full text
Abstract:
Case-based intelligent fault diagnosis methods of rotating machinery can deal with new faults effectively by adding them into the case library. However, case-based methods scarcely refer to automatic feature extraction, and k-nearest neighbor (KNN) commonly required by case-based methods is unable to determine the nearest neighbors for different testing samples adaptively. To solve these problems, a new intelligent fault diagnosis method of rotating machinery is proposed based on enhanced KNN (EKNN), which can take advantage of both parameter-based and case-based methods. First, EKNN is embedded with a dimension-reduction stage, which extracts the discriminative features of samples via sparse filtering (SF). Second, to locate the nearest neighbors for various testing samples adaptively, a case-based reconstruction algorithm is designed to obtain the correlation vectors between training samples and testing samples. Finally, according to the optimized correlation vector of each testing sample, its nearest neighbors can be adaptively selected to obtain its corresponding health condition label. Extensive experiments on vibration signal datasets of bearings are also conducted to verify the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
29

Widiarti, Anastasia Rita. "K-nearest neighbor performance for Nusantara scripts image transliteration." Jurnal Teknologi dan Sistem Komputer 8, no. 2 (March 13, 2020): 150–56. http://dx.doi.org/10.14710/jtsiskom.8.2.2020.150-156.

Full text
Abstract:
The concept of classification using the k-nearest neighbor (KNN) method is simple, easy to understand, and easy to be implemented in the system. The main challenge in classification with KNN is determining the proximity measure of an object and how to make a compact reference class. This paper studied the implementation of the KNN for the automatic transliteration of Javanese, Sundanese, and Bataknese script images into Roman script. The study used the KNN algorithm with the number k set to 1, 3, 5, 7, and 9. Tests used the image dataset of 2520 data. With the 3-fold and 10-fold cross-validation, the results exposed the accuracy differences if the area of the extracted image, the number of neighbors in the classification, and the number of data training were different.
APA, Harvard, Vancouver, ISO, and other styles
30

Nugraheni, Annisa, Rima Dias Ramadhani, Amalia Beladinna Arifa, and Agi Prasetiadi. "Perbandingan Performa Antara Algoritma Naive Bayes Dan K-Nearest Neighbour Pada Klasifikasi Kanker Payudara." Journal of Dinda : Data Science, Information Technology, and Data Analytics 2, no. 1 (February 23, 2022): 11–20. http://dx.doi.org/10.20895/dinda.v2i1.391.

Full text
Abstract:
Breast cancer is the second most common cause of death from cancer after lung cancer is in the first place. Breast cancer occurs when cells in breast tissue begin to grow uncontrollably and can disrupt existing healthy tissue. Therefore, there is a need for a classification to distinguish breast cancer patients and healthy people. Based on previous research, the Naïve Bayes and K-Nearest Neighbor algorithms are considered capable of classifying breast cancer. In the research process using the breast cancer dataset from the Breast Cancer Coimbra dataset in 2018 UCI Machine Learning Repository with a total of 116 data, while for the calculation of the feasibility of the method using the Confusion Matrix (Accuracy, Precision, and Recall) and the ROC-AUC curve. The purpose of this study is to compare the performance of the Naïve Bayes and K-Nearest Neighbor algorithms. In testing using the Naïve Bayes algorithm and the K-Nearest Neighbor algorithm, there are several test scenarios, namely, data testing before and after normalization, model testing based on a comparison of training data and testing data, model testing based on K values ​​in K-Nearest Neighbors, and model testing. based on the selection of the strongest attribute with the Pearson correlation test. The results of this study indicate that the Naïve Bayes algorithm has the highest average accuracy of 69.12%, healthy precision 64.90%, pain precision 83%, healthy recall 88%, sick recall 61.11% and AUC 0.82 which is included in the good classification category. Meanwhile, the highest average results of the K-Nearest Neighbor algorithm are 76.83% for accuracy, 76% healthy precision, 80.21% pain precision, 74.18% for healthy recall, 80.81% sick recall and 0.91 AUC which is included in the excellent classification category.
APA, Harvard, Vancouver, ISO, and other styles
31

Doshi, Ishita, Dhritiman Das, Ashish Bhutani, Rajeev Kumar, Rushi Bhatt, and Niranjan Balasubramanian. "LANNS." Proceedings of the VLDB Endowment 15, no. 4 (December 2021): 850–58. http://dx.doi.org/10.14778/3503585.3503594.

Full text
Abstract:
Nearest neighbor search (NNS) has a wide range of applications in information retrieval, computer vision, machine learning, databases, and other areas. Existing state-of-the-art algorithm for nearest neighbor search, Hierarchical Navigable Small World Networks (HNSW), is unable to scale to large datasets of 100M records in high dimensions. In this paper, we propose LANNS, an end-to-end platform for Approximate Nearest Neighbor Search, which scales for web-scale datasets. Library for Large Scale Approximate Nearest Neighbor Search (LANNS) is deployed in multiple production systems for identifying top-K (100 ≤ k ≤ 200) approximate nearest neighbors with a latency of a few milliseconds per query, high throughput of ~2.5k Queries Per Second (QPS) on a single node, on large (e.g., ~ 180M data points) high dimensional (50-2048 dimensional) datasets.
APA, Harvard, Vancouver, ISO, and other styles
32

P. Mateva, Tonya, and Ivan G. Ivanov. "k-NN improvement to data analysis." International Journal of Engineering & Technology 8, no. 4 (November 5, 2019): 523. http://dx.doi.org/10.14419/ijet.v8i4.29803.

Full text
Abstract:
The problem to classify big data is an actual one the subject. There are multiple ways to classify data but the k Nearest Neighbors (k-NN) has become a popular tool for the data scientist. In this paper we examine several modifications of the k Nearest Neighbors algorithm that achieve better efficiency in terms of accuracy and CPU time when classifying test observations in comparison to the standard k Nearest Neighbors algorithm. To make the modifications faster than standard k-NN we use a special methodology which splits the input dataset into n folds and combine it with input data transformations. Each time we execute the process, one of the folds is saved as a test subset and the rest of the folds are applied for training. The process is executed n times. In the proposed methodology we are looking for the pair of subsets which produces the highest accuracy result.
APA, Harvard, Vancouver, ISO, and other styles
33

NOCK, RICHARD, MARC SEBBAN, and DIDIER BERNARD. "A SIMPLE LOCALLY ADAPTIVE NEAREST NEIGHBOR RULE WITH APPLICATION TO POLLUTION FORECASTING." International Journal of Pattern Recognition and Artificial Intelligence 17, no. 08 (December 2003): 1369–82. http://dx.doi.org/10.1142/s0218001403002952.

Full text
Abstract:
In this paper, we propose a thorough investigation of a nearest neighbor rule which we call the "Symmetric Nearest Neighbor (sNN) rule". Basically, it symmetrises the classical nearest neighbor relationship from which are computed the points voting for some instances. Experiments on 29 datasets, most of which are readily available, show that the method significantly outperforms the traditional Nearest Neighbors methods. Experiments on a domain of interest related to tropical pollution normalization also show the greater potential of this method. We finally discuss the reasons for the rule's efficiency, provide methods for speeding-up the classification time, and derive from the sNN rule a reliable and fast algorithm to fix the parameter k in the k-NN rule, a longstanding problem in this field.
APA, Harvard, Vancouver, ISO, and other styles
34

Gweon, Hyukjun, Matthias Schonlau, and Stefan H. Steiner. "The k conditional nearest neighbor algorithm for classification and class probability estimation." PeerJ Computer Science 5 (May 13, 2019): e194. http://dx.doi.org/10.7717/peerj-cs.194.

Full text
Abstract:
The k nearest neighbor (kNN) approach is a simple and effective nonparametric algorithm for classification. One of the drawbacks of kNN is that the method can only give coarse estimates of class probabilities, particularly for low values of k. To avoid this drawback, we propose a new nonparametric classification method based on nearest neighbors conditional on each class: the proposed approach calculates the distance between a new instance and the kth nearest neighbor from each class, estimates posterior probabilities of class memberships using the distances, and assigns the instance to the class with the largest posterior. We prove that the proposed approach converges to the Bayes classifier as the size of the training data increases. Further, we extend the proposed approach to an ensemble method. Experiments on benchmark data sets show that both the proposed approach and the ensemble version of the proposed approach on average outperform kNN, weighted kNN, probabilistic kNN and two similar algorithms (LMkNN and MLM-kHNN) in terms of the error rate. A simulation shows that kCNN may be useful for estimating posterior probabilities when the class distributions overlap.
APA, Harvard, Vancouver, ISO, and other styles
35

ZHAO, GENG, KEFENG XUAN, DAVID TANIAR, and BALA SRINIVASAN. "INCREMENTAL K-NEAREST-NEIGHBOR SEARCH ON ROAD NETWORKS." Journal of Interconnection Networks 09, no. 04 (December 2008): 455–70. http://dx.doi.org/10.1142/s0219265908002382.

Full text
Abstract:
Most query search on road networks is either to find objects within a certain range (range search) or to find K nearest neighbors (KNN) on the actual road network map. In this paper, we propose a novel query, that is, incremental k nearest neighbor (iKNN). iKNN can be defined as given a set of candidate interest objects, a query point and the number of objects k, find a path which starts at the query point, goes through k interest objects and the distance of this path is the shortest among all possible paths. This is a new type of query, which can be used when we want to visit k interest objects one by one from the query point. This approach is based on expanding the network from the query point, keeping the results in a query set and updating the query set when reaching network intersection or interest objects. The basic theory of this approach is Dijkstra's algorithm and the Incremental Network Expansion (INE) algorithm. Our experiment verified the applicability of the proposed approach to solve the queries, which involve finding incremental k nearest neighbor.
APA, Harvard, Vancouver, ISO, and other styles
36

DICKERSON, MATTHEW T., R. L. SCOT DRYSDALE, and JÖRG-RÜDIGER SACK. "SIMPLE ALGORITHMS FOR ENUMERATING INTERPOINT DISTANCES AND FINDING k NEAREST NEIGHBORS." International Journal of Computational Geometry & Applications 02, no. 03 (September 1992): 221–39. http://dx.doi.org/10.1142/s0218195992000147.

Full text
Abstract:
We present an O(n log n+k log k) time and O(n+k) space algorithm which takes as input a set of n points in the plane and enumerates the k smallest distances between pairs of points in nondecreasing order. We also present an O(n log n+kn log k) solution to the problem of finding the k nearest neighbors for each of n points. Both algorithms are conceptually very simple, are easy to implement, and are based on a common data structure: the Delaunay triangulation. Variants of the algorithms work for any convex distance function metric.
APA, Harvard, Vancouver, ISO, and other styles
37

Ye, Ming, Yi Ke Chen, and Dun Bing Tang. "An Improved Weighted K Nearest Neighbors Algorithm for WLAN Localization." Advanced Materials Research 753-755 (August 2013): 2191–95. http://dx.doi.org/10.4028/www.scientific.net/amr.753-755.2191.

Full text
Abstract:
For indoor location estimation based on Wi-Fi fingerprinting, how to reduce localization effort while maintaining high location estimate accuracy is of critical concern. The paper introduces a new approach Reward and Penalty Localization (RP-Loc) algorithm, which extends the classic WKNN Localization algorithm. The parameter of observable Access Point (AP) occurrence rate is added into the fingerprint database in the offline training phrase to increase the increase the anti-interference ability. The experimental results prove that the RP-Loc algorithm exhibits superior performance in terms of location accuracy and robustness.
APA, Harvard, Vancouver, ISO, and other styles
38

Phu, Vo Ngoc, and Vo Thi Ngoc Tran. "A Reformed K-Nearest Neighbors Algorithm for Big Data Sets." Journal of Computer Science 14, no. 9 (September 1, 2018): 1213–25. http://dx.doi.org/10.3844/jcssp.2018.1213.1225.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Hwang, Dae-Hyeon, Soon-Whan Kim, and Young-Chul Bae. "A prediction of bid price using k-Nearest Neighbors Algorithm." Journal of Korean Institute of Intelligent Systems 29, no. 6 (December 31, 2019): 482–87. http://dx.doi.org/10.5391/jkiis.2019.29.6.482.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Xia, Shuyin, Zhongyang Xiong, Yueguo Luo, Limei Dong, and Guanghua Zhang. "Location difference of multiple distances based k-nearest neighbors algorithm." Knowledge-Based Systems 90 (December 2015): 99–110. http://dx.doi.org/10.1016/j.knosys.2015.09.028.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Cai, Wei, Weike Pan, Jixiong Liu, Zixiang Chen, and Zhong Ming. "k-Reciprocal nearest neighbors algorithm for one-class collaborative filtering." Neurocomputing 381 (March 2020): 207–16. http://dx.doi.org/10.1016/j.neucom.2019.10.112.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

QIAO, Y. L. "Fast K Nearest Neighbors Search Algorithm Based on Wavelet Transform." IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E89-A, no. 8 (August 1, 2006): 2239–43. http://dx.doi.org/10.1093/ietfec/e89-a.8.2239.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Ni, K. S., and T. Q. Nguyen. "An Adaptable $k$-Nearest Neighbors Algorithm for MMSE Image Interpolation." IEEE Transactions on Image Processing 18, no. 9 (September 2009): 1976–87. http://dx.doi.org/10.1109/tip.2009.2023706.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Kim, Jeong-Hun, Jong-Hyeok Choi, Young-Ho Park, Carson Kai-Sang Leung, and Aziz Nasridinov. "KNN-SC: Novel Spectral Clustering Algorithm Using k-Nearest Neighbors." IEEE Access 9 (2021): 152616–27. http://dx.doi.org/10.1109/access.2021.3126854.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Yao, Haiqing, Xiuwen Fu, Yongsheng Yang, and Octavian Postolache. "An Incremental Local Outlier Detection Method in the Data Stream." Applied Sciences 8, no. 8 (July 28, 2018): 1248. http://dx.doi.org/10.3390/app8081248.

Full text
Abstract:
Outlier detection has attracted a wide range of attention for its broad applications, such as fault diagnosis and intrusion detection, among which the outlier analysis in data streams with high uncertainty and infinity is more challenging. Recent major work of outlier detection has focused on principle research of the local outlier factor, and there are few studies on incremental updating strategies, which are vital to outlier detection in data streams. In this paper, a novel incremental local outlier detection approach is introduced to dynamically evaluate the local outlier in the data stream. An extended local neighborhood consisting of k nearest neighbors, reverse nearest neighbors and shared nearest neighbors is estimated for each data. The theoretical evidence of algorithm complexity for the insertion of new data and deletion of old data in the composite neighborhood shows that the amount of affected data in the incremental calculation is finite. Finally, experiments performed on both synthetic and real datasets verify its scalability and outlier detection accuracy. All results show that the proposed approach has comparable performance with state-of-the-art k nearest neighbor-based methods.
APA, Harvard, Vancouver, ISO, and other styles
46

AN, JIAN, XIAOLIN GUI, JIANWEI YANG, JINHUA JIANG, and LING QI. "SEMI-SUPERVISED LEARNING OF K-NEAREST NEIGHBORS USING A NEAREST-NEIGHBOR SELF-CONTAINED CRITERION IN FOR MOBILE-AWARE SERVICE." International Journal of Pattern Recognition and Artificial Intelligence 27, no. 05 (August 2013): 1351001. http://dx.doi.org/10.1142/s0218001413510014.

Full text
Abstract:
We propose a new K-nearest neighbor (KNN) algorithm based on a nearest-neighbor self-contained criterion (NNscKNN) by utilizing the unlabeled data information. Our algorithm incorporates other discriminant information to train KNN classifier. This new KNN scheme is also applied in a community detection algorithm for mobile-aware service: First, as the edges of networks, the social relation between mobile nodes is quantified with social network theory; second, we would construct the mobile nodes optimal path tree and calculate the similarity index of adjacent nodes; finally, the community dispersion is defined to evaluate the clustering results and measure the quality of community structure. Promising experiments on benchmarks demonstrate the effectiveness of our approach for recognition and detection tasks.
APA, Harvard, Vancouver, ISO, and other styles
47

Syed, Zeeshan, and Ilan Rubinfeld. "Scaling Unsupervised Risk Stratification to Massive Clinical Datasets." International Journal of Knowledge Discovery in Bioinformatics 2, no. 1 (January 2011): 45–59. http://dx.doi.org/10.4018/jkdb.2011010103.

Full text
Abstract:
While rare clinical events, by definition, occur infrequently in a population, the consequences of these events can be drastic. Unfortunately, developing risk stratification algorithms for these conditions requires large volumes of data to capture enough positive and negative cases. This process is slow, expensive, and burdensome to both patients and caregivers. This paper proposes an unsupervised machine learning approach to address this challenge and risk stratify patients for adverse outcomes without use of a priori knowledge or labeled training data. The key idea of the approach is to identify high-risk patients as anomalies in a population. Cases are identified through a novel algorithm that finds an approximate solution to the k-nearest neighbor problem using locality sensitive hashing (LSH) based on p-stable distributions. The algorithm is optimized to use multiple LSH searches, each with a geometrically increasing radius, to find the k-nearest neighbors of patients in a dynamically changing dataset where patients are being added or removed over time. When evaluated on data from the National Surgical Quality Improvement Program (NSQIP), this approach successfully identifies patients at an elevated risk of mortality and rare morbidities. The LSH-based algorithm provided a substantial improvement over an exact k-nearest neighbor algorithm in runtime, while achieving a similar accuracy.
APA, Harvard, Vancouver, ISO, and other styles
48

Kumar Ojha, Rajesh, and Dr Bhagirathi Nayak. "Application of Machine Learning in Collaborative Filtering Recommender Systems." International Journal of Engineering & Technology 7, no. 4.38 (December 3, 2018): 213. http://dx.doi.org/10.14419/ijet.v7i4.38.24445.

Full text
Abstract:
Recommender systems are one of the important methodologies in machine learning technologies, which is using in current business scenario. This article proposes a book recommender system using deep learning technique and k-Nearest Neighbors (k-NN) classification. Deep learning technique is one of the most effective techniques in the field of recommender systems. Recommender systems are intelligent systems in Machine Learning that can make difference from other algorithms. This article considers application of Machine Learning Technology and we present an approach based a recommender system. We used k-Nearest Neighbors classification algorithm of deep learning technique to classify users based book recommender system. We analyze the traditional collaborative filtering with our methodology and also to compare with them. Our outcomes display the projected algorithm is more precise over the existing algorithm, it also consumes less time and reliable than the existing methods.
APA, Harvard, Vancouver, ISO, and other styles
49

Ati, Indri, and Ari Kusyanti. "Metode Ensemble Classifier untuk Mendeteksi Jenis Attention Deficit Hyperactivity Disorder (SDHD) pada Anak Usia Dini." Jurnal Teknologi Informasi dan Ilmu Komputer 6, no. 3 (May 9, 2019): 301. http://dx.doi.org/10.25126/jtiik.2019631313.

Full text
Abstract:
<p class="Abstract">Pada awal masa perkembangan, beberapa anak mengalami hambatan diantaranya sulit untuk diam, sulit untuk berkonsentrasi dan mengontrol perilakunya, apabila anak mengalami gangguan pemusatan perhatian dan sulit mengontrol perilaku yang sesuai, dapat disebut dengan ADHD (Attention Deficit Hyperactive Disorder). Ini merupakan masalah yang serius dikarenakan anak penyandang ADHD mengalami masalah perilaku sosial, emosional dan mengalami kesulitan belajar sekolah sehingga akan mempengaruhi perkembangan pada masa dewasa anak penyandang ADHD. Oleh karena itu perlu diketahui gejala ADHD sejak dini, agar dapat dilakukan suatu penanganan dengan cepat dan tepat. Penelitian ini menghasilkan aplikasi yang digunakan untuk mendeteksi jenis ADHD berdasarkan gejala-gejala yang di masukkan oleh pengguna sehingga akan tampil hasil klasifikasi jenis ADHD nya secara otomatis. Aplikasi ini menggunakan metode Ensemble Classifier yaitu metode yang menggabungkan beberapa classifier agar dapat meningkatkan akurasi yang dihasilkan. Pada tahap klasifikasi setiap data akan dihitung menggunakan K-Nearest Neighbour (KNN), Fuzzy K-Nearest Neighbour (FKNN) dan Neighbour Weighted K-Nearest Neighbour (NWKNN). Hasil perhitungan ketiga classifier tersebut akan diproses kembali dengan metode Ensemble Classifier dengan menggunakan majority voting untuk penentuan klasnya. Hasil akurasi tertinggi dari metode ensemble classifier yaitu 95% dengan nilai k optimal yaitu k=10. Akan tetapi semakin besar nilai k yaitu diatas k=20 maka nilai akurasi untuk masing-masing algoritme akan semakin turun. Hal ini dikarenakan semua algoritme penentuan klasifikasinya berdasarkan jumlah ketetanggaannya. Maka semakin banyak jumlah tetangga yang diperhitungkan maka kemungkinan salah klasifikasinya semakin besar.</p><p class="Abstract"> </p><p class="Abstract"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>At the beginning of the development stage, some children experience difficulty to calm, to concentrate and to control their behavior. These symptoms are known as ADHD (Attention Deficit Hyperactive Disorder). This research develops an application that is used to defineADHD based on symptoms that that is entered by the user so that it will show its ADHD type automatically. This application uses the Ensemble Classifier method, in which a method that allows some classifier in order to increase the resulting value. At the classification stage each data will be calculated using K-Nearest Neighbor (KNN), Fuzzy K-Nearest Neighbor (FKNN) and Neighbor Weighted K-Nearest Neighbor (NWKNN). The results of the three classifier calculations will return using the Ensemble Classifier method using the majority voting for class determination. Acceptance results from the ensemble classifier method is 95% with the optimal k value k = 10. However, when the k value, i.e k &gt;=20 then the value for each algorithm will decrease. This is due to the calculation of all the classification algorithm based on the number of its neighbors. Therefore, the more neighbours that are calculated then the possibility of misclassification is greater.</em></p><p class="Abstract"><em><strong><br /></strong></em></p>
APA, Harvard, Vancouver, ISO, and other styles
50

Wahono, Hermanto, and Dwiza Riana. "Prediksi Calon Pendonor Darah Potensial Dengan Algoritma Naïve Bayes, K-Nearest Neighbors dan Decision Tree C4.5." JURIKOM (Jurnal Riset Komputer) 7, no. 1 (February 15, 2020): 7. http://dx.doi.org/10.30865/jurikom.v7i1.1953.

Full text
Abstract:
Blood donation is a process of taking blood from donors that is declared feasible, in terms of various factors including age, weight, blood pressure, hemoglobin levels, and donor status which are taken into consideration during the feasibility test. This study was conducted to find the most appropriate method with high accuracy and Area Under Curve (AUC) values using 3710 blood donor datasets from the Bekasi City PMI, processed using the Naïve Bayes algorithm method, K-Nearest Neighbors and Decision Tree C4.5. The analysis shows that the Decision Tree C4.5 algorithm shows higher accuracy of 93.83% compared to Naïve Bayes algorithm which shows an accuracy value of 85.15% and the K-Nearest Neighbors algorithm with an accuracy value of 84.10%. In addition to these values, Decision Tree C4.5 is also visually superior where the Decision Tree has an output model tree that shows attribute relationships and has an AUC value of 0.978, Naïve Bayes with an AUC value of 0.927 and K-Nearest Neighbors with an AUC value of 0.816.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography