Journal articles on the topic 'Clustering Healthcare data Silhouette score value K-means DBSCAN'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 18 journal articles for your research on the topic 'Clustering Healthcare data Silhouette score value K-means DBSCAN.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Godwin, Ogbuabor, and F. N. Ugwoke. "Clustering Algorithm for a Healthcare Dataset Using Silhouette Score Value." International Journal of Computer Science & Information Technology (IJCSIT) 10, no. 2 (2018): 27–37. https://doi.org/10.5281/zenodo.1248795.

Full text
Abstract:
The huge amount of healthcare data, coupled with the need for data analysis tools has made data mining interesting research areas. Data mining tools and techniques help to discover and understand hidden patterns in a dataset which may not be possible by mainly visualization of the data. Selecting appropriate clustering method and optimal number of clusters in healthcare data can be confusing and difficult most times. Presently, a large number of clustering algorithms are available for clustering healthcare data, but it is very difficult for people with little knowledge of data mining to choose suitable clustering algorithms
APA, Harvard, Vancouver, ISO, and other styles
2

Frangky, Frangky, Rudolf Sinaga, and M. Raihansyah. "Analisis Segmentasi Pasien Berdasarkan Persepsi Kualitas Pelayanan dengan Algoritma Clustering." Explorer 5, no. 1 (2025): 52–58. https://doi.org/10.47065/explorer.v5i1.1818.

Full text
Abstract:
Patient segmentation based on perceptions of service quality is a crucial step in improving patient experiences, optimizing resources, and enhancing healthcare service quality. However, understanding patients' needs and priorities in depth poses a challenge, particularly for hospitals serving populations with diverse demographic backgrounds. This study aims to cluster patients in a private hospital in Jambi City based on their perceptions of service quality using the K-Means algorithm. Data were collected from a 2022-2023 survey, covering patient demographics and perceptions of service quality. The data were processed through preprocessing steps, including missing value imputation, normalization, and encoding. The optimal number of clusters was determined using the Elbow and Silhouette Score methods. The results revealed three main clusters with distinct characteristics. The first cluster (34.29%) includes patients prioritizing service speed and procedural ease. The second cluster (46.12%) consists of patients who emphasize staff competence and cost fairness as their main priorities. The third cluster (19.59%) comprises patients with higher educational backgrounds who are more critical of facility quality and complaint handling. Evaluation using the Davies-Bouldin index demonstrated good cluster separation (score -0.645). This study concludes that patient segmentation based on perceptions of service quality can serve as a foundation for strategic decision-making to improve hospital service quality. Recommendations for future research include applying other algorithms such as DBSCAN, integrating sentiment analysis, and employing a hybrid approach to predict patient needs. These approaches are expected to provide a deeper understanding and more effective personalization of patient care.
APA, Harvard, Vancouver, ISO, and other styles
3

Dharmawan, Tio, Chinta 'Aliyyah Candramaya, and Vandha Pradwiyasma Widharta. "Forming Dataset of The Undergraduate Thesis using Simple Clustering Methods." International Journal of Innovation in Enterprise System 7, no. 01 (2023): 31–40. http://dx.doi.org/10.25124/ijies.v7i01.187.

Full text
Abstract:
Each university collects many undergraduate theses data but has yet to process it to make it easier for students to find references as desired. This study aims to classify and compare the grouping of documents using expert and simple clustering methods. Experts have done ground truth using OR Boolean Retrieval and keyword generation. The best cluster was discovered by the experiments using the K-Means, K-Medoids, and DBSCAN clustering methods and using Euclidean, Manhattan, City Block, and Cosine Similarity metrics. The cluster with the best Silhouette Score compared to the accuracy of the categorization of each document. The K-Means clustering method and the Cosine Similarity metric gave the best results with a Silhouette Score value of 0.105534. The comparison between ground truth and the best cluster results shows an accuracy of 33.42%. The result shows that the simple clustering method cannot handle data with Negative Skewness and Leptokurtic Kurtosis.
APA, Harvard, Vancouver, ISO, and other styles
4

Choque-Soto, Vanessa Maribel, Victor Dario Sosa-Jauregui, and Waldo Ibarra. "Characterization of the Dropout Student Profile Using Data Mining Techniques." Revista de Gestão Social e Ambiental 19, no. 2 (2025): e011306. https://doi.org/10.24857/rgsa.v19n2-067.

Full text
Abstract:
Objective: One of the primary concerns in Educational Data Mining is student dropout rates. This study aims to investigate student dropout rates in higher education by identifying and analyzing the demographic and academic characteristics of university students who discontinue their studies. Theoretical Framework: Based on Educational Data Mining with clustering techniques, this study utilizes pattern recognition and data segmentation models to analyze dropout behavior within Informatics programs. Method: Data mining techniques were applied to a dataset that contained demographic and academic records. Three clustering algorithms, K-means, DBSCAN, and Agglomerative Hierarchical Clustering, were employed, and their performance was evaluated. Results and Discussion: The K-means algorithm produced three distinct clusters with a silhouette score of 0.575, indicating well-defined groups. These clusters revealed significant patterns, such as a predominance of male, single students from Cusco enrolled under the 2013 curriculum. DBSCAN identified four clusters (score: 0.120), while Agglomerative Hierarchical Clustering produced three clusters (score: 0.564), offering a balance between granularity and clarity. The findings highlight the effectiveness of K-means in profiling dropout students and offer insights into their academic trajectories. Research Implications: The findings suggest that tailored interventions addressing the specific needs of identified student clusters may reduce dropout rates, relying on informed policy and practice in higher education. Originality/Value: This study contributes to the literature by introducing an innovative comparative analysis of clustering methods for dropout profiling, offering practical implications for educational data analysis and intervention strategies.
APA, Harvard, Vancouver, ISO, and other styles
5

Mutawalli, Lalu, Sofiansyah Fadli, and Supardianto Supardianto. "Komparasi Metode Perhitungan Jarak K-Means Paling Baik Terhadap Pembentukan Pola Kunjungan Wisatawan Mancanegara." Journal of Information System Research (JOSH) 5, no. 1 (2023): 159–66. http://dx.doi.org/10.47065/josh.v5i1.4377.

Full text
Abstract:
Understanding patterns among foreign tourists is an urgent matter. These patterns can become knowledge that helps in making better decisions because they are data-driven. The pattern to be elaborated on is regarding the clustering of visits by foreign tourists to tourist destinations in Jakarta. Data mining is an approach that extracts knowledge patterns from a dataset. K-Means is one of the data mining algorithms used for clustering data, where data is grouped based on similarity in features and attributes. This study compares the Euclidean Distance, Manhattan Distance, and Haversine Distance methods to obtain more representative data clusters for the datasets. The datasets in this study are not normally distributed due to outlier data; hence, the DBSCAN algorithm is used for improvement without removing or cutting the data, as it can result in a significant amount of missing values that could affect information that does not align with empirical facts. In this study, 5 clusters were created based on elbow calculation results. The K-Means cluster testing in Euclidean distance yielded a Silhouette Score of 0.36, Inertia of 0.86, and Davies-Bouldin Index of 2.39. The Manhattan method resulted in a Silhouette Score of 0.65, Inertia of 1.46, and Davies-Bouldin Index of 0.47. Meanwhile, applying the Haversine method resulted in a Silhouette Score of 0.36, Inertia of 0.03, and a value of 2.39 for the Davies-Bouldin Index.
APA, Harvard, Vancouver, ISO, and other styles
6

Bateja, Ritika, Sanjay Kumar Dubey, and Ashutosh Kumar Bhatt. "Diabetes Prediction and Recommendation Model Using Machine Learning Techniques and MapReduce." Indian Journal Of Science And Technology 17, no. 26 (2024): 2747–53. http://dx.doi.org/10.17485/ijst/v17i26.530.

Full text
Abstract:
Objectives: To deliver patient centric healthcare for diabetic patients using a fast and efficient diabetic prediction and recommendation model which will not only help in early diagnoses of disease but also recommend appropriate medicine for controlling it at stage 1. Methods: The Support Vector Machine Classifier is further enhanced with Particle Swarm Optimization (PSO) and used for the prediction of diabetes. Collaborative Filtering is used for drug recommendation, which produces a suitable list of medications that correspond to the diagnoses of diabetes patients. Improved Density-Based Spatial Clustering of Applications with Noise (I-DBSCAN) is proposed to cluster EHR data to get labels based on the symptoms of patients and map reduction is utilized to process the clustered data in parallel for quick recommendations. Findings: The accuracy of the SVM with the PSO model is 99.20%. The performance of I-DBSCAN is also compared with K-Means and regular DBSCAN using the Silhouette Score, Davies Bouldin Score, and the Calinski Harabasz Score. Also, I-DBSCAN was found to give a more accurate score. Novelty: The extensive volume of diabetes-related information stored in electronic health records (EHRs) through continuous monitoring devices poses a growing difficulty for healthcare professionals to effectively navigate and deliver patient-centered care. Machine Learning techniques like classification and recommendations can be utilized to facilitate early disease diagnosis and recommend appropriate medications. Keywords: Electronic health records (EHRs), Collaborative Filtering (CF), Recommendations, Improved Density Based Spatial Clustering of Applications with Noise (IDBSCAN), SVM classifier
APA, Harvard, Vancouver, ISO, and other styles
7

Samsudin, Angga Radlisa, Dhomas Hatta Fudholi, and Lizda Iswari. "TEMPORAL SPATIAL PROPERTY PROFILING AND IDENTIFICATION OF EARTHQUAKE PRONE AREAS USING ST-DBSCAN AND K-MEANS CLUSTERING." Jurnal Teknik Informatika (Jutif) 5, no. 3 (2024): 917–29. https://doi.org/10.52436/1.jutif.2024.5.3.1293.

Full text
Abstract:
Indonesia is a country located at the confluence of three major tectonic plates, namely Indo-Australia, Eurasia, and the Pacific so that earthquakes often occur, one of which is in West Nusa Tenggara Province. One way to accelerate the disaster mitigation process is to analyze earthquake occurrence based on spatial temporal aspects. This study uses data from BMKG NTB Province during 2018 with a total of 3,699 earthquake events which are then analyzed using ST-DBSCAN and K-Means. ST-DBSCAN analysis was used to determine earthquake prone areas based on the date and location of the event, while k-means used the depth and magnitude of the earthquake. The results show that the distribution pattern of earthquakes in the NTB region has a stationary pattern and there are similar prone areas based on the location and time of occurrence as well as the strength and depth of the earthquake. The ST-DBSCAN method using latitude and longitude attributes produces one cluster that covers 96.33% of the total data. Meanwhile, K-Means using the depth and magnitude attributes produced four clusters. The four clusters were obtained from the cluster density using the silhouette score value between -1 and 1. The K-means analysis used a silhouette score result of 18.527 which was found in cluster 1. Earthquake prone areas in the distribution of earthquakes or types of earthquakes are located in Gangga and Bayan sub-districts of North Lombok and in Sambelia and Sembalun sub-districts of East Lombok. The sub-district with the most frequent earthquakes is Sambelia sub-district with 112 earthquakes. Then the strength of the largest earthquakes on average occurred in Gangga sub-district with magnitudes of 4 to 6.2 SR with shallow earthquake types. The prone area is located at the foot of the mountain and directly adjacent to the ocean.ith shallow earthquake types. The Prone area is at the foot of a mountain and directly adjacent to the ocean.
APA, Harvard, Vancouver, ISO, and other styles
8

Ramadhan, Hafid, Mohammad Rizal Abdan Kamaludin, Muhammad Alfan Nasrullah, and Dwi Rolliawati. "Comparison of Hierarchical, K-Means and DBSCAN Clustering Methods for Credit Card Customer Segmentation Analysis Based on Expenditure Level." Journal of Applied Informatics and Computing 7, no. 2 (2023): 246–51. http://dx.doi.org/10.30871/jaic.v7i2.5790.

Full text
Abstract:
The amount of data from credit card users is increasing from year to year. Credit cards are an important need for people to make payments. The increasing number of credit card users is because it is considered more effective and efficient. The third method used today has a function to determine the effective outcome of credit card user scenarios. In this study, a comparison was made using the Hierarchical Clustering, K-Means and DBSCAN methods to determine the results of credit card customer segmentation analysis to be used as a market strategy. The results obtained based on the best silhouette coefficient score method is two cluster hierarchical clustering with 0.82322 score. Based on the best mean value customers are divided into two segments, and it is suggested to develop strategies for both segments.
APA, Harvard, Vancouver, ISO, and other styles
9

Ritika, Bateja, Kumar Dubey Sanjay, and Kumar Bhatt Ashutosh. "Diabetes Prediction and Recommendation Model Using Machine Learning Techniques and MapReduce." Indian Journal of Science and Technology 17, no. 26 (2024): 2747–53. https://doi.org/10.17485/IJST/v17i26.530.

Full text
Abstract:
Abstract <strong>Objectives:</strong>&nbsp;To deliver patient centric healthcare for diabetic patients using a fast and efficient diabetic prediction and recommendation model which will not only help in early diagnoses of disease but also recommend appropriate medicine for controlling it at stage 1.&nbsp;<strong>Methods:</strong>&nbsp;The Support Vector Machine Classifier is further enhanced with Particle Swarm Optimization (PSO) and used for the prediction of diabetes. Collaborative Filtering is used for drug recommendation, which produces a suitable list of medications that correspond to the diagnoses of diabetes patients. Improved Density-Based Spatial Clustering of Applications with Noise (I-DBSCAN) is proposed to cluster EHR data to get labels based on the symptoms of patients and map reduction is utilized to process the clustered data in parallel for quick recommendations.<strong>&nbsp;Findings:</strong>&nbsp;The accuracy of the SVM with the PSO model is 99.20%. The performance of I-DBSCAN is also compared with K-Means and regular DBSCAN using the Silhouette Score, Davies Bouldin Score, and the Calinski Harabasz Score. Also, I-DBSCAN was found to give a more accurate score.&nbsp;<strong>Novelty:</strong>&nbsp;The extensive volume of diabetes-related information stored in electronic health records (EHRs) through continuous monitoring devices poses a growing difficulty for healthcare professionals to effectively navigate and deliver patient-centered care. Machine Learning techniques like classification and recommendations can be utilized to facilitate early disease diagnosis and recommend appropriate medications. <strong>Keywords:</strong> Electronic health records (EHRs), Collaborative Filtering (CF), Recommendations, Improved Density Based Spatial Clustering of Applications with Noise (IDBSCAN), SVM classifier
APA, Harvard, Vancouver, ISO, and other styles
10

Husna, Farida Amila, Diana Purwitasari, Bayu Adjie Sidharta, Drigo Alexander Sihombing, Amiq Fahmi, and Mauridhi Hery Purnomo. "A Clustering Approach for Mapping Dengue Contingency Plan." Scientific Journal of Informatics 9, no. 2 (2022): 149–60. http://dx.doi.org/10.15294/sji.v9i2.36885.

Full text
Abstract:
Purpose: The dengue epidemic has an increasing number of sufferers and spreading areas along with increased mobility and population density. Therefore, it is necessary to control and prevent Dengue Hemorrhagic Fever (DHF) by mapping a DHF contingency plan. However, mapping a dengue contingency plan is not easy because clinical and managerial issues, vector control, preventive measures, and surveillance must be considered. This work introduces a cluster-based dengue contingency planning method by grouping patient cases according to their environment and demographics, then mapping out a plan and selecting the appropriate plan for each area.Methods: We used clustering with silhouette scoring to select features, the best cluster formation, the best clustering method, and cluster severity. Cluster severity is carried out by levelling the attributes of the average value to low, medium, high, and extreme, which are related to the plans each region sets for village type and season type.Result: In five years of data (2016-2020) ±15K cases from Semarang City, Indonesia, feature selection results show that environmental and demography group features have the biggest silhouette score. With these features, it is found that K-Means has a high silhouette score compared to DBSCAN and agglomerative with three optimum numbers of clusters. K-Means also successfully mapped the cluster severity and assigned the cluster to a suitable contingency policy.Novelty: Most of the research on DHF cases is about predicting DHF cases and measuring the risk of DHF occurrence. There are not many studies that discuss the policy recommendations for dengue control.
APA, Harvard, Vancouver, ISO, and other styles
11

Abdikerimova, Gulzira, Dana Khamitova, Akmaral Kassymova, et al. "Development of a Model for Soil Salinity Segmentation Based on Remote Sensing Data and Climate Parameters." Algorithms 18, no. 5 (2025): 285. https://doi.org/10.3390/a18050285.

Full text
Abstract:
The paper presents a hybrid machine learning model for the spatial segmentation of soils by salinity using multispectral satellite data from Sentinel-2 and climate parameters of the ERA5-Land model. The proposed method aims to solve the problem of accurate soil cover segmentation under climate change and high spatial heterogeneity of data. The approach includes the sequential application of unsupervised learning algorithms (K-Means, hierarchical clustering, DBSCAN), the XGBoost model, and a multitasking neural network that performs simultaneous classification and regression. At the first stage, pseudo-labels are formed using K-Means, then a probabilistic assessment of object membership in classes and ensemble voting of clustering algorithms are carried out. The final model is trained on an extended feature space and demonstrates improved results compared to traditional approaches. Experiments on a sample of 33,624 observations (23,536—training sample, 10,088—test sample) showed an increase in the Silhouette Score value from 0.7840 to 0.8156 and a decrease in the Davies–Bouldin Score from 0.3567 to 0.3022. The classification accuracy was 99.99%, with only one error in more than 10,000 test objects. The results confirmed the proposed method’s high efficiency and applicability for remote monitoring, environmental analysis, and sustainable land management.
APA, Harvard, Vancouver, ISO, and other styles
12

Zhukabayeva, Tamara, Zulfiqar Ahmad, Aigul Adamova, Nurdaulet Karabayev, and Assel Abdildayeva. "An Edge-Computing-Based Integrated Framework for Network Traffic Analysis and Intrusion Detection to Enhance Cyber–Physical System Security in Industrial IoT." Sensors 25, no. 8 (2025): 2395. https://doi.org/10.3390/s25082395.

Full text
Abstract:
Industrial Internet of things (IIoT) environments need to implement reliable security measures because of the growth in network traffic and overall connectivity. Accordingly, this work provides the architecture of network traffic analysis and the detection of intrusions in a network with the help of edge computing and using machine-learning methods. The study uses k-means and DBSCAN techniques to examine the flow of traffic in a network and to discover several groups of behavior and possible anomalies. An assessment of the two clustering methods shows that K-means achieves a silhouette score of 0.612, while DBSCAN achieves 0.473. For intrusion detection, k-nearest neighbors (KNN), random forest (RF), and logistic regression (LR) were used and evaluated. The analysis revealed that both KNN and RF yielded seamless results in terms of precision, recall, and F1 score, close to the maximum possible value of 1.00, as demonstrated by both ROC and precision–recall curves. Accuracy matrices show that RF had better precision and recall for both benign and attacks, while KNN and LR had good detection with slight fluctuations. With the integration of edge computing, the framework is improved by real-time data processing, which means a lower latency of the security system. This work enriches the knowledge of the IIOT by offering a detailed solution to the issue of cybersecurity in IoT systems, based on well-grounded performance assessments and the right implementation of current technologies. The results thus support the effectiveness of the proposed framework to improve security and provide tangible improvements over current approaches by identifying potential threats within a network.
APA, Harvard, Vancouver, ISO, and other styles
13

Fitriyani, Rofi, Ayip Luthfi Firmansyah, Al Yaafi Nadiyal Fithri, and Larasati Angelica Nurfadillah. "Penerapan Algoritma Clustering untuk Segmentasi Pelanggan E-commerce berdasarkan Data Pembelian dan Aktivitas." SEMINAR TEKNOLOGI MAJALENGKA (STIMA) 8 (October 1, 2024): 372–79. http://dx.doi.org/10.31949/stima.v8i0.1129.

Full text
Abstract:
In the current digital era, e-commerce has become one of the main pillars of global trade. With the ever-increasing amount of transaction and user activity data, e-commerce companies are faced with the challenge of understanding and managing diverse customer segments more effectively. This paper discusses the application of clustering algorithms for e-commerce customer segmentation based on purchasing data and user activity. The aim of this research is to identify homogeneous customer groups to support more targeted marketing strategies and increase customer retention. The problem faced is how to process big data originating from user transactions and activities on e-commerce platforms, as well as how to identify patterns that are useful for customer segmentation. The data used in this research includes purchase history, frequency of visits, length of time spent on the site, and interactions with certain products. The solution method applied in this research is the clustering algorithm, especially K-Means and DBSCAN. K-Means is used to group data into a predetermined number of clusters based on the Euclidean distance between data points. Meanwhile, DBSCAN is used to identify clusters with high density and separate data that is considered noise or outliers. Data preprocessing is carried out to clean and normalize the data before being applied to the clustering algorithm. Validation of clustering results is carried out using metrics such as Silhouette Score and Davies-Bouldin Index. The research results show that by applying the clustering algorithm, customers can be grouped into several segments that have similar characteristics. For example, we found groups of customers with high purchase frequency but low transaction value, as well as other groups with high transaction value but low purchase frequency. This information is very useful for companies to design more effective marketing strategies, such as special offers for customers with high transaction values ​​or loyalty programs for customers with high purchasing frequency. The conclusion of this research is that clustering algorithms can be a very effective tool in e-commerce customer segmentation, allowing companies to understand customer behavior patterns and develop more targeted and effective marketing strategies. Thus, implementing this method is expected to improve business performance and overall customer satisfaction.
APA, Harvard, Vancouver, ISO, and other styles
14

Godwin, Ogbuabor. "CLUSTERING ALGORITHM FOR A HEALTHCARE DATASET USING SILHOUETTE SCORE VALUE." October 19, 2018. https://doi.org/10.5281/zenodo.1466257.

Full text
Abstract:
The huge amount of healthcare data, coupled with the need for data analysis tools has made data mining interesting research areas. Data mining tools and techniques help to discover and understand hidden patterns in a dataset which may not be possible by mainly visualization of the data. Selecting appropriate clustering method and optimal number of clusters in healthcare data can be confusing and difficult most times. Presently, a large number of clustering algorithms are available for clustering healthcare data, but it is very difficult for people with little knowledge of data mining to choose suitable clustering algorithms. This paper aims to analyze clustering techniques using healthcare dataset, in order to determine suitable algorithms which can bring the optimized group clusters. Performances of two clustering algorithms (Kmeans and DBSCAN) were compared using Silhouette score values. Firstly, we analyzed K-means algorithm using different number of clusters (K) and different distance metrics. Secondly, we analyzed DBSCAN algorithm using different minimum number of points required to form a cluster (minPts) and different distance metrics. The experimental result indicates that both K-means and DBSCAN algorithms have strong intra-cluster cohesion and inter-cluster separation. Based on the analysis, K-means algorithm performed better compare to DBSCAN algorithm in terms of clustering accuracy and execution time.
APA, Harvard, Vancouver, ISO, and other styles
15

"Comparative Analysis of Clustering Algorithms: Performance Evaluation Using the Weighted Product Method (WPM)." Computer Science, Engineering and Technology 2, no. 4 (2025): 34–42. https://doi.org/10.46632/cset/2/4/5.

Full text
Abstract:
Introduction: Clustering algorithms play a key role in grouping data objects based on their similarities. A popular method, K-means, works by repeatedly adjusting the center of each cluster until convergence is achieved. This method, especially in the PAM form, is widely used in clustering analysis for its effectiveness in separating data. Clustering, an unsupervised learning technique, is very effective in discovering hidden patterns within datasets. Clustering focuses on dividing data into meaningful groups, rather than using predefined labels, as supervised algorithms do. By finding underlying structures and connections, this technique helps gain a deeper understanding of complex data. Research significance: This research is of great importance for the study of clustering algorithms developed for sparse industrial datasets. It aims to provide useful insights and standards for improving clustering performance in industrial settings by examining and contrasting five main clustering approaches: partitioning, hierarchical, density, grid, and model-based methods. Methology: Some alternative suppliers include K-Means, Hierarchical Clustering, DBSCAN, Gaussian Mixture, Sample, Iterated-Bisection, Agglomerative, and Clustering. Evaluation criteria include Silhouette Score, Davis-Bouldin Index, KalinskyHarapas Index, Cluster Cohesion, Execution Time, Memory Usage, and Sensitivity to Noise. Result: According to the results, DBSCAN was ranked highest, while Clustering was ranked lowest. DBSCAN has the highest value for Clustering algorithms according to the WPM Method approach.
APA, Harvard, Vancouver, ISO, and other styles
16

-, Srilekha S., Priyadharshini P. -, and Adhilakshmi M. -. "Comparative Evaluation of K-Means, Hierarchical Clustering, and DBSCAN in Blood Donor Segmentation." International Journal For Multidisciplinary Research 6, no. 4 (2024). http://dx.doi.org/10.36948/ijfmr.2024.v06i04.26755.

Full text
Abstract:
Clustering techniques are pivotal in the fields of data analysis and pattern recognition, offering significant insights by grouping data points with similar characteristics. This study aims to perform a comprehensive comparison of three widely used clustering algorithms—K-Means, Hierarchical Clustering, and DBSCAN—on a dataset of blood donors. The objective is to determine which algorithm achieves the most precise and effective clustering of the data, taking into account factors such as donor location, blood type, and donation frequency. The study presents a novel approach by integrating a web-based platform that allows blood donors to register online. This platform not only facilitates the real-time updating of the dataset but also enhances the overall relevance and applicability of the clustering model by continuously incorporating new data entries. By leveraging such a dynamic dataset, the clustering algorithms can adapt to evolving patterns and trends, ensuring more accurate and meaningful insights over time.To rigorously evaluate the performance of each clustering method, several well-established metrics are employed, including the Silhouette Score, which assesses how similar each data point is to its own cluster compared to other clusters; the Davies-Bouldin Index, which evaluates the average similarity ratio of each cluster with its most similar cluster; and the Calinski-Harabasz Index, which measures the ratio of the sum of between-clusters dispersion and of within-cluster dispersion for all clusters. The results of this study indicate that the K-Means algorithm consistently outperforms both Hierarchical Clustering and DBSCAN in terms of accuracy and the clarity of cluster definitions. The findings underscore the robustness of K-Means for applications involving blood donor data, where capturing precise donor groupings can have substantial implications for healthcare logistics and resource allocation. These insights pave the way for further research into the optimization of clustering techniques in dynamic datasets and their practical applications in medical and other domains.
APA, Harvard, Vancouver, ISO, and other styles
17

Amol Bhopale, Sanskar Zanwar, Aarya Balpande, and Jaweria Kazi. "Optimised Cluster-based Approach for Healthcare Data Analytics." International Journal of Next-Generation Computing, February 15, 2023. http://dx.doi.org/10.47164/ijngc.v14i1.1011.

Full text
Abstract:
Data analytics is an intriguing study due to the fact that an enormous volume of healthcare data is being generated by different smart IOT-based health tracking devices, and the Artificial Intelligent-based applications. Data analytic tools and unsupervised techniques combinedly make it possible to find and comprehend hidden patterns in a dataset that may not be visible through simple data display. Grouping of voluminous data objects into homogenous clusters is a crucial operation in soft computing. Choosing the right clustering technique and the correct number of partitions to divide the healthcare data for effective analysis is complicated and challenging most of the time. This research work examines clustering approaches on the healthcare datasets with the optimum K-clusters, in order to perform the analysis of the data. In this work, the K-means clustering method is examined and the silhouette score is computed to estimate the optimal K-value and the quality of the cluster.
APA, Harvard, Vancouver, ISO, and other styles
18

Amit, Sajid, Abdulla Al Kafy, Mushfiqur Rahman, and Iftakhar Ahmed. "Youth Capability Ecosystems and Strategic Business Models: Leveraging Market Segmentation for Sustainable Development in Emerging Economies." Business Strategy & Development 8, no. 2 (2025). https://doi.org/10.1002/bsd2.70137.

Full text
Abstract:
ABSTRACTAs businesses increasingly seek to align commercial strategies with development goals in emerging markets, understanding the complex youth capability landscape becomes crucial for sustainable growth and impact. This research examines how businesses can align commercial strategies with sustainable development goals by understanding youth capability ecosystems in emerging markets, focusing on Bangladesh as a representative case. Through a mixed‐methods approach with advanced data science techniques, we analyzed survey data from 400 youth respondents in Dhaka to identify distinct market segments based on capability profiles. Our methodological triangulation applying K‐Means clustering (silhouette score: 0.670), Hierarchical clustering (0.778), and DBSCAN (0.782) revealed four robust youth segments with unique development potentials: high potential entrepreneurs, digital workforce, traditional employment seekers, and skill development candidates. Network analysis identified digital proficiency as the most central capability (betweenness centrality: 0.58), demonstrating strong correlations with career confidence (0.93) and entrepreneurial intent (0.92). Business model suitability analysis quantified segment‐specific alignment patterns, with entrepreneurship incubators showing exceptional alignment with high potential entrepreneurs (score approaching 1.0) and digital workforce platforms strongly aligning with the digital workforce segment (0.85). SDG alignment analysis revealed that High Potential Entrepreneurs demonstrate strongest alignment with SDG 9: industry, innovation and infrastructure (0.77), while digital workforce shows substantial alignment with SDG 8: decent work (0.69). These findings provide a robust framework for strategic youth engagement in emerging markets, enabling businesses to create shared value while contributing to sustainable development. The study advances both theoretical understanding of youth capability ecosystems and practical approaches to inclusive business models in emerging economies experiencing demographic transitions.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography