Увійти

Готові списки джерел за темами / ML classification algorithms / Статті в журналах

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: ML classification algorithms.

Статті в журналах з теми "ML classification algorithms"

Автор: Grafiati

Опубліковано: 10 січня 2023

Оновлено: 28 січня 2023

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "ML classification algorithms".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

shehab, Sara, and Arabi Keshk. "Breast Cancer Classification Using Ml Algorithms." Kafrelsheikh Journal of Information Sciences 3, no. 1 (June 7, 2022): 1–7. http://dx.doi.org/10.21608/kjis.2022.159008.1010.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Patidar, Muskan. "Cyber Bullying Detection for Twitter Using ML Classification Algorithms." International Journal for Research in Applied Science and Engineering Technology 9, no. 11 (November 30, 2021): 24–29. http://dx.doi.org/10.22214/ijraset.2021.38701.

Повний текст джерела

Анотація:

Abstract: Social networking platforms have given us incalculable opportunities than ever before, and its benefits are undeniable. Despite benefits, people may be humiliated, insulted, bullied, and harassed by anonymous users, strangers, or peers. Cyberbullying refers to the use of technology to humiliate and slander other people. It takes form of hate messages sent through social media and emails. With the exponential increase of social media users, cyberbullying has been emerged as a form of bullying through electronic messages. We have tried to propose a possible solution for the above problem, our project aims to detect cyberbullying in tweets using ML Classification algorithms like Naïve Bayes, KNN, Decision Tree, Random Forest, Support Vector etc. and also we will apply the NLTK (Natural language toolkit) which consist of bigram, trigram, n-gram and unigram on Naïve Bayes to check its accuracy. Finally, we will compare the results of proposed and baseline features with other machine learning algorithms. Findings of the comparison indicate the significance of the proposed features in cyberbullying detection. Keywords: Cyber bullying, Machine Learning Algorithms, Twitter, Natural Language Toolkit

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Dey, Sumagna. "DIABETES PREDICTION AND VALIDATION MODEL USING ML CLASSIFICATION ALGORITHMS." International Journal of Advanced Research in Computer Science 11, no. 5 (October 20, 2020): 59–63. http://dx.doi.org/10.26483/ijarcs.v11i5.6654.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Pathan, Munir S., S. M. Pradhan, and T. Palani Selvam. "MACHINE LEARNING ALGORITHMS FOR IDENTIFICATION OF ABNORMAL GLOW CURVES AND ASSOCIATED ABNORMALITY IN CaSO4:DY-BASED PERSONNEL MONITORING DOSIMETERS." Radiation Protection Dosimetry 190, no. 3 (July 2020): 342–51. http://dx.doi.org/10.1093/rpd/ncaa108.

Повний текст джерела

Анотація:

Abstract In the present study, machine learning (ML) methods for the identification of abnormal glow curves (GC) of CaSO4:Dy-based thermoluminescence dosimeters in individual monitoring are presented. The classifier algorithms, random forest (RF), artificial neural network (ANN) and support vector machine (SVM) are employed for identifying not only the abnormal glow curve but also the type of abnormality. For the first time, the simplest and computationally efficient algorithm based on RF is presented for GC classifications. About 4000 GCs are used for the training and validation of ML algorithms. The performance of all algorithms is compared by using various parameters. Results show a fairly good accuracy of 99.05% for the classification of GCs by RF algorithm. Whereas 96.7% and 96.1% accuracy is achieved using ANN and SVM, respectively. The RF-based classifier is recommended for GC classification as well as in assisting the fault determination of the TLD reader system.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Sipper, Moshe. "High Per Parameter: A Large-Scale Study of Hyperparameter Tuning for Machine Learning Algorithms." Algorithms 15, no. 9 (September 2, 2022): 315. http://dx.doi.org/10.3390/a15090315.

Повний текст джерела

Анотація:

Hyperparameters in machine learning (ML) have received a fair amount of attention, and hyperparameter tuning has come to be regarded as an important step in the ML pipeline. However, just how useful is said tuning? While smaller-scale experiments have been previously conducted, herein we carry out a large-scale investigation, specifically one involving 26 ML algorithms, 250 datasets (regression and both binary and multinomial classification), 6 score metrics, and 28,857,600 algorithm runs. Analyzing the results we conclude that for many ML algorithms, we should not expect considerable gains from hyperparameter tuning on average; however, there may be some datasets for which default hyperparameters perform poorly, especially for some algorithms. By defining a single hp_score value, which combines an algorithm’s accumulated statistics, we are able to rank the 26 ML algorithms from those expected to gain the most from hyperparameter tuning to those expected to gain the least. We believe such a study shall serve ML practitioners at large.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Ambavkar, Om, Prathmesh Bharti, Amit Chaurasiya, Roshan Chauhan, and Mahalaxmi Palinje. "Review on IDS based on ML Algorithms." International Journal for Research in Applied Science and Engineering Technology 10, no. 11 (November 30, 2022): 169–74. http://dx.doi.org/10.22214/ijraset.2022.47284.

Повний текст джерела

Анотація:

Abstract: Intrusion detection is one of the challenging problems encountered by the modern network security industry. The developing pace of digital assaults on framework networks as of late compounds the protection and security of PC foundation and PCs. Intrusion Detection and Prevention systems are transforming into a critical part of PC organizations and network safety. Various approaches have been proposed to determine the most effective features and hence enhance the efficiency of intrusion detection systems, the methods include, machine learning-based (ML), Bayesian based algorithm, Random Forest, SVM, Decision Tree. This paper presents an intensive survey on different examination articles that utilized single, hybrid and ensemble classification algorithms. The outcomes measurements, weaknesses and datasets involved by the concentrated on articles in the advancement of IDS were looked at. A future heading for potential explores is likewise given. The paper addressed latest research papers written from the use of machine learning classifiers in intrusion detection systems.

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Camele, Genaro, Waldo Hasperué, Franco Ronchetti, and Facundo Manuel Quiroga. "Statistical analysis of the performance of four Apache Spark ML algorithms." Journal of Computer Science and Technology 22, no. 2 (October 17, 2022): e14. http://dx.doi.org/10.24215/16666038.22.e14.

Повний текст джерела

Анотація:

Feature selection (FS) techniques generally require repeatedly training and evaluating models to assess theimportance of each feature for a particular task. However, due to the increasing size of currently availabledatabases, distributed processing has become a necessity for many tasks. In this context, the Apache SparkML library is one of the most widely used libraries for performing classification and other tasks with largedatasets. Therefore, knowing both the predictive performance and efficiency of its main algorithms beforeapplying a FS technique is crucial to planning computations and saving time. In this work, a comparativestudy of four Spark ML classification algorithms is carried out, statistically measuring execution times andpredictive power based on the number of attributes from a colon cancer database. Results were statistically analyzed, showing that, although Random Forest and Na¨ıve Bayes are the algorithms with the shortest execution times, Support Vector Machine obtains models with the best predictive power. The study of the performance of these algorithms is interesting as they are applied in many different problems, such as classification of pathologies from epigenomic data, image classification, prediction of computer attacks in network security problems, among others.

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Ã–zsoy, Salih, GÃ¶khan GÃ¼mÃ¼ÅŸ, and Savriddin KHALILOV. "C4.5 Versus Other Decision Trees: A Review." Computer Engineering and Applications Journal 4, no. 3 (September 20, 2015): 173–82. http://dx.doi.org/10.18495/comengapp.v4i3.141.

Повний текст джерела

Анотація:

In this study, Data Mining, one of the latest technologies of the Information Systems, was introduced and Classification a Data Mining method and the Classification algorithms were discussed. A classification was applied by using C4.5 decision tree algorithm on a dataset about Labor Relations from http://archive.ics.uci.edu/ml/datasets.html. Finally, C4.5 algorithm was compared to some other decision tree algorithms. C4.5 was the one of the successful classifier.

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Lee, Sangwoo, Eun Choe, and Boram Park. "Exploration of Machine Learning for Hyperuricemia Prediction Models Based on Basic Health Checkup Tests." Journal of Clinical Medicine 8, no. 2 (February 2, 2019): 172. http://dx.doi.org/10.3390/jcm8020172.

Повний текст джерела

Анотація:

Background: Machine learning (ML) is a promising methodology for classification and prediction applications in healthcare. However, this method has not been practically established for clinical data. Hyperuricemia is a biomarker of various chronic diseases. We aimed to predict uric acid status from basic healthcare checkup test results using several ML algorithms and to evaluate the performance. Methods: We designed a prediction model for hyperuricemia using a comprehensive health checkup database designed by the classification of ML algorithms, such as discrimination analysis, K-nearest neighbor, naïve Bayes (NBC), support vector machine, decision tree, and random forest classification (RFC). The performance of each algorithm was evaluated and compared with the performance of a conventional logistic regression (CLR) algorithm by receiver operating characteristic curve analysis. Results: Of the 38,001 participants, 7705 were hyperuricemic. For the maximum sensitivity criterion, NBC showed the highest sensitivity (0.73), and RFC showed the second highest (0.66); for the maximum balanced classification rate (BCR) criterion, RFC showed the highest BCR (0.68), and NBC showed the second highest (0.66) among the various ML algorithms for predicting uric acid status. In a comparison to the performance of NBC (area under the curve (AUC) = 0.669, 95% confidence intervals (CI) = 0.669–0.675) and RFC (AUC = 0.775, 95% CI 0.770–0.780) with a CLR algorithm (AUC = 0.568, 95% CI = 0.563–0.571), NBC and RFC showed significantly better performance (p < 0.001). Conclusions: The ML model was superior to the CLR model for the prediction of hyperuricemia. Future studies are needed to determine the best-performing ML algorithms based on data set characteristics. We believe that this study will be informative for studies using ML tools in clinical research.

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Jafarzadeh, Hamid, Masoud Mahdianpari, Eric Gill, Fariba Mohammadimanesh, and Saeid Homayouni. "Bagging and Boosting Ensemble Classifiers for Classification of Multispectral, Hyperspectral and PolSAR Data: A Comparative Evaluation." Remote Sensing 13, no. 21 (November 2, 2021): 4405. http://dx.doi.org/10.3390/rs13214405.

Повний текст джерела

Анотація:

In recent years, several powerful machine learning (ML) algorithms have been developed for image classification, especially those based on ensemble learning (EL). In particular, Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) methods have attracted researchers’ attention in data science due to their superior results compared to other commonly used ML algorithms. Despite their popularity within the computer science community, they have not yet been well examined in detail in the field of Earth Observation (EO) for satellite image classification. As such, this study investigates the capability of different EL algorithms, generally known as bagging and boosting algorithms, including Adaptive Boosting (AdaBoost), Gradient Boosting Machine (GBM), XGBoost, LightGBM, and Random Forest (RF), for the classification of Remote Sensing (RS) data. In particular, different classification scenarios were designed to compare the performance of these algorithms on three different types of RS data, namely high-resolution multispectral, hyperspectral, and Polarimetric Synthetic Aperture Radar (PolSAR) data. Moreover, the Decision Tree (DT) single classifier, as a base classifier, is considered to evaluate the classification’s accuracy. The experimental results demonstrated that the RF and XGBoost methods for the multispectral image, the LightGBM and XGBoost methods for hyperspectral data, and the XGBoost and RF algorithms for PolSAR data produced higher classification accuracies compared to other ML techniques. This demonstrates the great capability of the XGBoost method for the classification of different types of RS data.

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Ilievski, Gjorgji, and Pero Latkoski. "Network Traffic Classification in an NFV Environment using Supervised ML Algorithms." Journal of Telecommunictions and Information Technology 3, no. 2021 (September 30, 2021): 23–31. http://dx.doi.org/10.26636/jtit.2021.153421.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Zeng, Zengri, Wei Peng, and Baokang Zhao. "Improving the Accuracy of Network Intrusion Detection with Causal Machine Learning." Security and Communication Networks 2021 (November 3, 2021): 1–18. http://dx.doi.org/10.1155/2021/8986243.

Повний текст джерела

Анотація:

In recent years, machine learning (ML) algorithms have been approved effective in the intrusion detection. However, as the ML algorithms are mainly applied to evaluate the anomaly of the network, the detection accuracy for cyberattacks with multiple types cannot be fully guaranteed. The existing algorithms for network intrusion detection based on ML or feature selection are on the basis of spurious correlation between features and cyberattacks, causing several wrong classifications. In order to tackle the abovementioned problems, this research aimed to establish a novel network intrusion detection system (NIDS) based on causal ML. The proposed system started with the identification of noisy features by causal intervention, while only the features that had a causality with cyberattacks were preserved. Then, the ML algorithm was used to make a preliminary classification to select the most relevant types of cyberattacks. As a result, the unique labeled cyberattack could be detected by the counterfactual detection algorithm. In addition to a relatively stable accuracy, the complexity of cyberattack detection could also be effectively reduced, with a maximum reduction to 94% on the size of training features. Moreover, in case of the availability of several types of cyberattacks, the detection accuracy was significantly improved compared with the previous ML algorithms.

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Ponukumati, B. K., P. Sinha, M. K. Maharana, A. V. P. Kumar, and A. Karthik. "An Intelligent Fault Detection and Classification Scheme for Distribution Lines Using Machine Learning." Engineering, Technology & Applied Science Research 12, no. 4 (August 1, 2022): 8972–77. http://dx.doi.org/10.48084/etasr.5107.

Повний текст джерела

Анотація:

The current paper focuses on the development and deployment of Machine Learning (ML) based algorithms for the classification and detection of different faults in the electrical distribution system. The methodology adapted using ML has higher computational accuracy than traditional computational algorithms. The parameters involved in developing ML for fault detection/classification are fundamental frequency, fault voltage, and current components at fault situations. During faults, the current and voltage waveforms consist of high-frequency transient signals. The Wavelet Decomposition (WD) technique is used to break down transient signals to obtain the required information. To investigate the performance of the ML-based algorithms, an IEEE 33 bus system is utilized, and a fault is generated in Matlab/Simulink environment. The methodologies used for fault detection and classification are K Nearest Neighbor (KNN), Decision Tree (DT), and Support Vector Machine (SVM). The performance of the designed algorithm is assessed by employing a confusion matrix, and the results demonstrated extraordinarily high accuracy.

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Chen, Siji, Bin Shen, Xin Wang, and Sang-Jo Yoo. "A Strong Machine Learning Classifier and Decision Stumps Based Hybrid AdaBoost Classification Algorithm for Cognitive Radios." Sensors 19, no. 23 (November 20, 2019): 5077. http://dx.doi.org/10.3390/s19235077.

Повний текст джерела

Анотація:

Machine learning (ML) based classification methods have been viewed as one kind of alternative solution for cooperative spectrum sensing (CSS) in recent years. In this paper, ML techniques based CSS algorithms are investigated for cognitive radio networks (CRN). Specifically, a strong machine learning classifier (MLC) and decision stumps (DS) based adaptive boosting (AdaBoost) classification mechanism is proposed for pattern classification of the primary user’s behavior in the network. The conventional AdaBoost algorithm only combines multiple sub-classifiers and produces a strong weight based on their weights in classification. Taking into account the fact that the strong MLC and the weak DS serve as different sub-classifiers in classification, we propose employing a strong MLC as the first-stage classifier and DS as the second-stage classifiers, to eventually determine the class that the spectrum energy vector belongs to. We verify in simulations that the proposed hybrid AdaBoost algorithms are capable of achieving a higher detection probability than the conventional ML based spectrum sensing algorithms and the conventional hard fusion based CSS schemes.

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Trupthi, Dr M. "Room Occupancy Prediction Using ML." International Journal for Research in Applied Science and Engineering Technology 10, no. 6 (June 30, 2022): 4924–30. http://dx.doi.org/10.22214/ijraset.2022.45122.

Повний текст джерела

Анотація:

Abstract: Now-a-days buildings are growing rapidly and uses large amount of energy. Most of the energy gets wasted on HVAC systems. Occupancy (presence and number of occupants) is one of the most important factors impacting energy efficiency of HVAC systems. Hence, knowing occupancy information is vital for demand driven HVAC controls, that directly impacts on energy-related building control systems. So, occupancy information without privacy issues (like using cameras) gained importance. Recent works are done on this problem. Existing solutions proposed are yes or no classification, estimating for low count etc using sensors data. Using yes or no classification, the energy consumption may not be saved to a great extent. In this work, we have trained a model which classifies the state of room based on number of occupants (empty, low, fair, high). For this we are using data collected by different sensors (co2, humidity, temperature). We are using classification algorithms to train the model. This information could be used to solve energy related problems in buildings

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Shafiq, Muhammad, Xiangzhan Yu, Asif Ali Laghari, and Dawei Wang. "Effective Feature Selection for 5G IM Applications Traffic Classification." Mobile Information Systems 2017 (2017): 1–12. http://dx.doi.org/10.1155/2017/6805056.

Повний текст джерела

Анотація:

Recently, machine learning (ML) algorithms have widely been applied in Internet traffic classification. However, due to the inappropriate features selection, ML-based classifiers are prone to misclassify Internet flows as that traffic occupies majority of traffic flows. To address this problem, a novel feature selection metric named weighted mutual information (WMI) is proposed. We develop a hybrid feature selection algorithm named WMI_ACC, which filters most of the features with WMI metric. It further uses a wrapper method to select features for ML classifiers with accuracy (ACC) metric. We evaluate our approach using five ML classifiers on the two different network environment traces captured. Furthermore, we also apply Wilcoxon pairwise statistical test on the results of our proposed algorithm to find out the robust features from the selected set of features. Experimental results show that our algorithm gives promising results in terms of classification accuracy, recall, and precision. Our proposed algorithm can achieve 99% flow accuracy results, which is very promising.

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Kishimoto, Akihiro, Djallel Bouneffouf, Radu Marinescu, Parikshit Ram, Ambrish Rawat, Martin Wistuba, Paulito Palmes, and Adi Botea. "Bandit Limited Discrepancy Search and Application to Machine Learning Pipeline Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 9 (June 28, 2022): 10228–37. http://dx.doi.org/10.1609/aaai.v36i9.21263.

Повний текст джерела

Анотація:

Optimizing a machine learning (ML) pipeline has been an important topic of AI and ML. Despite recent progress, pipeline optimization remains a challenging problem, due to potentially many combinations to consider as well as slow training and validation. We present the BLDS algorithm for optimized algorithm selection (ML operations) in a fixed ML pipeline structure. BLDS performs multi-fidelity optimization for selecting ML algorithms trained with smaller computational overhead, while controlling its pipeline search based on multi-armed bandit and limited discrepancy search. Our experiments on well-known classification benchmarks show that BLDS is superior to competing algorithms. We also combine BLDS with hyperparameter optimization, empirically showing the advantage of BLDS.

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Wu, Jiande, and Chindo Hicks. "Breast Cancer Type Classification Using Machine Learning." Journal of Personalized Medicine 11, no. 2 (January 20, 2021): 61. http://dx.doi.org/10.3390/jpm11020061.

Повний текст джерела

Анотація:

Background: Breast cancer is a heterogeneous disease defined by molecular types and subtypes. Advances in genomic research have enabled use of precision medicine in clinical management of breast cancer. A critical unmet medical need is distinguishing triple negative breast cancer, the most aggressive and lethal form of breast cancer, from non-triple negative breast cancer. Here we propose use of a machine learning (ML) approach for classification of triple negative breast cancer and non-triple negative breast cancer patients using gene expression data. Methods: We performed analysis of RNA-Sequence data from 110 triple negative and 992 non-triple negative breast cancer tumor samples from The Cancer Genome Atlas to select the features (genes) used in the development and validation of the classification models. We evaluated four different classification models including Support Vector Machines, K-nearest neighbor, Naïve Bayes and Decision tree using features selected at different threshold levels to train the models for classifying the two types of breast cancer. For performance evaluation and validation, the proposed methods were applied to independent gene expression datasets. Results: Among the four ML algorithms evaluated, the Support Vector Machine algorithm was able to classify breast cancer more accurately into triple negative and non-triple negative breast cancer and had less misclassification errors than the other three algorithms evaluated. Conclusions: The prediction results show that ML algorithms are efficient and can be used for classification of breast cancer into triple negative and non-triple negative breast cancer types.

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Batista, João E., Ana I. R. Cabral, Maria J. P. Vasconcelos, Leonardo Vanneschi, and Sara Silva. "Improving Land Cover Classification Using Genetic Programming for Feature Construction." Remote Sensing 13, no. 9 (April 21, 2021): 1623. http://dx.doi.org/10.3390/rs13091623.

Повний текст джерела

Анотація:

Genetic programming (GP) is a powerful machine learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in the field of remote sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs feature construction by evolving hyperfeatures from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyperfeatures from satellite bands to improve the classification of land cover types. We add the evolved hyperfeatures to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (decision trees, random forests, and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyperfeatures to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI, and NBR. We also compare the performance of the M3GP hyperfeatures in the binary classification problems with those created by other feature construction methods such as FFX and EFS.

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Pereira, Leonardo T., and Claudio F. M. Toledo. "Speeding up Search-Based Algorithms for Level Generation in Physics-Based Puzzle Games." International Journal on Artificial Intelligence Tools 26, no. 05 (October 2017): 1760019. http://dx.doi.org/10.1142/s0218213017600193.

Повний текст джерела

Анотація:

This research uses Machine Learning (ML) techniques in order to aid search-based (SB) algorithms to improve level generation for physics-based puzzle games. These algorithms’ performance are improved by reducing simulation time when the ML techniques are applied. Classification algorithms prevent levels with undesired traits to be evaluated during the simulation phase of the SB procedures. An Angry Birds clone is used for conducting the experiments and results report improvement using the combined approach against every SB algorithm by itself.

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Takkar, Sakshi, Aman Singh, and Babita Pandey. "Application of Machine Learning Algorithms to a Well Defined Clinical Problem: Liver Disease." International Journal of E-Health and Medical Communications 8, no. 4 (October 2017): 38–60. http://dx.doi.org/10.4018/ijehmc.2017100103.

Повний текст джерела

Анотація:

Liver diseases represent a major health burden worldwide. Machine learning (ML) algorithms have been extensively used to diagnose liver disease. This study accordingly aims to employ various individual and integrated ML algorithms on distinct liver disease datasets for evaluating the diagnostic performances, to integrate dimensionality reduction method with the ML algorithms for analyzing variation in results, to find the best classification model and to analyze the merits and demerits of these algorithms. KNN and PCA-KNN emerged to be the top individual and integrated models. The study also concluded that one specific algorithm can't show best results for all types of datasets and integrated models not always perform better than the individuals. It is observed that no algorithm is perfect and performance of an algorithm totally depends on the dataset type and structure, its number of observations, its dimensions and the decision boundary.

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Wang, Shuang, Jian Luo, and Liangfu Luo. "Large-scale Text Multiclass Classification Using Spark ML Packages." Journal of Physics: Conference Series 2171, no. 1 (January 1, 2022): 012022. http://dx.doi.org/10.1088/1742-6596/2171/1/012022.

Повний текст джерела

Анотація:

Abstract Machine learning is widely used in the field of natural language processing. Text classification is an important branch of natural language processing, mainly utilized to determine the topic of news information and judge the emotional tendency reflected by social data. However, with the expansion of data scale, traditional machine learning algorithms will encounter various performance bottlenecks due to the limitation of single node resource. The machine learning library ML provided by the distributed memory computing framework spark can efficiently process large-scale datasets. Based on Spark ML, this paper uses logistic regression to classify the preprocessed data originated from standard dataset 20newsgroups. The classification accuracy and F1-score can reach 89.64% and 89.64% respectively. Meanwhile, 5-fold cross validation experiments prove that the result is reliable and procedure design is feasible.

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Oo, Tin Ko, Noppol Arunrat, Sukanya Sereenonchai, Achara Ussawarujikulchai, Uthai Chareonwong, and Winai Nutmagul. "Comparing Four Machine Learning Algorithms for Land Cover Classification in Gold Mining: A Case Study of Kyaukpahto Gold Mine, Northern Myanmar." Sustainability 14, no. 17 (August 29, 2022): 10754. http://dx.doi.org/10.3390/su141710754.

Повний текст джерела

Анотація:

Numerous studies have been undertaken to determine the optimal land use/cover classification algorithm. However, there have not been many studies that have compared and evaluated the performance of maximum likelihood (ML), random forest (RF), support vector machine (SVM), and classification and regression trees (CART) using ASTER imagery, especially in a mining district. Therefore, this study aims to investigate land use/cover (LULC) change over three decades (1990–2020), comparing the performance of the ML, RF, SVM, and CART machine learning algorithms. The Landsat and ASTER data were retrieved using Google Earth Engine (GEE). Traditional ML classification was performed on ArcGIS 10.2 software while RF, SVM, and CART classification were undertaken on GEE. Then, thematic accuracy assessments were conducted for the four algorithms and their performances were compared. The results showed that the largest changes in area occurred in forest cover that decreased from 37.8 to 27.3 km2 during the three decades. The remarkable expansion of gold mining occurred during 2005–2010 with the increases of 1.6%. The mining land rose by 2.9% during the study period whereas agricultural land increased significantly by 10.7% between 1990 and 2020. When comparing the four algorithms, the RF algorithm gives the highest accuracy with an overall accuracy of 95.85% while SVM follows RF with 91.69%. This study proved that RF is the best choice for optimal land use/cover classification, particularly in the mining district.

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Tarasova, L. V., and L. N. Smirnova. "Satellite-based analysis of classification algorithms applied to the riparian zone of the Malaya Kokshaga river." IOP Conference Series: Earth and Environmental Science 932, no. 1 (December 1, 2021): 012012. http://dx.doi.org/10.1088/1755-1315/932/1/012012.

Повний текст джерела

Анотація:

Abstract The paper comparatively analyses the accuracy of land cover classification in the riparian zone of the Malaya Kokshaga river in the Mari El Republic of Russia using Sentinel-2A satellite images with the algorithms of supervised classification: Maximum Likelihood (ML), Decision Tree (DT) and Neural Net (NN) in the ENVI-5.2 software package. Six main classes of land cover were identified based on field studies: coniferous, mixed (deciduous), shrublands, herbaceous, and water. The assessment of the area and the structure of land cover showed that forest covers 76% of the entire territory of the riparian area of the Malaya Kokshaga river. The analysis of the results of thematic mapping shows that the overall classification accuracy obtained by the ML algorithm is 96.09%, by NN - 94.51%, and by DT - 86.54%. The producer’s accuracy and user’s accuracy for most classes have the maximum value when the ML algorithm is used. For the NN algorithm, the maximum value of producer’s accuracy is observed for the mixed (deciduous) class, while for the DT algorithm – for the coniferous. When classified using all three algorithms the water and bare land classes were mixed, which requires more detailed work when estimating riparian forest ecosystems.

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Li, Yuping. "Similar Classification Algorithm for Educational and Teaching Knowledge Based on Machine Learning." Wireless Communications and Mobile Computing 2022 (May 23, 2022): 1–9. http://dx.doi.org/10.1155/2022/7222236.

Повний текст джерела

Анотація:

From ancient times, machines did adhere to the commands that a human or a user prepared. According to the program, the machines are controlled by implementing machine learning (ML). It plays a significant part in the development of information technology (IT) companies and the rise of the education system. Using stored memories, people learn new things, making them feel better than before. Machines are pretty different from human knowledge. Instead of using memory power, they use statistical comparison to analyze the data. Here, the amount of data is stored in a database, and according to the reaction received from the user, it gets additional data to create new data. For example, once a person hears music using the application, they will hear repeated music before further entry. In this case, the application is working based on the machine learning algorithm. First, it collects the information from the user, and then, it uses the same information (data) to make the user’s work more efficient when they return. The existing system like Support Vector Machine (SVM) and learning management system approaches the necessity and development of the higher education system using machine learning algorithms. This proposed system focuses on classifying education and teaching knowledge by implementing the machine learning-based similar classification algorithm (ML-SCA). ML-SCA focuses on classifying similar teaching videos and the recommendations to improve the teaching and academic knowledge for the teachers and the students. ML-SCA is compared with the existing neural network and K -means algorithms. Based on the efficiency results, it is observed that the proposed ML-SCA has achieved 92% higher than the existing algorithms.

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Li, Hui, and Zeming Li. "Text Classification Based on Machine Learning and Natural Language Processing Algorithms." Wireless Communications and Mobile Computing 2022 (July 19, 2022): 1–12. http://dx.doi.org/10.1155/2022/3915491.

Повний текст джерела

Анотація:

Nowadays, with the development of media technology, people receive more and more information, but the current classification methods have the disadvantages of low classification efficiency and inability to identify multiple languages. In view of this, this paper is aimed at improving the text classification method by using machine learning and natural language processing technology. For text classification technology, this paper combines the technical requirements and application scenarios of text classification with ML to optimize the classification. For the application of natural language processing (NLP) technology in text classification, this paper puts forward the Trusted Platform Module (TPM) text classification algorithm. In the experiment of distinguishing spam from legitimate mail by text recognition, all performance indexes of the TPM algorithm are superior to other algorithms, and the accuracy of the TPM algorithm on different datasets is above 95%.

Стилі APA, Harvard, Vancouver, ISO та ін.

27

Xue, Joel, and Long Yu. "Applications of Machine Learning in Ambulatory ECG." Hearts 2, no. 4 (October 13, 2021): 472–94. http://dx.doi.org/10.3390/hearts2040037.

Повний текст джерела

Анотація:

The ambulatory ECG (AECG) is an important diagnostic tool for many heart electrophysiology-related cases. AECG covers a wide spectrum of devices and applications. At the core of these devices and applications are the algorithms responsible for signal conditioning, ECG beat detection and classification, and event detections. Over the years, there has been huge progress for algorithm development and implementation thanks to great efforts by researchers, engineers, and physicians, alongside the rapid development of electronics and signal processing, especially machine learning (ML). The current efforts and progress in machine learning fields are unprecedented, and many of these ML algorithms have also been successfully applied to AECG applications. This review covers some key AECG applications of ML algorithms. However, instead of doing a general review of ML algorithms, we are focusing on the central tasks of AECG and discussing what ML can bring to solve the key challenges AECG is facing. The center tasks of AECG signal processing listed in the review include signal preprocessing, beat detection and classification, event detection, and event prediction. Each AECG device/system might have different portions and forms of those signal components depending on its application and the target, but these are the topics most relevant and of greatest concern to the people working in this area.

Стилі APA, Harvard, Vancouver, ISO та ін.

28

Boyko, Nataliya. "Research into machine learning algorithms for the construction of mathematical models of multimodal data classification problems." Computational Problems of Electrical Engineering 11, no. 2 (November 30, 2021): 1–11. http://dx.doi.org/10.23939/jcpee2021.02.001.

Повний текст джерела

Анотація:

Currently, machine learning algorithms (ML) are increasingly integrated into everyday life. There are many areas of modern life where classification methods are already used. Methods taking into account previous predictions and errors that are calculated as a result of data integration to obtain forecasts for obtaining the classification result are investigated. A general overview of classification methods is conducted. Experiments on machine learning algorithms for multimodal data are performed. It is important to consider all the characteristics of metrics and features when using ML algorithms to predict multimodal data. The main advantages and disadvantages of Gradient Boosting, Random Forest, Logistic Regression and XGBoost algorithms are analyzed in the work.

Стилі APA, Harvard, Vancouver, ISO та ін.

29

Tan, Lijuan, Jinzhu Lu, and Huanyu Jiang. "Tomato Leaf Diseases Classification Based on Leaf Images: A Comparison between Classical Machine Learning and Deep Learning Methods." AgriEngineering 3, no. 3 (July 9, 2021): 542–58. http://dx.doi.org/10.3390/agriengineering3030035.

Повний текст джерела

Анотація:

Tomato production can be greatly reduced due to various diseases, such as bacterial spot, early blight, and leaf mold. Rapid recognition and timely treatment of diseases can minimize tomato production loss. Nowadays, a large number of researchers (including different institutes, laboratories, and universities) have developed and examined various traditional machine learning (ML) and deep learning (DL) algorithms for plant disease classification. However, through pass survey analysis, we found that there are no studies comparing the classification performance of ML and DL for the tomato disease classification problem. The performance and outcomes of different traditional ML and DL (a subset of ML) methods may vary depending on the datasets used and the tasks to be solved. This study generally aimed to identify the most suitable ML/DL models for the PlantVillage tomato dataset and the tomato disease classification problem. For machine learning algorithm implementation, we used different methods to extract disease features manually. In our study, we extracted a total of 52 texture features using local binary pattern (LBP) and gray level co-occurrence matrix (GLCM) methods and 105 color features using color moment and color histogram methods. Among all the feature extraction methods, the COLOR+GLCM method obtained the best result. By comparing the different methods, we found that the metrics (accuracy, precision, recall, F1 score) of the tested deep learning networks (AlexNet, VGG16, ResNet34, EfficientNet-b0, and MobileNetV2) were all better than those of the measured machine learning algorithms (support vector machine (SVM), k-nearest neighbor (kNN), and random forest (RF)). Furthermore, we found that, for our dataset and classification task, among the tested ML/DL algorithms, the ResNet34 network obtained the best results, with accuracy of 99.7%, precision of 99.6%, recall of 99.7%, and F1 score of 99.7%.

Стилі APA, Harvard, Vancouver, ISO та ін.

30

Salehnasab, Cirruse, Abbas Hajifathali, Farkhondeh Asadi, Elham Roshandel, Alireza Kazemi, and Arash Roshanpoor. "Machine Learning Classification Algorithms to Predict aGvHD following Allo-HSCT: A Systematic Review." Methods of Information in Medicine 58, no. 06 (December 2019): 205–12. http://dx.doi.org/10.1055/s-0040-1709150.

Повний текст джерела

Анотація:

Abstract Background The acute graft-versus-host disease (aGvHD) is the most important cause of mortality in patients receiving allogeneic hematopoietic stem cell transplantation. Given that it occurs at the stage of severe tissue damage, its diagnosis is late. With the advancement of machine learning (ML), promising real-time models to predict aGvHD have emerged. Objective This article aims to synthesize the literature on ML classification algorithms for predicting aGvHD, highlighting algorithms and important predictor variables used. Methods A systemic review of ML classification algorithms used to predict aGvHD was performed using a search of the PubMed, Embase, Web of Science, Scopus, Springer, and IEEE Xplore databases undertaken up to April 2019 based on Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statements. The studies with a focus on using the ML classification algorithms in the process of predicting of aGvHD were considered. Results After applying the inclusion and exclusion criteria, 14 studies were selected for evaluation. The results of the current analysis showed that the algorithms used were Artificial Neural Network (79%), Support Vector Machine (50%), Naive Bayes (43%), k-Nearest Neighbors (29%), Regression (29%), and Decision Trees (14%), respectively. Also, many predictor variables have been used in these studies so that we have divided them into more abstract categories, including biomarkers, demographics, infections, clinical, genes, transplants, drugs, and other variables. Conclusion Each of these ML algorithms has a particular characteristic and different proposed predictors. Therefore, it seems these ML algorithms have a high potential for predicting aGvHD if the process of modeling is performed correctly.

Стилі APA, Harvard, Vancouver, ISO та ін.

31

Riya, Meetali, R. M. Samant, Parimal Bartakke, Jayesh Mohite, and Subhasini Priya. "Brain Tumor MRI Image Processing and Classification by Edge Detection using ML Algorithms." International Journal of Computer Applications 184, no. 13 (May 20, 2022): 55–59. http://dx.doi.org/10.5120/ijca2022922132.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

32

Belkadi, Omayma, Alexandru Vulpe, Yassin Laaziz, and Simona Halunga. "ML-Based Traffic Classification in an SDN-Enabled Cloud Environment." Electronics 12, no. 2 (January 5, 2023): 269. http://dx.doi.org/10.3390/electronics12020269.

Повний текст джерела

Анотація:

Traffic classification plays an essential role in network security and management; therefore, studying traffic in emerging technologies can be useful in many ways. It can lead to troubleshooting problems, prioritizing specific traffic to provide better performance, detecting anomalies at an early stage, etc. In this work, we aim to propose an efficient machine learning method for traffic classification in an SDN/cloud platform. Traffic classification in SDN allows the management of flows by taking the application’s requirements into consideration, which leads to improved QoS. After our tests were implemented in a cloud/SDN environment, the method that we proposed showed that the supervised algorithms used (Naive Bayes, SVM (SMO), Random Forest, C4.5 (J48)) gave promising results of up to 97% when using the studied features and over 95% when using the generated features.

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Vergara Reyes, Juliana Alejandra, María Camila Martínez Ordoñez, and Oscar Mauricio Caicedo Rendón. "A benchmarking of the efficiency of supervised ML algorithms in the NFV traffic classification." Sistemas y Telemática 15, no. 42 (October 19, 2017): 47–67. http://dx.doi.org/10.18046/syt.v15i42.2539.

Повний текст джерела

Анотація:

The implementation of NFV allows improving the flexibility, efficiency, and manageability of networks by leveraging virtualization and cloud computing technologies to deploy computer networks. The implementation of autonomic management and supervised algorithms from Machine Learning [ML] become a key strategy to manage this hidden traffic. In this work, we focus on analyzing the traffic features of NFV-based networks while performing a benchmarking of the behavior of supervised ML algorithms, namely J48, Naïve Bayes, and Bayes Net, in the IP traffic classification regarding their efficiency; considering that such an efficiency is related to the trade-off between time-response and precision. We used two test scenarios (an NFV-based SDN and an NFV-based LTE EPC). The benchmarking results reveal that the Naïve Bayes and Bayes Net algorithms achieve the best performance in traffic classification. In particular, their performance corroborates a good trade-off between precision and time-response, with precision values higher than 80 % and 96 %, respectively, in times less than 1,5 sec.

Стилі APA, Harvard, Vancouver, ISO та ін.

34

Ahmed Khaleel, Almas, and Joanna Hussein Al-Khalidy. "Multispectral Image Classification Based on the Bat Algorithm." International journal of electrical and computer engineering systems 13, no. 2 (February 28, 2022): 119–26. http://dx.doi.org/10.32985/ijeces.13.2.4.

Повний текст джерела

Анотація:

There are many traditional classification algorithms used to classify multispectral images, especially those used in remote sensing. But the challenges of using these algorithms for multispectral image classification are that they are slow to implement and have poor classification accuracy. With the development of technologies that mimic nature, many researchers have resorted to using intelligent algorithms instead of traditional algorithms because of their great importance, especially when dealing with large amounts of data. The bat algorithm (BA) is one of the most important of these algorithms. This study aims to verify the possibility of using the BA to classify the multispectral images captured by the Landsat-5 TM satellite image of the study area. The study area represents the Mosul area located in the Nineveh Governorate in northwestern Iraq. The purpose is not only to study the ability of the BA to classify multispectral images but also to obtain a land cover map of this region. The BA showed efficiency in the classification results compared to Maximum Likelihood (ML), where the overall accuracy of classification when using the BA reached (82.136%), while MLreached (79.64%)

Стилі APA, Harvard, Vancouver, ISO та ін.

35

Fatima Iftikhar, Hafiz Muhammad Mueez Amin, and Ghulam Abbas. "A Feature Fusion Based Hybrid Approach for Breast Cancer Classification." Journal of Computing & Biomedical Informatics 3, no. 01 (March 15, 2022): 243–56. http://dx.doi.org/10.56979/301/2022/37.

Повний текст джерела

Анотація:

A detected type of cancer is breast cancer commonly in women. According to some estimate one in nine women is diagnosed with breast cancer. It is unfortunate that due to a lack of proper facilities, the diagnosis of breast cancer in patients is being delayed, which is leading to an increase in the possible death rate. Many different statistical methods and Machine Learning algorithms are often employed in the study to make breast cancer detection more accurate. Machine learning (ML) has allowed doctors to achieve remarkable results, and healthcare is using ML-based models to detect breast cancer in women. This allows analyzing the healthcare data and uses the traditional computer-aided detection (CAD) to assess breast cancer. Machine learning has become an accepted clinical practice and allows doctors to evaluate the ML model to detect breasts at an early stage. A major aim is to diagnose patients with breast cancer by analyzing the data of patients and classifying them into two categories, having diagnosis results as Benign "B" or Malignant “M”. In this study different machine learning algorithms are used to classify cancer as either its malignant or benign. The Kaggle data set was used for applying these algorithms to get the best accuracy. MLP is more efficient and accurate algorithm to classify the breast tumor. And here also fitted the matthews_corrcoef for MLP is 0.89% and accuracy score for the random forest is 0.94%.

Стилі APA, Harvard, Vancouver, ISO та ін.

36

Arsioli, B., and P. Dedin. "Machine learning applied to multifrequency data in astrophysics: blazar classification." Monthly Notices of the Royal Astronomical Society 498, no. 2 (August 17, 2020): 1750–64. http://dx.doi.org/10.1093/mnras/staa2449.

Повний текст джерела

Анотація:

ABSTRACT The study of machine learning (ML) techniques for the autonomous classification of astrophysical sources is of great interest, and we explore its applications in the context of a multifrequency data-frame. We test the use of supervised ML to classify blazars according to its synchrotron peak frequency, either lower or higher than 1015 Hz. We select a sample with 4178 blazars labelled as 1279 high synchrotron peak (HSP: $\rm \nu$-peak > 1015 Hz) and 2899 low synchrotron peak (LSP: $\rm \nu$-peak < 1015 Hz). A set of multifrequency features were defined to represent each source that includes spectral slopes ($\alpha _{\nu _1, \nu _2}$) between the radio, infra-red, optical, and X-ray bands, also considering IR colours. We describe the optimization of five ML classification algorithms that classify blazars into LSP or HSP: Random forests (RFs), support vector machine (SVM), K-nearest neighbours (KNN), Gaussian Naive Bayes (GNB), and the Ludwig auto-ML framework. In our particular case, the SVM algorithm had the best performance, reaching 93 per cent of balanced accuracy. A joint-feature permutation test revealed that the spectral slopes alpha-radio-infrared (IR) and alpha-radio-optical are the most relevant for the ML modelling, followed by the IR colours. This work shows that ML algorithms can distinguish multifrequency spectral characteristics and handle the classification of blazars into LSPs and HSPs. It is a hint for the potential use of ML for the autonomous determination of broadband spectral parameters (as the synchrotron ν-peak), or even to search for new blazars in all-sky data bases.

Стилі APA, Harvard, Vancouver, ISO та ін.

37

Omankwu Obinnaya Chinecherem, Ugwuja Nnenna Esther, and Kanu Chigbundu. "Comprehensive review of supervised machine learning algorithms to identify the best and error free." International Journal of Scholarly Research in Engineering and Technology 2, no. 1 (January 30, 2023): 013–19. http://dx.doi.org/10.56781/ijsret.2023.2.1.0028.

Повний текст джерела

Анотація:

Supervised classification is one of the tasks most frequently carried out by the intelligent systems. Supervised Machine Learning (SML) is the search for algorithms that reason from externally supplied instances to produce general hypotheses, which then make predictions about future instances. This paper; compares various supervised. Seven different machine learning algorithms were considered: Decision Table, Random Forest (RF) , Naïve Bayes (NB) , Support Vector Machine (SVM), Neural Networks (Perceptron), JRip and Decision Tree (J48)l. And also reviews various Supervised Machine Learning (ML) classification techniques with the aim of identifying the Best and Error free algorithm.

Стилі APA, Harvard, Vancouver, ISO та ін.

38

Wang, Biao, Chengxi Wu, Yunan Zhu, Mingliang Zhang, Hanqiong Li, and Wei Zhang. "Ship Radiated Noise Recognition Technology Based on ML-DS Decision Fusion." Computational Intelligence and Neuroscience 2021 (October 7, 2021): 1–14. http://dx.doi.org/10.1155/2021/8901565.

Повний текст джерела

Анотація:

Ship radiated noise is an important information source of underwater acoustic targets, and it is of great significance to the identification and classification of ship targets. However, there are a lot of interference noises in the water, which leads to the reduction of the model recognition rate. Therefore, the recognition results of radiated noise targets are severely affected. This paper proposes a machine learning Dempster–Shafer (ML-DS) decision fusion method. The algorithm combines the recognition results of machine learning and deep learning. It uses evidence-based decision-making theory to realize feature fusion under different neural network classifiers and improve the accuracy of judgment. First, deep learning algorithms are used to classify two-dimensional spectrogram features and one-dimensional amplitude features extracted from CNN and LSTM networks. The machine learning algorithm SVM is used to classify the chromaticity characteristics of radiated noise. Then, according to the classification results of different classifiers, a basic probability assignment model (BPA) was designed to fuse the recognition results of the classifiers. Finally, according to the classification characteristics of machine learning and deep learning, combined with the decision-making of D-S evidence theory of different times, the decision-making fusion of radiated noise is realized. The results of the experiment show that the two fusions of deep learning combined with one fusion of machine learning can significantly improve the recognition results of low signal-to-noise ratio (SNR) datasets. The lowest fusion recognition result can reach 76.01%, and the average fusion recognition rate can reach 94.92%. Compared with the traditional single feature recognition algorithm, the recognition accuracy is greatly improved. Compared with the traditional one-step fusion algorithm, it can effectively integrate the recognition results of heterogeneous data and heterogeneous networks. The identification method based on ML-DS proposed in this paper can be applied in the field of ship radiated noise identification.

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Yilmaz, Tuba. "Multiclass Classification of Hepatic Anomalies with Dielectric Properties: From Phantom Materials to Rat Hepatic Tissues." Sensors 20, no. 2 (January 18, 2020): 530. http://dx.doi.org/10.3390/s20020530.

Повний текст джерела

Анотація:

Open-ended coaxial probes can be used as tissue characterization devices. However, the technique suffers from a high error rate. To improve this technology, there is a need to decrease the measurement error which is reported to be more than 30% for an in vivo measurement setting. This work investigates the machine learning (ML) algorithms’ ability to decrease the measurement error of open-ended coaxial probe techniques to enable tissue characterization devices. To explore the potential of this technique as a tissue characterization device, performances of multiclass ML algorithms on collected in vivo rat hepatic tissue and phantom dielectric property data were evaluated. Phantoms were used for investigating the potential of proliferating the data set due to difficulty of in vivo data collection from tissues. The dielectric property measurements were collected from 16 rats with hepatic anomalies, 8 rats with healthy hepatic tissues, and in house phantoms. Three ML algorithms, k-nearest neighbors (kNN), logistic regression (LR), and random forests (RF) were used to classify the collected data. The best performance for the classification of hepatic tissues was obtained with 76% accuracy using the LR algorithm. The LR algorithm performed classification with over 98% accuracy within the phantom data and the model generalized to in vivo dielectric property data with 48% accuracy. These findings indicate first, linear models, such as logistic regression, perform better on dielectric property data sets. Second, ML models fitted to the data collected from phantom materials can partly generalize to in vivo dielectric property data due to the discrepancy between dielectric property variability.

Стилі APA, Harvard, Vancouver, ISO та ін.

40

Párizs, Richárd Dominik, Dániel Török, Tatyana Ageyeva, and József Gábor Kovács. "Machine Learning in Injection Molding: An Industry 4.0 Method of Quality Prediction." Sensors 22, no. 7 (April 1, 2022): 2704. http://dx.doi.org/10.3390/s22072704.

Повний текст джерела

Анотація:

One of the essential requirements of injection molding is to ensure the stable quality of the parts produced. However, numerous processing conditions, which are often interrelated in quite a complex way, make this challenging. Machine learning (ML) algorithms can be the solution, as they work in multidimensional spaces by learning the structure of datasets. In this study, we used four ML algorithms (kNN, naïve Bayes, linear discriminant analysis, and decision tree) and compared their effectiveness in predicting the quality of multi-cavity injection molding. We used pressure-based quality indexes (features) as inputs for the classification algorithms. We proved that all the examined ML algorithms adequately predict quality in injection molding even with very little training data. We found that the decision tree algorithm was the most accurate one, with a computational time of only 8–10 s. The average performance of the decision tree algorithm exceeded 90%, even for very little training data. We also demonstrated that feature selection does not significantly affect the accuracy of the decision tree algorithm.

Стилі APA, Harvard, Vancouver, ISO та ін.

41

Inomov, B., and M. Tropmann-Frick. "Scientific Texts Classification by Speciality with Machine Learning Methods." Vestnik NSU. Series: Information Technologies 20, no. 2 (October 8, 2022): 27–36. http://dx.doi.org/10.25205/1818-7900-2022-20-2-27-36.

Повний текст джерела

Анотація:

This article investigates the problem of experimental study classification problem of scientific text materials by utilizing the methods of Machine Learning and Deep Learning. The experimental study based on text classification method which proposed preprocessing and specificity of scientific text materials by using the ML algorithms to improve accuracy and speed of text classification was conducted. The analysis of indexation and classification methods by specialties was conducted for a set of scientific text materials. The evaluation and comparison of ML algorithms’ quality was considered, and the results of dissertational works’ classification by machine learning methods within the framework of the existing training set of scientific materials were obtained.

Стилі APA, Harvard, Vancouver, ISO та ін.

42

Dalsania, Naman, Devang Punatar, and Deep Kothari. "Credit Card Approval Prediction using Classification Algorithms." International Journal for Research in Applied Science and Engineering Technology 10, no. 11 (November 30, 2022): 507–14. http://dx.doi.org/10.22214/ijraset.2022.47369.

Повний текст джерела

Анотація:

Abstract: Credit risk as the boards in banks basically revolves around determining the probability of default or the creditworthiness of a customer, collapse, and the cost, assuming it happens. It is important to consider key factors and anticipate the likelihood of consumer default, given the circumstances. This is where machine learning models come into play. This allows banks and large financial institutions to predict whether their customers will default on their loans. This project uses Python to create machine-learning models with the highest possible accuracy. First, we load the dataset and take a glimpse. The data set is a combination of mathematical and non-mathematical elements, with various ranges of values and some missing points. We pre-process the dataset so that the selected ML model meets high expectations.

Стилі APA, Harvard, Vancouver, ISO та ін.

43

Fan, Ching-Lung. "Evaluation of Classification for Project Features with Machine Learning Algorithms." Symmetry 14, no. 2 (February 13, 2022): 372. http://dx.doi.org/10.3390/sym14020372.

Повний текст джерела

Анотація:

Due to the asymmetry of project features, it is difficult for project managers to make a reliable prediction of the decision-making process. Big data research can establish more predictions through the results of accurate classification. Machine learning (ML) has been widely applied for big data analytic and processing, which includes model symmetry/asymmetry of various prediction problems. The purpose of this study is to achieve symmetry in the developed decision-making solution based on the optimal classification results. Defects are important metrics of construction management performance. Accordingly, the use of suitable algorithms to comprehend the characteristics of these defects and train and test massive data on defects can conduct the effectual classification of project features. This research used 499 defective classes and related features from the Public Works Bid Management System (PWBMS). In this article, ML algorithms, such as support vector machine (SVM), artificial neural network (ANN), decision tree (DT), and Bayesian network (BN), were employed to predict the relationship between three target variables (engineering level, project cost, and construction progress) and defects. To formulate and subsequently cross-validate an optimal classification model, 1015 projects were considered in this work. Assessment indicators showed that the accuracy of ANN for classifying the engineering level is 93.20%, and the accuracy values of SVM for classifying the project cost and construction progress are 85.32% and 79.01%, respectively. In general, the SVM yielded better classification results from these project features. This research was based on an ML algorithm evaluation system for buildings as a classification model for project features with the goal of aiding project managers to comprehend defects.

Стилі APA, Harvard, Vancouver, ISO та ін.

44

Pawar, Sharad Laxman, and Tryambak Hiwarkar. "A Result Analysis of Supervised Machine Learning Approach to Detect Anomaly from Network Traffic." International Journal of Computer Science and Mobile Computing 11, no. 6 (June 30, 2022): 152–68. http://dx.doi.org/10.47760/ijcsmc.2022.v11i06.012.

Повний текст джерела

Анотація:

Supervised Machine Learning (SML) is the quest for algorithms that reason from externally given cases to develop general hypotheses, which subsequently make predictions about future instances. Supervised categorization is one of the jobs most commonly carried out by the intelligent systems. This article presents numerous Supervised Machine Learning (ML) classification strategies, evaluates various supervised learning algorithms as well as finds the most effective classification algorithm depending on the data set, the number of instances and variables (features) (features). Seven alternative machine learning methods were considered: Decision Table, Random Forest (RF) , Naïve Bayes (NB) , Support Vector Machine (SVM), utilizing Waikato Environment for Knowledge Analysis (WEKA)machine learning program. To develop the algorithms, Diabetes data set was utilized for the classification with 786 cases with eight attributes as independent variable and one as dependent variable for the analysis. The findings suggest that SVM was determined to be the method with maximum precision and accuracy. Naïve Bayes and Random Forest classification algorithms were shown to be the next accurate after SVM appropriately. The research demonstrates that time spent to create a model and precision (accuracy) is a factor on one hand; while kappa statistic and Mean Absolute Error (MAE) is another element on the other side. Therefore, ML techniques demands precision, accuracy and least error to have supervised predictive machine learning.

Стилі APA, Harvard, Vancouver, ISO та ін.

45

Zahedi, Leila, Farid Ghareh Mohammadi, and Mohammad Hadi Amini. "A2BCF: An Automated ABC-Based Feature Selection Algorithm for Classification Models in an Education Application." Applied Sciences 12, no. 7 (March 31, 2022): 3553. http://dx.doi.org/10.3390/app12073553.

Повний текст джерела

Анотація:

Feature selection is an essential step of preprocessing in Machine Learning (ML) algorithms that can significantly impact the performance of ML models. It is considered one of the most crucial phases of automated ML (AutoML). Feature selection aims to find the optimal subset of features and remove the noninformative features from the dataset. Feature selection also reduces the computational time and makes the data more understandable to the learning model. There are various heuristic search strategies to address combinatorial optimization challenges. This paper develops an Automated Artificial Bee Colony-based algorithm for Feature Selection (A2BCF) to solve a classification problem. The application domain evaluating our proposed algorithm is education science, which solves a binary classification problem, namely, undergraduate student success. The modifications made to the original Artificial Bee Colony algorithm make the algorithm a well-performed approach.

Стилі APA, Harvard, Vancouver, ISO та ін.

46

Chilyabanyama, Obvious Nchimunya, Roma Chilengi, Michelo Simuyandi, Caroline C. Chisenga, Masuzyo Chirwa, Kalongo Hamusonde, Rakesh Kumar Saroj, Najeeha Talat Iqbal, Innocent Ngaruye, and Samuel Bosomprah. "Performance of Machine Learning Classifiers in Classifying Stunting among Under-Five Children in Zambia." Children 9, no. 7 (July 20, 2022): 1082. http://dx.doi.org/10.3390/children9071082.

Повний текст джерела

Анотація:

Stunting is a global public health issue. We sought to train and evaluate machine learning (ML) classification algorithms on the Zambia Demographic Health Survey (ZDHS) dataset to predict stunting among children under the age of five in Zambia. We applied Logistic regression (LR), Random Forest (RF), SV classification (SVC), XG Boost (XgB) and Naïve Bayes (NB) algorithms to predict the probability of stunting among children under five years of age, on the 2018 ZDHS dataset. We calibrated predicted probabilities and plotted the calibration curves to compare model performance. We computed accuracy, recall, precision and F1 for each machine learning algorithm. About 2327 (34.2%) children were stunted. Thirteen of fifty-eight features were selected for inclusion in the model using random forest. Calibrating the predicted probabilities improved the performance of machine learning algorithms when evaluated using calibration curves. RF was the most accurate algorithm, with an accuracy score of 79% in the testing and 61.6% in the training data while Naïve Bayesian was the worst performing algorithm for predicting stunting among children under five in Zambia using the 2018 ZDHS dataset. ML models aids quick diagnosis of stunting and the timely development of interventions aimed at preventing stunting.

Стилі APA, Harvard, Vancouver, ISO та ін.

47

D’Addio, Giovanni, Leandro Donisi, Giuseppe Cesarelli, Federica Amitrano, Armando Coccia, Maria Teresa La Rovere, and Carlo Ricciardi. "Extracting Features from Poincaré Plots to Distinguish Congestive Heart Failure Patients According to NYHA Classes." Bioengineering 8, no. 10 (October 3, 2021): 138. http://dx.doi.org/10.3390/bioengineering8100138.

Повний текст джерела

Анотація:

Heart-rate variability has proved a valid tool in prognosis definition of patients with congestive heart failure (CHF). Previous research has documented Poincaré plot analysis as a valuable approach to study heart-rate variability performance among different subjects. In this paper, we explored the possibility to feed machine-learning (ML) algorithms using unconventional quantitative parameters extracted from Poincaré plots (generated from 24-h electrocardiogram recordings) to classify patients with CHF belonging to different New York Heart Association (NYHA) classes. We performed in sequence the following investigations: first, a statistical analysis was carried out on 9 morphological parameters, automatically measured from Poincaré plots. Subsequently, a feature selection through a wrapper with a 10-fold cross-validation method was performed to find the best subset of features which maximized the classification accuracy for each considered ML algorithm. Finally, patient classification was assessed through a ML analysis using AdaBoost of Decision Tree, k-Nearest Neighbors and Naive Bayes algorithms. A univariate statistical analysis proved 5 out of 9 parameters presented statistically significant differences among patients of distinct NYHA classes; similarly, a multivariate logistic regression confirmed the importance of the parameter ρy in the separability between low-risk and high-risk classes. The ML analysis achieved promising results in terms of evaluation metrics (especially the Naive Bayes algorithm), with accuracies greater than 80% and Area Under the Receiver Operating Curve indices greater than 0.7 for the overall three algorithms. The study indicates the proposed features have a predictive power to discriminate the NYHA classes, to which the features seem evenly correlated. Despite the NYHA classification being subjective and easily recognized by cardiologists, the potential relevance in the clinical cardiology of the proposed features and the promising ML results implies the methodology could be a valuable approach to automatically classify CHF. Future investigations on enriched datasets may further confirm the presented evidence.

Стилі APA, Harvard, Vancouver, ISO та ін.

48

Ustuner, M., F. B. Sanli, and S. Abdikan. "BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 379–84. http://dx.doi.org/10.5194/isprs-archives-xli-b7-379-2016.

Повний текст джерела

Анотація:

The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.

Стилі APA, Harvard, Vancouver, ISO та ін.

49

Ustuner, M., F. B. Sanli, and S. Abdikan. "BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 379–84. http://dx.doi.org/10.5194/isprsarchives-xli-b7-379-2016.

Повний текст джерела

Анотація:

The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.

Стилі APA, Harvard, Vancouver, ISO та ін.

50

Tabares-Soto, Reinel, Simon Orozco-Arias, Victor Romero-Cano, Vanesa Segovia Bucheli, José Luis Rodríguez-Sotelo, and Cristian Felipe Jiménez-Varón. "A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data." PeerJ Computer Science 6 (April 13, 2020): e270. http://dx.doi.org/10.7717/peerj-cs.270.

Повний текст джерела

Анотація:

Cancer classification is a topic of major interest in medicine since it allows accurate and efficient diagnosis and facilitates a successful outcome in medical treatments. Previous studies have classified human tumors using a large-scale RNA profiling and supervised Machine Learning (ML) algorithms to construct a molecular-based classification of carcinoma cells from breast, bladder, adenocarcinoma, colorectal, gastro esophagus, kidney, liver, lung, ovarian, pancreas, and prostate tumors. These datasets are collectively known as the 11_tumor database, although this database has been used in several works in the ML field, no comparative studies of different algorithms can be found in the literature. On the other hand, advances in both hardware and software technologies have fostered considerable improvements in the precision of solutions that use ML, such as Deep Learning (DL). In this study, we compare the most widely used algorithms in classical ML and DL to classify the tumors described in the 11_tumor database. We obtained tumor identification accuracies between 90.6% (Logistic Regression) and 94.43% (Convolutional Neural Networks) using k-fold cross-validation. Also, we show how a tuning process may or may not significantly improve algorithms’ accuracies. Our results demonstrate an efficient and accurate classification method based on gene expression (microarray data) and ML/DL algorithms, which facilitates tumor type prediction in a multi-cancer-type scenario.

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!