To see the other types of publications on this topic, follow the link: Training and Testing Dataset.

Journal articles on the topic 'Training and Testing Dataset'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Training and Testing Dataset.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Lo, Jui-En, Eugene Yu-Chuan Kang, Yun-Nung Chen, et al. "Data Homogeneity Effect in Deep Learning-Based Prediction of Type 1 Diabetic Retinopathy." Journal of Diabetes Research 2021 (December 28, 2021): 1–9. http://dx.doi.org/10.1155/2021/2751695.

Full text
Abstract:
This study is aimed at evaluating a deep transfer learning-based model for identifying diabetic retinopathy (DR) that was trained using a dataset with high variability and predominant type 2 diabetes (T2D) and comparing model performance with that in patients with type 1 diabetes (T1D). The Kaggle dataset, which is a publicly available dataset, was divided into training and testing Kaggle datasets. In the comparison dataset, we collected retinal fundus images of T1D patients at Chang Gung Memorial Hospital in Taiwan from 2013 to 2020, and the images were divided into training and testing T1D d
APA, Harvard, Vancouver, ISO, and other styles
2

Oyegoke, Temitayo O., Kehinde K. Akomolede, Adesola G. Aderounmu, and Emmanuel R. Adagunodo. "A Multi-Layer Perceptron Model for Classification of E-mail Fraud." European Journal of Information Technologies and Computer Science 1, no. 5 (2021): 16–22. http://dx.doi.org/10.24018/compute.2021.1.5.24.

Full text
Abstract:
This study was developed an e-mail classification model to preempt fraudulent activities. The e-mail has such a predominant nature that makes it suitable for adoption by cyber-fraudsters. This research used a combination of two databases: CLAIR fraudulent and Spambase datasets for creating the training and testing dataset. The CLAIR dataset consists of raw e-mails from users’ inbox which were pre-processed into structured form using Natural Language Processing (NLP) techniques. This dataset was then consolidated with the Spambase dataset as a single dataset. The study deployed the Multi-Layer
APA, Harvard, Vancouver, ISO, and other styles
3

An, Chansik, Yae Won Park, Sung Soo Ahn, Kyunghwa Han, Hwiyoung Kim, and Seung-Koo Lee. "Radiomics machine learning study with a small sample size: Single random training-test set split may lead to unreliable results." PLOS ONE 16, no. 8 (2021): e0256152. http://dx.doi.org/10.1371/journal.pone.0256152.

Full text
Abstract:
This study aims to determine how randomly splitting a dataset into training and test sets affects the estimated performance of a machine learning model and its gap from the test performance under different conditions, using real-world brain tumor radiomics data. We conducted two classification tasks of different difficulty levels with magnetic resonance imaging (MRI) radiomics features: (1) “Simple” task, glioblastomas [n = 109] vs. brain metastasis [n = 58] and (2) “difficult” task, low- [n = 163] vs. high-grade [n = 95] meningiomas. Additionally, two undersampled datasets were created by ran
APA, Harvard, Vancouver, ISO, and other styles
4

Mabuni, D., and S. Aquter Babu. "High Accurate and a Variant of k-fold Cross Validation Technique for Predicting the Decision Tree Classifier Accuracy." International Journal of Innovative Technology and Exploring Engineering 10, no. 2 (2021): 105–10. http://dx.doi.org/10.35940/ijitee.c8403.0110321.

Full text
Abstract:
In machine learning data usage is the most important criterion than the logic of the program. With very big and moderate sized datasets it is possible to obtain robust and high classification accuracies but not with small and very small sized datasets. In particular only large training datasets are potential datasets for producing robust decision tree classification results. The classification results obtained by using only one training and one testing dataset pair are not reliable. Cross validation technique uses many random folds of the same dataset for training and validation. In order to o
APA, Harvard, Vancouver, ISO, and other styles
5

D., Mabuni, and Aquter Babu S. "High Accurate and a Variant of k-fold Cross Validation Technique for Predicting the Decision Tree Classifier Accuracy." International Journal of Innovative Technology and Exploring Engineering (IJITEE) 10, no. 3 (2021): 105–10. https://doi.org/10.35940/ijitee.C8403.0110321.

Full text
Abstract:
In machine learning data usage is the most important criterion than the logic of the program. With very big and moderate sized datasets it is possible to obtain robust and high classification accuracies but not with small and very small sized datasets. In particular only large training datasets are potential datasets for producing robust decision tree classification results. The classification results obtained by using only one training and one testing dataset pair are not reliable. Cross validation technique uses many random folds of the same dataset for training and validation. In order to o
APA, Harvard, Vancouver, ISO, and other styles
6

Lee, Yongju, Sungjun Jang, Han Byeol Bae, Taejae Jeon, and Sangyoun Lee. "Multitask Learning Strategy with Pseudo-Labeling: Face Recognition, Facial Landmark Detection, and Head Pose Estimation." Sensors 24, no. 10 (2024): 3212. http://dx.doi.org/10.3390/s24103212.

Full text
Abstract:
Most facial analysis methods perform well in standardized testing but not in real-world testing. The main reason is that training models cannot easily learn various human features and background noise, especially for facial landmark detection and head pose estimation tasks with limited and noisy training datasets. To alleviate the gap between standardized and real-world testing, we propose a pseudo-labeling technique using a face recognition dataset consisting of various people and background noise. The use of our pseudo-labeled training dataset can help to overcome the lack of diversity among
APA, Harvard, Vancouver, ISO, and other styles
7

Apeināns, Ilmars. "OPTIMAL SIZE OF AGRICULTURAL DATASET FOR YOLOV8 TRAINING." ENVIRONMENT. TECHNOLOGIES. RESOURCES. Proceedings of the International Scientific and Practical Conference 2 (June 22, 2024): 38–42. http://dx.doi.org/10.17770/etr2024vol2.8041.

Full text
Abstract:
The smart farming solutions are mainly based on the application of convolutional neural networks for object detection tasks. The number of open datasets is restricted in the agricultural domain. Therefore, it is required to find the answer to the question: how big a dataset must be collected to train a convolutional neural network for object detection tasks? To solve this task, the YOLOv8 framework was selected for the experiment. Three datasets were prepared: MinneApples, PFruitlets640 and mosaic dataset using both previously named datasets. 100 images were selected for testing. Other images
APA, Harvard, Vancouver, ISO, and other styles
8

Murugesan, S., R. S. Bhuvaneswaran, H. Khanna Nehemiah, S. Keerthana Sankari, and Y. Nancy Jane. "Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner." Computational and Mathematical Methods in Medicine 2021 (May 17, 2021): 1–18. http://dx.doi.org/10.1155/2021/6662420.

Full text
Abstract:
A computer-aided diagnosis (CAD) system that employs a super learner to diagnose the presence or absence of a disease has been developed. Each clinical dataset is preprocessed and split into training set (60%) and testing set (40%). A wrapper approach that uses three bioinspired algorithms, namely, cat swarm optimization (CSO), krill herd (KH) ,and bacterial foraging optimization (BFO) with the classification accuracy of support vector machine (SVM) as the fitness function has been used for feature selection. The selected features of each bioinspired algorithm are stored in three separate data
APA, Harvard, Vancouver, ISO, and other styles
9

Chua, Tuan-Hong, and Iftekhar Salam. "Evaluation of Machine Learning Algorithms in Network-Based Intrusion Detection Using Progressive Dataset." Symmetry 15, no. 6 (2023): 1251. http://dx.doi.org/10.3390/sym15061251.

Full text
Abstract:
Cybersecurity has become one of the focuses of organisations. The number of cyberattacks keeps increasing as Internet usage continues to grow. As new types of cyberattacks continue to emerge, researchers focus on developing machine learning (ML)-based intrusion detection systems (IDS) to detect zero-day attacks. They usually remove some or all attack samples from the training dataset and only include them in the testing dataset when evaluating the performance. This method may detect unknown attacks; however, it does not reflect the long-term performance of the IDS as it only shows the changes
APA, Harvard, Vancouver, ISO, and other styles
10

Sheshkus, A., A. Chirvonaya, and V. L. Arlazarov. "Tiny CNN for feature point description for document analysis: approach and dataset." Computer Optics 46, no. 3 (2022): 429–35. http://dx.doi.org/10.18287/2412-6179-co-1016.

Full text
Abstract:
In this paper, we study the problem of feature points description in the context of document analysis and template matching. Our study shows that specific training data is required for the task especially if we are to train a lightweight neural network that will be usable on devices with limited computational resources. In this paper, we construct and provide a dataset of photo and synthetically generated images and a method of training patches generation from it. We prove the effectiveness of this data by training a lightweight neural network and show how it performs in both general and docum
APA, Harvard, Vancouver, ISO, and other styles
11

Lin, Zhe, and Wenxuan Guo. "Cotton Stand Counting from Unmanned Aerial System Imagery Using MobileNet and CenterNet Deep Learning Models." Remote Sensing 13, no. 14 (2021): 2822. http://dx.doi.org/10.3390/rs13142822.

Full text
Abstract:
An accurate stand count is a prerequisite to determining the emergence rate, assessing seedling vigor, and facilitating site-specific management for optimal crop production. Traditional manual counting methods in stand assessment are labor intensive and time consuming for large-scale breeding programs or production field operations. This study aimed to apply two deep learning models, the MobileNet and CenterNet, to detect and count cotton plants at the seedling stage with unmanned aerial system (UAS) images. These models were trained with two datasets containing 400 and 900 images with variati
APA, Harvard, Vancouver, ISO, and other styles
12

Han, Ce, Hao Zheng, Fang He, and Tianmin Zhang. "A method for detecting anomalies in die forging presses using the pearson correlation coefficient." Journal of Physics: Conference Series 3009, no. 1 (2025): 012072. https://doi.org/10.1088/1742-6596/3009/1/012072.

Full text
Abstract:
Abstract This article explores the correlations among variables during the operation of die forging presses and analyzes the importance of these correlations for anomaly detection in the presses. By dividing the collected data from the die forging presses into training and testing datasets, a correlation matrix analysis is performed using the training dataset to identify groups of variables with strong associations. Subsequently, these selected variable groups are applied to the testing dataset for validation. The results demonstrate that this method can effectively identify specific anomalies
APA, Harvard, Vancouver, ISO, and other styles
13

Yu, Fanqianhui, Tao Lu, and Changhu Xue. "Deep Learning-Based Intelligent Apple Variety Classification System and Model Interpretability Analysis." Foods 12, no. 4 (2023): 885. http://dx.doi.org/10.3390/foods12040885.

Full text
Abstract:
In this study, series networks (AlexNet and VGG-19) and directed acyclic graph (DAG) networks (ResNet-18, ResNet-50, and ResNet-101) with transfer learning were employed to identify and classify 13 classes of apples from 7439 images. Two training datasets, model evaluation metrics, and three visualization methods were used to objectively assess, compare, and interpret five Convolutional Neural Network (CNN)-based models. The results show that the dataset configuration had a significant impact on the classification results, as all models achieved over 96.1% accuracy on dataset A (training-to-te
APA, Harvard, Vancouver, ISO, and other styles
14

Arief, Muhammad, Made Gunawan, Agung Septiadi, et al. "A novel framework for analyzing internet of things datasets for machine learning and deep learning-based intrusion detection systems." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 2 (2024): 1574. http://dx.doi.org/10.11591/ijai.v13.i2.pp1574-1584.

Full text
Abstract:
To generate a machine learning (ML) and deep learning (DL) architecture with good performance, we need a decent dataset for the training and testing phases of the development process. Starting with the knowledge discovery and data mining (KDD) Cup 99 dataset, numerous datasets have been produced since 1998 to be utilized in the ML and DL-based intrusion detection systems (IDS) training and testing process. Because there are so many datasets accessible, it might be challenging for researchers to choose which dataset to employ. Therefore, a framework for evaluating dataset appropriateness with t
APA, Harvard, Vancouver, ISO, and other styles
15

Muhammad, Arief, Gunawan Made, Septiadi Agung, et al. "A novel framework for analyzing internet of things datasets for machine learning and deep learning-based intrusion detection systems." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 2 (2024): 1574–84. https://doi.org/10.11591/ijai.v13.i2.pp1574-1584.

Full text
Abstract:
To generate a machine learning (ML) and deep learning (DL) architecture with good performance, we need a decent dataset for the training and testing phases of the development process. Starting with the knowledge discovery and data mining (KDD) Cup 99 dataset, numerous datasets have been produced since 1998 to be utilized in the ML and DL-based intrusion detection systems (IDS) training and testing process. Because there are so many datasets accessible, it might be challenging for researchers to choose which dataset to employ. Therefore, a framework for evaluating dataset appropriateness with t
APA, Harvard, Vancouver, ISO, and other styles
16

Liu, Hua. "Realization of Text Categorization for Small-Scaled Dataset." Advanced Materials Research 532-533 (June 2012): 1239–42. http://dx.doi.org/10.4028/www.scientific.net/amr.532-533.1239.

Full text
Abstract:
Testing of the text categorization and comparison testing is carried out based on small-scaled dataset. In case of lack of trained set, without training, the indexed text keywords are used to categorize the expert subject terms, with large categorization accuracy amounted to 0.82. In case of less trained set, after training, the characteristics vectors acquired from the training are added into experts’ subject terms and are categorized, with large accuracy amounted to 0.94, the level-3 accuracy amounted to 0.73, so the results are satisfying.
APA, Harvard, Vancouver, ISO, and other styles
17

Zorman, Milan, Sandi Pohorec, Bojan Butolen, Bojan Žlahtič, and Peter Kokol. "Cross–testing Symbolic and Connectionist Machine Learning Approaches in Specialized Acute Appendicitis Databases." Acta Medico-Biotechnica 5, no. 2 (2021): 23–32. http://dx.doi.org/10.18690/actabiomed.72.

Full text
Abstract:
Purpose: In the world of learning with medical supervised machine approaches, we often face a lack of dataset objects suitable for training a classifier. Two of the most common reasons are the lack of funds to perform all of the required tests and dataset gathering, or simply the condition is too rare to collect a suitable number of cases. In this paper we present the results of a very rare opportunity to test and train classifiers on three acute appendicitis datasets with almost identical structures, but from different sources and of different sizes.
 Methods: We performed a parallel var
APA, Harvard, Vancouver, ISO, and other styles
18

Wu, Yike, Shiwan Zhao, Ying Zhang, Xiaojie Yuan, and Zhong Su. "When Pairs Meet Triplets: Improving Low-Resource Captioning via Multi-Objective Optimization." ACM Transactions on Multimedia Computing, Communications, and Applications 18, no. 3 (2022): 1–20. http://dx.doi.org/10.1145/3492325.

Full text
Abstract:
Image captioning for low-resource languages has attracted much attention recently. Researchers propose to augment the low-resource caption dataset into (image, rich-resource language, and low-resource language) triplets and develop the dual attention mechanism to exploit the existence of triplets in training to improve the performance. However, datasets in triplet form are usually small due to their high collecting cost. On the other hand, there are already many large-scale datasets, which contain one pair from the triplet, such as caption datasets in the rich-resource language and translation
APA, Harvard, Vancouver, ISO, and other styles
19

Akinpelu, Adeola Akeem, Mazen K. Nazal, Md Shafiullah, et al. "A Multivariate Machine Learning Model of Adsorptive Lindane Removal from Contaminated Water." Applied Sciences 13, no. 12 (2023): 7086. http://dx.doi.org/10.3390/app13127086.

Full text
Abstract:
It is challenging to use conventional one-variable-at-time (OVAT) batch experiments to evaluate multivariate/inter-parametric interactions between physico-chemical variables that contribute to the adsorptive removal of contaminants. Thus, chemometric prediction approaches for multivariate calibration and analysis reveal the impact of multi-parametric variation on the process of concern. Hence, we aim to develop an artificial neural network (ANN), and stepwise regression (SR) models for multivariate calibration and analysis utilizing OVAT data prepared through experimentation. After comparing t
APA, Harvard, Vancouver, ISO, and other styles
20

How, Chun Kit, Ismail Mohd Khairuddin, Mohd Azraai Mohd Razman, Anwar P. P. Abdul Majeed, and Wan Hasbullah Mohd Isa. "Development of Audio-Visual Speech Recognition using Deep-Learning Technique." MEKATRONIKA 4, no. 1 (2022): 88–95. http://dx.doi.org/10.15282/mekatronika.v4i1.8625.

Full text
Abstract:
Deep learning is a technique with artificial intelligent (AI) that simulate humans’ learning behavior. Audio-visual speech recognition is important for the listener understand the emotions behind the spoken words truly. In this thesis, two different deep learning models, Convolutional Neural Network (CNN) and Deep Neural Network (DNN), were developed to recognize the speech’s emotion from the dataset. Pytorch framework with torchaudio library was used. Both models were given the same training, validation, testing, and augmented datasets. The training will be stopped when the training loop reac
APA, Harvard, Vancouver, ISO, and other styles
21

Qusay Alshebly *, Omar, and Suhail Najm Abdullah. "The Fuzziness Models with The Proposed New Conjugate Gradient Method for The Classification of High-Dimensional Data in Bioinformatics." Journal of Economics and Administrative Sciences 30, no. 142 (2024): 425–48. http://dx.doi.org/10.33095/ahnw8r72.

Full text
Abstract:
The development of the subject of bioinformatics may be attributed to the exponential growth of biological data, namely the huge amount of high-dimensional gene expression data. The discipline of bioinformatics efficiently tackles issues in molecular biology through the use of optimization, computer science, and statistical methods. The present study introduces a new optimization strategy, namely the proposed conjugate gradient method (PNCG), for the purpose of learning a fuzzy neural network model using the Takagi-Sugeno approach. This study presented a novel algorithm that addresses the issu
APA, Harvard, Vancouver, ISO, and other styles
22

Upadhyay, Jitendrakumar B. "BUILT A DATASET OF GUJARATI ISOLATED HANDWRITTEN CHARACTERS AND RECOGNITION THROUGH DEEP LEARNING." international journal of advanced research in computer science 16, no. 1 (2025): 42–47. https://doi.org/10.26483/ijarcs.v16i1.7182.

Full text
Abstract:
In the current era with the rise of new machine learning algorithms, particularly deep learning, the demand for large, high-quality datasets has grown significantly, especially in handwritten character recognition (HCR). While several Indian languages have publicly available benchmark datasets, a few, including Gujarati, still lack such resources. This paper addresses an attempt to build a dataset for Gujarati isolated handwritten characters and to recognize the isolated Gujarati handwritten vowels and consonants. The dataset is collected from 692 writers of varying ages, genders, qualificatio
APA, Harvard, Vancouver, ISO, and other styles
23

Talaat, Mohamed, Xiuhua Si, and Jinxiang Xi. "Multi-Level Training and Testing of CNN Models in Diagnosing Multi-Center COVID-19 and Pneumonia X-ray Images." Applied Sciences 13, no. 18 (2023): 10270. http://dx.doi.org/10.3390/app131810270.

Full text
Abstract:
This study aimed to address three questions in AI-assisted COVID-19 diagnostic systems: (1) How does a CNN model trained on one dataset perform on test datasets from disparate medical centers? (2) What accuracy gains can be achieved by enriching the training dataset with new images? (3) How can learned features elucidate classification results, and how do they vary among different models? To achieve these aims, four CNN models—AlexNet, ResNet-50, MobileNet, and VGG-19—were trained in five rounds by incrementally adding new images to a baseline training set comprising 11,538 chest X-ray images.
APA, Harvard, Vancouver, ISO, and other styles
24

Guha, Ritam, Manosij Ghosh, Pawan Kumar Singh, Ram Sarkar, and Mita Nasipuri. "M-HMOGA: A New Multi-Objective Feature Selection Algorithm for Handwritten Numeral Classification." Journal of Intelligent Systems 29, no. 1 (2019): 1453–67. http://dx.doi.org/10.1515/jisys-2019-0064.

Full text
Abstract:
Abstract The feature selection process is very important in the field of pattern recognition, which selects the informative features so as to reduce the curse of dimensionality, thus improving the overall classification accuracy. In this paper, a new feature selection approach named Memory-Based Histogram-Oriented Multi-objective Genetic Algorithm (M-HMOGA) is introduced to identify the informative feature subset to be used for a pattern classification problem. The proposed M-HMOGA approach is applied to two recently used feature sets, namely Mojette transform and Regional Weighted Run Length
APA, Harvard, Vancouver, ISO, and other styles
25

Hanif, Iqbal, and Regita Fachri Septiani. "Ensemble Learning For Television Program Rating Prediction." Indonesian Journal of Statistics and Its Applications 5, no. 2 (2021): 377–95. http://dx.doi.org/10.29244/ijsa.v5i2p377-395.

Full text
Abstract:
Rating is one of the most frequently used metrics in the television industry to evaluate television programs or channels. This research is an attempt to develop a prediction model of television program ratings using rating data gathered from UseeTV (interned-based television service from Telkom Indonesia). The machine learning methods (Random Forest and Extreme Gradient Boosting) were tried out utilizing a set of rating data from 20 television programs collected from January 2018 to August 2019 (train dataset) and evaluated using September 2019 rating data (test dataset). Research results show
APA, Harvard, Vancouver, ISO, and other styles
26

Ali, Maria, Fatima Pervez, Muhammad Nouman Atta, Abdullah Khan, and Asfandyar Khan. "Sine Cosine Algorithm for Enhancing Convergence Rates of Artificial Neural Network: A Comparative Study." Journal of Engineering Technology and Applied Physics 6, no. 2 (2024): 32–37. http://dx.doi.org/10.33093/jetap.2024.6.2.5.

Full text
Abstract:
Artificial neural networks (ANNs) is widely adopted by researchers for classification tasks due to their simplicity and superior performance. This study offerings the ANN and it variant such as Elman Neural Network (NN) model to address its strengths, although it faces with issues like local minima and slow convergence. This study presents a comprehensive evaluation of four distinct algorithms for classification tasks, focusing on their performance on both training and testing datasets. These algorithms such as Sine Cosine Algorithm is integrated with Artificial Neural Networks (SCA_ANN), Back
APA, Harvard, Vancouver, ISO, and other styles
27

Chang, Hong-Chan, Yi-Che Wang, Yu-Yang Shih, and Cheng-Chien Kuo. "Fault Diagnosis of Induction Motors with Imbalanced Data Using Deep Convolutional Generative Adversarial Network." Applied Sciences 12, no. 8 (2022): 4080. http://dx.doi.org/10.3390/app12084080.

Full text
Abstract:
A homemade defective model of an induction motor was created by the laboratory team to acquire the vibration acceleration signals of five operating states of an induction motor under different loads. Two major learning models, namely a deep convolutional generative adversarial network (DCGAN) and a convolutional neural network, were applied for fault diagnosis of the induction motor to the problem of an imbalanced training dataset. Two datasets were studied and analyzed: a sufficient and balanced training dataset and insufficient and imbalanced training data. When the training datasets were ad
APA, Harvard, Vancouver, ISO, and other styles
28

Ma, Zhengchi, Ruoyu Ouyang, and Hanzhang Wang. "The Study of Performance for Cross-Platform Spam Filtering Based on the Random Forest Algorithm." Highlights in Science, Engineering and Technology 57 (July 11, 2023): 32–36. http://dx.doi.org/10.54097/hset.v57i.9893.

Full text
Abstract:
The objective of this study was to investigate the performance of the Random Forest algorithm in spam detection when generalized from email spam to social media comment spam. The dataset used involved the use of two sources: an email dataset and a YouTube spam comment dataset. Text processing techniques and feature extraction methods were applied to preprocess the datasets using scikit-learn package. Labels were mapped from "spam" and "ham" to "1" and "0" respectively for training and testing the model. The email spam dataset was split into training and testing datasets, and the first 3000 lin
APA, Harvard, Vancouver, ISO, and other styles
29

Arnap, Adam, and Kusrini. "Enhancing SQL Injection Attack Detection Using Naïve Bayes and SMOTE Method on Imbalanced Datasets." Journal of Artificial Intelligence and Engineering Applications (JAIEA) 4, no. 1 (2024): 74–81. http://dx.doi.org/10.59934/jaiea.v4i1.559.

Full text
Abstract:
SQL injection attack detection is a crucial aspect of cybersecurity, considering the potential damage that such attacks can cause. This study aims to evaluate the effectiveness of the Naive Bayes model in detecting SQL injection attacks on an imbalanced dataset. To address the data imbalance issue, the SMOTE (Synthetic Minority Over-sampling Technique) method was applied. The study consists of two phases: first, training and testing the Naive Bayes model on the original dataset without SMOTE, and second, training and testing on the dataset with SMOTE applied. The results indicate that the Naiv
APA, Harvard, Vancouver, ISO, and other styles
30

Romero, Carlo N., Matt Ervin G. Mital, Zagie D. Rostata, and Mark Angelo M. Martinez. "Investigating the Impact of Training and Testing Ratios on the Performance of an AI-Based Malware Detector using MATLAB." E3S Web of Conferences 500 (2024): 01015. http://dx.doi.org/10.1051/e3sconf/202450001015.

Full text
Abstract:
This research investigates the impact of the training and testing ratios on the performance of an AI-Based Malware Detector using MATLAB. The experiments through MATLAB have shown that higher training percentage means that a larger portion of dataset for training the model have been used while a lower training percentage shows that a large portion of the dataset reserved for testing the model’s performance. The exploration of the influence of training and testing ratios also have been able to determine the performance of an AI-Based Malware Detector. The results give to determining the relatio
APA, Harvard, Vancouver, ISO, and other styles
31

Musu, Wilem, Abdul Ibrahim, and Heriadi Heriadi. "Pengaruh Komposisi Data Training dan Testing terhadap Akurasi Algoritma C4.5." SISITI : Seminar Ilmiah Sistem Informasi dan Teknologi Informasi 10, no. 1 (2021): 186–95. https://doi.org/10.36774/sisiti.v10i1.802.

Full text
Abstract:
Akurasi adalah tolak ukur yang digunakan untuk mengetahui seberapa tepat suatu pola klasifikasi memprediksi kelas data dari data yang akan datang. Dalam praktek data mining pengujian akurasi dari sebuah pola klasifikasi menggunakan data testing, sementara untuk menemukan pola itu sendiri, menggunakan data training. Pembagian presentasi jumlah data training dan data testing dari sebuah dataset menjadi salah satu faktor penentu besaran nilai akurasi. Sehingga kesalahan menentukan komposisi antara kedua jenis data tersebut akan mempengaruhi nilai akurasi yang diperoleh. Penelitian ini menguji tin
APA, Harvard, Vancouver, ISO, and other styles
32

Kim, Jung Hwan, Alwin Poulose, and Dong Seog Han. "The Extensive Usage of the Facial Image Threshing Machine for Facial Emotion Recognition Performance." Sensors 21, no. 6 (2021): 2026. http://dx.doi.org/10.3390/s21062026.

Full text
Abstract:
Facial emotion recognition (FER) systems play a significant role in identifying driver emotions. Accurate facial emotion recognition of drivers in autonomous vehicles reduces road rage. However, training even the advanced FER model without proper datasets causes poor performance in real-time testing. FER system performance is heavily affected by the quality of datasets than the quality of the algorithms. To improve FER system performance for autonomous vehicles, we propose a facial image threshing (FIT) machine that uses advanced features of pre-trained facial recognition and training from the
APA, Harvard, Vancouver, ISO, and other styles
33

Zakria, Jianhua Deng, Jingye Cai, Muhammad Umar Aftab, Muhammad Saddam Khokhar, and Rajesh Kumar. "Visual Features with Spatio-Temporal-Based Fusion Model for Cross-Dataset Vehicle Re-Identification." Electronics 9, no. 7 (2020): 1083. http://dx.doi.org/10.3390/electronics9071083.

Full text
Abstract:
Vehicle re-identification (Re-Id) is the key module in an intelligent transportation system (ITS). Due to its versatile applicability in metropolitan cities, this task has received increasing attention these days. It aims to identify whether the specific vehicle has already appeared over the surveillance network or not. Mostly, the vehicle Re-Id method are evaluated on a single dataset, in which training and testing of the model is performed on the same dataset. However in practice, this negatively effects model generalization ability due to biased datasets along with the significant differenc
APA, Harvard, Vancouver, ISO, and other styles
34

Yen, Chih-Ta, Sheng-Nan Chang, and Cheng-Hong Liao. "Deep learning algorithm evaluation of hypertension classification in less photoplethysmography signals conditions." Measurement and Control 54, no. 3-4 (2021): 439–45. http://dx.doi.org/10.1177/00202940211001904.

Full text
Abstract:
This study used photoplethysmography signals to classify hypertensive into no hypertension, prehypertension, stage I hypertension, and stage II hypertension. There are four deep learning models are compared in the study. The difficulties in the study are how to find the optimal parameters such as kernel, kernel size, and layers in less photoplethysmographyt (PPG) training data condition. PPG signals were used to train deep residual network convolutional neural network (ResNetCNN) and bidirectional long short-term memory (BILSTM) to determine the optimal operating parameters when each dataset c
APA, Harvard, Vancouver, ISO, and other styles
35

Ou, Yuduan, and Gerónimo Quiñónez-Barraza. "Modeling Height–Diameter Relationship Using Artificial Neural Networks for Durango Pine (Pinus durangensis Martínez) Species in Mexico." Forests 14, no. 8 (2023): 1544. http://dx.doi.org/10.3390/f14081544.

Full text
Abstract:
The total tree height (h) and diameter at breast height (dbh) relationship is an essential tool in forest management and planning. Nonlinear mixed effect modeling (NLMEM) has been extensively used, and lately the artificial neural network (ANN) and the resilient backpropagation artificial neural network (RBPANN) approach has been a trending topic for modeling this relationship. The objective of this study was to evaluate and contrast the NLMEN and RBPANN approaches for modeling the h-dbh relationship for the Durango pine species (Pinus durangensis Martínez) for both training and testing datase
APA, Harvard, Vancouver, ISO, and other styles
36

Jin, Jiayi, and Chengyun Zhao. "Performance Analysis and Comparison of Heart Disease Prediction Models." Highlights in Science, Engineering and Technology 123 (December 24, 2024): 618–24. https://doi.org/10.54097/yb7t2031.

Full text
Abstract:
Since cardiovascular diseases (CVDs) are the leading cause of death, it is significant for people to detect them early and take certain precautions. As a result, the paper performs the study of heart disease prediction with statistical models. In this study, the researcher analyzed a dataset from four different regions, including Cleveland, Hungarian, Switzerland, and Long Beach. Each of the regions is used as one testing dataset and the rest of the three regions are used as one training dataset, with a total of four sets of training and testing datasets. The goal is to predict heart disease w
APA, Harvard, Vancouver, ISO, and other styles
37

Raihani Mohamed, Nur Hidayah Azizan, Thinagaran Perumal, Syaifulnizam Abd Manaf, Erzam Marlisah, and Medria Kusuma Dewi Hardhienata. "Discovering and Recognizing of Imbalance Human Activity in Healthcare Monitoring using Data Resampling Technique and Decision Tree Model." Journal of Advanced Research in Applied Sciences and Engineering Technology 33, no. 2 (2023): 340–50. http://dx.doi.org/10.37934/araset.33.2.340350.

Full text
Abstract:
Human activity recognition model is vital and has been use in healthcare monitoring system. Bespoke multi-modal sensors were used such as accelerometer, gyroscope, GPS, temperature, pressure mat etc. Hence, the activities involved may varied resulted on class imbalance issue therefore, the model accuracy also degraded and may not provide the desired results in all aspects. Resampling method addressed as Synthetically Minority Oversampling Technique and Tomek Link (Smote Tomek) is proposed to balance the target classes. Moreover, many classification algorithms such as Logistic Regression, SVM a
APA, Harvard, Vancouver, ISO, and other styles
38

Karjadi, Daniel Avian, Bayu Yasa Wedha, and Handri Santoso. "Heavy-loaded Vehicles Detection Model Testing using Synthetic Dataset." SinkrOn 7, no. 2 (2022): 464–71. http://dx.doi.org/10.33395/sinkron.v7i2.11378.

Full text
Abstract:
Currently, many roads in Indonesia are damaged. This is due to the presence of large vehicles and large loads that often pass. The more omissions are carried out, the more damaged and severe the road is. The central government and local governments often carry out road repairs, but this problem is often a problem. Damaged roads are indeed many factors, one of which is the road load. The road load is caused by the number of vehicles that carry more than the specified capacity. There are many methods used to monitor roads for road damage. The weighing post is a means used by the government in co
APA, Harvard, Vancouver, ISO, and other styles
39

Riduan, Achmad, Febriyanti Panjaitan, Syahril Rizal, Nurul Huda, and Susan Dian Purnamasari. "Detection of Inorganic Waste Using Convolutional Neural Network Method." Journal of Information Systems and Informatics 6, no. 1 (2024): 290–300. http://dx.doi.org/10.51519/journalisi.v6i1.662.

Full text
Abstract:
Waste, encompassing both domestic and industrial materials, presents a significant environmental challenge. Effectively managing waste requires accurate identification and classification. Convolutional Neural Networks (CNNs), particularly the Residual Network (ResNet) architecture, have shown promise in image classification tasks. This research aims to utilize ResNet to identify types of waste, contributing to more efficient waste management practices. The ResNet101 architecture, comprising 101 layers, is employed in this study for waste classification. The dataset consists of 2527 images cate
APA, Harvard, Vancouver, ISO, and other styles
40

Ray, Sujan, Khaldoon Alshouiliy, and Dharma P. Agrawal. "Dimensionality Reduction for Human Activity Recognition Using Google Colab." Information 12, no. 1 (2020): 6. http://dx.doi.org/10.3390/info12010006.

Full text
Abstract:
Human activity recognition (HAR) is a classification task that involves predicting the movement of a person based on sensor data. As we can see, there has been a huge growth and development of smartphones over the last 10–15 years—they could be used as a medium of mobile sensing to recognize human activity. Nowadays, deep learning methods are in a great demand and we could use those methods to recognize human activity. A great way is to build a convolutional neural network (CNN). HAR using Smartphone dataset has been widely used by researchers to develop machine learning models to recognize hu
APA, Harvard, Vancouver, ISO, and other styles
41

Anand, Battu, and T. NagaTeja. "Traffic Sign Board Recognition Using Computational Techniques." Journal of Physics: Conference Series 2779, no. 1 (2024): 012021. http://dx.doi.org/10.1088/1742-6596/2779/1/012021.

Full text
Abstract:
Abstract Traffic signs are essential for traffic management, regulating driving conduct, and lowering accidents, injuries, and fatalities. Any Intelligent Transportation System must have automatic traffic sign detection and identification. This work describes a deep-learning-based autonomous technique for traffic sign recognition in India. End-to-end learning with a Convolutional Neural Network (CNN) and a Support Vector Machine (SVM) provided inspiration for automatic traffic sign detection and recognition. The proposed concept was evaluated using a cutting-edge dataset consisting of 12933 im
APA, Harvard, Vancouver, ISO, and other styles
42

Hasan, Mahmudul, Md Abdus Sahid, Md Palash Uddin, Md Abu Marjan, Seifedine Kadry, and Jungeun Kim. "Performance discrepancy mitigation in heart disease prediction for multisensory inter-datasets." PeerJ Computer Science 10 (March 18, 2024): e1917. http://dx.doi.org/10.7717/peerj-cs.1917.

Full text
Abstract:
Heart disease is one of the primary causes of morbidity and death worldwide. Millions of people have had heart attacks every year, and only early-stage predictions can help to reduce the number. Researchers are working on designing and developing early-stage prediction systems using different advanced technologies, and machine learning (ML) is one of them. Almost all existing ML-based works consider the same dataset (intra-dataset) for the training and validation of their method. In particular, they do not consider inter-dataset performance checks, where different datasets are used in the trai
APA, Harvard, Vancouver, ISO, and other styles
43

Kim, Jong-Ho, Byantara Darsan Purusatama, Alvin Muhammad Savero, et al. "Performance Influencing Factors of Convolutional Neural Network Models for Classifying Certain Softwood Species." Forests 14, no. 6 (2023): 1249. http://dx.doi.org/10.3390/f14061249.

Full text
Abstract:
This study aims to verify the wood classification performance of convolutional neural networks (CNNs), such as VGG16, ResNet50, GoogLeNet, and basic CNN architectures, and to investigate the factors affecting classification performance. A dataset from 10 softwood species consisted of 200 cross-sectional micrographs each from the total part, earlywood, and latewood of each species. We used 80% and 20% of each dataset for training and testing, respectively. To improve the performance of the architectures, the dataset was augmented, and the differences in classification performance before and aft
APA, Harvard, Vancouver, ISO, and other styles
44

Xia, Jianglin. "Credit Card Fraud Detection Based on Support Vector Machine." Highlights in Science, Engineering and Technology 23 (December 3, 2022): 93–97. http://dx.doi.org/10.54097/hset.v23i.3202.

Full text
Abstract:
Due to the increasing popularity cashless transactions, credit card fraud has become one of the most common frauds and caused huge harm to the financial institutions and individuals in real life. In this academic paper, the algorithm Support Vector Machine (SVM) is used to build models to deal with the credit card fraud detection problem with the performance metrics AUC and F1-score. The experiment dataset is named Credit Card Transactions Fraud Detection Dataset from the Kaggle website. After the step of preprocessing, the dataset is split into the training, testing and validation dataset wit
APA, Harvard, Vancouver, ISO, and other styles
45

Moldovanu, Simona, Iulia-Nela Anghelache Nastase, Mihaela Miron, and Luminita Moraru. "Performance comparison of two non-parametric classifiers for classification using geometric features." Annals of the ”Dunarea de Jos” University of Galati Fascicle II Mathematics Physics Theoretical Mechanics 45, no. 2 (2022): 59–62. http://dx.doi.org/10.35219/ann-ugal-math-phys-mec.2022.2.04.

Full text
Abstract:
This study aims to examine and compare the performances of Random Forest (RF) and k-Nearest Neighbor (k-NN) algorithms used for classification based on certain geometric features. For the purpose of the analysis, the Breast Cancer Wisconsin (BCW) public dataset is used. BCW dataset contains features like area, perimeter, radius, compactness, and symmetry computed from 357 benign, and 212 malignant breast images, respectively. Three different experiments related to the size of training and testing datasets for classification are conducted and different accuracy values are obtained. The best acc
APA, Harvard, Vancouver, ISO, and other styles
46

Mao, Gang, Zhongzheng Zhang, Sixiang Jia, Khandaker Noman, and Yongbo Li. "Partial Transfer Ensemble Learning Framework: A Method for Intelligent Diagnosis of Rotating Machinery Based on an Incomplete Source Domain." Sensors 22, no. 7 (2022): 2579. http://dx.doi.org/10.3390/s22072579.

Full text
Abstract:
Most cross-domain intelligent diagnosis approaches presume that the health states in training datasets are consistent with those in testing. However, it is usually difficult and expensive to collect samples under all failure states during the training stage in actual engineering; this causes the training dataset to be incomplete. These existing methods may not be favorably implemented with an incomplete training dataset. To address this problem, a novel deep-learning-based model called partial transfer ensemble learning framework (PT-ELF) is proposed in this paper. The major procedures of this
APA, Harvard, Vancouver, ISO, and other styles
47

Sarwati Rahayu, Sulis Sandiwarno, Erwin Dwika Putra, Marissa Utami, and Hadiguna Setiawan. "Model Sequential Resnet50 Untuk Pengenalan Tulisan Tangan Aksara Arab." JSAI (Journal Scientific and Applied Informatics) 6, no. 2 (2023): 234–41. http://dx.doi.org/10.36085/jsai.v6i2.5379.

Full text
Abstract:
Research for Arabic handwriting recognition is still limited. The number of public datasets regarding Arabic script is still limited for this type of public dataset. Therefore, each study usually uses its dataset to conduct research. However, recently public datasets have become available and become research opportunities to compare methods with the same dataset. This study aimed to determine the implementation of the transfer learning model with the best accuracy for handwriting recognition in Arabic script. The results of the experiment using ResNet50 are as follows: training accuracy is 91.
APA, Harvard, Vancouver, ISO, and other styles
48

Aman, Fazal, Azhar Rauf, Rahman Ali, Jamil Hussain, and Ibrar Ahmed. "Balancing Complex Signals for Robust Predictive Modeling." Sensors 21, no. 24 (2021): 8465. http://dx.doi.org/10.3390/s21248465.

Full text
Abstract:
Robust predictive modeling is the process of creating, validating, and testing models to obtain better prediction outcomes. Datasets usually contain outliers whose trend deviates from the most data points. Conventionally, outliers are removed from the training dataset during preprocessing before building predictive models. Such models, however, may have poor predictive performance on the unseen testing data involving outliers. In modern machine learning, outliers are regarded as complex signals because of their significant role and are not suggested for removal from the training dataset. Model
APA, Harvard, Vancouver, ISO, and other styles
49

Zebari, Dilovan Asaad, Dheyaa Ahmed Ibrahim, Diyar Qader Zeebaree, et al. "Breast Cancer Detection Using Mammogram Images with Improved Multi-Fractal Dimension Approach and Feature Fusion." Applied Sciences 11, no. 24 (2021): 12122. http://dx.doi.org/10.3390/app112412122.

Full text
Abstract:
Breast cancer detection using mammogram images at an early stage is an important step in disease diagnostics. We propose a new method for the classification of benign or malignant breast cancer from mammogram images. Hybrid thresholding and the machine learning method are used to derive the region of interest (ROI). The derived ROI is then separated into five different blocks. The wavelet transform is applied to suppress noise from each produced block based on BayesShrink soft thresholding by capturing high and low frequencies within different sub-bands. An improved fractal dimension (FD) appr
APA, Harvard, Vancouver, ISO, and other styles
50

Kanjanawattana, Sarunya, Worawit Teerawatthanaprapha, Panchalee Praneetpholkrang, Gun Bhakdisongkhram, and Suchada Weeragulpiriya. "Pineapple Sweetness Classification Using Deep Learning Based on Pineapple Images." Journal of Image and Graphics 11, no. 1 (2023): 47–52. http://dx.doi.org/10.18178/joig.11.1.47-52.

Full text
Abstract:
In Thailand, the pineapple is a valuable crop whose price is determined by its sweetness. An optical refractometer or another technique that requires expert judgment can be used to determine a fruit's sweetness. Furthermore, determining the sweetness of each fruit takes time and effort. This study employed the Alexnet deep learning model to categorize pineapple sweetness levels based on physical attributes shown in images. The dataset was classified into four classes, i.e., M1 to M4, and sorted in ascending order by sweetness level. The dataset was divided into two parts: training and testing
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!