Academic literature on the topic 'Kaggle Dataset'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Kaggle Dataset.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Kaggle Dataset"

1

Lo, Jui-En, Eugene Yu-Chuan Kang, Yun-Nung Chen, et al. "Data Homogeneity Effect in Deep Learning-Based Prediction of Type 1 Diabetic Retinopathy." Journal of Diabetes Research 2021 (December 28, 2021): 1–9. http://dx.doi.org/10.1155/2021/2751695.

Full text
Abstract:
This study is aimed at evaluating a deep transfer learning-based model for identifying diabetic retinopathy (DR) that was trained using a dataset with high variability and predominant type 2 diabetes (T2D) and comparing model performance with that in patients with type 1 diabetes (T1D). The Kaggle dataset, which is a publicly available dataset, was divided into training and testing Kaggle datasets. In the comparison dataset, we collected retinal fundus images of T1D patients at Chang Gung Memorial Hospital in Taiwan from 2013 to 2020, and the images were divided into training and testing T1D d
APA, Harvard, Vancouver, ISO, and other styles
2

Jesie, R. Sherline, and M. S. Godwin Premi. "Improved Tunicate Swarm Optimization Based Hybrid Convolutional Neural Network for Classification of Leaf Diseases and Nutrient Deficiencies in Rice (Oryza)." Agronomy 14, no. 8 (2024): 1851. http://dx.doi.org/10.3390/agronomy14081851.

Full text
Abstract:
In Asia, rice is the most consumed grain by humans, serving as a staple food in India. The yield of rice paddies is easily affected by nutrient deficiencies and leaf diseases. To overcome this problem and improve the yield productivity of rice, nutrient deficiency and leaf disease identification are essential. The main nutrient elements in paddies are potassium, phosphorus, and nitrogen (PPN), the deficiency of any of which strongly affects the rice plants. When multiple nutrient elements are deficient, the leaf color of the rice plants is altered. To overcome this problem, optimal nutrient de
APA, Harvard, Vancouver, ISO, and other styles
3

Tsai, Chi-Yi, Wei-Hsuan Shih, and Humaira Nisar. "Three-Stage Recursive Learning Technique for Face Mask Detection on Imbalanced Datasets." Mathematics 12, no. 19 (2024): 3104. http://dx.doi.org/10.3390/math12193104.

Full text
Abstract:
In response to the COVID-19 pandemic, governments worldwide have implemented mandatory face mask regulations in crowded public spaces, making the development of automatic face mask detection systems critical. To achieve robust face mask detection performance, a high-quality and comprehensive face mask dataset is required. However, due to the difficulty in obtaining face samples with masks in the real-world, public face mask datasets are often imbalanced, leading to the data imbalance problem in model training and negatively impacting detection performance. To address this problem, this paper p
APA, Harvard, Vancouver, ISO, and other styles
4

Nafi'iyah, Nur, and Nur Fahmi Maulidi. "LINEAR REGRESSION FOR DISCOUNTING PRESENTATION RECOMMENDATIONS (Kaggle Dataset)." JURNAL TEKNOLOGI INFORMASI DAN KOMUNIKASI 13, no. 2 (2022): 67–73. http://dx.doi.org/10.51903/jtikp.v13i2.326.

Full text
Abstract:
In the business of selling goods, there must be goods that do not sell well and sell well. How to make unsold items sell well by giving customers a discount or discount strategy. The goal is to provide discounted prices to attract customers' attention and increase sales turnover. Prediction to give the right discount presentation is needed in the discount strategy. How to determine discount prediction using linear regression method, looking for line equations by training data taken from Kaggle.com. The data were trained to find the constants and coefficients of the independent variables. The r
APA, Harvard, Vancouver, ISO, and other styles
5

Jinfeng, Gao, Sehrish Qummar, Zhang Junming, Yao Ruxian, and Fiaz Gul Khan. "Ensemble Framework of Deep CNNs for Diabetic Retinopathy Detection." Computational Intelligence and Neuroscience 2020 (December 15, 2020): 1–11. http://dx.doi.org/10.1155/2020/8864698.

Full text
Abstract:
Diabetic retinopathy (DR) is an eye disease that damages the blood vessels of the eye. DR causes blurred vision or it may lead to blindness if it is not detected in early stages. DR has five stages, i.e., 0 normal, 1 mild, 2 moderate, 3 severe, and 4 PDR. Conventionally, many hand-on projects of computer vision have been applied to detect DR but cannot code the intricate underlying features. Therefore, they result in poor classification of DR stages, particularly for early stages. In this research, two deep CNN models were proposed with an ensemble technique to detect all the stages of DR by u
APA, Harvard, Vancouver, ISO, and other styles
6

Anaya-Sánchez, Héctor, Leopoldo Altamirano-Robles, Raquel Díaz-Hernández, and Saúl Zapotecas-Martínez. "WGAN-GP for Synthetic Retinal Image Generation: Enhancing Sensor-Based Medical Imaging for Classification Models." Sensors 25, no. 1 (2024): 167. https://doi.org/10.3390/s25010167.

Full text
Abstract:
Accurate synthetic image generation is crucial for addressing data scarcity challenges in medical image classification tasks, particularly in sensor-derived medical imaging. In this work, we propose a novel method using a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) and nearest-neighbor interpolation to generate high-quality synthetic images for diabetic retinopathy classification. Our approach enhances training datasets by generating realistic retinal images that retain critical pathological features. We evaluated the method across multiple retinal image datasets
APA, Harvard, Vancouver, ISO, and other styles
7

Sudriyanto, Sudriyanto, Muhammad Ali Hafid, and Moch Ade Kurniawan. "Deteksi Akun Kaggle Bot Menggunakan Linear Regression." Journal of Electrical Engineering and Computer (JEECOM) 6, no. 2 (2024): 449–59. http://dx.doi.org/10.33650/jeecom.v6i2.9251.

Full text
Abstract:
Penelitian ini mengkaji permasalahan pemalsuan akun pada platform Kaggle dengan fokus pada pengembangan model prediksi menggunakan metode Linear Regression untuk mendeteksi akun bot. Kaggle, sebagai platform terkemuka dalam bidang ilmu data, menghadapi tantangan serius terkait integritas data akibat praktik bot voting yang berdampak pada keaslian kompetisi dan dataset yang diunggah. Studi ini memanfaatkan dataset Kaggle Bot Account yang terdiri dari lebih dari satu juta entri, dengan variabel independen mencakup jumlah pengikut, interaksi dengan konten, dan aktivitas pengguna lainnya. Metode L
APA, Harvard, Vancouver, ISO, and other styles
8

van Otterloo, Sieuwert, and Pavlo Burda. "The Utrecht Housing dataset: A housing appraisal dataset." Computers and Society Research Journal 1 (2025): 1–11. https://doi.org/10.54822/qvhm1662.

Full text
Abstract:
This paper introduces a real-world dataset for analysing and predicting house prices. The dataset consists of actual data on the Dutch housing market collected in 2024 for a total of 153 houses in one city (Utrecht in The Netherlands). The dataset incorporates diverse variables on individual houses, includ- ing property characteristics (e.g., house type, build year, geolocation, area, energy label) and market metrics (e.g., asking price, final price). The data was collected from two public sources. The dataset has been created to help researchers and educators to demonstrate machine learning p
APA, Harvard, Vancouver, ISO, and other styles
9

Ahamad, Ghulab Nabi, Shafiullah, Hira Fatima, et al. "Influence of Optimal Hyperparameters on the Performance of Machine Learning Algorithms for Predicting Heart Disease." Processes 11, no. 3 (2023): 734. http://dx.doi.org/10.3390/pr11030734.

Full text
Abstract:
One of the most difficult challenges in medicine is predicting heart disease at an early stage. In this study, six machine learning (ML) algorithms, viz., logistic regression, K-nearest neighbor, support vector machine, decision tree, random forest classifier, and extreme gradient boosting, were used to analyze two heart disease datasets. One dataset was UCI Kaggle Cleveland and the other was the comprehensive UCI Kaggle Cleveland, Hungary, Switzerland, and Long Beach V. The performance results of the machine learning techniques were obtained. The support vector machine with tuned hyperparamet
APA, Harvard, Vancouver, ISO, and other styles
10

Aziz, Faisal, and Nana Suryana. "Implementation of Machine Learning for Disease Detection in Tomato Plants Using Convolutional Neural Networks." JESII: Journal of Elektronik Sistem InformasI 2, no. 2 (2024): 275–87. https://doi.org/10.31848/jesii.v2i2.3580.

Full text
Abstract:
Diseases in tomato plants can be highly detrimental to tomato farmers, with common afflictions such as begomovirus, blight, and spider mites posing significant challenges. The implementation of machine learning offers a promising solution to address these issues and mitigate the financial losses caused by such diseases. This study aims to evaluate the effectiveness of machine learning in detecting plant diseases using Convolutional Neural Networks (CNN). The data used in this implementation was obtained from public datasets available on Kaggle and real-time data collected directly from tomato
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Kaggle Dataset"

1

Akshay, B. R., Sini Raj Pulari, T. S. Murugesh, and Shriram K. Vasudevan. "Flower recognition with Kaggle dataset and Gradio interface." In Machine Learning. CRC Press, 2024. http://dx.doi.org/10.1201/9781032676685-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Tariq, Maria, Vasile Palade, and YingLiang Ma. "Transfer Learning Based Classification of Diabetic Retinopathy on the Kaggle EyePACS Dataset." In Medical Imaging and Computer-Aided Diagnosis. Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-16-6775-6_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Prasher, Shikha, Leema Nelson, and Avinash Sharma. "Evaluation of Machine Learning Techniques to Diagnose Polycystic Ovary Syndrome Using Kaggle Dataset." In Emerging Trends in Expert Applications and Security. Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-1946-8_25.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jareena Begum, D., and S. P. Chokkalingam. "MRI-Based Brain Tumour Detection and Classification Using Random Forest Algorithm." In Smart Innovation, Systems and Technologies. Springer Nature Singapore, 2025. https://doi.org/10.1007/978-981-97-8355-7_7.

Full text
Abstract:
Abstract A brain tumour develops when cells in the brain multiply abnormally and out of control. The possibility of fatality makes this growth dangerous. The brain regulates everything from memory to vision to emotions. Recognizing these tumours is a challenging and complex task due to their location, size, and shape. There have been a number of successful attempts to enhance detection. However, the current level of precision is insufficient. This study details a method with a high likelihood of success in identifying brain cancers. The approach is implemented in Python LAB by use of an image segmentation technique and a classifier. One such classifier is the Random Forest Algorithm. The Principal Component Analysis (PCA) and the Discrete Wavelet Transform (DWT) are also employed. The created method is put to the test using a dataset available on the Kaggle platform. The acquired results were approximately 97.68% accurate. A confusion matrix and a comparison of the developed method to other research in the literature are included in the investigation to facilitate the implementation of the proposed strategy in image testing. This Research shows that the offered system is superior to others in terms of detection rate, false positive rate, and recall. Results from this study showed that the developed method was effective at detecting brain tumours due to its high levels of accuracy, precision, and recall. According to the proposed strategy, it is essential to provide professionals who diagnose brain tumours with access to such technology.
APA, Harvard, Vancouver, ISO, and other styles
5

Samte, Lianmuansang, Aditya Kumar Rabha, Bhargav Kalpa Hazarika, and Gypsy Nandi. "NEOTracker." In Critical Approaches to Data Engineering Systems and Analysis. IGI Global, 2024. http://dx.doi.org/10.4018/979-8-3693-2260-4.ch012.

Full text
Abstract:
Near-Earth objects (NEOs) are asteroids or comets that have their orbits in close proximity with Earth. Some objects amongst these are known to be potentially hazardous and pose a risk of collision. This chapter developed four supervised machine learning algorithms, namely, logistic regression, random forest, support vector machine, and XGBoost, for the detection and classification of hazardous near-earth objects. Two datasets were utilised, the first taken from the Kaggle website, and the second generated from NASA's JPL Small-Body database. Feature importance analysis of these datasets was done by analysing the Shapley values of the individual features in both datasets. This chapter concludes by finding all models to have performed sufficiently well, with XGBoost found to be the best and most consistent performing across both datasets. Additionally, both min and max diameter, and the absolute magnitude features for the Kaggle dataset, and the H and moid features for the JPL dataset were found to be the most impactful features for classifying hazardous near-earth objects.
APA, Harvard, Vancouver, ISO, and other styles
6

Malik, Sonika, and Preeti Rathee. "Enhancing Wireless Communication in Education through a Novel Text-toImage Generation System." In Wireless Communication Networks and Applications. Iterative International Publishers, Selfypage Developers Pvt Ltd, 2024. http://dx.doi.org/10.58532/nbennurch77.

Full text
Abstract:
This paper introduces a system designed for wireless text-to-image synthesis, empowering users to provide textual descriptions and obtain high-quality visual representations in real-time. The proposed model, utilizing a conditional generative adversarial network (cGAN), is trained on an extensive dataset of approximately 3,80,000 images sourced from 5 datasets, amalgamated through Kaggle. This diverse dataset encompasses various images of living and non-living entities, enhancing the model's robustness. The integration of wireless communication allows users to remotely input textual descriptions, extending the system's accessibility and usability. The model demonstrates an exceptionally high success rate, establishing itself as a valuable tool for dynamic image generation in wirelessenabled scenarios
APA, Harvard, Vancouver, ISO, and other styles
7

Singh, Khushwant, and Dheerdhwaj Barak. "Healthcare Performance in Predicting Type 2 Diabetes Using Machine Learning Algorithms." In Advances in Medical Diagnosis, Treatment, and Care. IGI Global, 2024. http://dx.doi.org/10.4018/979-8-3693-3679-3.ch008.

Full text
Abstract:
The body's imbalanced glucose consumption caused type 2 diabetes, which in turn caused problems with the immunological, neurological, and circulatory systems. Numerous studies have been conducted to predict this illness using a variety of clinical and pathological criteria. As technology has advanced, several machine learning approaches have also been used for improved prediction accuracy. This study examines the concept of data preparation and examines how it affects machine learning algorithms. Two datasets were built up for the experiment: LS, a locally developed and verified dataset, and PIMA, a dataset from Kaggle. In all, the research evaluates five machine learning algorithms and eight distinct scaling strategies. It has been noted that the accuracy of the PIMA data set ranges from 46.99 to 69.88% when no pre-processing is used, and it may reach 77.92% when scalers are used. Because the LS data set is tiny and regulated, accuracy for the dataset without scalers may be as low as 78.67%. With two labels, accuracy increases to 100%.
APA, Harvard, Vancouver, ISO, and other styles
8

Jain, Sachin, and Preeti Jaidka. "Lung Cancer Classification Using Deep Learning Hybrid Model." In Future of AI in Medical Imaging. IGI Global, 2024. http://dx.doi.org/10.4018/979-8-3693-2359-5.ch013.

Full text
Abstract:
Abnormal growths in the lungs caused by disease. The classification of CT scans is accomplished by applying machine learning strategies. Classification methods based on deep learning, such as support vector machines, can categorize a wide variety of image datasets and produce segmentation results of the highest caliber. In this work, we suggested a method for deep feature extraction from images by altering SVM and CNN and then applying the hybrid model resulting from those modifications (NNSVLC). For this investigation, the Kaggle dataset will be utilized. The proposed method was found to be accurate 91.7% of the time, as determined by the results of the experiments.
APA, Harvard, Vancouver, ISO, and other styles
9

J., Shanthalakshmi Revathy, Uma Maheswari N., and Sasikala S. "Enhanced BiLSTM Model for EEG Emotional Data Analysis." In Principles and Applications of Socio-Cognitive and Affective Computing. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-3843-5.ch005.

Full text
Abstract:
Emotion recognition based on biological signals from the brain necessitates sophisticated signal processing and feature extraction techniques. The major purpose of this research is to use the enhanced BiLSTM (E-BiLSTM) approach to improve the effectiveness of emotion identification utilizing brain signals. The approach detects brain activity that has distinct characteristics that vary from person to person. This experiment uses an emotional EEG dataset that is publicly available on Kaggle. The data was collected using an EEG headband with four sensors (AF7, AF8, TP9, TP10), and three possible states were identified, including neutral, positive, and negative, based on cognitive behavioral studies. A big dataset is generated using statistical brainwave extraction of alpha, beta, theta, delta, and gamma, which is then scaled down to smaller datasets using the PCA feature selection technique. Overall accuracy was around 98.12%, which is higher than the present state of the art.
APA, Harvard, Vancouver, ISO, and other styles
10

Ganesh R and Kalaiarasi S. "Strange Approach of Movie Rating Prediction Using Logistic Regression Comparing to Gaussian Naive Bayes Algorithm." In Advances in Parallel Computing Algorithms, Tools and Paradigms. IOS Press, 2022. http://dx.doi.org/10.3233/apc220041.

Full text
Abstract:
The aim is to find movie ratings using logistic regression and comparing the result with naive bayes based on Accuracy. A total of 6040 samples were collected from movie datasets available in kaggle. Two algorithms are used; one is Logistic Regression and another is naive bayes algorithm. The computation processes were executed and verified for exactness. Sample size N=5 is taken for both algorithms. SPSS was used for predicting significance value of the dataset considering G-Power value as 80%. Logistic Regression achieved mean accuracy of 80.83% when compared to Naive Bayes Algorithm with 82.53%. Results were obtained with a level of significance with 0.003 (p<0.05). Applied strange recommendation model confirms to have higher accuracy than Naive Bayes algorithm.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Kaggle Dataset"

1

Soni, Tanishq, Deepali Gupta, and Mudita Uppal. "Optimizing Heart Disease Prediction with Random Forest: Insights from the Kaggle Dataset." In 2024 4th International Conference on Advancement in Electronics & Communication Engineering (AECE). IEEE, 2024. https://doi.org/10.1109/aece62803.2024.10911595.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ango, Rithika, Raj Kumar Masih, C. Kishor Kumar Reddy, Mohammed Shuaib, Monika Singh T, and Shadab Alam. "Fraud Detection in Banking using the Kaggle Credit Card Dataset and XGBoost Model." In 2024 International Conference on IoT Based Control Networks and Intelligent Systems (ICICNIS). IEEE, 2024. https://doi.org/10.1109/icicnis64247.2024.10823227.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Immanuel, Jonathan, Danny Matthew Saputra, and Novi Yusliani. "Bagging-Based Disease Prediction Using Naïve Bayes, Decision Tree, and Support Vector Machine on Kaggle Dataset." In 2024 International Conference on Electrical Engineering and Computer Science (ICECOS). IEEE, 2024. https://doi.org/10.1109/icecos63900.2024.10791239.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Mostafavi Ghahfarokhi, Mojtaba, Arash Asgari, Mohammad Abolnejadian, and Abbas Heydarnoori. "DistilKaggle: A Distilled Dataset of Kaggle Jupyter Notebooks." In MSR '24: 21st International Conference on Mining Software Repositories. ACM, 2024. http://dx.doi.org/10.1145/3643991.3644882.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Quaranta, Luigi, Fabio Calefato, and Filippo Lanubile. "KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle." In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, 2021. http://dx.doi.org/10.1109/msr52588.2021.00072.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Sujatha, S., and T. Sreenivasulu Reddy. "3D Brain Tumor Segmentation with U-Net Network using Public Kaggle Dataset." In 2023 Third International Conference on Artificial Intelligence and Smart Energy (ICAIS). IEEE, 2023. http://dx.doi.org/10.1109/icais56108.2023.10073895.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

A. Noever, David, and Samantha E. Miller Noever. "Image Classifiers for Network Intrusions." In 9th International Conference of Security, Privacy and Trust Management (SPTM 2021). AIRCC Publishing Corporation, 2021. http://dx.doi.org/10.5121/csit.2021.110504.

Full text
Abstract:
This research recasts the network attack dataset from UNSW-NB15 as an intrusion detection problem in image space. Using one-hot-encodings, the resulting grayscale thumbnails provide a quarter-million examples for deep learning algorithms. Applying the MobileNetV2’s convolutional neural network architecture, the work demonstrates a 97% accuracy in distinguishing normal and attack traffic. Further class refinements to 9 individual attack families (exploits, worms, shellcodes) show an overall 56% accuracy. Using feature importance rank, a random forest solution on subsets show the most important
APA, Harvard, Vancouver, ISO, and other styles
8

Baral, Gaurab, and Junxiu Zhou. "A hybrid Regression method for Predicting Housing Prices." In 2024 AHFE International Conference on Human Factors in Design, Engineering, and Computing (AHFE 2024 Hawaii Edition). AHFE International, 2024. http://dx.doi.org/10.54941/ahfe1005725.

Full text
Abstract:
Accurate house price prediction is crucial for accommodating the diverse needs of stakeholders in the home-buying process. House prices can be affected by various factors, such as location, construction date, exterior, etc. This work proposes a hybrid regression method that leverages the strengths of different regression techniques to improve prediction accuracy. Specifically, this work looks at conventional linear regression and other machine learning techniques such as support vector regression (SVR), and XGBoost regression. Then we compare these models with our proposed hybrid regression mo
APA, Harvard, Vancouver, ISO, and other styles
9

Menezes, Richardson Santiago Teles, Angelo Marcelino Cordeiro, Rafael Magalhães, and Helton Maia. "Classification of Paintings Authorship Using Convolutional Neural Network." In Congresso Brasileiro de Inteligência Computacional. SBIC, 2021. http://dx.doi.org/10.21528/cbic2021-116.

Full text
Abstract:
In this paper, state-of-the-art architectures of Convolutional Neural Networks (CNNs) are explained and compared concerning authorship classification of famous paintings. The chosen CNNs architectures were VGG-16, VGG-19, Residual Neural Networks (ResNet), and Xception. The used dataset is available on the website Kaggle, under the title “Best Artworks of All Time”. Weighted classes for each artist with more than 200 paintings present in the dataset were created to represent and classify each artist’s style. The performed experiments resulted in an accuracy of up to 95% for the Xception archit
APA, Harvard, Vancouver, ISO, and other styles
10

Ortiz, Pedro Arthur Pinto da Silva, and Daniel Lichtnow. "Usando um Dataset para Criação de um Aplicativo para Detecção de Problemas na Cultura do Morango a partir da Análise de Imagens." In Escola Regional de Banco de Dados. Sociedade Brasileira de Computação - SBC, 2024. http://dx.doi.org/10.5753/erbd.2024.238696.

Full text
Abstract:
Este trabalho descreve um modelo de classificação de imagens gerado para ser usado em um protótipo de aplicativo para detecção de problemas na cultura do morango. Para geração do modelo foi usado um dataset com imagens de sete tipos diferentes de doenças na cultura do morango obtido na plataforma Kaggle. A implementação do modelo foi feita no aplicativo de análise Ultralytics HUB, que oferece a detecção de objetos e reconhecimento de imagens em tempo real, otimizando o treinamento do modelo de Machine Learning com o uso de GPUs. O trabalho emprega ainda a arquitetura YOLOv8. A proposta visa es
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!