Log in

Relevant bibliographies by topics / Scikit-learn / Journal articles

To see the other types of publications on this topic, follow the link: Scikit-learn.

Journal articles on the topic 'Scikit-learn'

Author: Grafiati

Published: 4 June 2021

Last updated: 4 May 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Scikit-learn.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Varoquaux, G., L. Buitinck, G. Louppe, O. Grisel, F. Pedregosa, and A. Mueller. "Scikit-learn." GetMobile: Mobile Computing and Communications 19, no. 1 (June 2015): 29–33. http://dx.doi.org/10.1145/2786984.2786995.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Kashikar, Sudhnya, Sumedha Patil, Ameya Vedantwar, Shivani Katpatal, and Sofia Pillai. "Weather Prediction using Scikit-Learn." International Journal of Computer Sciences and Engineering 7, no. 4 (April 30, 2019): 36–40. http://dx.doi.org/10.26438/ijcse/v7i4.3640.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Bengfort, Benjamin, and Rebecca Bilbro. "Yellowbrick: Visualizing the Scikit-Learn Model Selection Process." Journal of Open Source Software 4, no. 35 (March 24, 2019): 1075. http://dx.doi.org/10.21105/joss.01075.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Hao, Jiangang, and Tin Kam Ho. "Machine Learning Made Easy: A Review of Scikit-learn Package in Python Programming Language." Journal of Educational and Behavioral Statistics 44, no. 3 (February 20, 2019): 348–61. http://dx.doi.org/10.3102/1076998619832248.

Full text

Abstract:

Machine learning is a popular topic in data analysis and modeling. Many different machine learning algorithms have been developed and implemented in a variety of programming languages over the past 20 years. In this article, we first provide an overview of machine learning and clarify its difference from statistical inference. Then, we review Scikit-learn, a machine learning package in the Python programming language that is widely used in data science. The Scikit-learn package includes implementations of a comprehensive list of machine learning methods under unified data and modeling procedure conventions, making it a convenient toolkit for educational and behavior statisticians.

APA, Harvard, Vancouver, ISO, and other styles

5

Kalimuthu, Sathyavikasini, and Vijaya Vijayakumar. "Shallow learning model for diagnosing neuro muscular disorder from splicing variants." World Journal of Engineering 14, no. 4 (August 7, 2017): 329–36. http://dx.doi.org/10.1108/wje-09-2016-0075.

Full text

Abstract:

Purpose Diagnosing genetic neuromuscular disorder such as muscular dystrophy is complicated when the imperfection occurs while splicing. This paper aims in predicting the type of muscular dystrophy from the gene sequences by extracting the well-defined descriptors related to splicing mutations. An automatic model is built to classify the disease through pattern recognition techniques coded in python using scikit-learn framework. Design/methodology/approach In this paper, the cloned gene sequences are synthesized based on the mutation position and its location on the chromosome by using the positional cloning approach. For instance, in the human gene mutational database (HGMD), the mutational information for splicing mutation is specified as IVS1-5 T > G indicates (IVS - intervening sequence or introns), first intron and five nucleotides before the consensus intron site AG, where the variant occurs in nucleotide G altered to T. IVS (+ve) denotes forward strand 3′– positive numbers from G of donor site invariant and IVS (−ve) denotes backward strand 5′ – negative numbers starting from G of acceptor site. The key idea in this paper is to spot out discriminative descriptors from diseased gene sequences based on splicing variants and to provide an effective machine learning solution for predicting the type of muscular dystrophy disease with the splicing mutations. Multi-class classification is worked out through data modeling of gene sequences. The synthetic mutational gene sequences are created, as the diseased gene sequences are not readily obtainable for this intricate disease. Positional cloning approach supports in generating disease gene sequences based on mutational information acquired from HGMD. SNP-, gene- and exon-based discriminative features are identified and used to train the model. An eminent muscular dystrophy disease prediction model is built using supervised learning techniques in scikit-learn environment. The data frame is built with the extracted features as numpy array. The data are normalized by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn. Findings To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations. Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. This paper also deliberates the results of statistical learning carried out with the same set of gene sequences with synonymous and non-synonymous mutational descriptors. Research limitations/implications The data frame is built with the Numpy array. Normalizing the data by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn. While learning the SVM model, the cost, gamma and kernel parameters are tuned to attain good results. Scoring parameters of the classifiers are evaluated using tenfold cross-validation using metric functions of scikit-learn library. Results of the disease identification model based on non-synonymous, synonymous and splicing mutations were analyzed. Practical implications Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. The performance of the classifiers are increased by using different estimators from the scikit-learn library. Several types of mutations such as missense, non-sense and silent mutations are also considered to build models through statistical learning technique and their results are analyzed. Originality/value To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations.

APA, Harvard, Vancouver, ISO, and other styles

6

Bhattacharya, Sounak, and Ankit Lundia. "MOVIE RECOMMENDATION SYSTEM USING BAG OF WORDS AND SCIKIT-LEARN." International Journal of Engineering Applied Sciences and Technology 04, no. 05 (October 1, 2019): 526–28. http://dx.doi.org/10.33564/ijeast.2019.v04i05.076.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Kravchenko, S. N., E. O. Grishkun, and O. V. Vlasenko. "CLASSIFICATION METHODS FOR MACHINE LEARNING USING THE SCIKIT-LEARN LIBRARY." Scientific notes of Taurida National V.I. Vernadsky University. Series: Technical Sciences 1, no. 3 (2020): 121–25. http://dx.doi.org/10.32838/tnu-2663-5941/2020.3-1/19.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Beckner, Wesley, Coco M. Mao, and Jim Pfaendtner. "Statistical models are able to predict ionic liquid viscosity across a wide range of chemical functionalities and experimental conditions." Molecular Systems Design & Engineering 3, no. 1 (2018): 253–63. http://dx.doi.org/10.1039/c7me00094d.

Full text

Abstract:

Herein we present a method of developing predictive models of viscosity for ionic liquids (ILs) using publicly available data in the ILThermo database and the open-source software toolkits PyChem, RDKit, and SciKit-Learn.

APA, Harvard, Vancouver, ISO, and other styles

9

Castejón-Limas, Manuel, Laura Fernández-Robles, Héctor Alaiz-Moretón, Jaime Cifuentes-Rodriguez, and Camino Fernández-Llamas. "A Framework for the Optimization of Complex Cyber-Physical Systems via Directed Acyclic Graph." Sensors 22, no. 4 (February 15, 2022): 1490. http://dx.doi.org/10.3390/s22041490.

Full text

Abstract:

Mathematical modeling and data-driven methodologies are frequently required to optimize industrial processes in the context of Cyber-Physical Systems (CPS). This paper introduces the PipeGraph software library, an open-source python toolbox for easing the creation of machine learning models by using Directed Acyclic Graph (DAG)-like implementations that can be used for CPS. scikit-learn’s Pipeline is a very useful tool to bind a sequence of transformers and a final estimator in a single unit capable of working itself as an estimator. It sequentially assembles several steps that can be cross-validated together while setting different parameters. Steps encapsulation secures the experiment from data leakage during the training phase. The scientific goal of PipeGraph is to extend the concept of Pipeline by using a graph structure that can handle scikit-learn’s objects in DAG layouts. It allows performing diverse operations, instead of only transformations, following the topological ordering of the steps in the graph; it provides access to all the data generated along the intermediate steps; and it is compatible with GridSearchCV function to tune the hyperparameters of the steps. It is also not limited to (X,y) entries. Moreover, it has been proposed as part of the scikit-learn-contrib supported project, and is fully compatible with scikit-learn. Documentation and unitary tests are publicly available together with the source code. Two case studies are analyzed in which PipeGraph proves to be essential in improving CPS modeling and optimization: the first is about the optimization of a heat exchange management system, and the second deals with the detection of anomalies in manufacturing processes.

APA, Harvard, Vancouver, ISO, and other styles

10

Bac, Jonathan, Evgeny M. Mirkes, Alexander N. Gorban, Ivan Tyukin, and Andrei Zinovyev. "Scikit-Dimension: A Python Package for Intrinsic Dimension Estimation." Entropy 23, no. 10 (October 19, 2021): 1368. http://dx.doi.org/10.3390/e23101368.

Full text

Abstract:

Dealing with uncertainty in applications of machine learning to real-life data critically depends on the knowledge of intrinsic dimensionality (ID). A number of methods have been suggested for the purpose of estimating ID, but no standard package to easily apply them one by one or all at once has been implemented in Python. This technical note introduces scikit-dimension, an open-source Python package for intrinsic dimension estimation. The scikit-dimension package provides a uniform implementation of most of the known ID estimators based on the scikit-learn application programming interface to evaluate the global and local intrinsic dimension, as well as generators of synthetic toy and benchmark datasets widespread in the literature. The package is developed with tools assessing the code quality, coverage, unit testing and continuous integration. We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation for real-life and synthetic data.

APA, Harvard, Vancouver, ISO, and other styles

11

Malpe, Prof Kalpana. "COVID-19 Face Mask Detection." International Journal for Research in Applied Science and Engineering Technology 10, no. 1 (January 31, 2022): 1312–15. http://dx.doi.org/10.22214/ijraset.2022.40005.

Full text

Abstract:

Abstract: Face mask detection involves in detection the placement of the face then crucial whether or not it's a mask thereon or not. the problem is proximately cognate to general object notion to detect the categories of objects. Face identification flatly deals with identifying a particular cluster of entities i.e., Face. it's varied applications, like autonomous driving, education, police work, and so on. This paper presents a simplified approach to serve the above purpose using the basic Machine Learning (ML) packages such as TensorFlow, Keras, OpenCV and Scikit-Learn. The planned technique detects the face from the image properly and so identifies if it's a mask on that or not. As an investigation taskperforming artist, it ought to conjointly sight a face at the side of a mask in motion. The technique perform accuracyup to 95.77% and 94.58% respectively on two different datasets and count optimized values of parameters using the Sequential Convolutional Neural Network model to detect the presence of masks correctly without causing over-fitting. Keywords: TensorFlow, Keras, OpenCV and Scikit- Learn

APA, Harvard, Vancouver, ISO, and other styles

12

D McGinnis, William, Chapman Siu, Andre S, and Hanyu Huang. "Category Encoders: a scikit-learn-contrib package of transformers for encoding categorical data." Journal of Open Source Software 3, no. 21 (January 22, 2018): 501. http://dx.doi.org/10.21105/joss.00501.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Hoffmann, Moritz, Martin Scherer, Tim Hempel, Andreas Mardt, Brian de Silva, Brooke E. Husic, Stefan Klus, et al. "Deeptime: a Python library for machine learning dynamical models from time series data." Machine Learning: Science and Technology 3, no. 1 (December 10, 2021): 015009. http://dx.doi.org/10.1088/2632-2153/ac3de0.

Full text

Abstract:

Abstract Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables, dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. Deeptime can be found under https://deeptime-ml.github.io/.

APA, Harvard, Vancouver, ISO, and other styles

14

Rajkumar, D. "IRIS Species Predictor." International Journal for Research in Applied Science and Engineering Technology 10, no. 1 (January 31, 2022): 1530–35. http://dx.doi.org/10.22214/ijraset.2022.40097.

Full text

Abstract:

Abstract: In Machine Learning, we are using semi-automated extraction of knowledge of data for identifying IRIS flower species. Classification is a supervised learning in which the response is categorical that is its values are in finite unordered set. To simply the problem of classification, scikit learn tools have been used. This paper focuses on IRIS flower classification using Machine Learning with scikit tools. Here the problem concerns the identification of IRIS flower species on the basis of flowers attribute measurements. Classification of IRIS data set would be discovering patterns from examining petal and sepal size of the IRIS flower and how the prediction was made from analyzing the pattern to from the class of IRIS flower. In this paper we train the machine learning model with data and when unseen data is discovered the predictive model predicts the species using what it has been learnt from the trained data. Keywords: MATLAB, Machine learning, Neural Network.

APA, Harvard, Vancouver, ISO, and other styles

15

Shaharum, N. S. N., H. Z. M. Shafri, W. A. W. A. K. Ghani, S. Samsatli, B. Yusuf, M. M. A. Al-Habshi, and H. M. Prince. "IMAGE CLASSIFICATION FOR MAPPING OIL PALM DISTRIBUTION VIA SUPPORT VECTOR MACHINE USING SCIKIT-LEARN MODULE." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4/W9 (October 30, 2018): 133–37. http://dx.doi.org/10.5194/isprs-archives-xlii-4-w9-133-2018.

Full text

Abstract:

<p><strong>Abstract.</strong> The world has been alarmed with the global warming effects. Global warming has been a distress towards the environment, thus shorten the Earth’s lifespan. It is a challenging task to reduce the global warming effects in a short period, knowing that the human population is increasing along with the electricity and energy demand. In order to reduce the effects, renewable energy is presented as an alternative method to produce energy in a way that will not harm the environment. Oil palm is one of the agricultural crops that produces huge amount of biomass which can be processed and used as a renewable energy source. In 2016, Malaysia has reported over 5 million hectares of land were covered by oil palm plantations. Placing Malaysia as the second largest country of oil palm producer in the world has given it an advantage to produce renewable energy source. However, there is a need to monitor the sustainability of oil palm plantations in Malaysia via effective mapping approaches. This study utilised two different platforms (open source and commercial) using a machine learning algorithm namely Support Vector Machine (SVM) to perform oil palm mapping. An open source Python programming-based technique utilising Scikit-learn module was performed to map the oil palm distribution and the result produced had an overall accuracy of 91.39%. To support and validate the efficiency of the Python programming-based image classification, a commercial remote sensing software (ENVI) was used and compared by implementing the same SVM algorithm and the result showed an overall accuracy of 98.21%.</p>

APA, Harvard, Vancouver, ISO, and other styles

16

Dhindsa, Kiret, Oliver Cook, Thomas Mudway, Areeb Khawaja, Ron Harwood, and Ranil Sonnadara. "LFSpy: A Python Implementation of Local Feature Selection for Data Classification with scikit-learn Compatibility." Journal of Open Source Software 5, no. 49 (May 10, 2020): 1958. http://dx.doi.org/10.21105/joss.01958.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Dorosh, Nataliia, and Tatyana Fenenko. "ДОСЛІДЖЕННЯ ДЕСКРИПТОРІВ ЩОДО РОЗПІЗНАВАННЯ ЦИФР НАБОРУ MNIST." System technologies 2, no. 127 (February 24, 2020): 45–54. http://dx.doi.org/10.34185/1562-9945-2-127-2020-04.

Full text

Abstract:

Кращі результати розпізнавання цифр отримані на основі нейронних мереж і мають помилку менше 1%. Успішні алгоритми розпізнавання, в тому числі і глибокого навчання, приховані від користувача і складні в описі, тому не втратили свою актуальність алгоритми на основі дескрипторів. Метою роботи є вибір та дослідження дескрипторів для розпізнавання набору MNIST. Виконано розпізнавання цифр на основі 12 дескрипторів із застосуванням моделей з бібліотеки Scikit-Learn Python. За результатами розпізнавання методом k-середніх з’ясовано, що доцільно обрати 8 дескрипторів.

APA, Harvard, Vancouver, ISO, and other styles

18

Abdelrahman, Mahmoud M., and Ahmed Mohamed Yousef Toutou. "[ANT]: A Machine Learning Approach for Building Performance Simulation: Methods and Development." Academic Research Community publication 3, no. 1 (February 7, 2019): 205. http://dx.doi.org/10.21625/archive.v3i1.442.

Full text

Abstract:

In this paper, we represent an approach for combining machine learning (ML) techniques with building performance simulation by introducing four methods in which ML could be effectively involved in this field i.e. Classification, Regression, Clustering and Model selection . Rhino-3d-Grasshopper SDK was used to develop a new plugin for involving machine learning in design process using Python programming language and making use of scikit-learn module, that is, a python module which provides a general purpose high level language to nonspecialist user by integration of wide range supervised and unsupervised learning algorithms with high performance, ease of use and well documented features. ANT plugin provides a method to make use of these modules inside Rhino\Grasshopper to be handy to designers. This tool is open source and is released under BSD simplified license. This approach represents promising results regarding making use of data in automating building performance development and could be widely applied. Future studies include providing parallel computation facility using PyOpenCL module as well as computer vision integration using scikit-image.

APA, Harvard, Vancouver, ISO, and other styles

19

Polatgil, Mesut. "Investigation of the Effect of Normalization Methods on ANFIS Success: Forestfire and Diabets Datasets." International Journal of Information Technology and Computer Science 14, no. 1 (February 8, 2022): 1–8. http://dx.doi.org/10.5815/ijitcs.2022.01.01.

Full text

Abstract:

Machine learning and artificial intelligence techniques are more and more in our lives and studies in this field are increasing day by day. Data is vital for these studies. In order to draw meaningful conclusions from the available data, new methods are proposed and successful results are obtained. The preparation of the obtained data is very important in the studies to be carried out. Data preprocessing is very important in the preparation of data. The most critical stage of the data preprocessing process is the scaling or normalization of the data. Machine learning libraries such as scikit-learn and programming languages such as R provide the necessary libraries to scale data. However, it is not known exactly which normalization method will be applied and which will yield more successful results. The success of these normalization methods has been investigated on many different methods, but such a study has not been done on the adaptive neural fuzzy inference system (ANFIS). The aim of this study is to examine the success of normalization methods on ANFIS in terms of both classification and regression problems. So, for studies using the Anfis method, guidance will be provided on which normalization process will give better results in the data preprocessing stage. Four different normalization methods in the scikit-learn library were applied on the Diabets and Forestfire datasets in the UCI database. The results are presented separately for both classification and regression. It has been determined that min-max normalization in classification problems and working with original data in regression problems are more successful.

APA, Harvard, Vancouver, ISO, and other styles

20

Martínez, Jorge L., Mariano Morán, Jesús Morales, Alfredo Robles, and Manuel Sánchez. "Supervised Learning of Natural-Terrain Traversability with Synthetic 3D Laser Scans." Applied Sciences 10, no. 3 (February 7, 2020): 1140. http://dx.doi.org/10.3390/app10031140.

Full text

Abstract:

Autonomous navigation of ground vehicles on natural environments requires looking for traversable terrain continuously. This paper develops traversability classifiers for the three-dimensional (3D) point clouds acquired by the mobile robot Andabata on non-slippery solid ground. To this end, different supervised learning techniques from the Python library Scikit-learn are employed. Training and validation are performed with synthetic 3D laser scans that were labelled point by point automatically with the robotic simulator Gazebo. Good prediction results are obtained for most of the developed classifiers, which have also been tested successfully on real 3D laser scans acquired by Andabata in motion.

APA, Harvard, Vancouver, ISO, and other styles

21

Kamila, Vina Zahrotun, and Eko Subastian. "KNN vs Naive Bayes Untuk Deteksi Dini Putus Kuliah Pada Profil Akademik Mahasiswa." Jurnal Rekayasa Teknologi Informasi (JURTI) 3, no. 2 (December 14, 2019): 116. http://dx.doi.org/10.30872/jurti.v3i2.3097.

Full text

Abstract:

Penelitian ini membahas bagaimana perbandingan KNN dan Naive Bayes dalam memprediksi potensi putus kuliah pada mahasiswa. Data yang dijadikan variabel independen adalah data akademik yaitu nilai semester 1 hingga 6. Hasil dari penelitian ini diharapkan menjadi pedoman dalam menerapkan algoritma ke dalam sistem deteksi dini putus kuliah. Algoritma-algoritma ini diterapkan dengan library Scikit-learn pada Python. Nilai akurasi yang dihasilkan dari penelitian ini menunjukkan Naive Bayes (92%) lebih unggul dalam memprediksi status putus kuliah mahasiswa dibandingkan dengan algoritma KNN (85%). Namun perlu dilakukan penelitian lanjutan lagiuntuk menguji konsistensi dan akurasi pada data yang lebih besar dan lebih beragam.

APA, Harvard, Vancouver, ISO, and other styles

22

Douglass, Michael J. J. "Book Review: Hands-on Machine Learning with Scikit-Learn, Keras, and Tensorflow, 2nd edition by Aurélien Géron." Physical and Engineering Sciences in Medicine 43, no. 3 (August 12, 2020): 1135–36. http://dx.doi.org/10.1007/s13246-020-00913-z.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Annala, L., M. A. Eskelinen, J. Hämäläinen, A. Riihinen, and I. Pölönen. "PRACTICAL APPROACH FOR HYPERSPECTRAL IMAGE PROCESSING IN PYTHON." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-3 (April 30, 2018): 45–52. http://dx.doi.org/10.5194/isprs-archives-xlii-3-45-2018.

Full text

Abstract:

Python is a very popular programming language among data scientists around the world. Python can also be used in hyperspectral data analysis. There are some toolboxes designed for spectral imaging, such as Spectral Python and HyperSpy, but there is a need for analysis pipeline, which is easy to use and agile for different solutions. We propose a Python pipeline which is built on packages xarray, Holoviews and scikit-learn. We have developed some of own tools, MaskAccessor, VisualisorAccessor and a spectral index library. They also fulfill our goal of easy and agile data processing. In this paper we will present our processing pipeline and demonstrate it in practice.

APA, Harvard, Vancouver, ISO, and other styles

24

Zhao, Yue, Xuejian Wang, Cheng Cheng, and Xueying Ding. "Combining Machine Learning Models Using combo Library." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 09 (April 3, 2020): 13648–49. http://dx.doi.org/10.1609/aaai.v34i09.7111.

Full text

Abstract:

Model combination, often regarded as a key sub-field of ensemble learning, has been widely used in both academic research and industry applications. To facilitate this process, we propose and implement an easy-to-use Python toolkit, combo, to aggregate models and scores under various scenarios, including classification, clustering, and anomaly detection. In a nutshell, combo provides a unified and consistent way to combine both raw and pretrained models from popular machine learning libraries, e.g., scikit-learn, XGBoost, and LightGBM. With accessibility and robustness in mind, combo is designed with detailed documentation, interactive examples, continuous integration, code coverage, and maintainability check; it can be installed easily through Python Package Index (PyPI) or {https://github.com/yzhao062/combo}.

APA, Harvard, Vancouver, ISO, and other styles

25

Deshpande, P. J., A. Sure, O. Dikshit, and S. Tripathi. "A FRAMEWORK FOR ESTIMATING REPRESENTATIVE AREA OF A GROUND SAMPLE USING REMOTE SENSING." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W13 (June 4, 2019): 687–92. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w13-687-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> Modelling hydro-meteorological variables over land and atmosphere comprise of ground sampling at selected locations and predicting over the other locations. Remote sensing data can be effectively used to improve predictions by prudently choosing sampling locations of variables co-dependent on the prediction variable. This paper presents a framework for estimating the representative area of a ground sample and thereby determining the number of samples required for prediction with a given level of uncertainty and spatial resolution. Application of the proposed framework for soil moisture as the prediction variable is presented using Google Earth Engine and Scikit-learn libraries implemented in Python 3 programming language.</p>

APA, Harvard, Vancouver, ISO, and other styles

26

Sapakova, S. Z., and N. Madinesh. "Obtaining predicted values of the demographic process using machine learning methods." Bulletin of Kazakh Leading Academy of Architecture and Construction 80, no. 2 (June 29, 2021): 352–58. http://dx.doi.org/10.51488/1680-080x/2021.2-08.

Full text

Abstract:

Research and analysis of demographic processes play an important role in many areas. For this, the population size and key factors from 1994 to 2019 were selected on the statistical website of the Republic of Kazakhstan. Demographics were population size, fertility, mortality, divorce, and migration. The factors of the standard of living were the number of unemployed and the average monthly salary, while the medical factors were the hospital organizations, the number of hospital beds and the number of doctors of all specialties. In the course of regression analysis, a correlation was obtained and multicollinear factors were identified. We used four different machine learning models from the Scikit-Learn library to generate population estimates. Regression models were evaluated using the quality score. As a result, linear regression and random forest models performed well.

APA, Harvard, Vancouver, ISO, and other styles

27

Sudrazat, Sthevanie Dhita, Humbang Purba, Egie Wijaksono, Waskito Pranowo, and Muhammad Irsyad Hibatullah. "PREDIKSI KECEPATAN GELOMBANG S DENGAN MACHINE LEARNING PADA SUMUR “S-1”, CEKUNGAN SUMATERA TENGAH, INDONESIA." Lembaran publikasi minyak dan gas bumi 54, no. 1 (April 1, 2020): 29–35. http://dx.doi.org/10.29017/lpmgb.54.1.502.

Full text

Abstract:

Data kecepatan gelombang S (shear) sangat diperlukan untuk karakterisasi reservoar dalam menentukan zona reservoar. Namun data kecepatan gelombang S sangat terbatas dan tersedia pada sumur tertentu saja. Penelitian ini dilakukan untuk memprediksi nilai kecepatan gelombang S dengan menggunakan metode supervised machine learning pada sumur S-1 lapangan migas di cekungan Sumatra Tengah. Simulasi algoritma machine learning dilakukan melalui tahapan sebelum dan setelah tuning pada algoritma library Scikit learn dan algoritma artificial neural network (ANN). Selain itu, parameter dan jumlah data yang digunakan dalam memprediksi nilai kecepatan gelombang akan menentukan nilai error dan akurasi. Hasil analisis menunjukkan bahwa algoritma yang digunakan untuk memperoleh akurasi terbaik pertama dalam memprediksi kecepatan gelombang S, yaitu random forest dengan nilai parameter n_estimator terbaik 10 dan algoritma kedua yang terbaik yaitu k-nearest neighbor dengan nilai parameter n_neighbor terbaik 5.

APA, Harvard, Vancouver, ISO, and other styles

28

Chapman, James, and Hao-Ting Wang. "CCA-Zoo: A collection of Regularized, Deep Learning based, Kernel, and Probabilistic CCA methods in a scikit-learn style framework." Journal of Open Source Software 6, no. 68 (December 18, 2021): 3823. http://dx.doi.org/10.21105/joss.03823.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Karakatič, Sašo. "EvoPreprocess—Data Preprocessing Framework with Nature-Inspired Optimization Algorithms." Mathematics 8, no. 6 (June 2, 2020): 900. http://dx.doi.org/10.3390/math8060900.

Full text

Abstract:

The quality of machine learning models can suffer when inappropriate data is used, which is especially prevalent in high-dimensional and imbalanced data sets. Data preparation and preprocessing can mitigate some problems and can thus result in better models. The use of meta-heuristic and nature-inspired methods for data preprocessing has become common, but these approaches are still not readily available to practitioners with a simple and extendable application programming interface (API). In this paper the EvoPreprocess open-source Python framework, that preprocesses data with the use of evolutionary and nature-inspired optimization algorithms, is presented. The main problems addressed by the framework are data sampling (simultaneous over- and under-sampling data instances), feature selection and data weighting for supervised machine learning problems. EvoPreprocess framework provides a simple object-oriented and parallelized API of the preprocessing tasks and can be used with scikit-learn and imbalanced-learn Python machine learning libraries. The framework uses self-adaptive well-known nature-inspired meta-heuristic algorithms and can easily be extended with custom optimization and evaluation strategies. The paper presents the architecture of the framework, its use, experiment results and comparison to other common preprocessing approaches.

APA, Harvard, Vancouver, ISO, and other styles

30

Weiand, Augusto, and Fernanda Rodrigues Ribeiro Weiand. "Análise de sentimentos do Twitter com Naïve Bayes e NLTK." ScientiaTec 4, no. 3 (April 24, 2018): 46–57. http://dx.doi.org/10.35819/scientiatec.v4i3.2188.

Full text

Abstract:

Este artigo propõe um algoritmo de análise de sentimentos dos tweets do microblog Twitter, utilizando o modelo probabilístico de Naïve Bayes, de modo a classificá-los em positivos ou negativos. Foram utilizados os dados pré-analisados de Sanders (2011) para a construção do corpus e posterior aplicação da análise e validação cruzada. Após, demonstramos o desenvolvimento do algoritmo seguindo a metodologia estudada nos artigos relacionados, utilizando, também, as bibliotecas NLTK e Scikit-Learn para o auxílio na aplicação do algoritmo com a linguagem de programação python, medidas de acurácia e validação cruzada dos dados. Neste momento da pesquisa, foi possível obter um índice acurácia relativamente alto, de 91% no dataset mencionado. Organizamos este artigo em sessões que abordam os trabalhos relacionados, a metodologia utilizada, o sistema de coleta de dados, a biblioteca NLTK, o modelo probabilístico Naïve Bayes e, por fim, os resultados e os trabalhos futuros, nesta ordem.

APA, Harvard, Vancouver, ISO, and other styles

31

Trainor-Guitton, Whitney, Leo Turon, and Dominique Dubucq. "Python Earth Engine API as a new open-source ecosphere for characterizing offshore hydrocarbon seeps and spills." Leading Edge 40, no. 1 (January 2021): 35–44. http://dx.doi.org/10.1190/tle40010035.1.

Full text

Abstract:

The Python Earth Engine application programming interface (API) provides a new open-source ecosphere for testing hydrocarbon detection algorithms on large volumes of images curated with the Google Earth Engine. We specifically demonstrate the Python Earth Engine API by calculating three hydrocarbon indices: fluorescence, rotation absorption, and normalized fluorescence. The Python Earth Engine API provides an ideal environment for testing these indices with varied oil seeps and spills by (1) removing barriers of proprietary software formats and (2) providing an extensive library of data analysis tools (e.g., Pandas and Seaborn) and classification algorithms (e.g., Scikit-learn and TensorFlow). Our results demonstrate end-member cases in which fluorescence and normalized fluorescence indices of seawater and oil are statistically similar and different. As expected, predictive classification is more effective and the calculated probability of oil is more accurate for scenarios in which seawater and oil are well separated in the fluorescence space.

APA, Harvard, Vancouver, ISO, and other styles

32

Шумейко, О. О., В. В. Шевченко, О. О. Жульковський, and І. І. Жульковська. "ПОРІВНЯЛЬНЕ ДОСЛІДЖЕННЯ МЕТОДІВ РОЗПІЗНАВАННЯ ОБЛИЧ." Математичне моделювання, no. 2(45) (December 13, 2021): 29–38. http://dx.doi.org/10.31319/2519-8106.2(45)2021.246871.

Full text

Abstract:

Розпізнавання облич завоювало свою популярність завдяки своїй унікальності серед інших біометричних методів, тому що має всі характеристики ефективної системи безпеки. Проте існують певні обмеження у системі розпізнавання облич, які необхідно дослідити та вивчити. Так, наприклад, вирішення таких проблем, як зміна освітлення, розташування об’єкту, емоцій, віку тощо потребують застосування спеціальних алгоритмів. Використання цих алгоритмів та їх комбінацій певною мірою сприятимуть вирішенню подібних задач. У роботі досліджені та застосовані аналіз основних компонентів, лінійний дискримінантний аналіз, незалежний аналіз компонентів та класифікація за допомогою машини опорних векторів. Для реалізації перелічених алгоритмів було використано мову Python та бібліотеку машинного навчання Scikit-learn. Проведено порівняння продуктивності систем на основі точності. Результати досліджень показують, що продуктивність SVM-класифікатора з використанням NMF є найгіршою з точки зору точності передбачення. Ефективність інших моделей, що були натреновані з використанням методів ICA, PCA та LDA, коливається в припустимих межах. Модель, навчена з використанням алгоритму PCA, працює з найвищою точністю передбачення.

APA, Harvard, Vancouver, ISO, and other styles

33

Pilnenskiy, Nikita, and Ivan Smetannikov. "Feature Selection Algorithms as One of the Python Data Analytical Tools." Future Internet 12, no. 3 (March 16, 2020): 54. http://dx.doi.org/10.3390/fi12030054.

Full text

Abstract:

With the current trend of rapidly growing popularity of the Python programming language for machine learning applications, the gap between machine learning engineer needs and existing Python tools increases. Especially, it is noticeable for more classical machine learning fields, namely, feature selection, as the community attention in the last decade has mainly shifted to neural networks. This paper has two main purposes. First, we perform an overview of existing open-source Python and Python-compatible feature selection libraries, show their problems, if any, and demonstrate the gap between these libraries and the modern state of feature selection field. Then, we present new open-source scikit-learn compatible ITMO FS (Information Technologies, Mechanics and Optics University feature selection) library that is currently under development, explain how its architecture covers modern views on feature selection, and provide some code examples on how to use it with Python and its performance compared with other Python feature selection libraries.

APA, Harvard, Vancouver, ISO, and other styles

34

Gevorkyan, Migran N., Anastasia V. Demidova, Tatiana S. Demidova, and Anton A. Sobolev. "Review and comparative analysis of machine learning libraries for machine learning." Discrete and Continuous Models and Applied Computational Science 27, no. 4 (December 15, 2019): 305–15. http://dx.doi.org/10.22363/2658-4670-2019-27-4-305-315.

Full text

Abstract:

The article is an overview. We carry out the comparison of actual machine learning libraries that can be used the neural networks development. The first part of the article gives a brief description of TensorFlow, PyTorch, Theano, Keras, SciKit Learn libraries, SciPy library stack. An overview of the scope of these libraries and the main technical characteristics, such as performance, supported programming languages, the current state of development is given. In the second part of the article, a comparison of five libraries is carried out on the example of a multilayer perceptron, which is applied to the problem of handwritten digits recognizing. This problem is well known and well suited for testing different types of neural networks. The study time is compared depending on the number of epochs and the accuracy of the classifier. The results of the comparison are presented in the form of graphs of training time and accuracy depending on the number of epochs and in tabular form.

APA, Harvard, Vancouver, ISO, and other styles

35

Antoniadis-Karnavas, A., S. G. Sousa, E. Delgado-Mena, N. C. Santos, G. D. C. Teixeira, and V. Neves. "ODUSSEAS: a machine learning tool to derive effective temperature and metallicity for M dwarf stars." Astronomy & Astrophysics 636 (April 2020): A9. http://dx.doi.org/10.1051/0004-6361/201937194.

Full text

Abstract:

Aims. The derivation of spectroscopic parameters for M dwarf stars is very important in the fields of stellar and exoplanet characterization. The goal of this work is the creation of an automatic computational tool able to quickly and reliably derive the Teff and [Fe/H] of M dwarfs using optical spectra obtained by different spectrographs with different resolutions. Methods. ODUSSEAS (Observing Dwarfs Using Stellar Spectroscopic Energy-Absorption Shapes) is based on the measurement of the pseudo equivalent widths for more than 4000 stellar absorption lines and on the use of the machine learning Python package “scikit-learn” for predicting the stellar parameters. Results. We show that our tool is able to derive parameters accurately and with high precision, having precision errors of ~30 K for Teff and ~0.04 dex for [Fe/H]. The results are consistent for spectra with resolutions of between 48 000 and 115 000 and a signal-to-noise ratio above 20.

APA, Harvard, Vancouver, ISO, and other styles

36

Smith, O., and H. Cho. "AN OPEN-SOURCE CANOPY CLASSIFICATION SYSTEM USING MACHINE-LEARNING TECHNIQUES WITHIN A PYTHON FRAMEWORK." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLVI-4/W2-2021 (August 19, 2021): 175–82. http://dx.doi.org/10.5194/isprs-archives-xlvi-4-w2-2021-175-2021.

Full text

Abstract:

Abstract. Studying deforestation has been an important topic in forestry research. Especially, canopy classification using remotely sensed data plays an essential role in monitoring tree canopy on a large scale. As remote sensing technologies advance, the quality and resolution of satellite imagery have significantly improved. Oftentimes, leveraging high-resolution imagery such as the National Agriculture Imagery Program (NAIP) imagery requires proprietary software. However, the lack of insight into the inner workings of such software and the inability of modifying its code lead many researchers towards open-source solutions. In this research, we introduce CanoClass, an open-source cross-platform canopy classification system written in Python. CanoClass utilizes the Random Forest and Extra Trees algorithms provided by scikit-learn to classify canopy using remote sensing imagery. Based on our benchmark tests, this new canopy classification system was 283 % to 464 % faster than commercial Feature Analyst, but it produced comparable results with a similarity of 87.56 % to 87.62 %.

APA, Harvard, Vancouver, ISO, and other styles

37

Urbanski, Konrad, and Dariusz Janiszewski. "Position Estimation at Zero Speed for PMSMs Using Artificial Neural Networks." Energies 14, no. 23 (December 4, 2021): 8134. http://dx.doi.org/10.3390/en14238134.

Full text

Abstract:

This paper presents a method for shaft position estimation of a synchronous motor with permanent magnets. Zero speed and very low speed range are considered. The method uses the analysis of high-frequency currents induced by the introduction of additional voltage in the control path in the stationary coordinate system associated with the stator. An artificial neural network estimates the sine and cosine values necessary in the Park’s transformation units. This method can achieve satisfactory accuracy in the case of low asymmetry of inductance in the direct and quadrature axes of the coordinate system associated with the rotor. The TensorFlow/Keras package was used for artificial network calculations and the scikit-learn package for preprocessing. Aggregating the outputs of several artificial neural networks provides an opportunity to reduce the resultant estimation error. The use of as few as four networks has enabled the error to be reduced by approximately 20% compared to a single example network.

APA, Harvard, Vancouver, ISO, and other styles

38

Saxena, Kavita. "Face Mask Detection Using Machine Learning." International Journal for Research in Applied Science and Engineering Technology 9, no. 12 (December 31, 2021): 817–22. http://dx.doi.org/10.22214/ijraset.2021.39262.

Full text

Abstract:

Abstract: COVID-19 epidemic has affected our daily life disturbing the world trade and transport. Wearing a face mask has become a new necessity for safety. In the near future, many institutions will ask the customers to wear masks to avail of their services. Therefore, face mask detection has become a necessity to help society. This paper presents a simplified approach to achieve this purpose using some packages like TensorFlow, Keras, OpenCV and Scikit-Learn. This method detects the face from the image in frame and then identifies if it has worn a mask or not. As in a surveillance task, it can also detect a face along with a mask in movement through image processing. The method attains accuracy up to 93% and 91.2% respectively on two datasets. We explore optimized values of parameters using the Sequential CNN (Convolutional Neural Network) model to detect the presence of masks correctly. Keywords: Face Mask Detection, Convolutional Neural Network, TensorFlow, Keras, Image Processing

APA, Harvard, Vancouver, ISO, and other styles

39

Shatrovskii, I., and A. Zolotukhin. "Computer technologies to determine offshore facilities suitable for the climatic conditions." IOP Conference Series: Materials Science and Engineering 1201, no. 1 (November 1, 2021): 012068. http://dx.doi.org/10.1088/1757-899x/1201/1/012068.

Full text

Abstract:

Abstract This article provides an overview of computer technologies and an analysis of their application possibilities to determine the quantitative and qualitative composition of the facilities for the development of offshore oil and gas fields for given climatic conditions. The first question covered in the article is how to form a data set including the information about offshore fields development conditions. The author gives examples of using the Python programming language, NumPy, Tesseract and OpenCV libraries for that purpose. For each designated variable, a test program code was done to obtain the required value. Another question, which is given in the article, is how to analyze the obtained results and determine which offshore facilities are best suited for certain climatic conditions. For this purpose, a deep neural network of forwarding propagation was created with the help of Keras, TensorFlow and Scikit-learn Python libraries. The network performance was tested on a sample of a small size, the results were analyzed, and future research tasks were defined.

APA, Harvard, Vancouver, ISO, and other styles

40

Silva, Hugo, and Jorge Bernardino. "Machine Learning Algorithms: An Experimental Evaluation for Decision Support Systems." Algorithms 15, no. 4 (April 15, 2022): 130. http://dx.doi.org/10.3390/a15040130.

Full text

Abstract:

Decision support systems with machine learning can help organizations improve operations and lower costs with more precision and efficiency. This work presents a review of state-of-the-art machine learning algorithms for binary classification and makes a comparison of the related metrics between them with their application to a public diabetes and human resource datasets. The two mainly used categories that allow the learning process without requiring explicit programming are supervised and unsupervised learning. For that, we use Scikit-learn, the free software machine learning library for Python language. The best-performing algorithm was Random Forest for supervised learning, while in unsupervised clustering techniques, Balanced Iterative Reducing and Clustering Using Hierarchies and Spectral Clustering algorithms presented the best results. The experimental evaluation shows that the application of unsupervised clustering algorithms does not translate into better results than with supervised algorithms. However, the application of unsupervised clustering algorithms, as the preprocessing of the supervised techniques, can translate into a boost of performance.

APA, Harvard, Vancouver, ISO, and other styles

41

Vinayakumar, R., K. P. Soman, and Prabaharan Poornachandran. "Evaluation of Recurrent Neural Network and its Variants for Intrusion Detection System (IDS)." International Journal of Information System Modeling and Design 8, no. 3 (July 2017): 43–63. http://dx.doi.org/10.4018/ijismd.2017070103.

Full text

Abstract:

This article describes how sequential data modeling is a relevant task in Cybersecurity. Sequences are attributed temporal characteristics either explicitly or implicitly. Recurrent neural networks (RNNs) are a subset of artificial neural networks (ANNs) which have appeared as a powerful, principle approach to learn dynamic temporal behaviors in an arbitrary length of large-scale sequence data. Furthermore, stacked recurrent neural networks (S-RNNs) have the potential to learn complex temporal behaviors quickly, including sparse representations. To leverage this, the authors model network traffic as a time series, particularly transmission control protocol / internet protocol (TCP/IP) packets in a predefined time range with a supervised learning method, using millions of known good and bad network connections. To find out the best architecture, the authors complete a comprehensive review of various RNN architectures with its network parameters and network structures. Ideally, as a test bed, they use the existing benchmark Defense Advanced Research Projects Agency / Knowledge Discovery and Data Mining (DARPA) / (KDD) Cup ‘99' intrusion detection (ID) contest data set to show the efficacy of these various RNN architectures. All the experiments of deep learning architectures are run up to 1000 epochs with a learning rate in the range [0.01-0.5] on a GPU-enabled TensorFlow and experiments of traditional machine learning algorithms are done using Scikit-learn. Experiments of families of RNN architecture achieved a low false positive rate in comparison to the traditional machine learning classifiers. The primary reason is that RNN architectures are able to store information for long-term dependencies over time-lags and to adjust with successive connection sequence information. In addition, the effectiveness of RNN architectures are shown for the UNSW-NB15 data set.

APA, Harvard, Vancouver, ISO, and other styles

42

Kakkar, D., and B. Lewis. "BUILDING A BILLION SPATIO-TEMPORAL OBJECT SEARCH AND VISUALIZATION PLATFORM." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-4/W2 (October 19, 2017): 97–100. http://dx.doi.org/10.5194/isprs-annals-iv-4-w2-97-2017.

Full text

Abstract:

With funding from the Sloan Foundation and Harvard Dataverse, the Harvard Center for Geographic Analysis (CGA) has developed a prototype spatio-temporal visualization platform called the Billion Object Platform or BOP. The goal of the project is to lower barriers for scholars who wish to access large, streaming, spatio-temporal datasets. The BOP is now loaded with the latest billion geo-tweets, and is fed a real-time stream of about 1 million tweets per day. The geo-tweets are enriched with sentiment and census/admin boundary codes when they enter the system. The system is open source and is currently hosted on Massachusetts Open Cloud (MOC), an OpenStack environment with all components deployed in Docker orchestrated by Kontena. This paper will provide an overview of the BOP architecture, which is built on an open source stack consisting of Apache Lucene, Solr, Kafka, Zookeeper, Swagger, scikit-learn, OpenLayers, and AngularJS. The paper will further discuss the approach used for harvesting, enriching, streaming, storing, indexing, visualizing and querying a billion streaming geo-tweets.

APA, Harvard, Vancouver, ISO, and other styles

43

Sujeeun, Lakshmi Y., Nowsheen Goonoo, Honita Ramphul, Itisha Chummun, Fanny Gimié, Shakuntala Baichoo, and Archana Bhaw-Luximon. "Correlating in vitro performance with physico-chemical characteristics of nanofibrous scaffolds for skin tissue engineering using supervised machine learning algorithms." Royal Society Open Science 7, no. 12 (December 2020): 201293. http://dx.doi.org/10.1098/rsos.201293.

Full text

Abstract:

The engineering of polymeric scaffolds for tissue regeneration has known a phenomenal growth during the past decades as materials scientists seek to understand cell biology and cell–material behaviour. Statistical methods are being applied to physico-chemical properties of polymeric scaffolds for tissue engineering (TE) to guide through the complexity of experimental conditions. We have attempted using experimental in vitro data and physico-chemical data of electrospun polymeric scaffolds, tested for skin TE, to model scaffold performance using machine learning (ML) approach. Fibre diameter, pore diameter, water contact angle and Young's modulus were used to find a correlation with 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay of L929 fibroblasts cells on the scaffolds after 7 days. Six supervised learning algorithms were trained on the data using Seaborn/Scikit-learn Python libraries. After hyperparameter tuning, random forest regression yielded the highest accuracy of 62.74%. The predictive model was also correlated with in vivo data. This is a first preliminary study on ML methods for the prediction of cell–material interactions on nanofibrous scaffolds.

APA, Harvard, Vancouver, ISO, and other styles

44

Sulaiman, S. M., P. Aruna Jeyanthy, and D. Devaraj. "Smart Meter Data Analysis Using Big Data Tools." Journal of Computational and Theoretical Nanoscience 16, no. 8 (August 1, 2019): 3629–36. http://dx.doi.org/10.1166/jctn.2019.8338.

Full text

Abstract:

In recent years, the problem of electrical load forecasting gained attention due to the arrival of new measurement technologies that produce electrical energy consumption data at very short intervals of time. Such short term measurements become voluminous in very short time. The availability of big electrical consumption data allows machine learning techniques to be employed to analyze consumption behavior of every consumer on a greater detail. Predicting the consumption of a residential customer is crucial at this point of time because tailor-made consumer-specific tariffs will play a vital role in load balancing process of Utilities. This paper analyzes the electrical consumption of a single residential customer measured using a smart meter that is capable of measuring electrical consumption at circuit level. The issues and challenges in collecting the data and pre-processing required for making them suitable for data analytics are discussed in detail. A comparison of the performance of different machine learning algorithms implemented using Python’s Scikit-learn module gives an insight on the consumption pattern.

APA, Harvard, Vancouver, ISO, and other styles

45

Nofriani, Nofriani. "Machine Learning Application for Classification Prediction of Household’s Welfare Status." JITCE (Journal of Information Technology and Computer Engineering) 4, no. 02 (September 30, 2020): 72–82. http://dx.doi.org/10.25077/jitce.4.02.72-82.2020.

Full text

Abstract:

Various approaches have been attempted by the Government of Indonesia to eradicate poverty throughout the country, one of which is equitable distribution of social assistance for target households according to their classification of social welfare status. This research aims to re-evaluate the prior evaluation of five well-known machine learning techniques; Naïve Bayes, Random Forest, Support Vector Machines, K-Nearest Neighbor, and C4.5 Algorithm; on how well they predict the classifications of social welfare statuses. Afterwards, the best-performing one is implemented into an executable machine learning application that may predict the user’s social welfare status. Other objectives are to analyze the reliability of the chosen algorithm in predicting new data set, and generate a simple classification-prediction application. This research uses Python Programming Language, Scikit-Learn Library, Jupyter Notebook, and PyInstaller to perform all the methodology processes. The results shows that Random Forest Algorithm is the best machine learning technique for predicting household’s social welfare status with classification accuracy of 74.20% and the resulted application based on it could correctly predict 60.00% of user’s social welfare status out of 40 entries.

APA, Harvard, Vancouver, ISO, and other styles

46

Susilo, Bambang, and Riri Fitri Sari. "Intrusion Detection in IoT Networks Using Deep Learning Algorithm." Information 11, no. 5 (May 21, 2020): 279. http://dx.doi.org/10.3390/info11050279.

Full text

Abstract:

The internet has become an inseparable part of human life, and the number of devices connected to the internet is increasing sharply. In particular, Internet of Things (IoT) devices have become a part of everyday human life. However, some challenges are increasing, and their solutions are not well defined. More and more challenges related to technology security concerning the IoT are arising. Many methods have been developed to secure IoT networks, but many more can still be developed. One proposed way to improve IoT security is to use machine learning. This research discusses several machine-learning and deep-learning strategies, as well as standard datasets for improving the security performance of the IoT. We developed an algorithm for detecting denial-of-service (DoS) attacks using a deep-learning algorithm. This research used the Python programming language with packages such as scikit-learn, Tensorflow, and Seaborn. We found that a deep-learning model could increase accuracy so that the mitigation of attacks that occur on an IoT network is as effective as possible.

APA, Harvard, Vancouver, ISO, and other styles

47

Аlliluev, A. S., O. V. Filinyuk, E. E. Shnаyder, and S. V. Аksenov. "Machine Learning for Prediction of Relapses in Multiple Drug Resistant Tuberculosis Patients." Tuberculosis and Lung Diseases 99, no. 11 (November 27, 2021): 27–34. http://dx.doi.org/10.21292/2075-1230-2021-99-11-27-34.

Full text

Abstract:

The objective of the study: to evaluate the possibility of using machine learning algorithms for prediction of relapses in multiple drug resistant tuberculosis (MDR TB) patients.Subjects and Methods. Сlinical, epidemiological, gender, sex, social, biomedical parameters and chemotherapy parameters were analyzed in 346 cured MDR TB patients. The tools of the scikit-learn library, Version 0.24.2 in the Google Colaboratory interactive cloud environment were used to build forecasting models.Results. Analysis of the characteristics of relapse prediction models in cured MDR TB patients using machine learning algorithms including decision tree, random forest, gradient boosting, and logistic regression using K-block stratified validation revealed high sensitivity (0.74 ± 0.167; 0.91 ± 0.17; 0.91 ± 0.14; 0.91 ± 0.16, respectively) and specificity (0.97 ± 0.03; 0.98 ± 0.02; 0.98 ± 0.02; 0.98 ± 0.02, respectively).Five main predictors of relapse in cured MDR-TB patients were identified: repeated courses of chemotherapy; length of history of tuberculosis; destructive process in the lungs; total duration of treatment less than 22 months; and use of less than five effective anti-TB drugs in the regimen of chemotherapy.

APA, Harvard, Vancouver, ISO, and other styles

48

Madhavi, Onkar, Shivani Khente, Sumit Kolipyaka, and Pallavi Chandratre. "Automated Social Distancing and Face Mask Detection System." International Journal for Research in Applied Science and Engineering Technology 10, no. 4 (April 30, 2022): 1372–76. http://dx.doi.org/10.22214/ijraset.2022.41530.

Full text

Abstract:

Abstract: COVID-19 epidemic has fleetly affected our day-to- day life dismembering the world trade and movements. Wearing a defensive face mask has become a new normal. In the near future, numerous public service providers will ask the guests to wear masks rightly to benefit of their services. Thus, face mask recognition has turn out to be a pivotal task to help global society. This paper presents a simplified approach to achieve this purpose using some introductory Machine Learning packages like TensorFlow, Keras, OpenCV and Scikit- Learn. The projected system detects the face from the image appropriately and then identifies if it has a mask on it or not. As a surveillance task executor, it can also distinguish a face along with a mask in motion. The system attains precision up to95.77 and 94.58 independently on two different datasets. We discover optimized values of parameters using the Convolutional Neural Network model to spot the presence of masks rightly without causing over-fitting. Keywords: Coronavirus, Covid-19, Machine Learning, Face Mask Recognition, Convolutional Neural Network, TensorFlow

APA, Harvard, Vancouver, ISO, and other styles

49

Hajirahimova, Makrufa, and Marziya Ismayilova. "Machine learning-based sentiment analysis of Twitter data." Problems of Information Society 13, no. 1 (January 25, 2022): 52–60. http://dx.doi.org/10.25045/jpis.v13.i1.07.

Full text

Abstract:

The paper analyzes the views of Twitter users on the COVID-19 corona virus pandemic based on machine learning algorithms. The role of sentiment analysis increased with the advent of the social network era and the rapid spread of microblogging applications and forums. Social networks are the main sources for gathering information about users’ thoughts on various themes. People spend more time on social media to share their thoughts with others. One of the themes discussed on social networking platforms Twitter is the COVID-19 corona virus pandemic. In the paper, machine learning methods as Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), Neural Network (NN) are used to analyze the emotional “color” (positive, negative, and neutral) of tweets related to the COVID-19 corona virus pandemic. The experiments are conducted in Python programming using the scikit-learn library. A tweet database related to the COVID-19 corona virus pandemic from the Kaggle website is used for experiments. The RF classifier shows the highest performance in the experiments.

APA, Harvard, Vancouver, ISO, and other styles

50

Dagogo-George, Tamunopriye Ene, Hammed Adeleye Mojeed, Abdulateef Oluwagbemiga Balogun, Modinat Abolore Mabayoje, and Shakirat Aderonke Salihu. "Tree-based homogeneous ensemble model with feature selection for diabetic retinopathy prediction." Jurnal Teknologi dan Sistem Komputer 8, no. 4 (October 13, 2020): 297–303. http://dx.doi.org/10.14710/jtsiskom.2020.13669.

Full text

Abstract:

Diabetic Retinopathy (DR) is a condition that emerges from prolonged diabetes, causing severe damages to the eyes. Early diagnosis of this disease is highly imperative as late diagnosis may be fatal. Existing studies employed machine learning approaches with Support Vector Machines (SVM) having the highest performance on most analyses and Decision Trees (DT) having the lowest. However, SVM has been known to suffer from parameter and kernel selection problems, which undermine its predictive capability. Hence, this study presents homogenous ensemble classification methods with DT as the base classifier to optimize predictive performance. Boosting and Bagging ensemble methods with feature selection were employed, and experiments were carried out using Python Scikit Learn libraries on DR datasets extracted from UCI Machine Learning repository. Experimental results showed that Bagged and Boosted DT were better than SVM. Specifically, Bagged DT performed best with accuracy 65.38 %, f-score 0.664, and AUC 0.731, followed by Boosted DT with accuracy 65.42 %, f-score 0.655, and AUC 0.724 when compared to SVM (accuracy 65.16 %, f-score 0.652, and AUC 0.721). These results indicate that DT's predictive performance can be optimized by employing the homogeneous ensemble methods to outperform SVM in predicting DR.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!