Academic literature on the topic 'Scikit-learn'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Scikit-learn.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Scikit-learn"

1

Varoquaux, G., L. Buitinck, G. Louppe, O. Grisel, F. Pedregosa, and A. Mueller. "Scikit-learn." GetMobile: Mobile Computing and Communications 19, no. 1 (June 2015): 29–33. http://dx.doi.org/10.1145/2786984.2786995.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kashikar, Sudhnya, Sumedha Patil, Ameya Vedantwar, Shivani Katpatal, and Sofia Pillai. "Weather Prediction using Scikit-Learn." International Journal of Computer Sciences and Engineering 7, no. 4 (April 30, 2019): 36–40. http://dx.doi.org/10.26438/ijcse/v7i4.3640.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Bengfort, Benjamin, and Rebecca Bilbro. "Yellowbrick: Visualizing the Scikit-Learn Model Selection Process." Journal of Open Source Software 4, no. 35 (March 24, 2019): 1075. http://dx.doi.org/10.21105/joss.01075.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Hao, Jiangang, and Tin Kam Ho. "Machine Learning Made Easy: A Review of Scikit-learn Package in Python Programming Language." Journal of Educational and Behavioral Statistics 44, no. 3 (February 20, 2019): 348–61. http://dx.doi.org/10.3102/1076998619832248.

Full text
Abstract:
Machine learning is a popular topic in data analysis and modeling. Many different machine learning algorithms have been developed and implemented in a variety of programming languages over the past 20 years. In this article, we first provide an overview of machine learning and clarify its difference from statistical inference. Then, we review Scikit-learn, a machine learning package in the Python programming language that is widely used in data science. The Scikit-learn package includes implementations of a comprehensive list of machine learning methods under unified data and modeling procedure conventions, making it a convenient toolkit for educational and behavior statisticians.
APA, Harvard, Vancouver, ISO, and other styles
5

Kalimuthu, Sathyavikasini, and Vijaya Vijayakumar. "Shallow learning model for diagnosing neuro muscular disorder from splicing variants." World Journal of Engineering 14, no. 4 (August 7, 2017): 329–36. http://dx.doi.org/10.1108/wje-09-2016-0075.

Full text
Abstract:
Purpose Diagnosing genetic neuromuscular disorder such as muscular dystrophy is complicated when the imperfection occurs while splicing. This paper aims in predicting the type of muscular dystrophy from the gene sequences by extracting the well-defined descriptors related to splicing mutations. An automatic model is built to classify the disease through pattern recognition techniques coded in python using scikit-learn framework. Design/methodology/approach In this paper, the cloned gene sequences are synthesized based on the mutation position and its location on the chromosome by using the positional cloning approach. For instance, in the human gene mutational database (HGMD), the mutational information for splicing mutation is specified as IVS1-5 T > G indicates (IVS - intervening sequence or introns), first intron and five nucleotides before the consensus intron site AG, where the variant occurs in nucleotide G altered to T. IVS (+ve) denotes forward strand 3′– positive numbers from G of donor site invariant and IVS (−ve) denotes backward strand 5′ – negative numbers starting from G of acceptor site. The key idea in this paper is to spot out discriminative descriptors from diseased gene sequences based on splicing variants and to provide an effective machine learning solution for predicting the type of muscular dystrophy disease with the splicing mutations. Multi-class classification is worked out through data modeling of gene sequences. The synthetic mutational gene sequences are created, as the diseased gene sequences are not readily obtainable for this intricate disease. Positional cloning approach supports in generating disease gene sequences based on mutational information acquired from HGMD. SNP-, gene- and exon-based discriminative features are identified and used to train the model. An eminent muscular dystrophy disease prediction model is built using supervised learning techniques in scikit-learn environment. The data frame is built with the extracted features as numpy array. The data are normalized by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn. Findings To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations. Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. This paper also deliberates the results of statistical learning carried out with the same set of gene sequences with synonymous and non-synonymous mutational descriptors. Research limitations/implications The data frame is built with the Numpy array. Normalizing the data by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn. While learning the SVM model, the cost, gamma and kernel parameters are tuned to attain good results. Scoring parameters of the classifiers are evaluated using tenfold cross-validation using metric functions of scikit-learn library. Results of the disease identification model based on non-synonymous, synonymous and splicing mutations were analyzed. Practical implications Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. The performance of the classifiers are increased by using different estimators from the scikit-learn library. Several types of mutations such as missense, non-sense and silent mutations are also considered to build models through statistical learning technique and their results are analyzed. Originality/value To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations.
APA, Harvard, Vancouver, ISO, and other styles
6

Bhattacharya, Sounak, and Ankit Lundia. "MOVIE RECOMMENDATION SYSTEM USING BAG OF WORDS AND SCIKIT-LEARN." International Journal of Engineering Applied Sciences and Technology 04, no. 05 (October 1, 2019): 526–28. http://dx.doi.org/10.33564/ijeast.2019.v04i05.076.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kravchenko, S. N., E. O. Grishkun, and O. V. Vlasenko. "CLASSIFICATION METHODS FOR MACHINE LEARNING USING THE SCIKIT-LEARN LIBRARY." Scientific notes of Taurida National V.I. Vernadsky University. Series: Technical Sciences 1, no. 3 (2020): 121–25. http://dx.doi.org/10.32838/tnu-2663-5941/2020.3-1/19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Beckner, Wesley, Coco M. Mao, and Jim Pfaendtner. "Statistical models are able to predict ionic liquid viscosity across a wide range of chemical functionalities and experimental conditions." Molecular Systems Design & Engineering 3, no. 1 (2018): 253–63. http://dx.doi.org/10.1039/c7me00094d.

Full text
Abstract:
Herein we present a method of developing predictive models of viscosity for ionic liquids (ILs) using publicly available data in the ILThermo database and the open-source software toolkits PyChem, RDKit, and SciKit-Learn.
APA, Harvard, Vancouver, ISO, and other styles
9

Castejón-Limas, Manuel, Laura Fernández-Robles, Héctor Alaiz-Moretón, Jaime Cifuentes-Rodriguez, and Camino Fernández-Llamas. "A Framework for the Optimization of Complex Cyber-Physical Systems via Directed Acyclic Graph." Sensors 22, no. 4 (February 15, 2022): 1490. http://dx.doi.org/10.3390/s22041490.

Full text
Abstract:
Mathematical modeling and data-driven methodologies are frequently required to optimize industrial processes in the context of Cyber-Physical Systems (CPS). This paper introduces the PipeGraph software library, an open-source python toolbox for easing the creation of machine learning models by using Directed Acyclic Graph (DAG)-like implementations that can be used for CPS. scikit-learn’s Pipeline is a very useful tool to bind a sequence of transformers and a final estimator in a single unit capable of working itself as an estimator. It sequentially assembles several steps that can be cross-validated together while setting different parameters. Steps encapsulation secures the experiment from data leakage during the training phase. The scientific goal of PipeGraph is to extend the concept of Pipeline by using a graph structure that can handle scikit-learn’s objects in DAG layouts. It allows performing diverse operations, instead of only transformations, following the topological ordering of the steps in the graph; it provides access to all the data generated along the intermediate steps; and it is compatible with GridSearchCV function to tune the hyperparameters of the steps. It is also not limited to (X,y) entries. Moreover, it has been proposed as part of the scikit-learn-contrib supported project, and is fully compatible with scikit-learn. Documentation and unitary tests are publicly available together with the source code. Two case studies are analyzed in which PipeGraph proves to be essential in improving CPS modeling and optimization: the first is about the optimization of a heat exchange management system, and the second deals with the detection of anomalies in manufacturing processes.
APA, Harvard, Vancouver, ISO, and other styles
10

Bac, Jonathan, Evgeny M. Mirkes, Alexander N. Gorban, Ivan Tyukin, and Andrei Zinovyev. "Scikit-Dimension: A Python Package for Intrinsic Dimension Estimation." Entropy 23, no. 10 (October 19, 2021): 1368. http://dx.doi.org/10.3390/e23101368.

Full text
Abstract:
Dealing with uncertainty in applications of machine learning to real-life data critically depends on the knowledge of intrinsic dimensionality (ID). A number of methods have been suggested for the purpose of estimating ID, but no standard package to easily apply them one by one or all at once has been implemented in Python. This technical note introduces scikit-dimension, an open-source Python package for intrinsic dimension estimation. The scikit-dimension package provides a uniform implementation of most of the known ID estimators based on the scikit-learn application programming interface to evaluate the global and local intrinsic dimension, as well as generators of synthetic toy and benchmark datasets widespread in the literature. The package is developed with tools assessing the code quality, coverage, unit testing and continuous integration. We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation for real-life and synthetic data.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Scikit-learn"

1

Кулініч, Маргарита Миколаївна, Маргарита Николаевна Кулинич, and Marharyta Mykolaivna Kulinich. "Дослідження та розробка інтелектуальних систем керування проектами." Магістерська робота, ЗДІА, 2018. https://dspace.znu.edu.ua/jspui/handle/12345/357.

Full text
Abstract:
Кулініч, М. М. Дослідження та розробка інтелектуальних систем керування проектами [Електронний ресурс] : робота на здобуття кваліфікаційного ступеня магістра ; спец. : 121 – інженерія програмного забезпечення / М.М. Кулініч ; ЗДІА ; наук. кер. В.Г. Вербицький. – Запоріжжя, 2018. - 114 с.
UA : Метою роботи є дослідження методів автоматичного розподілення задач, та ство-рення автоматизованої системи розподілу проектних задач між виконавцями для най-більш оптимального розподілення часу ви-конання проекту. В результаті роботи досліджені методи автоматичного розподілення задач, пробле-ми сучасних систем керування проектами, обрано мову програмування Python. Для роз-робки були використані фреймворк Django, як фронтенд фреймворк, різні методи ма-шинного навчання. Досліджено принципи роботи і можливості обраних технологій. Ре-зультатами роботи є створення програмного продукту, що буде надавати можливості ав-томатично розподіляти проекти і задачі між виконавцями.
RU : Целью работы является исследование методов автоматического распределения за-дач, создание автоматизированной системы распределения проектных задач между ис-полнителями для более оптимального рас-пределения времени выполнения проекта. Результатом работы является исследо-вание методов автоматического распределе-ния задач, проблемы современных систем управления проектами, выбран язык про-граммирования Python. Для разработки были использованы фреймворк Django, как фрон-тенд фреймворк, разные методы машинного обучения. Исследовано принципы работы и возможности выбранных технологий. Ре-зультатом работы является создание про-граммного продукта, который предоставит возможность автоматически распределять задачи между исполнителями.
EN : Objective: The research of methods for the automatic allocation of tasks, and the creation of an automated system for distributing project tasks among the performers for the optimal allo-cation of project implementation time. Results: The methods of automatic alloca-tion of tasks, problems of modern project man-agement systems, the Python programming lan-guage were chosen. For development, the framework of Django, as a frontend framework, and various methods of machine learning, were used. The principles of work and possibilities of the chosen technologies are investigated. The results of the work are the creation of a software product, which will provide an opportunity to automatically distribute tasks between perform-ers.
APA, Harvard, Vancouver, ISO, and other styles
2

Nguyen, John, and Kasper Lindén. "Creating a Back Stock to Increase Order Delivery and Pickup Availability." Thesis, KTH, Hälsoinformatik och logistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-252798.

Full text
Abstract:
Apotek Hjärtat wants to keep developing their e-commerce website and improve retrieval and delivery of orders to customers. Click and Collect and Click and Express are two options for retrieving e-commerce orders that are available if all products in the order are present in the store. By implementing a back stock in the stores with popular e-commercial items, all products of an order will more often be present in the store. The back stock will in such a way increase the availability of Click and Collect and Click and Express. The goals for the study are to conduct a pilot study, compare methods and possible solutions to implement a model to reach the goals. The pilot study was made by studying previous works in mathematical statistics methods and machine learning methods. The statistical method was accomplished through the analytical tool Statistical Package for the Social Sciences (SPSS) and Java. The machine learning method was accomplished through Python and the Scikit-learn library. The machine learning method was performed by a regression algorithm that was used to find relations between category sales and pollen forecasts. The statistical and machine learning methods were compared to each other. Both gave identical results, but the machine learning method was more functional and easier to further develop and consequently was chosen. Several models were created for a few selected product categories. The categories that did not work for the models had an unrealistic amount of sold products. These amounts could be negative or extremely high when unknown inputs were introduced. A simulation was made of the back stock to estimate how it would increase the availability of Click and Collect/Click and Express. The machine learning models could need more data for more accurate predictions. A conclusion could be made though that is possible to predict the amount of sold products of certain categories such as Allergy and Child Medicine with pollen halt taken into account.
Apotek Hjärtat vill fortsätta utveckla sin e-handelssida och förbättra upphämtning och leverans av ordrar till kund. Click and Collect och Click and Express är två val för att hämta upp e-handelsordrar som finns tillgängliga om alla produkter i ordern finns i butik. Genom att implementera ett baklager i butiker med populära unika ehandelsprodukter kommer alla produkter i en order oftare att finnas i butik. Baklagret kommer på så vis öka tillgängligheten av Click and Collect och Click and Express. Målen är att utföra en förstudie, samt att jämföra och hitta en bra lösning att implementera en modell för att uppnå målen. Förstudien gick ut på att analysera tidigare arbeten inom matematiska statistikmetoder och maskininlärningsmetoder. Den statistiska metoden utfördes genom det analytiska verktyget Statistical Package for the Social Sciences (SPSS) och Java. Maskininlärningsmetoden utvecklades med hjälp av Python och Scikit-learn biblioteket. Maskinlärningsmetoden utfördes genom en regressionsalgoritm som användes för att ta fram flera modeller för relationer mellan försäljning av kategorier och pollenprognoser. Statistiska metoden och maskininlärningsmetoden jämfördes med varandra. Båda gav identiska resultat men maskininlärning var mer funktionellt och enklare att vidareutveckla och därför valdes den metoden. Flera olika modeller lyckades tas fram för en del produktkategorier. De kategorier som inte fungerade för modellerna hade orealistiska mängder sålda varor. Dessa mängder kunde vara negativa eller extremt höga när okända inputs introducerades. Med hjälp av simulationen var det möjligt att uppskatta hur baklagret skulle öka tillgängligheten av Click and Collect/Express. Maskininlärningsmodellerna skulle behöva mer data, som kommer i framtiden, för att ge en mer precis prediktering mellan pollenvärden. Som slutsats är det möjligt att använda dem i framtiden för vissa kategorier som allergi och barnmedicin.
APA, Harvard, Vancouver, ISO, and other styles
3

Paulavets, Anastasiya. "Návrh systému pro doporučování pracovních příležitostí." Master's thesis, Vysoká škola ekonomická v Praze, 2014. http://www.nusl.cz/ntk/nusl-193343.

Full text
Abstract:
This thesis deals with recommender systems in the field of e-recruitment. The main objective is to design a job recommender system for career portal UNIjobs.cz. First, the theoretical background of recommender systems is provided. In the following part, specific properties of job recommender systems are discussed, as well as existing approaches to recommendation in the e-recruitment environment. The last part of the thesis is dedicated to designing a recommender system for career portal UNIjobs.cz. The output of that part is the main contribution of the thesis.
APA, Harvard, Vancouver, ISO, and other styles
4

Panopoulos, Vasileios. "Near Real-time Detection of Masquerade attacks in Web applications : catching imposters using their browsing behavor." Thesis, KTH, Kommunikationsnät, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-183777.

Full text
Abstract:
This Thesis details the research on Machine Learning techniques that are central in performing Anomaly and Masquerade attack detection. The main focus is put on Web Applications because of their immense popularity and ubiquity. This popularity has led to an increase in attacks, making them the most targeted entry point to violate a system. Specifically, a group of attacks that range from identity theft using social engineering to cross site scripting attacks, aim at exploiting and masquerading users. Masquerading attacks are even harder to detect due to their resemblance with normal sessions, thus posing an additional burden. Concerning prevention, the diversity and complexity of those systems makes it harder to define reliable protection mechanisms. Additionally, new and emerging attack patterns make manually configured and Signature based systems less effective with the need to continuously update them with new rules and signatures. This leads to a situation where they eventually become obsolete if left unmanaged. Finally the huge amount of traffic makes manual inspection of attacks and False alarms an impossible task. To tackle those issues, Anomaly Detection systems are proposed using powerful and proven Machine Learning algorithms. Gravitating around the context of Anomaly Detection and Machine Learning, this Thesis initially defines several basic definitions such as user behavior, normality and normal and anomalous behavior. Those definitions aim at setting the context in which the proposed method is targeted and at defining the theoretical premises. To ease the transition into the implementation phase, the underlying methodology is also explained in detail. Naturally, the implementation is also presented, where, starting from server logs, a method is described on how to pre-process the data into a form suitable for classification. This preprocessing phase was constructed from several statistical analyses and normalization methods (Univariate Selection, ANOVA) to clear and transform the given logs and perform feature selection. Furthermore, given that the proposed detection method is based on the source and1request URLs, a method of aggregation is proposed to limit the user privacy and classifier over-fitting issues. Subsequently, two popular classification algorithms (Multinomial Naive Bayes and Support Vector Machines) have been tested and compared to define which one performs better in our given situations. Each of the implementation steps (pre-processing and classification) requires a number of different parameters to be set and thus a method called Hyper-parameter optimization is defined. This method searches for the parameters that improve the classification results. Moreover, the training and testing methodology is also outlined alongside the experimental setup. The Hyper-parameter optimization and the training phases are the most computationally intensive steps, especially given a large number of samples/users. To overcome this obstacle, a scaling methodology is also defined and evaluated to demonstrate its ability to handle larger data sets. To complete this framework, several other options have been also evaluated and compared to each other to challenge the method and implementation decisions. An example of this, is the "Transitions-vs-Pages" dilemma, the block restriction effect, the DR usefulness and the classification parameters optimization. Moreover, a Survivability Analysis is performed to demonstrate how the produced alarms could be correlated affecting the resulting detection rates and interval times. The implementation of the proposed detection method and outlined experimental setup lead to interesting results. Even so, the data-set that has been used to produce this evaluation is also provided online to promote further investigation and research on this field.
Det här arbetet behandlar forskningen på maskininlärningstekniker som är centrala i utförandet av detektion av anomali- och maskeradattacker. Huvud-fokus läggs på webbapplikationer på grund av deras enorma popularitet och att de är så vanligt förekommande. Denna popularitet har lett till en ökning av attacker och har gjort dem till den mest utsatta punkten för att bryta sig in i ett system. Mer specifikt så syftar en grupp attacker som sträcker sig från identitetsstölder genom social ingenjörskonst, till cross-site scripting-attacker, på att exploatera och maskera sig som olika användare. Maskeradattacker är ännu svårare att upptäcka på grund av deras likhet med vanliga sessioner, vilket utgör en ytterligare börda. Vad gäller förebyggande, gör mångfalden och komplexiteten av dessa system det svårare att definiera pålitliga skyddsmekanismer. Dessutom gör nya och framväxande attackmönster manuellt konfigurerade och signaturbaserade system mindre effektiva på grund av behovet att kontinuerligt uppdatera dem med nya regler och signaturer. Detta leder till en situation där de så småningom blir obsoleta om de inte sköts om. Slutligen gör den enorma mängden trafik manuell inspektion av attacker och falska alarm ett omöjligt uppdrag. För att ta itu med de här problemen, föreslås anomalidetektionssystem som använder kraftfulla och beprövade maskininlärningsalgoritmer. Graviterande kring kontexten av anomalidetektion och maskininlärning, definierar det här arbetet först flera enkla definitioner såsom användarbeteende, normalitet, och normalt och anomalt beteende. De här definitionerna syftar på att fastställa sammanhanget i vilket den föreslagna metoden är måltavla och på att definiera de teoretiska premisserna. För att under-lätta övergången till implementeringsfasen, förklaras även den bakomliggande metodologin i detalj. Naturligtvis presenteras även implementeringen, där, med avstamp i server-loggar, en metod för hur man kan för-bearbeta datan till en form som är lämplig för klassificering beskrivs. Den här för´-bearbetningsfasen konstruerades från flera statistiska analyser och normaliseringsmetoder (univariate se-lection, ANOVA) för att rensa och transformera de givna loggarna och utföra feature selection. Dessutom, givet att en föreslagen detektionsmetod är baserad på käll- och request-URLs, föreslås en metod för aggregation för att begränsa problem med överanpassning relaterade till användarsekretess och klassificerare. Efter det så testas och jämförs två populära klassificeringsalgoritmer (Multinomialnaive bayes och Support vector machines) för att definiera vilken som fungerar bäst i våra givna situationer. Varje implementeringssteg (för-bearbetning och klassificering) kräver att ett antal olika parametrar ställs in och således definieras en metod som kallas Hyper-parameter optimization. Den här metoden söker efter parametrar som förbättrar klassificeringsresultaten. Dessutom så beskrivs tränings- och test-ningsmetodologin kortfattat vid sidan av experimentuppställningen. Hyper-parameter optimization och träningsfaserna är de mest beräkningsintensiva stegen, särskilt givet ett stort urval/stort antal användare. För att övervinna detta hinder så definieras och utvärderas även en skalningsmetodologi baserat på dess förmåga att hantera stora datauppsättningar. För att slutföra detta ramverk, utvärderas och jämförs även flera andra alternativ med varandra för att utmana metod- och implementeringsbesluten. Ett exempel på det är ”Transitions-vs-Pages”-dilemmat, block restriction-effekten, DR-användbarheten och optimeringen av klassificeringsparametrarna. Dessu-tom så utförs en survivability analysis för att demonstrera hur de producerade alarmen kan korreleras för att påverka den resulterande detektionsträ˙säker-heten och intervalltiderna. Implementeringen av den föreslagna detektionsmetoden och beskrivna experimentuppsättningen leder till intressanta resultat. Icke desto mindre är datauppsättningen som använts för att producera den här utvärderingen också tillgänglig online för att främja vidare utredning och forskning på området.
APA, Harvard, Vancouver, ISO, and other styles
5

Gustavsson, Vilhelm. "Machine Learning for a Network-based Intrusion Detection System : An application using Zeek and the CICIDS2017 dataset." Thesis, KTH, Hälsoinformatik och logistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-253273.

Full text
Abstract:
Cyber security is an emerging field in the IT-sector. As more devices are connected to the internet, the attack surface for hackers is steadily increasing. Network-based Intrusion Detection Systems (NIDS) can be used to detect malicious traffic in networks and Machine Learning is an up and coming approach for improving the detection rate. In this thesis the NIDS Zeek is used to extract features based on time and data size from network traffic. The features are then analyzed with Machine Learning in Scikit-Learn in order to detect malicious traffic. A 98.58% Bayesian detection rate was achieved for the CICIDS2017 which is about the same level as the results from previous works on CICIDS2017 (without Zeek). The best performing algorithms were K-Nearest Neighbors, Random Forest and Decision Tree.
IT-säkerhet är ett växande fält inom IT-sektorn. I takt med att allt fler saker ansluts till internet, ökar även angreppsytan och risken för IT-attacker. Ett Nätverksbaserat Intrångsdetekteringssystem (NIDS) kan användas för att upptäcka skadlig trafik i nätverk och maskininlärning har blivit ett allt vanligare sätt att förbättra denna förmåga. I det här examensarbetet används ett NIDS som heter Zeek för att extrahera parametrar baserade på tid och datastorlek från nätverkstrafik. Dessa parametrar analyseras sedan med maskininlärning i Scikit-Learn för att upptäcka skadlig trafik. För datasetet CICIDS2017 uppnåddes en Bayesian detection rate på 98.58% vilket är på ungefär samma nivå som resultat från tidigare arbeten med CICIDS2017 (utan Zeek). Algoritmerna som gav bäst resultat var K-Nearest Neighbors, Random Forest och Decision Tree.
APA, Harvard, Vancouver, ISO, and other styles
6

Avena, Anna. "Tecniche di data mining applicate alla decodifica di dati neurali." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14800/.

Full text
Abstract:
Gli studi sulla decodifica dell'attività neuronale permettono di mappare gli impulsi elettrici della corteccia cerebrale in segnali da inviare a determinati dispositivi per poterli monitorare. È su questo tema che la ricerca scientifica si sta concentrando, al fine di aiutare le persone affette da gravi lesioni fisiche ad ottenere un maggiore grado di autonomia nelle piccole azioni di tutti i giorni. In questo elaborato, sono stati analizzati dati derivanti da attività neuronali raccolti da esperimenti effettuati su primati non umani, eseguiti dal gruppo di ricerca della professoressa Patrizia Fattori nel Dipartimento di Farmacia e Biotecnologie dell'Università di Bologna. Per lo svolgimento di questo esperimento, la cavia, è stata addestrata a svolgere un compito che consiste nell'afferrare gli oggetti proposti, uno alla volta, in ordine casuale. Durante l'esercizio, l'attività neuronale della cavia è stata registrata in vettori contenenti l'attività di spiking. Ciò che si cerca di fare in questa tesi è ricostruire l'informazione relativa all'attività di una popolazione di neuroni, dato il suo spike vector. Sono stati testati diversi algoritmi di classificazione e feature al fine di stabilire quale configurazione sia più affidabile per il riconoscimento dell'attività motoria svolta dalla cavia durante l'esperimento. A tal proposito, è stato implementato un processo di data mining attraverso l'utilizzo del linguaggio python e del framework Scikit-learn che permette di effettuare più classificazioni e stabilire quale fornisce una migliore performance. I risultati dell'analisi dimostrano che alcune feature forniscono alti tassi di riconoscimento e che, a seconda del dominio del problema, è più indicato un determinato tipo di preprocessing rispetto ad un altro.
APA, Harvard, Vancouver, ISO, and other styles
7

Valešová, Nikola. "Bioinformatický nástroj pro klasifikaci bakterií do taxonomických kategorií na základě sekvence genu 16S rRNA." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403138.

Full text
Abstract:
Tato práce se zabývá problematikou automatizované klasifikace a rozpoznávání bakterií po získání jejich DNA procesem sekvenování. V rámci této práce je navržena a popsána nová metoda klasifikace založená na základě segmentu 16S rRNA. Představený princip je vytvořen podle stromové struktury taxonomických kategorií a používá známé algoritmy strojového učení pro klasifikaci bakterií do jedné ze tříd na nižší taxonomické úrovni. Součástí práce je dále implementace popsaného algoritmu a vyhodnocení jeho přesnosti predikce. Přesnost klasifikace různých typů klasifikátorů a jejich nastavení je prozkoumána a je určeno nastavení, které dosahuje nejlepších výsledků. Přesnost implementovaného algoritmu je také porovnána s několika existujícími metodami. Během validace dosáhla implementovaná aplikace KTC více než 45% přesnosti při predikci rodu na datových sadách BLAST 16S i BLAST V4. Na závěr je zmíněno i několik možností vylepšení a rozšíření stávající implementace algoritmu.
APA, Harvard, Vancouver, ISO, and other styles
8

Ramanayaka, Mudiyanselage Asanga. "Data Engineering and Failure Prediction for Hard Drive S.M.A.R.T. Data." Bowling Green State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1594957948648404.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Haglund, Robin. "Automated analysis of battery articles." Thesis, Uppsala universitet, Strukturkemi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-403738.

Full text
Abstract:
Journal articles are the formal medium for the communication of results among scientists, and often contain valuable data. However, manually collecting article data from a large field like lithium-ion battery chemistry is tedious and time consuming, which is an obstacle when searching for statistical trends and correlations to inform research decisions. To address this a platform for the automatic retrieval and analysis of large numbers of articles is created and applied to the field of lithium-ion battery chemistry. Example data produced by the platform is presented and evaluated and sources of error limiting this type of platform are identified, with problems related to text extraction and pattern matching being especially significant. Some solutions to these problems are presented and potential future improvements are proposed.
APA, Harvard, Vancouver, ISO, and other styles
10

Urbanczyk, Martin. "Webový simulátor fotbalových lig a turnajů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403171.

Full text
Abstract:
This thesis is about the creation of a simulator of football leagues and championships. I studied the problematics of football competitions and their systems and also about the base of machine learning. There was also an analysis of similar and existing solutions and I took inspiration for my proposal from them. After that, I made the design of the whole simulator structure and of all of its key parts. Then the simulator was implemented and tested. The application allows simulating top five competitions in UEFA club coefficients rating.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Scikit-learn"

1

Paper, David. Hands-on Scikit-Learn for Machine Learning Applications. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-5373-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Garreta, Raul, Guillermo Moncecchi, Trent Hauck, and Gavin Hackeling. scikit-learn : Machine Learning Simplified: Implement scikit-learn into every step of the data science pipeline. Packt Publishing, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

scikit-learn Cookbook - Second Edition: Over 80 recipes for machine learning in Python with scikit-learn. Packt Publishing - ebooks Account, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Hackeling, Gavin. Mastering Machine Learning with scikit-learn - Second Edition: Apply effective learning algorithms to real-world problems using scikit-learn. Packt Publishing, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Shapiro, Bruce, and Isabella Romeo. Getting Started in Machine Learning: Easy Recipes for Python 3, Scikit-Learn, and Jupyter. Sherwood Forest Books, 2020.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow, 2nd Edition. Packt Publishing, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Aprende Machine Learning con Scikit-Learn, Keras y TensorFlow : Conceptos, herramientas y técnicas para conseguir sistemas inteligentes . Anaya, 2020.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition. Packt Publishing, 2019.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Geron. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. Shroff - O'Reilly, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly Media, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Scikit-learn"

1

Kramer, Oliver. "Scikit-Learn." In Studies in Big Data, 45–53. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-33383-0_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Porcu, Valentina. "Scikit-learn." In Python for Data Mining Quick Syntax Reference, 235–53. Berkeley, CA: Apress, 2018. http://dx.doi.org/10.1007/978-1-4842-4113-4_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Bisong, Ekaba. "Introduction to Scikit-learn." In Building Machine Learning and Deep Learning Models on Google Cloud Platform, 215–29. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-4470-8_18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Paper, David. "Introduction to Scikit-Learn." In Hands-on Scikit-Learn for Machine Learning Applications, 1–35. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-5373-1_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Paper, David. "Scikit-Learn Regression Tuning." In Hands-on Scikit-Learn for Machine Learning Applications, 189–213. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-5373-1_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Agrawal, Tanay. "Hyperparameter Optimization Using Scikit-Learn." In Hyperparameter Optimization in Machine Learning, 31–51. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-6579-6_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Nelli, Fabio. "Machine Learning with scikit-learn." In Python Data Analytics, 237–64. Berkeley, CA: Apress, 2015. http://dx.doi.org/10.1007/978-1-4842-0958-5_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Nelli, Fabio. "Machine Learning with scikit-learn." In Python Data Analytics, 313–47. Berkeley, CA: Apress, 2018. http://dx.doi.org/10.1007/978-1-4842-3913-1_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Bucher, Tabea-Clara, Xuehui Jiang, Ole Meyer, Stephan Waitz, Sven Hertling, and Heiko Paulheim. "scikit-learn Pipelines Meet Knowledge Graphs." In The Semantic Web: ESWC 2021 Satellite Events, 9–14. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-80418-3_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Balfer, Jenny, Jürgen Bajorath, and Martin Vogt. "Compound Classification Using the scikit-learn Library." In Tutorials in Chemoinformatics, 223–39. Chichester, UK: John Wiley & Sons, Ltd, 2017. http://dx.doi.org/10.1002/9781119161110.ch14.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Scikit-learn"

1

Zhang, Robert F., and Ryan J. Urbanowicz. "A scikit-learn compatible learning classifier system." In GECCO '20: Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3377929.3398097.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Komer, Brent, James Bergstra, and Chris Eliasmith. "Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn." In Python in Science Conference. SciPy, 2014. http://dx.doi.org/10.25080/majora-14bd3278-006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Doukas, Michail, Sotirios Xydis, and Dimitrios Soudris. "Dataflow Acceleration of scikit-learn Gaussian Process Regression." In the 8th Workshop and 6th Workshop. New York, New York, USA: ACM Press, 2017. http://dx.doi.org/10.1145/3029580.3029587.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Kolte, Ashish, Bodireddy Mahitha, and N. V. Ganapathi Raju. "Stratification of Parkinson Disease using python scikit-learn ML library." In 2019 International Conference on Emerging Trends in Science and Engineering (ICESE). IEEE, 2019. http://dx.doi.org/10.1109/icese46178.2019.9194627.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Brites, Daniel, and Mingkui Wei. "PhishFry - A Proactive Approach to Classify Phishing Sites Using SCIKIT Learn." In 2019 IEEE Globecom Workshops (GC Wkshps). IEEE, 2019. http://dx.doi.org/10.1109/gcwkshps45667.2019.9024428.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Shushkevich, Elena, Mikhail Alexandrov, and John Cardiff. "Detecting fake news about Covid-19 using classifiers from Scikit-learn." In 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT). IEEE, 2021. http://dx.doi.org/10.1109/csit52700.2021.9648767.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Jatturas, Chinnavat, Sornsawan Chokkoedsakul, Pisitpong Devahasting Na Ayudhya, Sukit Pankaew, Cherdkul Sopavanit, and Widhyakorn Asdornwised. "Recurrent Neural Networks for Environmental Sound Recognition using Scikit-learn and Tensorflow." In 2019 16th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON). IEEE, 2019. http://dx.doi.org/10.1109/ecti-con47248.2019.8955382.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hishamuddin, Muhammad Nur Fikri, Mohd Fadzil Hassan, Duc Chung Tran, and Ainul Akmar Mokhtar. "Improving Classification Accuracy of Scikit-learn Classifiers with Discrete Fuzzy Interval Values." In 2020 International Conference on Computational Intelligence (ICCI). IEEE, 2020. http://dx.doi.org/10.1109/icci51257.2020.9247696.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Susanto, Deris Stiawan, M. Agus Syamsul Arifin, Mohd Yazid Idris, and Rahmat Budiarto. "IoT Botnet Malware Classification Using Weka Tool and Scikit-learn Machine Learning." In 2020 7th International Conference on Electrical Engineering, Computer Sciences and Informatics (EECSI). IEEE, 2020. http://dx.doi.org/10.23919/eecsi50503.2020.9251304.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Elbagir, Shihab, and Jing Yang. "Sentiment Analysis of Twitter Data Using Machine Learning Techniques and Scikit-learn." In ACAI 2018: 2018 International Conference on Algorithms, Computing and Artificial Intelligence. New York, NY, USA: ACM, 2018. http://dx.doi.org/10.1145/3302425.3302492.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography