Tesi sul tema "Computer software. Software engineering. Machine learning"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Computer software. Software engineering. Machine learning".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.

1

Cao, Bingfei. "Augmenting the software testing workflow with machine learning". Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119752.

Testo completo
Abstract (sommario):
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 67-68).
This work presents the ML Software Tester, a system for augmenting software testing processes with machine learning. It allows users to plug in a Git repository of the choice, specify a few features and methods specific to that project, and create a full machine learning pipeline. This pipeline will generate software test result predictions that the user can easily integrate with their existing testing processes. To do so, a novel test result collection system was built to collect the necessary data on which the prediction models could be trained. Test data was collected for Flask, a well-known Python open-source project. This data was then fed through SVDFeature, a matrix prediction model, to generate new test result predictions. Several methods for the test result prediction procedure were evaluated to demonstrate various methods of using the system.
by Bingfei Cao.
M. Eng.
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Brun, Yuriy 1981. "Software fault identification via dynamic analysis and machine learning". Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/17939.

Testo completo
Abstract (sommario):
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.
Includes bibliographical references (p. 65-67).
I propose a technique that identifies program properties that may indicate errors. The technique generates machine learning models of run-time program properties known to expose faults, and applies these models to program properties of user-written code to classify and rank properties that may lead the user to errors. I evaluate an implementation of the technique, the Fault Invariant Classifier, that demonstrates the efficacy of the error finding technique. The implementation uses dynamic invariant detection to generate program properties. It uses support vector machine and decision tree learning tools to classify those properties. Given a set of properties produced by the program analysis, some of which are indicative of errors, the technique selects a subset of properties that are most likely to reveal an error. The experimental evaluation over 941,000 lines of code, showed that a user must examine only the 2.2 highest-ranked properties for C programs and 1.7 for Java programs to find a fault-revealing property. The technique increases the relevance (the concentration of properties that reveal errors) by a factor of 50 on average for C programs, and 4.8 for Java programs.
by Yuriy Brun.
M.Eng.
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Bayana, Sreeram. "Learning to deal with COTS (commercial off the shelf)". Morgantown, W. Va. : [West Virginia University Libraries], 2005. https://etd.wvu.edu/etd/controller.jsp?moduleName=documentdata&jsp%5FetdId=3859.

Testo completo
Abstract (sommario):
Thesis (M.S.)--West Virginia University, 2005
Title from document title page. Document formatted into pages; contains vii, 66 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 61-66).
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Liljeson, Mattias, e Alexander Mohlin. "Software defect prediction using machine learning on test and source code metrics". Thesis, Blekinge Tekniska Högskola, Institutionen för kreativa teknologier, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-4162.

Testo completo
Abstract (sommario):
Context. Software testing is the process of finding faults in software while executing it. The results of the testing are used to find and correct faults. Software defect prediction estimates where faults are likely to occur in source code. The results from the defect prediction can be used to opti- mize testing and ultimately improve software quality. Machine learning, that concerns computer programs learning from data, is used to build pre- diction models which then can be used to classify data. Objectives. In this study we, in collaboration with Ericsson, investigated whether software metrics from source code files combined with metrics from their respective tests predicts faults with better prediction perfor- mance compared to using only metrics from the source code files. Methods. A literature review was conducted to identify inputs for an ex- periment. The experiment was applied on one repository from Ericsson to identify the best performing set of metrics. Results. The prediction performance results of three metric sets are pre- sented and compared with each other. Wilcoxon’s signed rank tests are performed on four different performance measures for each metric set and each machine learning algorithm to demonstrate significant differences of the results. Conclusions. We conclude that metrics from tests can be used to predict faults. However, the combination of source code metrics and test metrics do not outperform using only source code metrics. Moreover, we conclude that models built with metrics from the test metric set with minimal infor- mation of the source code can in fact predict faults in the source code.
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Chi, Yuan. "Machine learning techniques for high dimensional data". Thesis, University of Liverpool, 2015. http://livrepository.liverpool.ac.uk/2033319/.

Testo completo
Abstract (sommario):
This thesis presents data processing techniques for three different but related application areas: embedding learning for classification, fusion of low bit depth images and 3D reconstruction from 2D images. For embedding learning for classification, a novel manifold embedding method is proposed for the automated processing of large, varied data sets. The method is based on binary classification, where the embeddings are constructed so as to determine one or more unique features for each class individually from a given dataset. The proposed method is applied to examples of multiclass classification that are relevant for large scale data processing for surveillance (e.g. face recognition), where the aim is to augment decision making by reducing extremely large sets of data to a manageable level before displaying the selected subset of data to a human operator. In addition, an indicator for a weighted pairwise constraint is proposed to balance the contributions from different classes to the final optimisation, in order to better control the relative positions between the important data samples from either the same class (intraclass) or different classes (interclass). The effectiveness of the proposed method is evaluated through comparison with seven existing techniques for embedding learning, using four established databases of faces, consisting of various poses, lighting conditions and facial expressions, as well as two standard text datasets. The proposed method performs better than these existing techniques, especially for cases with small sets of training data samples. For fusion of low bit depth images, using low bit depth images instead of full images offers a number of advantages for aerial imaging with UAVs, where there is a limited transmission rate/bandwidth. For example, reducing the need for data transmission, removing superfluous details, and reducing computational loading of on-board platforms (especially for small or micro-scale UAVs). The main drawback of using low bit depth imagery is discarding image details of the scene. Fortunately, this can be reconstructed by fusing a sequence of related low bit depth images, which have been properly aligned. To reduce computational complexity and obtain a less distorted result, a similarity transformation is used to approximate the geometric alignment between two images of the same scene. The transformation is estimated using a phase correlation technique. It is shown that that the phase correlation method is capable of registering low bit depth images, without any modi�cation, or any pre and/or post-processing. For 3D reconstruction from 2D images, a method is proposed to deal with the dense reconstruction after a sparse reconstruction (i.e. a sparse 3D point cloud) has been created employing the structure from motion technique. Instead of generating a dense 3D point cloud, this proposed method forms a triangle by three points in the sparse point cloud, and then maps the corresponding components in the 2D images back to the point cloud. Compared to the existing methods that use a similar approach, this method reduces the computational cost. Instated of utilising every triangle in the 3D space to do the mapping from 2D to 3D, it uses a large triangle to replace a number of small triangles for flat and almost flat areas. Compared to the reconstruction result obtained by existing techniques that aim to generate a dense point cloud, the proposed method can achieve a better result while the computational cost is comparable.
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Richmond, James Howard. "Bayesian Logistic Regression Models for Software Fault Localization". Case Western Reserve University School of Graduate Studies / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=case1326658577.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Kaloskampis, Ioannis. "Recognition of complex human activities in multimedia streams using machine learning and computer vision". Thesis, Cardiff University, 2013. http://orca.cf.ac.uk/59377/.

Testo completo
Abstract (sommario):
Modelling human activities observed in multimedia streams as temporal sequences of their constituent actions has been the object of much research effort in recent years. However, most of this work concentrates on tasks where the action vocabulary is relatively small and/or each activity can be performed in a limited number of ways. In this Thesis, a novel and robust framework for modelling and analysing composite, prolonged activities arising in tasks which can be effectively executed in a variety of ways is proposed. Additionally, the proposed framework is designed to handle cognitive tasks, which cannot be captured using conventional types of sensors. It is shown that the proposed methodology is able to efficiently analyse and recognise complex activities arising in such tasks and also detect potential errors in their execution. To achieve this, a novel activity classification method comprising a feature selection stage based on the novel Key Actions Discovery method and a classification stage based on the combination of Random Forests and Hierarchical Hidden Markov Models is introduced. Experimental results captured in several scenarios arising from real-life applications, including a novel application to a bridge design problem, show that the proposed framework offers higher classification accuracy compared to current activity identification schemes.
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Hossain, Md Billal. "QoS-Aware Intelligent Routing For Software Defined Networking". University of Akron / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=akron1595086618729923.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Percival, Graham Keith. "Physical modelling meets machine learning : performing music with a virtual string ensemble". Thesis, University of Glasgow, 2013. http://theses.gla.ac.uk/4253/.

Testo completo
Abstract (sommario):
This dissertation describes a new method of computer performance of bowed string instruments (violin, viola, cello) using physical simulations and intelligent feedback control. Computer synthesis of music performed by bowed string instruments is a challenging problem. Unlike instruments whose notes originate with a single discrete excitation (e.g., piano, guitar, drum), bowed string instruments are controlled with a continuous stream of excitations (i.e. the bow scraping against the string). Most existing synthesis methods utilize recorded audio samples, which perform quite well for single-excitation instruments but not continuous-excitation instruments. This work improves the realism of synthesis of violin, viola, and cello sound by generating audio through modelling the physical behaviour of the instruments. A string's wave equation is decomposed into 40 modes of vibration, which can be acted upon by three forms of external force: A bow scraping against the string, a left-hand finger pressing down, and/or a right-hand finger plucking. The vibration of each string exerts force against the instrument bridge; these forces are summed and convolved with the instrument body impulse response to create the final audio output. In addition, right-hand haptic output is created from the force of the bow against the string. Physical constants from ten real instruments (five violins, two violas, and three cellos) were measured and used in these simulations. The physical modelling was implemented in a high-performance library capable of simulating audio on a desktop computer one hundred times faster than real-time. The program also generates animated video of the instruments being performed. To perform music with the physical models, a virtual musician interprets the musical score and generates actions which are then fed into the physical model. The resulting audio and haptic signals are examined with a support vector machine, which adjusts the bow force in order to establish and maintain a good timbre. This intelligent feedback control is trained with human input, but after the initial training is completed the virtual musician performs autonomously. A PID controller is used to adjust the position of the left-hand finger to correct any flaws in the pitch. Some performance parameters (initial bow force, force correction, and lifting factors) require an initial value for each string and musical dynamic; these are calibrated automatically using the previously-trained support vector machines. The timbre judgements are retained after each performance and are used to pre-emptively adjust bowing parameters to avoid or mitigate problematic timbre for future performances of the same music. The system is capable of playing sheet music with approximately the same ability level as a human music student after two years of training. Due to the number of instruments measured and the generality of the machine learning, music can be performed with ensembles of up to ten stringed instruments, each with a distinct timbre. This provides a baseline for future work in computer control and expressive music performance of virtual bowed string instruments.
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Osgood, Thomas J. "Semantic labelling of road scenes using supervised and unsupervised machine learning with lidar-stereo sensor fusion". Thesis, University of Warwick, 2013. http://wrap.warwick.ac.uk/60439/.

Testo completo
Abstract (sommario):
At the highest level the aim of this thesis is to review and develop reliable and efficient algorithms for classifying road scenery primarily using vision based technology mounted on vehicles. The purpose of this technology is to enhance vehicle safety systems in order to prevent accidents which cause injuries to drivers and pedestrians. This thesis uses LIDAR–stereo sensor fusion to analyse the scene in the path of the vehicle and apply semantic labels to the different content types within the images. It details every step of the process from raw sensor data to automatically labelled images. At each stage of the process currently used methods are investigated and evaluated. In cases where existingmethods do not produce satisfactory results improvedmethods have been suggested. In particular, this thesis presents a novel, automated,method for aligning LIDAR data to the stereo camera frame without the need for specialised alignment grids. For image segmentation a hybrid approach is presented, combining the strengths of both edge detection and mean-shift segmentation. For texture analysis the presented method uses GLCM metrics which allows texture information to be captured and summarised using only four feature descriptors compared to the 100’s produced by SURF descriptors. In addition to texture descriptors, the ìD information provided by the stereo system is also exploited. The segmented point cloud is used to determine orientation and curvature using polynomial surface fitting, a technique not yet applied to this application. Regarding classification methods a comprehensive study was carried out comparing the performance of the SVM and neural network algorithms for this particular application. The outcome shows that for this particular set of learning features the SVM classifiers offer slightly better performance in the context of image and depth based classification which was not made clear in existing literature. Finally a novel method of making unsupervised classifications is presented. Segments are automatically grouped into sub-classes which can then be mapped to more expressive super-classes as needed. Although the method in its current state does not yet match the performance of supervised methods it does produce usable classification results without the need for any training data. In addition, the method can be used to automatically sub-class classes with significant inter-class variation into more specialised groups prior to being used as training targets in a supervised method.
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Grills, Blake E. "Automatic Identification and Analysis of Commented Out Code". Bowling Green State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1587646144001317.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
12

Shahdad, Mir Abubakr. "Engineering innovation (TRIZ based computer aided innovation)". Thesis, University of Plymouth, 2015. http://hdl.handle.net/10026.1/3317.

Testo completo
Abstract (sommario):
This thesis describes the approach and results of the research to create a TRIZ based computer aided innovation tools (AEGIS and Design for Wow). This research has mainly been based around two tools created under this research: called AEGIS (Accelerated Evolutionary Graphics Interface System), and Design for Wow. Both of these tools are discussed in this thesis in detail, along with the test data, design methodology, test cases, and research. Design for Wow (http://www.designforwow.com) is an attempt to summarize the successful inventions/ designs from all over the world on a web portal which has multiple capabilities. These designs/innovations are then linked to the TRIZ Principles in order to determine whether innovative aspects of these successful innovations are fully covered by the forty TRIZ principles. In Design for Wow, a framework is created which is implemented through a review tool. The Design for Wow website includes this tool which has been used by researcher and the users of the site and reviewers to analyse the uploaded data in terms of strength of TRIZ Principles linked to them. AEGIS (Accelerated Evolutionary Graphics Interface System) is a software tool developed under this research aimed to help the graphic designers to make innovative graphic designs. Again it uses the forty TRIZ Principles as a set of guiding rules in the software. AEGIS creates graphic design prototypes according to the user input and uses TRIZ Principles framework as a guide to generate innovative graphic design samples. The AEGIS tool created is based on TRIZ Principles discussed in Chapter 3 (a subset of them). In AEGIS, the TRIZ Principles are used to create innovative graphic design effects. The literature review on innovative graphic design (in chapter 3) has been analysed for links with TRIZ Principles and then the DNA of AEGIS has been built on the basis of this study. Results from various surveys/ questionnaires indicated were used to collect the innovative graphic design samples and then TRIZ was mapped to it (see section 3.2). The TRIZ effects were mapped to the basic graphic design elements and the anatomy of the graphic design letters was studied to analyse the TRIZ effects in the collected samples. This study was used to build the TRIZ based AEGIS tool. Hence, AEGIS tool applies the innovative effects using TRIZ to basic graphic design elements (as described in section 3.3). the working of AEGIS is designed based on Genetic Algorithms coded specifically to implement TRIZ Principles specialized for Graphic Design, chapter 4 discusses the process followed to apply TRIZ Principles to graphic design and coding them using Genetic Algorithms, hence resulting in AEGIS tool. Similarly, in Design for Wow, the content uploaded has been analysed for its link with TRIZ Principles (see section 3.1 for TRIZ Principles). The tool created in Design for Wow is based on the framework of analysing the TRIZ links in the uploaded content. The ‘Wow’ concept discussed in the section 5.1 and 5.2 is the basis of the concept of Design for Wow website, whereby the users upload the content they classify as ‘Wow’. This content then is further analysed for the ‘Wow factor’ and then mapped to TRIZ Principles as TRIZ tagging methodology is framed (section 5.5). From the results of the research, it appears that the TRIZ Principles are a comprehensive set of innovation basic building blocks. Some surveys suggest that amongst other tools, TRIZ Principles were the first choice and used most .They have thus the potential of being used in other innovation domains, to help in their analysis, understanding and potential development.
Gli stili APA, Harvard, Vancouver, ISO e altri
13

Ewö, Christian. "A machine learning approach in financial markets". Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik och datavetenskap, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-5571.

Testo completo
Abstract (sommario):
In this work we compare the prediction performance of three optimized technical indicators with a Support Vector Machine Neural Network. For the indicator part we picked the common used indicators: Relative Strength Index, Moving Average Convergence Divergence and Stochastic Oscillator. For the Support Vector Machine we used a radial-basis kernel function and regression mode. The techniques were applied on financial time series brought from the Swedish stock market. The comparison and the promising results should be of interest for both finance people using the techniques in practice, as well as software companies and similar considering to implement the techniques in their products.
Gli stili APA, Harvard, Vancouver, ISO e altri
14

Robeson, Aaron. "Airwaves: A Broadcasting Web Application Supplemented by a Neural Network Transcription Model". Ohio University Honors Tutorial College / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ouhonors155603038153628.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
15

Artchounin, Daniel. "Tuning of machine learning algorithms for automatic bug assignment". Thesis, Linköpings universitet, Programvara och system, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139230.

Testo completo
Abstract (sommario):
In software development projects, bug triage consists mainly of assigning bug reports to software developers or teams (depending on the project). The partial or total automation of this task would have a positive economic impact on many software projects. This thesis introduces a systematic four-step method to find some of the best configurations of several machine learning algorithms intending to solve the automatic bug assignment problem. These four steps are respectively used to select a combination of pre-processing techniques, a bug report representation, a potential feature selection technique and to tune several classifiers. The aforementioned method has been applied on three software projects: 66 066 bug reports of a proprietary project, 24 450 bug reports of Eclipse JDT and 30 358 bug reports of Mozilla Firefox. 619 configurations have been applied and compared on each of these three projects. In production, using the approach introduced in this work on the bug reports of the proprietary project would have increased the accuracy by up to 16.64 percentage points.
Gli stili APA, Harvard, Vancouver, ISO e altri
16

Le, Khanh Duc. "A Study of Face Embedding in Face Recognition". DigitalCommons@CalPoly, 2019. https://digitalcommons.calpoly.edu/theses/1989.

Testo completo
Abstract (sommario):
Face Recognition has been a long-standing topic in computer vision and pattern recognition field because of its wide and important applications in our daily lives such as surveillance system, access control, and so on. The current modern face recognition model, which keeps only a couple of images per person in the database, can now recognize a face with high accuracy. Moreover, the model does not need to be retrained every time a new person is added to the database. By using the face dataset from Digital Democracy, the thesis will explore the capability of this model by comparing it with the standard convolutional neural network based on pose variations and training set sizes. First, we compare different types of pose to see their effect on the accuracy of the algorithm. Second, we train the system using different number of training images per person to see how many training samples are actually needed to maintain a reasonable accuracy. Finally, to push the limit, we decide to train the model using only a single image per person with the help of a face generation technique to synthesize more faces. The performance obtained by this integration is found to be competitive with the previous results, which are trained on multiple images.
Gli stili APA, Harvard, Vancouver, ISO e altri
17

Fahlén, Erik. "Androidapplikation för digitalisering av formulär : Minimering av inlärningstid, kostnad och felsannolikhet". Thesis, Mittuniversitetet, Avdelningen för informationssystem och -teknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-35623.

Testo completo
Abstract (sommario):
This study was performed by creating an android application that uses custom object recognition to scan and digitalize a series of checkbox form for example to correct multiple-choice questions or collect forms in a spreadsheet. The purpose with this study was to see which dataset and hardware with the machine learning library TensorFlow was cheapest, price worthy, enough reliable and fastest. A dataset of filled example forms with annotated checkboxes was created and used in the learning process. The model that was used for the object recognition was Single Show MultiBox Detector, MobileNet version, because it can detect multiple objects in the same image as well as it doesn’t have as high hardware requirements making it fitted for phones. The learning process was done in Google Clouds Machine Learning Engine with different image resolutions and cloud configurations. After the learning process on the cloud the finished TensorFlow model was converted to the TensorFlow Lite model that gets used in phones. The TensorFlow Lite model was used in the compilation of the android application so that the object recognition could work. The android application worked and could recognize the inputs in the checkbox form. Different image resolutions and cloud configurations during the learning process gave different results when it comes to which one was fastest and cheapest. In the end the conclusion was that Googles hardware setup STANDARD_1 was 20% faster than BASIC that was 91% cheaper and more price worthy with this dataset.
Denna studie genomfördes genom att skapa en fungerande androidapplikation som använder sig av en anpassad objektigenkänning för att skanna och digitalisera en serie av kryssruteformulär exempelvis för att rätta flervalsfrågor eller sammanställa enkäter i ett kalkylark. Syftet med undersökningen var att se vilka datauppsättningar och hårdvara med maskininlärningsbiblioteket TensorFlow som var billigast, mest prisvärd, tillräcklig tillförlitlig och snabbast. En datauppsättning av ifyllda exempelformulär med klassificerade kryssrutor skapades och användes i inlärningsprocessen. Modellen som användes för objektigenkänningen blev Single Shot MultiBox Detector, version MobileNet, för att denna kan känna igen flera objekt i samma bild samt att den inte har lika höga hårdvarukrav vilket gör den anpassad för mobiltelefoner. Inlärningsprocessen utfördes i Google Clouds Machine Learning Engine med olika bildupplösningar och molnkonfiguration. Efter inlärningsprocessen på molnet konverterades den färdiga TensorFlow- modellen till en TensorFlow Lite-modell som används i mobiltelefoner. TensorFlow Lite-modellen användes i kompileringen av androidapplikationen för att objektigenkänningen skulle fungera. Androidapplikationen fungerade och kunde känna igen alla inmatningar i kryssruteformuläret. Olika bildupplösningar och molnkonfigurationer under inlärningsprocessen gav olika resultat när det gäller vilken som var snabbast eller billigast. I slutändan drogs slutsatsen att Googles hårdvaruuppsättning STANDARD_1 var 20% snabbare än BASIC som var 91% billigare och mest prisvärd med denna datauppsättning.
Gli stili APA, Harvard, Vancouver, ISO e altri
18

Hsu, Samantha. "CLEAVER: Classification of Everyday Activities Via Ensemble Recognizers". DigitalCommons@CalPoly, 2018. https://digitalcommons.calpoly.edu/theses/1960.

Testo completo
Abstract (sommario):
Physical activity can have immediate and long-term benefits on health and reduce the risk for chronic diseases. Valid measures of physical activity are needed in order to improve our understanding of the exact relationship between physical activity and health. Activity monitors have become a standard for measuring physical activity; accelerometers in particular are widely used in research and consumer products because they are objective, inexpensive, and practical. Previous studies have experimented with different monitor placements and classification methods. However, the majority of these methods were developed using data collected in controlled, laboratory-based settings, which is not reliably representative of real life data. Therefore, more work is required to validate these methods in free-living settings. For our work, 25 participants were directly observed by trained observers for two two-hour activity sessions over a seven day timespan. During the sessions, the participants wore accelerometers on the wrist, thigh, and chest. In this thesis, we tested a battery of machine learning techniques, including a hierarchical classification schema and a confusion matrix boosting method to predict activity type, activity intensity, and sedentary time in one-second intervals. To do this, we created a dataset containing almost 100 hours worth of observations from three sets of accelerometer data from an ActiGraph wrist monitor, a BioStampRC thigh monitor, and a BioStampRC chest monitor. Random forest and k-nearest neighbors are shown to consistently perform the best out of our traditional machine learning techniques. In addition, we reduce the severity of error from our traditional random forest classifiers on some monitors using a hierarchical classification approach, and combat the imbalanced nature of our dataset using a multi-class (confusion matrix) boosting method. Out of the three monitors, our models most accurately predict activity using either or both of the BioStamp accelerometers (with the exception of the chest BioStamp predicting sedentary time). Our results show that we outperform previous methods while still predicting behavior at a more granular level.
Gli stili APA, Harvard, Vancouver, ISO e altri
19

Gokyer, Gokhan. "Identifying Architectural Concerns From Non-functional Requirements Using Support Vector Machine". Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609964/index.pdf.

Testo completo
Abstract (sommario):
There has been no commonsense on how to identify problem domain concerns in architectural modeling of software systems. Even, there is no commonly accepted method for modeling the Non-Functional Requirements (NFRs) effectively associated with the architectural aspects in the solution domain. This thesis introduces the use of a Machine Learning (ML) method based on Support Vector Machines to relate NFRs to classified "
architectural concerns"
in an automated way. This method uses Natural Language Processing techniques to fragment the plain NFR texts under the supervision of domain experts. The contribution of this approach lies in continuously applying ML techniques against previously discovered &ldquo
NFR - architectural concerns&rdquo
associations to improve the intelligence of repositories for requirements engineering. The study illustrates a charted roadmap and demonstrates the automated requirements engineering toolset for this roadmap. It also validates the approach and effectiveness of the toolset on the snapshot of a real-life project.
Gli stili APA, Harvard, Vancouver, ISO e altri
20

Ekström, Hagevall Adam, e Carl Wikström. "Increasing Reproducibility Through Provenance, Transparency and Reusability in a Cloud-Native Application for Collaborative Machine Learning". Thesis, Uppsala universitet, Avdelningen för datorteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-435349.

Testo completo
Abstract (sommario):
The purpose of this thesis paper was to develop new features in the cloud-native and open-source machine learning platform STACKn, aiming to strengthen the platform's support for conducting reproducible machine learning experiments through provenance, transparency and reusability. Adhering to the definition of reproducibility as the ability of independent researchers to exactly duplicate scientific results with the same material as in the original experiment, two concepts were explored as alternatives for this specific goal: 1) Increased support for standardized textual documentation of machine learning models and their corresponding datasets; and 2) Increased support for provenance to track the lineage of machine learning models by making code, data and metadata readily available and stored for future reference. We set out to investigate to what degree these features could increase reproducibility in STACKn, both when used in isolation and when combined.  When these features had been implemented through an exhaustive software engineering process, an evaluation of the implemented features was conducted to quantify the degree of reproducibility that STACKn supports. The evaluation showed that the implemented features, especially provenance features, substantially increase the possibilities to conduct reproducible experiments in STACKn, as opposed to when none of the developed features are used. While the employed evaluation method was not entirely objective, these features are clearly a good first initiative in meeting current recommendations and guidelines on how computational science can be made reproducible.
Gli stili APA, Harvard, Vancouver, ISO e altri
21

Waqas, Muhammad. "A simulation-based approach to test the performance of large-scale real time software systems". Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20133.

Testo completo
Abstract (sommario):
Background: A real-time system operates with time constraints, and its correctness depends upon the time on which results are generated. Different industries use different types of real-time systems such as telecommunication, air traffic control systems, generation of power, and spacecraft system. There is a category of real-time systems that are required to handle millions of users and operations at the same time; those systems are called large scale real-time systems. In the telecommunication sector, many real-time systems are large scale, as they need to handle millions of users and resources in parallel. Performance is an essential aspect of this type of system; unpredictable behavior queue cost millions of dollars for telecom operators in a matter of seconds. The problem is that existing models for performance analysis of these types of systems are not cost-effective and require lots of knowledge to deploy. In this context, we have developed a performance simulator tool that is based on the XgBoost, Random Forest, and Decision Tree modeling. Objectives: The thesis aims to develop a cost-effective approach to support the analysis of the performance of large-scale real-time telecommunication systems. The idea is to develop and implement a solution to simulate the telecommunication system using some of the most promising identified factors that affect the performance of the system. Methods: We have performed an improvement case study in Ericsson. The identification of performance factors is found through a dataset generated in a performance testing session, the investigation conducted on the same system, and unstructured interviews with the system experts. The approach was selected through a literature review. Validation of the Performance Simulator performed through static analysis and user feedback received from the questionnaire. Results: The results show that Performance Simulator could be helpful to performance analysis of large-scale real-time telecommunication systems. Performance Simulator ability to endorse performance analysis of other real-time systems is a collection of multiple opinions. Conclusions: The developed and validated approach demonstrates potential usefulness in performance analysis and can benefit significantly from further enhancements. The specific amount of data used for training might impact the generalization of the research on other real-time systems. In the future, this study can establish with more numbers of input on real-time systems on a large scale.
Gli stili APA, Harvard, Vancouver, ISO e altri
22

Salem, Tawfiq. "Learning to Map the Visual and Auditory World". UKnowledge, 2019. https://uknowledge.uky.edu/cs_etds/86.

Testo completo
Abstract (sommario):
The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Billions of images that capture this complex relationship are uploaded to social-media websites every day and often are associated with precise time and location metadata. This rich source of data can be beneficial to improve our understanding of the globe. In this work, we propose a general framework that uses these publicly available images for constructing dense maps of different ground-level attributes from overhead imagery. In particular, we use well-defined probabilistic models and a weakly-supervised, multi-task training strategy to provide an estimate of the expected visual and auditory ground-level attributes consisting of the type of scenes, objects, and sounds a person can experience at a location. Through a large-scale evaluation on real data, we show that our learned models can be used for applications including mapping, image localization, image retrieval, and metadata verification.
Gli stili APA, Harvard, Vancouver, ISO e altri
23

França, André Luiz Pereira de. "Estudo, desenvolvimento e implementação de algoritmos de aprendizagem de máquina, em software e hardware, para detecção de intrusão de rede: uma análise de eficiência energética". Universidade Tecnológica Federal do Paraná, 2015. http://repositorio.utfpr.edu.br/jspui/handle/1/1166.

Testo completo
Abstract (sommario):
CAPES; CNPq
O constante aumento na velocidade da rede, o número de ataques e a necessidade de eficiência energética estão fazendo com que a segurança de rede baseada em software chegue ao seu limite. Um tipo comum de ameaça são os ataques do tipo probing, nos quais um atacante procura vulnerabilidades a partir do envio de pacotes de sondagem a uma máquina-alvo. Este trabalho apresenta o estudo, o desenvolvimento e a implementação de um algoritmo de extração de características dos pacotes da rede em hardware e de três classificadores de aprendizagem de máquina (Árvore de Decisão, Naive Bayes e k-vizinhos mais próximos), em software e hardware, para a detecção de ataques do tipo probing. O trabalho apresenta, ainda resultados detalhados de acurácia de classificação, taxa de transferência e consumo de energia para cada implementação.
The increasing network speeds, number of attacks, and need for energy efficiency are pushing software-based network security to its limits. A common kind of threat is probing attacks, in which an attacker tries to find vulnerabilities by sending a series of probe packets to a target machine. This work presents the study, development, and implementation of a network packets feature extraction algorithm in hardware and three machine learning classifiers (Decision Tree, Naive Bayes, and k-nearest neighbors), in software and hardware, for the detection of probing attacks. The work also presents detailed results of classification accuracy, throughput, and energy consumption for each implementation.
Gli stili APA, Harvard, Vancouver, ISO e altri
24

Håkansson, Fredrik, e Carl-Johan Larsson. "User-Based Predictive Caching of Streaming Media". Thesis, Linköpings universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-151008.

Testo completo
Abstract (sommario):
Streaming media is a growing market all over the world which sets a strict requirement on mobile connectivity. The foundation for a good user experience when supplying a streaming media service on a mobile device is to ensure that the user can access the requested content. Due to the varying availability of mobile connectivity measures has to be taken to remove as much dependency as possible on the quality of the connection. This thesis investigates the use of a Long Short-Term Memory machine learning model for predicting a future geographical location for a mobile device. The predicted location in combination with information about cellular connectivity in the geographical area is used to schedule prefetching of media content in order to improve user experience and to reduce mobile data usage. The Long Short-Term Memory model suggested in this thesis achieves an accuracy of 85.15% averaged over 20000 routes and the predictive caching managed to retain user experience while decreasing the amount of data consumed.

This thesis is written as a joint thesis between two students from different universities. This means the exact same thesis is published at two universities (LiU and KTH) but with different style templates. The other report has identification number: TRITA-EECS-EX-2018:403

Gli stili APA, Harvard, Vancouver, ISO e altri
25

Farhat, Md Tanzin. "An Artificial Neural Network based Security Approach of Signal Verification in Cognitive Radio Network". University of Toledo / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=toledo153511563131623.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
26

Bessinger, Zachary. "An Automatic Framework for Embryonic Localization Using Edges in a Scale Space". TopSCHOLAR®, 2013. http://digitalcommons.wku.edu/theses/1262.

Testo completo
Abstract (sommario):
Localization of Drosophila embryos in images is a fundamental step in an automatic computational system for the exploration of gene-gene interaction on Drosophila. Contour extraction of embryonic images is challenging due to many variations in embryonic images. In the thesis work, we develop a localization framework based on the analysis of connected components of edge pixels in a scale space. We propose criteria to select optimal scales for embryonic localization. Furthermore, we propose a scale mapping strategy to compress the range of a scale space in order to improve the efficiency of the localization framework. The effectiveness of the proposed framework and the scale mapping strategy are validated in our experiments.
Gli stili APA, Harvard, Vancouver, ISO e altri
27

Partin, Michael. "Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems". Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1567073723628721.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
28

Pradhan, Shameer Kumar. "Investigation of Event-Prediction in Time-Series Data : How to organize and process time-series data for event prediction?" Thesis, Högskolan Kristianstad, Fakulteten för naturvetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hkr:diva-19416.

Testo completo
Abstract (sommario):
The thesis determines the type of deep learning algorithms to compare for a particular dataset that contains time-series data. The research method includes study of multiple literatures and conduction of 12 tests. It deals with the organization and processing of the data so as to prepare the data for prediction of an event in the time-series. It also includes the explanation of the algorithms selected. Similarly, it provides a detailed description of the steps taken for classification and prediction of the event. It includes the conduction of multiple tests for varied timeframe in order to compare which algorithm provides better results in different timeframes. The comparison between the selected two deep learning algorithms identified that for shorter timeframes Convolutional Neural Networks performs better and for longer timeframes Recurrent Neural Networks has higher accuracy in the provided dataset. Furthermore, it discusses possible improvements that can be made to the experiments and the research as a whole.
Gli stili APA, Harvard, Vancouver, ISO e altri
29

Santiago, Dionny. "A Model-Based AI-Driven Test Generation System". FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3878.

Testo completo
Abstract (sommario):
Achieving high software quality today involves manual analysis, test planning, documentation of testing strategy and test cases, and development of automated test scripts to support regression testing. This thesis is motivated by the opportunity to bridge the gap between current test automation and true test automation by investigating learning-based solutions to software testing. We present an approach that combines a trainable web component classifier, a test case description language, and a trainable test generation and execution system that can learn to generate new test cases. Training data was collected and hand-labeled across 7 systems, 95 web pages, and 17,360 elements. A total of 250 test flows were also manually hand-crafted for training purposes. Various machine learning algorithms were evaluated. Results showed that Random Forest classifiers performed well on several web component classification problems. In addition, Long Short-Term Memory neural networks were able to model and generate new valid test flows.
Gli stili APA, Harvard, Vancouver, ISO e altri
30

van, Schaik Sebastiaan Johannes. "A framework for processing correlated probabilistic data". Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:91aa418d-536e-472d-9089-39bef5f62e62.

Testo completo
Abstract (sommario):
The amount of digitally-born data has surged in recent years. In many scenarios, this data is inherently uncertain (or: probabilistic), such as data originating from sensor networks, image and voice recognition, location detection, and automated web data extraction. Probabilistic data requires novel and different approaches to data mining and analysis, which explicitly account for the uncertainty and the correlations therein. This thesis introduces ENFrame: a framework for processing and mining correlated probabilistic data. Using this framework, it is possible to express both traditional and novel algorithms for data analysis in a special user language, without having to explicitly address the uncertainty of the data on which the algorithms operate. The framework will subsequently execute the algorithm on the probabilistic input, and perform exact or approximate parallel probability computation. During the probability computation, correlations and provenance are succinctly encoded using probabilistic events. This thesis contains novel contributions in several directions. An expressive user language – a subset of Python – is introduced, which allows a programmer to implement algorithms for probabilistic data without requiring knowledge of the underlying probabilistic model. Furthermore, an event language is presented, which is used for the probabilistic interpretation of the user program. The event language can succinctly encode arbitrary correlations using events, which are the probabilistic counterparts of deterministic user program variables. These highly interconnected events are stored in an event network, a probabilistic interpretation of the original user program. Multiple techniques for exact and approximate probability computation (with error guarantees) of such event networks are presented, as well as techniques for parallel computation. Adaptations of multiple existing data mining algorithms are shown to work in the framework, and are subsequently subjected to an extensive experimental evaluation. Additionally, a use-case is presented in which a probabilistic adaptation of a clustering algorithm is used to predict faults in energy distribution networks. Lastly, this thesis presents techniques for integrating a number of different probabilistic data formalisms for use in this framework and in other applications.
Gli stili APA, Harvard, Vancouver, ISO e altri
31

Vatandoust, Arman. "Machine Learning for Software Bug Categorization". Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-395253.

Testo completo
Abstract (sommario):
The pursuit of flawless software is often an exhausting task for software developers. Code defects can range from soft issues to hard issues that lead to unforgiving consequences. DICE have their own system which automatically collects these defects which are grouped into buckets, however, this system suffers from the flaw of sometimes incorrectly grouping unrelated issues, and missing apparent duplicates. This time-consuming flaw puts excessive work for software developers and leads to wasted resources in the company. These flaws also impact the data quality of the system's defects tracking datasets which turn into a never-ending vicious circle. In this thesis, we investigate the method of measuring the similarity between reports in order to reduce incorrectly grouped issues and duplicate reports. Prototype models have been built for bug categorization and bucketing using convolutional neural networks. For each report, the prototype is able to provide developers with candidates of related issues with likelihood metric whether the issues are related. The similarity measurement is made in the representation phase of the neural networks, which we call the latent space. We also use Kullback–Leibler divergence in this space in order to get better similarity metrics. The results show important findings and insights for further improvement in the future. In addition to this, we discuss methods and strategies for detecting outliers using Mahalanobis distance in order to prevent incorrectly grouped reports.
Gli stili APA, Harvard, Vancouver, ISO e altri
32

Khan, Mohammed Salman. "A Topic Modeling approach for Code Clone Detection". UNF Digital Commons, 2019. https://digitalcommons.unf.edu/etd/874.

Testo completo
Abstract (sommario):
In this thesis work, the potential benefits of Latent Dirichlet Allocation (LDA) as a technique for code clone detection has been described. The objective is to propose a language-independent, effective, and scalable approach for identifying similar code fragments in relatively large software systems. The main assumption is that the latent topic structure of software artifacts gives an indication of the presence of code clones. It can be hypothesized that artifacts with similar topic distributions contain duplicated code fragments and to prove this hypothesis, an experimental investigation using multiple datasets from various application domains were conducted. In addition, CloneTM, an LDA-based working prototype for code clone detection was developed. Results showed that, if calibrated properly, topic modeling can deliver a satisfactory performance in capturing different types of code clones, showing particularity good performance in detecting Type III clones. CloneTM also achieved levels of performance comparable to already existing practical tools that adopt different clone detection strategies.
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Hickman, Björn, e Victor Holmqvist. "Predict future software defects through machine learning". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301864.

Testo completo
Abstract (sommario):
The thesis aims to investigate the implications of software defect predictions through machine learning on project management. In addition, the study aims to examine what features of a code base that are useful for making such predictions. The features examined are of both organisational and technical nature, indicated to correlate with the introductions of software defects by previous studies. The machine learning algorithms used in the study are Random forest, logistic regression and naive Bayes. The data was collected from an open source git-repository, VSCode, where the correct classifications of reported defects originated from GitHub-Issues. The results of the study indicate that both technical features of a code base, as well as organisational factors can be useful when predicting future software defects. All three algorithms showed similar performance. Furthermore, the ML-models presented in this study show some promise as a complementary tool in project management decision making, more specifically decisions regarding planning, risk assessment and resource allocation. However, further studies in this area are of interest, in order to confirm the findings of this study and it’s limitations.
Rapportens mål var att undersöka potentiella effekter av att predicera mjukvarudefekter i ett mjukvaruprojekt. Detta genomfördes med hjälp av maskininlärning. Vidare undersöker studien vilka särdrag hos en kodbas som är av intresse för att genomföra dessa prediktioner. De undersökta särdrag som användes för att träna modellerna var av både teknisk såväl som organisatorisk karaktär. Modellerna som användes var Random forest, logistisk regression och naive Bayes. Data hämtades från ett open source git-repository, VSCode, där korrekta klassificeringar av rapporterade defekter hämtades från GitHub-Issues. Rapportens resultat ger indikationer på att både tekniska och organisatoriska särdrag är av relevans. Samtliga tre modeller påvisade liknande resultat. Vidare kan modellernas resultat visa stöd för att användas som ett komplementärt verktyg vid projektledning av mjukvaruprojekt. Närmare bestämt stöd vid riskplanering, riskbedömning och vid resursallokering. Vidare skulle fortsatta studier inom detta område vara av intresse för att bekräfta denna studies slutsatser.
Gli stili APA, Harvard, Vancouver, ISO e altri
34

Watson, Cody. "Deep Learning In Software Engineering". W&M ScholarWorks, 2020. https://scholarworks.wm.edu/etd/1616444371.

Testo completo
Abstract (sommario):
Software evolves and therefore requires an evolving field of Software Engineering. The evolution of software can be seen on an individual project level through the software life cycle, as well as on a collective level, as we study the trends and uses of software in the real world. As the needs and requirements of users change, so must software evolve to reflect those changes. This cycle is never ending and has led to continuous and rapid development of software projects. More importantly, it has put a great responsibility on software engineers, causing them to adopt practices and tools that allow them to increase their efficiency. However, these tools suffer the same fate as software designed for the general population; they need to change in order to reflect the user’s needs. Fortunately, the demand for this evolving software has given software engineers a plethora of data and artifacts to analyze. The challenge arises when attempting to identify and apply patterns learned from the vast amount of data. In this dissertation, we explore and develop techniques to take advantage of the vast amount of software data and to aid developers in software development tasks. Specifically, we exploit the tool of deep learning to automatically learn patterns discovered within previous software data and automatically apply those patterns to present day software development. We first set out to investigate the current impact of deep learning in software engineering by performing a systematic literature review of top tier conferences and journals. This review provides guidelines and common pitfalls for researchers to consider when implementing DL (Deep Learning) approaches in SE (Software Engineering). In addition, the review provides a research road map for areas within SE where DL could be applicable. Our next piece of work developed an approach that simultaneously learned different representations of source code for the task of clone detection. We found that the use of multiple representations, such as Identifiers, ASTs, CFGs and bytecode, can lead to the identification of similar code fragments. Through the use of deep learning strategies, we automatically learned these different representations without the requirement of hand-crafted features. Lastly, we designed a novel approach for automating the generation of assert statements through seq2seq learning, with the goal of increasing the efficiency of software testing. Given the test method and the context of the associated focal method, we automatically generated semantically and syntactically correct assert statements for a given, unseen test method. We exemplify that the techniques presented in this dissertation provide a meaningful advancement to the field of software engineering and the automation of software development tasks. We provide analytical evaluations and empirical evidence that substantiate the impact of our findings and usefulness of our approaches toward the software engineering community.
Gli stili APA, Harvard, Vancouver, ISO e altri
35

Martin, Andrew Philip. "Machine-assisted theorem-proving for software engineering". Thesis, University of Oxford, 1994. http://ora.ox.ac.uk/objects/uuid:728d3cee-1dfe-4186-a49f-52b33cbc6551.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
36

Husseini, Orabi Ahmed. "Multi-Modal Technology for User Interface Analysis including Mental State Detection and Eye Tracking Analysis". Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36451.

Testo completo
Abstract (sommario):
We present a set of easy-to-use methods and tools to analyze human attention, behaviour, and physiological responses. A potential application of our work is evaluating user interfaces being used in a natural manner. Our approach is designed to be scalable and to work remotely on regular personal computers using expensive and noninvasive equipment. The data sources our tool processes are nonintrusive, and captured from video; i.e. eye tracking, and facial expressions. For video data retrieval, we use a basic webcam. We investigate combinations of observation modalities to detect and extract affective and mental states. Our tool provides a pipeline-based approach that 1) collects observational, data 2) incorporates and synchronizes the signal modality mentioned above, 3) detects users' affective and mental state, 4) records user interaction with applications and pinpoints the parts of the screen users are looking at, 5) analyzes and visualizes results. We describe the design, implementation, and validation of a novel multimodal signal fusion engine, Deep Temporal Credence Network (DTCN). The engine uses Deep Neural Networks to provide 1) a generative and probabilistic inference model, and 2) to handle multimodal data such that its performance does not degrade due to the absence of some modalities. We report on the recognition accuracy of basic emotions for each modality. Then, we evaluate our engine in terms of effectiveness of recognizing basic six emotions and six mental states, which are agreeing, concentrating, disagreeing, interested, thinking, and unsure. Our principal contributions include the implementation of a 1) multimodal signal fusion engine, 2) real time recognition of affective and primary mental states from nonintrusive and inexpensive modality, 3) novel mental state-based visualization techniques, 3D heatmaps, 3D scanpaths, and widget heatmaps that find parts of the user interface where users are perhaps unsure, annoyed, frustrated, or satisfied.
Gli stili APA, Harvard, Vancouver, ISO e altri
37

Jonsson, Nicklas. "Ways to use Machine Learning approaches for software development". Thesis, Umeå universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-152812.

Testo completo
Abstract (sommario):
With the rise of machine learning and in particular deep learning entering all different types of fields, including software development. It could be a bit hard to know where to begin to search for the tools when someone wants to use machine learning for a one’s problems. This thesis has looked at some available technologies of today for applying machine learning to one’s applications. This thesis has looked at some of the available cloud services, frame works, and libraries for machine learning and it presents three different implementation structures that can be used with these technologies for the problem of image classification.
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Sharma, Oliver. "Detecting worm mutations using machine learning". Thesis, University of Glasgow, 2008. http://theses.gla.ac.uk/469/.

Testo completo
Abstract (sommario):
Worms are malicious programs that spread over the Internet without human intervention. Since worms generally spread faster than humans can respond, the only viable defence is to automate their detection. Network intrusion detection systems typically detect worms by examining packet or flow logs for known signatures. Not only does this approach mean that new worms cannot be detected until the corresponding signatures are created, but that mutations of known worms will remain undetected because each mutation will usually have a different signature. The intuitive and seemingly most effective solution is to write more generic signatures, but this has been found to increase false alarm rates and is thus impractical. This dissertation investigates the feasibility of using machine learning to automatically detect mutations of known worms. First, it investigates whether Support Vector Machines can detect mutations of known worms. Support Vector Machines have been shown to be well suited to pattern recognition tasks such as text categorisation and hand-written digit recognition. Since detecting worms is effectively a pattern recognition problem, this work investigates how well Support Vector Machines perform at this task. The second part of this dissertation compares Support Vector Machines to other machine learning techniques in detecting worm mutations. Gaussian Processes, unlike Support Vector Machines, automatically return confidence values as part of their result. Since confidence values can be used to reduce false alarm rates, this dissertation determines how Gaussian Process compare to Support Vector Machines in terms of detection accuracy. For further comparison, this work also compares Support Vector Machines to K-nearest neighbours, known for its simplicity and solid results in other domains. The third part of this dissertation investigates the automatic generation of training data. Classifier accuracy depends on good quality training data -- the wider the training data spectrum, the higher the classifier's accuracy. This dissertation describes the design and implementation of a worm mutation generator whose output is fed to the machine learning techniques as training data. This dissertation then evaluates whether the training data can be used to train classifiers of sufficiently high quality to detect worm mutations. The findings of this work demonstrate that Support Vector Machines can be used to detect worm mutations, and that the optimal configuration for detection of worm mutations is to use a linear kernel with unnormalised bi-gram frequency counts. Moreover, the results show that Gaussian Processes and Support Vector Machines exhibit similar accuracy on average in detecting worm mutations, while K-nearest neighbours consistently produces lower quality predictions. The generated worm mutations are shown to be of sufficiently high quality to serve as training data. Combined, the results demonstrate that machine learning is capable of accurately detecting mutations of known worms.
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Phadke, Amit Ashok. "Predicting open-source software quality using statistical and machine learning techniques". Master's thesis, Mississippi State : Mississippi State University, 2004. http://library.msstate.edu/etd/show.asp?etd=etd-11092004-105801.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Forsberg, Fredrik, e Gonzalez Pierre Alvarez. "Unsupervised Machine Learning: An Investigation of Clustering Algorithms on a Small Dataset". Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-16300.

Testo completo
Abstract (sommario):
Context: With the rising popularity of machine learning, looking at its shortcomings is valuable in seeing how well machine learning is applicable. Is it possible to apply the clustering with a small dataset? Objectives: This thesis consists of a literature study, a survey and an experiment. It investigates how two different unsupervised machine learning algorithms DBSCAN(Density-Based Spatial Clustering of Applications with Noise) and K-means run on a dataset gathered from a survey. Methods: Making a survey where we can see statistically what most people chose and apply clustering with the data from the survey to confirm if the clustering has the same patterns as what people have picked statistically. Results: It was possible to identify patterns with clustering algorithms using a small dataset. The literature studies show examples that both algorithms have been used successfully. Conclusions: It's possible to see patterns using DBSCAN and K-means on a small dataset. The size of the dataset is not necessarily the only aspect to take into consideration, feature and parameter selection are both important as well since the algorithms need to be tuned and customized to the data.
Gli stili APA, Harvard, Vancouver, ISO e altri
41

Mayo, Quentin R. "Detection of Generalizable Clone Security Coding Bugs Using Graphs and Learning Algorithms". Thesis, University of North Texas, 2018. https://digital.library.unt.edu/ark:/67531/metadc1404548/.

Testo completo
Abstract (sommario):
This research methodology isolates coding properties and identifies the probability of security vulnerabilities using machine learning and historical data. Several approaches characterize the effectiveness of detecting security-related bugs that manifest as vulnerabilities, but none utilize vulnerability patch information. The main contribution of this research is a framework to analyze LLVM Intermediate Representation Code and merging core source code representations using source code properties. This research is beneficial because it allows source programs to be transformed into a graphical form and users can extract specific code properties related to vulnerable functions. The result is an improved approach to detect, identify, and track software system vulnerabilities based on a performance evaluation. The methodology uses historical function level vulnerability information, unique feature extraction techniques, a novel code property graph, and learning algorithms to minimize the amount of end user domain knowledge necessary to detect vulnerabilities in applications. The analysis shows approximately 99% precision and recall to detect known vulnerabilities in the National Institute of Standards and Technology (NIST) Software Assurance Metrics and Tool Evaluation (SAMATE) project. Furthermore, 72% percent of the historical vulnerabilities in the OpenSSL testing environment were detected using a linear support vector classifier (SVC) model.
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Kanneganti, Alekhya. "Using Ensemble Machine Learning Methods in Estimating Software Development Effort". Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20691.

Testo completo
Abstract (sommario):
Background: Software Development Effort Estimation is a process that focuses on estimating the required effort to develop a software project with a minimal budget. Estimating effort includes interpretation of required manpower, resources, time and schedule. Project managers are responsible for estimating the required effort. A model that can predict software development effort efficiently comes in hand and acts as a decision support system for the project managers to enhance the precision in estimating effort. Therefore, the context of this study is to increase the efficiency in estimating software development effort. Objective: The main objective of this thesis is to identify an effective ensemble method to build and implement it, in estimating software development effort. Apart from this, parameter tuning is also implemented to improve the performance of the model. Finally, we compare the results of the developed model with the existing models. Method: In this thesis, we have adopted two research methods. Initially, a Literature Review was conducted to gain knowledge on the existing studies, machine learning techniques, datasets, ensemble methods that were previously used in estimating Software Development Effort. Then a controlled Experiment was conducted in order to build an ensemble model and to evaluate the performance of the ensemble model for determining if the developed model has a better performance when compared to the existing models.   Results: After conducting literature review and collecting evidence, we have decided to build and implement stacked generalization ensemble method in this thesis, with the help of individual machine learning techniques like Support vector regressor (SVR), K-Nearest Neighbors regressor (KNN), Decision Tree Regressor (DTR), Linear Regressor (LR), Multi-Layer Perceptron Regressor (MLP) Random Forest Regressor (RFR), Gradient Boosting Regressor (GBR), AdaBoost Regressor (ABR), XGBoost Regressor (XGB). Likewise, we have decided to implement Randomized Parameter Optimization and SelectKbest function to implement feature section. Datasets like COCOMO81, MAXWELL, ALBERCHT, DESHARNAIS were used. Results of the experiment show that the developed ensemble model performs at its best, for three out of four datasets. Conclusion: After evaluating and analyzing the results obtained, we can conclude that the developed model works well with the datasets that have continuous, numeric type of values. We can also conclude that the developed ensemble model outperforms other existing models when implemented with COCOMO81, MAXWELL, ALBERCHT datasets.
Gli stili APA, Harvard, Vancouver, ISO e altri
43

Kasianenko, Stanislav. "Predicting Software Defectiveness by Mining Software Repositories". Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-78729.

Testo completo
Abstract (sommario):
One of the important aims of the continuous software development process is to localize and remove all existing program bugs as fast as possible. Such goal is highly related to software engineering and defectiveness estimation. Many big companies started to store source code in software repositories as the later grew in popularity. These repositories usually include static source code as well as detailed data for defects in software units. This allows analyzing all the data without interrupting programing process. The main problem of large, complex software is impossibility to control everything manually while the price of the error can be very high. This might result in developers missing defects on testing stage and increase of maintenance cost. The general research goal is to find a way of predicting future software defectiveness with high precision. Reducing maintenance and development costs will contribute to reduce the time-to-market and increase software quality. To address the problem of estimating residual defects an approach was found to predict residual defectiveness of a software by the means of machine learning. For a prime machine learning algorithm, a regression decision tree was chosen as a simple and reliable solution. Data for this tree is extracted from static source code repository and divided into two parts: software metrics and defect data. Software metrics are formed from static code and defect data is extracted from reported issues in the repository. In addition to already reported bugs, they are augmented with unreported bugs found on “discussions” section in repository and parsed by a natural language processor. Metrics were filtered to remove ones, that were not related to defect data by applying correlation algorithm. Remaining metrics were weighted to use the most correlated combination as a training set for the decision tree. As a result, built decision tree model allows to forecast defectiveness with 89% chance for the particular product. This experiment was conducted using GitHub repository on a Java project and predicted number of possible bugs in a single file (Java class). The experiment resulted in designed method for predicting possible defectiveness from a static code of a single big (more than 1000 files) software version.
Gli stili APA, Harvard, Vancouver, ISO e altri
44

Zam, Anton. "Evaluating Distributed Machine Learning using IoT Devices". Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42388.

Testo completo
Abstract (sommario):
Internet of things (IoT) blir bara större och större varje år och nya enheter läggs till hela tiden. Även om en stor del av dessa enheter är kontinuerligt använda finns det fortfarande väldigt många enheter som står inaktiva och sitter på oanvänd processorkraft som kan användas till att utföra maskininlärnings beräkningar. Det finns för nuvarande väldigt många metoder för att kombinera processorkraften av flera enheter för att utföra maskininlärnings uppgifter, dessa brukar kallas för distribuerade maskininlärnings metoder. huvudfokuset av detta arbetet är att utvärdera olika distribuerade maskininlärnings metoder för att se om de kan implementeras på IoT enheter och i fallet metoderna kan implementeras ska man mäta hur effektiva och skalbara dessa metoderna är. Den distribuerade maskininlärnings metoden som blivit implementerad i detta arbete kallas för ”MultiWorkerMirrorStrategy” och denna metod blev utvärderar genom en jämförelse på träningstiden, tränings precisionen och utvärderings precisionen av 2,3 och 4 Raspberry pi:s med en icke distribuerad metod vilket endast använt sig av 1 Raspberry pi. Resultatet av mätningarna visade att trots att processorkraften ökar för varje enhet som lagts till i clustret blir träningstiden högre samtidigt som resterande mätningar var desamma. Genom att analysera och diskutera dessa resultat drogs slutsatsen att den overhead som skapats av att enheterna kommunicerar med varandra är alldeles för hög vilket resulterar i att den implementerade metoden är väldigt ineffektiv och kan inte skallas upp utan att någon typ av optimering läggs till.
Internet of things is growing every year with new devices being added all the time. Although some of the devices are continuously in use a large amount of them are mostly idle and sitting on untapped processing power that could be used to compute machine learning computations. There currently exist a lot of different methods to combine the processing power of multiple devices to compute machine learning task these are often called distributed machine learning methods. The main focus of this thesis is to evaluate these distributed machine learning methods to see if they could be implemented on IoT devices and if so, measure how efficient and scalable these methods are. The method chosen for implementation was called “MultiWorkerMirrorStrategy” and this method was evaluated by comparing the training time, training accuracy and evaluation accuracy of 2,3 and 4 Raspberry pi:s  with a nondistributed machine learning method with 1 Raspberry pi. The results showed that although the computational power increased with every added device the training time increased while the rest of the measurements stayed the same. After the results were analyzed and discussed the conclusion of this were that the overhead added for communicating between devices were to high resulting in this method being very inefficient and wouldn’t scale without some sort of optimization being added.
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Lloyd, Katherine L. "Machine learning stratification for oncology patient survival". Thesis, University of Warwick, 2017. http://wrap.warwick.ac.uk/107703/.

Testo completo
Abstract (sommario):
Personalised medicine for cancer treatment promises benefits for patient survival and effective use of medical resources. This goal requires the development of predictive models for the identification and implementation of biomarkers for the prediction of patient survival given treatment options. This thesis addresses research questions in this area. The systematic review detailed in Chapter 2 investigates the literature concerning the prediction of resistance to chemotherapy for ovarian cancer patients using statistical methods and gene expression measurements. The range of models used by studies in the systematic review highlights the popularity of traditional models, such as Cox proportional hazards, with few more complex models being utilised. In Chapters 3 and 4, new methods are presented for modelling right-censored survival data. Due to the nature of biomedical data, the methods used need to be flexible and adequately account for high dimensional, noisy data. Gaussian processes fulfil these requirements and were hence used for the development of three Gaussian process models for right-censored survival data. Chapter 3 details these models, and they are applied to synthetic and cancer data in Chapter 4. In all cases the Gaussian processes for survival were found to equal or outperform all comparison models, as measured by concordance index. Given the application to molecular cancer data, it was expected that the data would be high dimensional. Two feature selection methods are investigated in Chapter 5 for use with Gaussian processes to address this. In Chapter 6 a program is developed for the analysis of the data produced by a test for cancer mutations using qPCR. The automated program was designed to isolate the analysis from the user and produce results and reports for clinical use. It is observed that this approach of automated analysis would be suitable for application to any form of clinical test or complex predictive model without the requirement of user guidance.
Gli stili APA, Harvard, Vancouver, ISO e altri
46

Bakhshi, Taimur. "User-centric traffic engineering in software defined networks". Thesis, University of Plymouth, 2017. http://hdl.handle.net/10026.1/8202.

Testo completo
Abstract (sommario):
Software defined networking (SDN) is a relatively new paradigm that decouples individual network elements from the control logic, offering real-time network programmability, translating high level policy abstractions into low level device configurations. The framework comprises of the data (forwarding) plane incorporating network devices, while the control logic and network services reside in the control and application planes respectively. Operators can optimize the network fabric to yield performance gains for individual applications and services utilizing flow metering and application-awareness, the default traffic management method in SDN. Existing approaches to traffic optimization, however, do not explicitly consider user application trends. Recent SDN traffic engineering designs either offer improvements for typical time-critical applications or focus on devising monitoring solutions aimed at measuring performance metrics of the respective services. The performance caveats of isolated service differentiation on the end users may be substantial considering the growth in Internet and network applications on offer and the resulting diversity in user activities. Application-level flow metering schemes therefore, fall short of fully exploiting the real-time network provisioning capability offered by SDN instead relying on rather static traffic control primitives frequent in legacy networking. For individual users, SDN may lead to substantial improvements if the framework allows operators to allocate resources while accounting for a user-centric mix of applications. This thesis explores the user traffic application trends in different network environments and proposes a novel user traffic profiling framework to aid the SDN control plane (controller) in accurately configuring network elements for a broad spectrum of users without impeding specific application requirements. This thesis starts with a critical review of existing traffic engineering solutions in SDN and highlights recent and ongoing work in network optimization studies. Predominant existing segregated application policy based controls in SDN do not consider the cost of isolated application gains on parallel SDN services and resulting consequence for users having varying application usage. Therefore, attention is given to investigating techniques which may capture the user behaviour for possible integration in SDN traffic controls. To this end, profiling of user application traffic trends is identified as a technique which may offer insight into the inherent diversity in user activities and offer possible incorporation in SDN based traffic engineering. A series of subsequent user traffic profiling studies are carried out in this regard employing network flow statistics collected from residential and enterprise network environments. Utilizing machine learning techniques including the prominent unsupervised k-means cluster analysis, user generated traffic flows are cluster analysed and the derived profiles in each networking environment are benchmarked for stability before integration in SDN control solutions. In parallel, a novel flow-based traffic classifier is designed to yield high accuracy in identifying user application flows and the traffic profiling mechanism is automated. The core functions of the novel user-centric traffic engineering solution are validated by the implementation of traffic profiling based SDN network control applications in residential, data center and campus based SDN environments. A series of simulations highlighting varying traffic conditions and profile based policy controls are designed and evaluated in each network setting using the traffic profiles derived from realistic environments to demonstrate the effectiveness of the traffic management solution. The overall network performance metrics per profile show substantive gains, proportional to operator defined user profile prioritization policies despite high traffic load conditions. The proposed user-centric SDN traffic engineering framework therefore, dynamically provisions data plane resources among different user traffic classes (profiles), capturing user behaviour to define and implement network policy controls, going beyond isolated application management.
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Niu, Fei. "Learning-based Software Testing using Symbolic Constraint Solving Methods". Licentiate thesis, KTH, Teoretisk datalogi, TCS, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-41932.

Testo completo
Abstract (sommario):
Software testing remains one of the most important but expensive approaches to ensure high-quality software today. In order to reduce the cost of testing, over the last several decades, various techniques such as formal verification and inductive learning have been used for test automation in previous research. In this thesis, we present a specification-based black-box testing approach, learning-based testing (LBT), which is suitable for a wide range of systems, e.g. procedural and reactive systems. In the LBT architecture, given the requirement specification of a system under test (SUT), a large number of high-quality test cases can be iteratively generated, executed and evaluated by means of combining inductive learning with constraint solving. We apply LBT to two types of systems, namely procedural and reactive systems. We specify a procedural system in Hoare logic and model it as a set of piecewise polynomials that can be locally and incrementally inferred. To automate test case generation (TCG), we use a quantifier elimination method, the Hoon-Collins cylindric algebraic decomposition (CAD), which is applied on only one local model (a bounded polynomial) at a time. On the other hand, a reactive system is specified in temporal logic formulas, and modeled as an extended Mealy automaton over abstract data types (EMA) that can be incrementally learned as a complete term rewriting system (TRS) using the congruence generator extension (CGE) algorithm. We consider TCG for a reactive system as a bounded model checking problem, which can be further reformulated into a disunification problem and solved by narrowing. The performance of the LBT frameworks is empirically evaluated against random testing for both procedural and reactive systems (executable models and programs). The results show that LBT is significantly more efficient than random testing in fault detection, i.e. less test cases and potentially less time are required than for random testing.
QC 20111012
Gli stili APA, Harvard, Vancouver, ISO e altri
48

Allamanis, Miltiadis. "Learning natural coding conventions". Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28791.

Testo completo
Abstract (sommario):
Coding conventions are ubiquitous in software engineering practice. Maintaining a uniform coding style allows software development teams to communicate through code by making the code clear and, thus, readable and maintainable—two important properties of good code since developers spend the majority of their time maintaining software systems. This dissertation introduces a set of probabilistic machine learning models of source code that learn coding conventions directly from source code written in a mostly conventional style. This alleviates the coding convention enforcement problem, where conventions need to first be formulated clearly into unambiguous rules and then be coded in order to be enforced; a tedious and costly process. First, we introduce the problem of inferring a variable’s name given its usage context and address this problem by creating Naturalize — a machine learning framework that learns to suggest conventional variable names. Two machine learning models, a simple n-gram language model and a specialized neural log-bilinear context model are trained to understand the role and function of each variable and suggest new stylistically consistent variable names. The neural log-bilinear model can even suggest previously unseen names by composing them from subtokens (i.e. sub-components of code identifiers). The suggestions of the models achieve 90% accuracy when suggesting variable names at the top 20% most confident locations, rendering the suggestion system usable in practice. We then turn our attention to the significantly harder method naming problem. Learning to name methods, by looking only at the code tokens within their body, requires a good understating of the semantics of the code contained in a single method. To achieve this, we introduce a novel neural convolutional attention network that learns to generate the name of a method by sequentially predicting its subtokens. This is achieved by focusing on different parts of the code and potentially directly using body (sub)tokens even when they have never been seen before. This model achieves an F1 score of 51% on the top five suggestions when naming methods of real-world open-source projects. Learning about naming code conventions uses the syntactic structure of the code to infer names that implicitly relate to code semantics. However, syntactic similarities and differences obscure code semantics. Therefore, to capture features of semantic operations with machine learning, we need methods that learn semantic continuous logical representations. To achieve this ambitious goal, we focus our investigation on logic and algebraic symbolic expressions and design a neural equivalence network architecture that learns semantic vector representations of expressions in a syntax-driven way, while solely retaining semantics. We show that equivalence networks learn significantly better semantic vector representations compared to other, existing, neural network architectures. Finally, we present an unsupervised machine learning model for mining syntactic and semantic code idioms. Code idioms are conventional “mental chunks” of code that serve a single semantic purpose and are commonly used by practitioners. To achieve this, we employ Bayesian nonparametric inference on tree substitution grammars. We present a wide range of evidence that the resulting syntactic idioms are meaningful, demonstrating that they do indeed recur across software projects and that they occur more frequently in illustrative code examples collected from a Q&A site. These syntactic idioms can be used as a form of automatic documentation of coding practices of a programming language or an API. We also mine semantic loop idioms, i.e. highly abstracted but semantic-preserving idioms of loop operations. We show that semantic idioms provide data-driven guidance during the creation of software engineering tools by mining common semantic patterns, such as candidate refactoring locations. This gives data-based evidence to tool, API and language designers about general, domain and project-specific coding patterns, who instead of relying solely on their intuition, can use semantic idioms to achieve greater coverage of their tool or new API or language feature. We demonstrate this by creating a tool that suggests loop refactorings into functional constructs in LINQ. Semantic loop idioms also provide data-driven evidence for introducing new APIs or programming language features.
Gli stili APA, Harvard, Vancouver, ISO e altri
49

Andersson, Robin. "CAN-bus Multi-mixed IDS : A combinatory approach for intrusion detection in the controller area network of personal vehicles". Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-43450.

Testo completo
Abstract (sommario):
With the digitalization and the ever more computerization of personal vehicles, new attack surfaces are introduced, challenging the security of the in-vehicle network. There is never such a thing as fully securing any computer system, nor learning all the methods of attack in order to prevent a break-in into a system. Instead, with sophisticated methods, we can focus on detecting and preventing attacks from being performed inside a system. The current state of the art of such methods, named intrusion detection systems (IDS), is divided into two main approaches. One approach makes its models very confident of detecting malicious activity, however only on activities that has been previously learned by this model. The second approach is very good at constructing models for detecting any type of malicious activity, even if never studied by the model before, but with less confidence. In this thesis, a new approach is suggested with a redesigned architecture for an intrusion detection system called Multi-mixed IDS. Where we take a middle ground between the two standardized approaches, trying to find a combination of both sides strengths and eliminating its weaknesses. This thesis aims to deliver a proof of concept for a new approach in the current state of the art in the CAN-bus security research field. This thesis also brings up some background knowledge about CAN and intrusion detection systems, discussing their strengths and weaknesses in further detail. Additionally, a brief overview from a handpick of research contributions from the field are discussed. Further, a simple architecture is suggested, three individual detection models are trained and combined to be tested against a CAN-bus dataset. Finally, the results are examined and evaluated. The results from the suggested approach shows somewhat poor results compared to other suggested algorithms within the field. However, it also shows some good potential, if better decision methods between the individual algorithms that constructs the model can be found.
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Andersson, Robin. "Combining Anomaly- and Signaturebased Algorithms for IntrusionDetection in CAN-bus : A suggested approach for building precise and adaptiveintrusion detection systems to controller area networks". Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-43450.

Testo completo
Abstract (sommario):
With the digitalization and the ever more computerization of personal vehicles, new attack surfaces are introduced, challenging the security of the in-vehicle network. There is never such a thing as fully securing any computer system, nor learning all the methods of attack in order to prevent a break-in into a system. Instead, with sophisticated methods, we can focus on detecting and preventing attacks from being performed inside a system. The current state of the art of such methods, named intrusion detection systems (IDS), is divided into two main approaches. One approach makes its models very confident of detecting malicious activity, however only on activities that has been previously learned by this model. The second approach is very good at constructing models for detecting any type of malicious activity, even if never studied by the model before, but with less confidence. In this thesis, a new approach is suggested with a redesigned architecture for an intrusion detection system called Multi-mixed IDS. Where we take a middle ground between the two standardized approaches, trying to find a combination of both sides strengths and eliminating its weaknesses. This thesis aims to deliver a proof of concept for a new approach in the current state of the art in the CAN-bus security research field. This thesis also brings up some background knowledge about CAN and intrusion detection systems, discussing their strengths and weaknesses in further detail. Additionally, a brief overview from a handpick of research contributions from the field are discussed. Further, a simple architecture is suggested, three individual detection models are trained and combined to be tested against a CAN-bus dataset. Finally, the results are examined and evaluated. The results from the suggested approach shows somewhat poor results compared to other suggested algorithms within the field. However, it also shows some good potential, if better decision methods between the individual algorithms that constructs the model can be found.
Gli stili APA, Harvard, Vancouver, ISO e altri
Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!

Vai alla bibliografia