Log in

Relevant bibliographies by topics / Software classification / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Software classification.

Dissertations / Theses on the topic 'Software classification'

Author: Grafiati

Published: 4 June 2021

Last updated: 2 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Software classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Hui. "Software Defects Classification Prediction Based On Mining Software Repository." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-216554.

Full text

Abstract:

An important goal during the cycle of software development is to find and fix existing defects as early as possible. This has much to do with software defects prediction and management. Nowadays,many big software development companies have their own development repository, which typically includes a version control system and a bug tracking system. This has no doubt proved useful for software defects prediction. Since the 1990s researchers have been mining software repository to get a deeper understanding of the data. As a result they have come up with some software defects prediction models the past few years. There are basically two categories among these prediction models. One category is to predict how many defects still exist according to the already captured defects data in the earlier stage of the software life-cycle. The other category is to predict how many defects there will be in the newer version software according to the earlier version of the software defects data. The complexities of software development bring a lot of issues which are related with software defects. We have to consider these issues as much as possible to get precise prediction results, which makes the modeling more complex. This thesis presents the current research status on software defects classification prediction and the key techniques in this area, including: software metrics, classifiers, data pre-processing and the evaluation of the prediction results. We then propose a way to predict software defects classification based on mining software repository. A way to collect all the defects during the development of software from the Eclipse version control systems and map these defects with the defects information containing in software defects tracking system to get the statistical information of software defects, is described. Then the Eclipse metrics plug-in is used to get the software metrics of files and packages which contain defects. After analyzing and preprocessing the dataset, the tool(R) is used to build a prediction models on the training dataset, in order to predict software defects classification on different levels on the testing dataset, evaluate the performance of the model and comparedifferent models’ performance.

APA, Harvard, Vancouver, ISO, and other styles

2

Dijkstra, Semme Josua. "Software tools developed for seafloor classification." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0031/NQ62170.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Konda, Swetha Reddy. "Classification of software components based on clustering." Morgantown, W. Va. : [West Virginia University Libraries], 2007. https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5510.

Full text

Abstract:

Thesis (M.S.)--West Virginia University, 2007.<br>Title from document title page. Document formatted into pages; contains vi, 59 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 57-59).

APA, Harvard, Vancouver, ISO, and other styles

4

Lester, Neil. "Assisting the software reuse process through classification and retrieval of software models." Thesis, University of Ulster, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.311531.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Henningsson, Kennet. "A Fault Classification Approach to Software Process Improvement." Licentiate thesis, Karlskrona : Blekinge Institute of Technology [Blekinge tekniska högskola], 2005. http://www.bth.se/fou/Forskinfo.nsf/allfirst2/2b9d5998e26ed1b2c12571230047386b?OpenDocument.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Lamont, Morné Michael Connell. "Binary classification trees : a comparison with popular classification methods in statistics using different software." Thesis, Stellenbosch : Stellenbosch University, 2002. http://hdl.handle.net/10019.1/52718.

Full text

Abstract:

Thesis (MComm) -- Stellenbosch University, 2002.<br>ENGLISH ABSTRACT: Consider a data set with a categorical response variable and a set of explanatory variables. The response variable can have two or more categories and the explanatory variables can be numerical or categorical. This is a typical setup for a classification analysis, where we want to model the response based on the explanatory variables. Traditional statistical methods have been developed under certain assumptions such as: the explanatory variables are numeric only and! or the data follow a multivariate normal distribution. hl practice such assumptions are not always met. Different research fields generate data that have a mixed structure (categorical and numeric) and researchers are often interested using all these data in the analysis. hl recent years robust methods such as classification trees have become the substitute for traditional statistical methods when the above assumptions are violated. Classification trees are not only an effective classification method, but offer many other advantages. The aim of this thesis is to highlight the advantages of classification trees. hl the chapters that follow, the theory of and further developments on classification trees are discussed. This forms the foundation for the CART software which is discussed in Chapter 5, as well as other software in which classification tree modeling is possible. We will compare classification trees to parametric-, kernel- and k-nearest-neighbour discriminant analyses. A neural network is also compared to classification trees and finally we draw some conclusions on classification trees and its comparisons with other methods.<br>AFRIKAANSE OPSOMMING: Beskou 'n datastel met 'n kategoriese respons veranderlike en 'n stel verklarende veranderlikes. Die respons veranderlike kan twee of meer kategorieë hê en die verklarende veranderlikes kan numeries of kategories wees. Hierdie is 'n tipiese opset vir 'n klassifikasie analise, waar ons die respons wil modelleer deur gebruik te maak van die verklarende veranderlikes. Tradisionele statistiese metodes is ontwikkelonder sekere aannames soos: die verklarende veranderlikes is slegs numeries en! of dat die data 'n meerveranderlike normaal verdeling het. In die praktyk word daar nie altyd voldoen aan hierdie aannames nie. Verskillende navorsingsvelde genereer data wat 'n gemengde struktuur het (kategories en numeries) en navorsers wil soms al hierdie data gebruik in die analise. In die afgelope jare het robuuste metodes soos klassifikasie bome die alternatief geword vir tradisionele statistiese metodes as daar nie aan bogenoemde aannames voldoen word nie. Klassifikasie bome is nie net 'n effektiewe klassifikasie metode nie, maar bied baie meer voordele. Die doel van hierdie werkstuk is om die voordele van klassifikasie bome uit te wys. In die hoofstukke wat volg word die teorie en verdere ontwikkelinge van klassifikasie bome bespreek. Hierdie vorm die fondament vir die CART sagteware wat bespreek word in Hoofstuk 5, asook ander sagteware waarin klassifikasie boom modelering moontlik is. Ons sal klassifikasie bome vergelyk met parametriese-, "kernel"- en "k-nearest-neighbour" diskriminant analise. 'n Neurale netwerk word ook vergelyk met klassifikasie bome en ten slote word daar gevolgtrekkings gemaak oor klassifikasie bome en hoe dit vergelyk met ander metodes.

APA, Harvard, Vancouver, ISO, and other styles

7

Williams, Byron Joseph. "A FRAMEWORK FOR ASSESSING THE IMPACT OF SOFTWARE CHANGES TO SOFTWARE ARCHITECTURE USING CHANGE CLASSIFICATION." MSSTATE, 2006. http://sun.library.msstate.edu/ETD-db/theses/available/etd-04172006-150444/.

Full text

Abstract:

Software developers must produce software that can be changed without the risk of degrading the software architecture. One way to address software changes is to classify their causes and effects. A software change classification mechanism allows engineers to develop a common approach for handling changes. This information can be used to show the potential impact of the change. The goal of this research is to develop a change classification scheme that can be used to address causes of architectural degradation. This scheme can be used to model the effects of changes to software architecture. This research also presents a study of the initial architecture change classification scheme. The results of the study indicated that the classification scheme was easy to use and provided some benefit to developers. In addition, the results provided some evidence that changes of different types (in this classification scheme) required different amounts of effort to implement.

APA, Harvard, Vancouver, ISO, and other styles

8

Graham, Martin. "Visualising multiple overlapping classification hierarchies." Thesis, Edinburgh Napier University, 2001. http://researchrepository.napier.ac.uk/Output/2430.

Full text

Abstract:

The revision or reorganisation of hierarchical data sets can result in many possible hierarchical classifications composed of the same or overlapping data sets existing in parallel with each other. These data sets are difficult for people to handle and conceptualise, as they try to reconcile the different perspectives and structures that such data represents. One area where this situation occurs is the study of botanical taxonomy, essentially the classification and naming of plants. Revisions, new discoveries and new dimensions for classifying plants lead to a proliferation of classifications over the same set of plant data. Taxonomists would like a method of exploring these multiple overlapping hierarchies for interesting information, correlations, or anomalies. The application and extension of Information Visualisation (IV) techniques, the graphical display of abstract information, is put forward as a solution to this problem. Displaying the multiple classification hierarchies in a visually appealing manner along with powerful interaction mechanisms for examination and exploration of the data allows taxonomists to unearth previously hidden information. This visualisation gives detail that previous visualisations and statistical overviews cannot offer. This thesis work has extended previous IV work in several respects to achieve this goal. Compact, yet full and unambiguous, hierarchy visualisations have been developed. Linking and brushing techniques have been extended to work on a higher class of structure, namely overlapping trees and hierarchies. Focus and context techniques have been pushed to achieve new effects across the visually distinct representations of these multiple hierarchies. Other data types, such as multidimensional data and large cluster hierarchies have also been displayed using the final version of the visualisation.

APA, Harvard, Vancouver, ISO, and other styles

9

Lee, Kee Khoon. "Interpretable classification model for automotive material fatigue." Thesis, University of Southampton, 2002. https://eprints.soton.ac.uk/361578/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Manley, Gary W. "The classification and evaluation of Computer-Aided Software Engineering tools." Thesis, Monterey, California: Naval Postgraduate School, 1990. http://hdl.handle.net/10945/34910.

Full text

Abstract:

Approved for public release; distribution unlimited.<br>The use of Computer-Aided Software Engineering (CASE) tools has been viewed as a remedy for the software development crisis by achieving improved productivity and system quality via the automation of all or part of the software engineering process. The proliferation and tremendous variety of tools available have stretched the understanding of experienced practitioners and has had a profound impact on the software engineering process itself. To understand what a tool does and compare it to similar tools is a formidable task given the existing diversity of functionality. This thesis investigates what tools are available, proposes a general classification scheme to assist those investigating tools to decide where a tool falls within the software engineering process and identifies a tool's capabilities and limitations. This thesis also provides guidance for the evaluation of a tool and evaluates three commercially available tools.

APA, Harvard, Vancouver, ISO, and other styles

11

Fong, Vivian Lin. "Software Requirements Classification Using Word Embeddings and Convolutional Neural Networks." DigitalCommons@CalPoly, 2018. https://digitalcommons.calpoly.edu/theses/1851.

Full text

Abstract:

Software requirements classification, the practice of categorizing requirements by their type or purpose, can improve organization and transparency in the requirements engineering process and thus promote requirement fulfillment and software project completion. Requirements classification automation is a prominent area of research as automation can alleviate the tediousness of manual labeling and loosen its necessity for domain-expertise. This thesis explores the application of deep learning techniques on software requirements classification, specifically the use of word embeddings for document representation when training a convolutional neural network (CNN). As past research endeavors mainly utilize information retrieval and traditional machine learning techniques, we entertain the potential of deep learning on this particular task. With the support of learning libraries such as TensorFlow and Scikit-Learn and word embedding models such as word2vec and fastText, we build a Python system that trains and validates configurations of Naïve Bayes and CNN requirements classifiers. Applying our system to a suite of experiments on two well-studied requirements datasets, we recreate or establish the Naïve Bayes baselines and evaluate the impact of CNNs equipped with word embeddings trained from scratch versus word embeddings pre-trained on Big Data.

APA, Harvard, Vancouver, ISO, and other styles

12

Walia, Gursimran Singh. "Empirical Validation of Requirement Error Abstraction and Classification: A Multidisciplinary Approach." MSSTATE, 2006. http://sun.library.msstate.edu/ETD-db/theses/available/etd-05152006-151903/.

Full text

Abstract:

Software quality and reliability is a primary concern for successful development organizations. Over the years, researchers have focused on monitoring and controlling quality throughout the software process by helping developers to detect as many faults as possible using different fault based techniques. This thesis analyzed the software quality problem from a different perspective by taking a step back from faults to abstract the fundamental causes of faults. The first step in this direction is developing a process of abstracting errors from faults throughout the software process. I have described the error abstraction process (EAP) and used it to develop error taxonomy for the requirement stage. This thesis presents the results of a study, which uses techniques based on an error abstraction process and investigates its application to requirement documents. The initial results show promise and provide some useful insights. These results are important for our further investigation.

APA, Harvard, Vancouver, ISO, and other styles

13

Britto, Ricardo. "Knowledge Classification for Supporting Effort Estimation in Global Software Engineering Projects." Licentiate thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-10520.

Full text

Abstract:

Background: Global Software Engineering (GSE) has become a widely applied operational model for the development of software systems; it can increase profits and decrease time-to-market. However, there are many challenges associated with development of software in a globally distributed fashion. There is evidence that these challenges affect many process related to software development, such as effort estimation. To the best of our knowledge, there are no empirical studies to gather evidence on effort estimation in the GSE context. In addition, there is no common terminology for classifying GSE scenarios focusing on effort estimation. Objective: The main objective of this thesis is to support effort estimation in the GSE context by providing a taxonomy to classify the existing knowledge in this field. Method: Systematic literature review (to identify and analyze the state of the art), survey (to identify and analyze the state of the practice), systematic mapping (to identify practices to design software engineering taxonomies), and literature survey (to complement the states of the art and practice) were the methods employed in this thesis. Results: The results on the states of the art and practice show that the effort estimation techniques employed in the GSE context are the same techniques used in the collocated context. It was also identified that global aspects, e.g. time, geographical and social-cultural distances, are accounted for as cost drivers, although it is not clear how they are measured. As a result of the conducted mapping study, we reported a method that can be used to design new SE taxonomies. The aforementioned results were combined to extend and specialize an existing GSE taxonomy, for suitability for effort estimation. The usage of the specialized GSE effort estimation taxonomy was illustrated by classifying 8 finished GSE projects. The results show that the specialized taxonomy proposed in this thesis is comprehensive enough to classify GSE projects focusing on effort estimation. Conclusions: The taxonomy presented in this thesis will help researchers and practitioners to report new research on effort estimation in the GSE context; researchers and practitioners will be able to gather evidence, com- pare new studies and find new gaps in an easier way. The findings from this thesis show that more research must be conducted on effort estimation in the GSE context. For example, the way the cost drivers are measured should be further investigated. It is also necessary to conduct further research to clarify the role and impact of sourcing strategies on the effort estimates’ accuracies. Finally, we believe that it is possible to design an instrument based on the specialized GSE effort estimation taxonomy that helps practitioners to perform the effort estimation process in a way tailored for the specific needs of the GSE context.

APA, Harvard, Vancouver, ISO, and other styles

14

Tahir, Muhammad Atif. "Hardware and software solutions for prostate cancer classification using multispectral images." Thesis, Queen's University Belfast, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.431457.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Nguyen, Victor Allen. "A Simplified Faceted Approach To Information Retrieval for Reusable Software Classification." NSUWorks, 1998. http://nsuworks.nova.edu/gscis_etd/749.

Full text

Abstract:

Software Reuse is widely recognized as the most promising technique presently available in reducing the cost of software production. It is the adaptation or incorporation of previously developed software components, designs or other software-related artifacts (i.e. test plans) into new software or software development regimes. Researchers and vendors are doubling their efforts and devoting their time primarily to the topic of software reuse. Most have focused on mechanisms to construct reusable software but few have focused on the problem of discovering components or designs to meet specific needs. In order for software reuse to be successful, it must be perceived to be less costly to discover a software component or related artifact to satisfy a given need than to discover one anew. As results, this study will describe a method to classify software components that meet a specified need. Specifically, the purpose of the present research study is to provide a flexible system, comprised of a classification scheme and searcher system, entitled Guides-Search, in which processes can be retrieved by carrying out a structured dialogue with the user. The classification scheme provides both the structure of questions to be posed to the user, and the set of possible answers to each question. The model is not an attempt to replace current structures; but rather, seeks to provide a conceptual and structural method to support the improvement of software reuse methodology. The investigation focuses on the following goals and objectives for the classification scheme and searcher system: the classification will be flexible and extensible, but usable by the Searcher; the user will not be presented with a large number of questions; the user will never be required to answer a question not known to be germane to the query;

APA, Harvard, Vancouver, ISO, and other styles

16

Worthy, Paul James. "Investigation of artificial neural networks for forecasting and classification." Thesis, City University London, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.264247.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Winter, Mark J. "Knowledge refinement in constraint satisfaction and case classification problems." Thesis, University of Aberdeen, 1999. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU106810.

Full text

Abstract:

Knowledge-Base Refinement (KBR) systems are an attempt to help with the difficulties of detecting and correcting errors in a knowledge-base. This thesis investigates Knowledge-Base Refinement within the two problem solving paradigms of Case-Based Reasoning and Constraint Based Reasoning. Case-Based Reasoners make use of cases which represent previous problem solving incidents. Constraint Satisfaction Problems are represented by a set of variables, the possible values these variables can take and a set of constraints further restricting their possible values. This thesis argues that if the problem-solving paradigms of Case-Based Reasoning and Constraint-Based Reasoning are to become truly viable, then research has to be directed at providing support for knowledge-base refinement, but aimed at the knowledge representation formalisms used by the two paradigms rather than more traditional rule-based representations. The CRIMSON system has been developed within the context of an industrial inventory management problem and uses constraint satisfaction techniques. The system makes use of design knowledge to form a constraint satisfaction problem (CSP) which is solved to determine which items from an inventory are suitable for a given problem. Additionally, the system is equipped with a KBR facility allowing the designer to criticise the results of the CSP, leading to knowledge being refined. The REFINER systems are knowledge-base refinement systems that detect and help remove inconsistencies in case-bases. The systems detect and report inconsistencies to domain expert together with a set of refinements which, if implemented would remove the appropriate inconsistency. REFINER+ attempts to overcome the problems associated with REFINER, mainly its inefficiency with large case-bases. The systems can make use of background knowledge to aid in the refinement process, although they can function without any. However, care must be taken to ensure that any background knowledge that is used is correct. If this is not the case, then the refinement process may be adversely affected. Countering this problem is the main aim of BROCKER, which further extends the ideas of REFINER+ to include a facility allowing incorrect background knowledge used to be refined in response to expert criticism of the system's performance. The systems were mainly developed making use of a medical dataset.

APA, Harvard, Vancouver, ISO, and other styles

18

Ammari, Faisal Tawfiq. "Securing financial XML transactions using intelligent fuzzy classification techniques." Thesis, University of Huddersfield, 2013. http://eprints.hud.ac.uk/id/eprint/19506/.

Full text

Abstract:

The eXtensible Markup Language (XML) has been widely adopted in many financial institutions in their daily transactions; this adoption was due to the flexible nature of XML providing a common syntax for systems messaging in general and in financial messaging in specific. Excessive use of XML in financial transactions messaging created an aligned interest in security protocols integrated into XML solutions in order to protect exchanged XML messages in an efficient yet powerful mechanism. However, financial institutions (i.e. banks) perform large volume of transactions on daily basis which require securing XML messages on large scale. Securing large volume of messages will result performance and resource issues. Therefore, an approach is needed to secure specified portions of an XML document, syntax and processing rules for representing secured parts. In this research we have developed a smart approach for securing financial XML transactions using effective and intelligent fuzzy classification techniques. Our approach defines the process of classifying XML content using a set of fuzzy variables. Upon fuzzy classification phase, a unique value is assigned to a defined attribute named "Importance Level". Assigned value indicates the data sensitivity for each XML tag. This thesis also defines the process of securing classified financial XML message content by performing element-wise XML encryption on selected parts defined in fuzzy classification phase. Element-wise encryption is performed using symmetric encryption using AES algorithm with different key sizes. Key size of 128-bit is being used on tags classified with "Medium" importance level; a key size of 256-bit is being used on tags classified with "High" importance level. An implementation has been performed on a real life environment using online banking system in Jordan Ahli Bank one of the leading banks in Jordan to demonstrate its flexibility, feasibility, and efficiency. Our experimental results of the system verified tangible enhancements in encryption efficiency, processing time reduction, and resulting XML message sizes. Finally, our proposed system was designed, developed, and evaluated using a live data extracted from an internet banking service in one of the leading banks in Jordan. The results obtained from our experiments are promising, showing that our model can provide an effective yet resilient support for financial systems to secure exchanged financial XML messages.

APA, Harvard, Vancouver, ISO, and other styles

19

Ndenga, Malanga Kennedy. "Predicting post-release software faults in open source software as a means of measuring intrinsic software product quality." Electronic Thesis or Diss., Paris 8, 2017. http://www.theses.fr/2017PA080099.

Full text

Abstract:

Les logiciels défectueux ont des conséquences coûteuses. Les développeurs de logiciels doivent identifier et réparer les composants défectueux dans leurs logiciels avant de les publier. De même, les utilisateurs doivent évaluer la qualité du logiciel avant son adoption. Cependant, la nature abstraite et les multiples dimensions de la qualité des logiciels entravent les organisations de mesurer leur qualités. Les métriques de qualité logicielle peuvent être utilisées comme proxies de la qualité du logiciel. Cependant, il est nécessaire de disposer d'une métrique de processus logiciel spécifique qui peut garantir des performances de prédiction de défaut meilleures et cohérentes, et cela dans de différents contextes. Cette recherche avait pour objectif de déterminer un prédicteur de défauts logiciels qui présente la meilleure performance de prédiction, nécessite moins d'efforts pour la détection et a un coût minimum de mauvaise classification des composants défectueux. En outre, l'étude inclut une analyse de l'effet de la combinaison de prédicteurs sur la performance d'un modèles de prédiction de défauts logiciels. Les données expérimentales proviennent de quatre projets OSS. La régression logistique et la régression linéaire ont été utilisées pour prédire les défauts. Les métriques Change Burst ont enregistré les valeurs les plus élevées pour les mesures de performance numérique, avaient les probabilités de détection de défaut les plus élevées et le plus faible coût de mauvaise classification des composants<br>Faulty software have expensive consequences. To mitigate these consequences, software developers have to identify and fix faulty software components before releasing their products. Similarly, users have to gauge the delivered quality of software before adopting it. However, the abstract nature and multiple dimensions of software quality impede organizations from measuring software quality. Software quality metrics can be used as proxies of software quality. There is need for a software process metric that can guarantee consistent superior fault prediction performances across different contexts. This research sought to determine a predictor for software faults that exhibits the best prediction performance, requires least effort to detect software faults, and has a minimum cost of misclassifying components. It also investigated the effect of combining predictors on performance of software fault prediction models. Experimental data was derived from four OSS projects. Logistic Regression was used to predict bug status while Linear Regression was used to predict number of bugs per file. Models built with Change Burst metrics registered overall better performance relative to those built with Change, Code Churn, Developer Networks and Source Code software metrics. Change Burst metrics recorded the highest values for numerical performance measures, exhibited the highest fault detection probabilities and had the least cost of mis-classification of components. The study found out that Change Burst metrics could effectively predict software faults

APA, Harvard, Vancouver, ISO, and other styles

20

Brendler, Silvio. "Neuere Hilfsmittel der Namenforschung: III. Kartographische Software." Gesellschaft für Namenkunde e.V, 2005. https://ul.qucosa.de/id/qucosa%3A31427.

Full text

Abstract:

This article draws attention to technological advances in cartography and reminds us that cartographic software makes work in modern geography of names (areal onomastics) much easier than it used to be. Especially geographic information systems (GIS) allow name students not only to present but also to interpret data.

APA, Harvard, Vancouver, ISO, and other styles

21

Rusch, Thomas, and Achim Zeileis. "Discussion on Fifty Years of Classification and Regression Trees." Wiley, 2014. http://dx.doi.org/10.1111/insr.12062.

Full text

Abstract:

In this discussion paper, we argue that the literature on tree algorithms is very fragmented. We identify possible causes and discuss good and bad sides of this situation. Among the latter is the lack of free open-source implementations for many algorithms. We argue that if the community adopts a standard of creating and sharing free open-source implementations for their developed algorithms and creates easy access to these programs the bad sides of the fragmentation will be actively combated and will benefit the whole scientific community. (authors' abstract)

APA, Harvard, Vancouver, ISO, and other styles

22

Clauß, Matthias, and Günther Fischer. "Automatisches Software-Update." Universitätsbibliothek Chemnitz, 2003. http://nbn-resolving.de/urn:nbn:de:swb:ch1-200301040.

Full text

Abstract:

Vorgestellt wird ein neuer Dienst zum eigenverantwortlichen Software-Update von PC-Systemen, die unter Linux Red Hat 7.3 betrieben werden. Grundlage des Dienstes bildet das Verfahren YARU (Yum based Automatic RPM Update) als Bestandteil der im URZ eingesetzten Admin-Technologie für Linux-Rechner.

APA, Harvard, Vancouver, ISO, and other styles

23

Otero, Fernando E. B. "New ant colony optimisation algorithms for hierarchial classification of protein functions." Thesis, University of Kent, 2010. http://www.cs.kent.ac.uk/pubs/2010/3057.

Full text

Abstract:

Ant colony optimisation (ACO) is a metaheuristic to solve optimisation problems inspired by the foraging behaviour of ant colonies. It has been successfully applied to several types of optimisation problems, such as scheduling and routing, and more recently for the discovery of classification rules. The classification task in data mining aims at predicting the value of a given goal attribute for an example, based on the values of a set of predictor attributes for that example. Since real-world classification problems are generally described by nominal (categorical or discrete) and continuous (real-valued) attributes, classification algorithms are required to be able to cope with both nominal and continuous attributes. Current ACO classification algorithms have been designed with the limitation of discovering rules using nominal attributes describing the data. Furthermore, they also have the limitation of not coping with more complex types of classification problems e.g., hierarchical multi-label classification problems. This thesis investigates the extension of ACO classification algorithms to cope with the aforementioned limitations. Firstly, a method is proposed to extend the rule construction process of ACO classification algorithms to cope with continuous attributes directly. Four new ACO classification algorithms are presented, as well as a comparison between them and well-known classification algorithms from the literature. Secondly, an ACO classification algorithm for the hierarchical problem of protein function prediction which is a major type of bioinformatics problem addressed in this thesis is presented. Finally, three different approaches to extend ACO classification algorithms to the more complex case of hierarchical multi-label classification are described, elaborating on the ideas of the proposed hierarchical classification ACO algorithm. These algorithms are compare against state-of-the-art decision tree induction algorithms for hierarchical multi-label classification in the context of protein function prediction. The computational results of experiments with a wide range of data sets including challenging protein function prediction data sets with very large number.

APA, Harvard, Vancouver, ISO, and other styles

24

Mahmood, Qazafi. "LC - an effective classification based association rule mining algorithm." Thesis, University of Huddersfield, 2014. http://eprints.hud.ac.uk/id/eprint/24274/.

Full text

Abstract:

Classification using association rules is a research field in data mining that primarily uses association rule discovery techniques in classification benchmarks. It has been confirmed by many research studies in the literature that classification using association tends to generate more predictive classification systems than traditional classification data mining techniques like probabilistic, statistical and decision tree. In this thesis, we introduce a novel data mining algorithm based on classification using association called “Looking at the Class” (LC), which can be used in for mining a range of classification data sets. Unlike known algorithms in classification using the association approach such as Classification based on Association rule (CBA) system and Classification based on Predictive Association (CPAR) system, which merge disjoint items in the rule learning step without anticipating the class label similarity, the proposed algorithm merges only items with identical class labels. This saves too many unnecessary items combining during the rule learning step, and consequently results in large saving in computational time and memory. Furthermore, the LC algorithm uses a novel prediction procedure that employs multiple rules to make the prediction decision instead of a single rule. The proposed algorithm has been evaluated thoroughly on real world security data sets collected using an automated tool developed at Huddersfield University. The security application which we have considered in this thesis is about categorizing websites based on their features to legitimate or fake which is a typical binary classification problem. Also, experimental results on a number of UCI data sets have been conducted and the measures used for evaluation is the classification accuracy, memory usage, and others. The results show that LC algorithm outperformed traditional classification algorithms such as C4.5, PART and Naïve Bayes as well as known classification based association algorithms like CBA with respect to classification accuracy, memory usage, and execution time on most data sets we consider.

APA, Harvard, Vancouver, ISO, and other styles

25

Akpinar, Kutalmis. "Human Activity Classification Using Spatio-temporal." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614587/index.pdf.

Full text

Abstract:

This thesis compares the state of the art methods and proposes solutions for human activity classification from video data. Human activity classification is finding the meaning of human activities, which are captured by the video. Classification of human activity is needed in order to improve surveillance video analysis and summarization, video data mining and robot intelligence. This thesis focuses on the classification of low level human activities which are used as an important information source to determine high level activities. In this study, the feature relation histogram based activity description proposed by Ryoo et al. (2009) is implemented and extended. The feature histogram is widely used in feature based approaches<br>however, the feature relation histogram has the ability to represent the locational information of the features. Our extension defines a new set of relations between the features, which makes the method more effective for action description. Classifications are performed and results are compared using feature histogram, Ryoo&rsquo<br>s feature relation histogram and our feature relation histogram using the same datasets and the feature type. Our experiments show that feature relation histogram performs slightly better than the feature histogram, our feature relation histogram is even better than both of the two. Although the difference is not clearly observable in the datasets containing periodic actions, a 12% improvement is observed for the non-periodic action datasets. Our work shows that the spatio-temporal relation represented by our new set of relations is a better way to represent the activity for classification.

APA, Harvard, Vancouver, ISO, and other styles

26

Hafezi, Nazila. "An integrated software package for model-based neuro-fuzzy classification of small airway dysfunction." To access this resource online via ProQuest Dissertations and Theses @ UTEP, 2009. http://0-proquest.umi.com.lib.utep.edu/login?COPT=REJTPTU0YmImSU5UPTAmVkVSPTI=&clientId=2515.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Felipe, Gilvan Ferreira. "Development and evaluation of software for use in hosting with risk classification in pediatrics." Universidade Federal do CearÃ, 2016. http://www.teses.ufc.br/tde_busca/arquivo.php?codArquivo=19249.

Full text

Abstract:

Risk classification (RC) is a relevant strategy for assessing and stratifying the risk and vulnerabilities of patients treated at emergency and emergency units, enabling to identify which cases require immediate assistance and which can wait for safe care. In addition, the recognition of technological evolution, especially in the computational field, stimulates the idea that the use of such technologies in the daily life of health services can contribute to improving the quality and safety of the service provided. The objective was to develop and evaluate a software about the process of RC in pediatrics. Methodological study developed in three stages: Step 1 - Software development; Step 2 - Evaluation of the technical quality and functional performance of the software through the analysis of the characteristics covered by ISO / IEC 25010, carried out by eight specialists in the area of informatics and 13 in the area of nursing; and Step 3 - Evaluation of the agreement of the software in relation to the printed protocol, performed by three nurses with experience in RC. Data analysis was performed using descriptive statistics using absolute and relative frequencies and inferential statistics using the Kendall coefficient of agreement (W), with the aid of Microsoft Office software ExcelÂ, Statistical Package for Social Sciences (SPSS) version 20.0 and R. The ethical aspects were respected and the study was approved by the Research Ethics Committee of the Federal University of CearÃ and received approval on opinion nÂ 1,327,959 / 2015 . For the development of the proposed software, we used the prescriptive process model of software engineering called Incremental Model, the language used for the development was CSharp (C #) and the database chosen was MicrosoftÂ SQL ServerÂ 2008 R2. The results obtained from the evaluation of the software developed for this study reveal that it was adequate in all characteristics analyzed and was indicated as very appropriate and / or completely appropriate by more than 70.0% of the evaluations of the specialists in computer science, as follows: functional adequacy - 100.0%; Reliability - 82.6%; Usability - 84.9%; Performance efficiency - 93.4%; Compatibility - 85.0%; Security - 91.7%; Maintainability - 95.0%; and portability - 87.5%, as well as by the nursing specialists: functional adequacy - 96.2%; Reliability - 88.5%; Usability - 98.7%; Performance efficiency - 96.2%; Compatibility - 98.1%; Security - 100.0%. The results of the risk classification generated by the use of the software, when compared to those generated from the use of the printed protocol, indicated a total agreement by two judges (W = 1,000; p <0.001) and very high agreement by another one (W = 0.992, p <0.001). The results allowed to conclude that the Software for ACCR in Pediatrics, developed in this study, was considered adequate in relation to technical quality and functional performance. In addition, the software presented high agreement in comparison with the printed protocol, currently used to perform the ACCR in the city of Fortaleza, evidencing its potential safety for the assistance of the nurses involved in conducting the RC in pediatrics.<br>O Acolhimento com ClassificaÃÃo de Risco (ACCR) Ã uma relevante estratÃgia para a avaliaÃÃo e estratificaÃÃo do risco e das vulnerabilidades de pacientes atendidos em unidades de urgÃncia e emergÃncia, possibilitando identificar quais casos necessitam de assistÃncia imediata e quais podem aguardar atendimento com seguranÃa. AlÃm disso, o reconhecimento da evoluÃÃo tecnolÃgica, sobretudo em Ãmbito computacional, estimula a ideia de que o uso de tais tecnologias no cotidiano dos serviÃos de saÃde pode contribuir para melhoria da qualidade e da seguranÃa do serviÃo prestado. Objetivou-se desenvolver e avaliar um software acerca do processo de ACCR em pediatria. Estudo metodolÃgico desenvolvido em trÃs etapas: Etapa 1 â Desenvolvimento do software; Etapa 2 â AvaliaÃÃo da qualidade tÃcnica e do desempenho funcional do software por meio da anÃlise das caracterÃsticas abordadas pela ISO/IEC 25010, realizada por oito especialistas da Ãrea de informÃtica e 13 da Ãrea de enfermagem; e Etapa 3 â AvaliaÃÃo da concordÃncia do software em relaÃÃo ao protocolo impresso, realizada por trÃs enfermeiros com experiÃncia em ACCR. A anÃlise dos dados obtidos foi realizada por meio de estatÃstica descritiva, utilizando-se os valores de frequÃncia absoluta e frequÃncia relativa, e de estatÃstica inferencial, por meio do uso do coeficiente de concordÃncia de Kendall (W), com auxÃlio dos softwares Microsoft Office ExcelÂ, Statistical Package for the Social Sciences (SPSS) versÃo 20.0 e R. Os aspectos Ãticos foram respeitados e o estudo foi aprovado pelo ComitÃ de Ãtica em Pesquisa da Universidade Federal do CearÃ, tendo recebido aprovaÃÃo no parecer nÂ 1.327.959/2015. Para desenvolvimento do software proposto, utilizou-se o modelo de processo prescritivo da engenharia de software chamado Modelo Incremental, a linguagem utilizada para o desenvolvimento foi a CSharp (C#) e o banco de dados escolhido para ser utilizado foi o MicrosoftÂ SQL ServerÂ 2008 R2. Os resultados alcanÃados a partir da avaliaÃÃo do software desenvolvido para este estudo revelam que ele se mostrou adequado em todas as caracterÃsticas analisadas, tendo sido indicado como muito apropriado e/ou completamente apropriado por mais de 70,0% das avaliaÃÃes dos especialistas em informÃtica, conforme segue: adequaÃÃo funcional â 100,0%; confiabilidade â 82,6%; usabilidade â 84,9%; eficiÃncia de desempenho â 93,4%; compatibilidade â 85,0%; seguranÃa â 91,7%; manutenibilidade â 95.0%; e portabilidade â 87,5%, bem como pelos especialistas em enfermagem: adequaÃÃo funcional â 96,2%; confiabilidade â 88,5%; usabilidade â 98,7%; eficiÃncia de desempenho â 96,2%; compatibilidade â 98,1%; seguranÃa â 100,0%. Os resultados da classificaÃÃo de risco gerados com o uso do software, ao serem comparados com os gerados a partir do uso do protocolo impresso, indicaram concordÃncia total em dois juÃzes (W=1,000; p<0,001) e concordÃncia muito alta em outro (W=0,992; p<0,001). Os resultados permitiram concluir que o Software para ACCR em Pediatria, desenvolvido neste estudo, foi considerado adequado em relaÃÃo Ã qualidade tÃcnica e ao desempenho funcional. AlÃm disso, o referido software apresentou elevada concordÃncia em comparaÃÃo com o Protocolo impresso, atualmente utilizado para a realizaÃÃo do ACCR no municÃpio de Fortaleza, evidenciando sua potencial seguranÃa para o auxÃlio dos enfermeiros envolvidos na conduÃÃo da classificaÃÃo de risco em pediatria.

APA, Harvard, Vancouver, ISO, and other styles

28

Allanqawi, Khaled Kh S. Kh. "A framework for the classification and detection of design defects and software quality assurance." Thesis, Kingston University, 2015. http://eprints.kingston.ac.uk/34534/.

Full text

Abstract:

In current software development lifecyeles of heterogeneous environments, the pitfalls businesses have to face are that software defect tracking, measurements and quality assurance do not start early enough in the development process. In fact the cost of fixing a defect in a production environment is much higher than in the initial phases of the Software Development Life Cycle (SDLC) which is particularly true for Service Oriented Architecture (SOA). Thus the aim of this study is to develop a new framework for defect tracking and detection and quality estimation for early stages particularly for the design stage of the SDLC. Part of the objectives of this work is to conceptualize, borrow and customize from known frameworks, such as object-oriented programming to build a solid framework using automated rule based intelligent mechanisms to detect and classify defects in software design of SOA. The framework on design defects and software quality assurance (DESQA) will blend various design defect metrics and quality measurement approaches and will provide measurements for both defect and quality factors. Unlike existing frameworks, mechanisms are incorporated for the conversion of defect metrics into software quality measurements. The framework is evaluated using a research tool supported by sample used to complete the Design Defects Measuring Matrix, and data collection process. In addition, the evaluation using a case study aims to demonstrate the use of the framework on a number of designs and produces an overall picture regarding defects and quality. The implementation part demonstrated how the framework can predict the quality level of the designed software. The results showed a good level of quality estimation can be achieved based on the number of design attributes, the number of quality attributes and the number of SOA Design Defects. Assessment shows that metrics provide guidelines to indicate the progress that a software system has made and the quality of design. Using these guidelines, we can develop more usable and maintainable software systems to fulfil the demand of efficient systems for software applications. Another valuable result coming from this study is that developers are trying to keep backwards compatibility when they introduce new functionality. Sometimes, in the same newly-introduced elements developers perform necessary breaking changes in future versions. In that way they give time to their clients to adapt their systems. This is a very valuable practice for the developers because they have more time to assess the quality of their software before releasing it. Other improvements in this research include investigation of other design attributes and SOA Design Defects which can be computed in extending the tests we performed.

APA, Harvard, Vancouver, ISO, and other styles

29

Wang, Chen. "Novel software tool for microsatellite instability classification and landscape of microsatellite instability in osteosarcoma." Miami University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=miami1554829925088174.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Zhang, Jing Bai. "Automatic hidden-web database classification." Thesis, University of Macau, 2007. http://umaclib3.umac.mo/record=b1677225.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Wang, Yi. "Hierarchhical classification of web pages." Thesis, University of Macau, 2008. http://umaclib3.umac.mo/record=b1943013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Magnusson, Ludvig, and Johan Rovala. "AI Approaches for Classification and Attribute Extraction in Text." Thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-67882.

Full text

Abstract:

As the amount of data online grows, the urge to use this data for different applications grows as well. Machine learning can be used with the intent to reconstruct and validate the data you are interested in. Although the problem is very domain specific, this report will attempt to shed some light on what we call strategies for classification, which in broad terms mean, a set of steps in a process where the end goal is to have classified some part of the original data. As a result, we hope to introduce clarity into the classification process in detail as well as from a broader perspective. The report will investigate two classification objectives, one of which is dependent on many variables found in the input data and one that is more literal and only dependent on one or two variables. Specifically, the data we will classify are sales-objects. Each sales-object has a text describing the object and a related image. We will attempt to place these sales-objects into the correct product category. We will also try to derive the year of creation and it’s dimensions such as height and width. Different approaches are presented in the aforementioned strategies in order to classify such attributes. The results showed that for broader attributes such as a product category, supervised learning is indeed an appropriate approach, while the same can not be said for narrower attributes, which instead had to rely on entity recognition. Experiments on image analytics in conjunction with supervised learning proved image analytics to be a good addition when requiring a higher precision score.

APA, Harvard, Vancouver, ISO, and other styles

33

Heik, Andreas, and Edwin Wegener. "Software unter Windows XP." Universitätsbibliothek Chemnitz, 2003. http://nbn-resolving.de/urn:nbn:de:swb:ch1-200301032.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Taing, Nguonly, Thomas Springer, Nicolás Cardozo, and Alexander Schill. "A Rollback Mechanism to Recover from Software Failures in Role-based Adaptive Software Systems." ACM, 2017. https://tud.qucosa.de/id/qucosa%3A75214.

Full text

Abstract:

Context-dependent applications are relatively complex due to their multiple variations caused by context activation, especially in the presence of unanticipated adaptation. Testing these systems is challenging, as it is hard to reproduce the same execution environments. Therefore, a software failure caused by bugs is no exception. This paper presents a rollback mechanism to recover from software failures as part of a role-based runtime with support for unanticipated adaptation. The mechanism performs checkpoints before each adaptation and employs specialized sensors to detect bugs resulting from recent configuration changes. When the runtime detects a bug, it assumes that the bug belongs to the latest configuration. The runtime rolls back to the recent checkpoint to recover and subsequently notifes the developer to fix the bug and re-applying the adaptation through unanticipated adaptation. We prototype the concept as part of our role-based runtime engine LyRT and demonstrate the applicability of the rollback recovery mechanism for unanticipated adaptation in erroneous situations.

APA, Harvard, Vancouver, ISO, and other styles

35

Masood, Khalid. "Histological image analysis and gland modelling for biopsy classification." Thesis, University of Warwick, 2010. http://wrap.warwick.ac.uk/3918/.

Full text

Abstract:

The area of computer-aided diagnosis (CAD) has undergone tremendous growth in recent years. In CAD, the computer output is used as a second opinion for cancer diagnosis. Development of cancer is a multiphase process and mutation of genes is involved over the years. Cancer grows out of normal cells in the body and it usually occurs when growth of the cells in the body is out of control. This phenomenon changes the shape and structure of the tissue glands. In this thesis, we have developed three algorithms for classification of colon and prostate biopsy samples. First, we computed morphological and shape based parameters from hyperspectral images of colon samples and used linear and non-linear classifiers for the identification of cancerous regions. To investigate the importance of hyperspectral imagery in histopathology, we selected a single spectral band from its hyperspectral cube and performed an analysis based on texture of the images. Texture refers to an arrangement of the basic constituents of the material and it is represented by the interrelationships between the spatial arrangements of the image pixels. A novel feature selection method based on the quality of clustering is developed to discard redundant information. In the third algorithm, we used Bayesian inference for segmentation of glands in colon and prostate biopsy samples. In this approach, glands in a tissue are represented by polygonal models with variuos number of vertices depending on the size of glands. An appropriate set of proposals for Metropolis- Hastings-Green algorithm is formulated and a series of Markov chain Monte Carlo (MCMC) simulations are run to extract polygonal models for the glands. We demonstrate the performance of 3D spectral and spatial and 2D spatial analyses with over 90% classification accuracies and less than 10% average segmentation error for the polygonal models.

APA, Harvard, Vancouver, ISO, and other styles

36

Glandberger, Oliver, and Daniel Fredriksson. "Neural Network Regularization for Generalized Heart Arrhythmia Classification." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-19731.

Full text

Abstract:

Background: Arrhythmias are a collection of heart conditions that affect almost half of the world’s population and accounted for roughly 32.1% of all deaths in 2015. More importantly, early detection of arrhythmia through electrocardiogram analysis can prevent up to 90% of deaths. Neural networks are a modern and increasingly popular tool of choice for classifying arrhythmias hidden within ECG-data. In the pursuit of achieving increased classification accuracy, some of these neural networks can become quite complex which can result in overfitting. To combat this phenomena, a technique called regularization is typically used. Thesis’ Problem Statement: Practically all of today’s research on utilizing neural networks for arrhythmia detection incorporates some form of regularization. However, most of this research has chosen not to focus on, and experiment with, regularization. In this thesis we measured and compared different regularization techniques in order to improve arrhythmia classification accuracy. Objectives: The main objective of this thesis is to expand upon a baseline neural network model by incorporating various regularization techniques and compare how these new models perform in relation to the baseline model. The regularization techniques used are L1, L2, L1 + L2, and Dropout. Methods: The study used quantitative experimentation in order to gather metrics from all of the models. Information regarding related works and relevant scientific articles were collected from Summon and Google Scholar. Results: The study shows that Dropout generally produces the best results, on average improving performance across all parameters and metrics. The Dropout model with a regularization parameter of 0.1 performed particularly well. Conclusions: The study concludes that there are multiple models which can be considered to have the greatest positive impact on the baseline model. Depending on how much one values the consequences of False Negatives vs. False Positives, there are multiple candidates which can be considered to be the best model. For example, is it worth choosing a model which misses 11 people suffering from arrhythmia but simultaneously catches 1651 mistakenly classified arrhythmia cases?<br>Bakgrund: Arytmier är en samling hjärt-kärlsjukdomar som drabbar nästan hälften av världens befolkning och stod för ungefär 32,1% av alla dödsfall 2015. 90% av dödsfallen som arytmi orsakar kan förhindras om arytmin identifieras tidigare. Neurala nätverk har blivit ett populärt verktyg för att detektera arytmi baserat på ECG-data. I strävan på att uppnå bättre klassificeringsnogrannhet kan dessa nätverk råka ut för problemet ’overfitting’. Overfitting kan dock förebyggas med regulariseringstekniker. Problemställning: Praktiskt taget all forskning som utnyttjar neurala nätverk för att klassifiera arytmi innehåller någon form av regularisering. Dock har majoriteten av denna forsknings inte valt att fokusera och experimentera med regularisering. I den här avhandlingen kommer vi att testa olika regulariseringstekniker för att jämföra hur de förbättrar grundmodellens arytmiklassificeringsförmåga. Mål: Huvudmålet med denna avhandling är att modifiera ett neuralt nätverk som utnyttjar transfer learning för att klassificera arytmi baserat på två-dimensionell ECG-data. Grundmodellen utökades med olika regulariseringstekniker i mån om att jämföra dessa och därmed komma fram till vilken teknik som har störst positiv påverkan. De tekniker som jämfördes är L1, L2, L1 + L2, och Dropout. Metod: Kvantitativa experiment användes för att samla in data kring teknikernas olika prestationer och denna data analyserades och presenterades sedan. En litteraturstudie genomfördes med hjälp av Summon och Google Scholar för att hitta information från relevanta artiklar. Resultat: Forskningen tyder på att generellt sett presterar Dropout bättre än de andra teknikerna. Dropout med parametern 0.1 förbättrade mätvärderna mest. Slutsatser: I specifikt denna kontext presterade Dropout(0.1) bäst. Dock anser vi att falska negativ och falska positiv inte är ekvivalenta. Vissa modeller presterar bättre än andra beroende på hur mycket dessa variabler värderas, och därmed är den bästa modellen subjektiv. Är det till exempel värt att låta 11 personer dö om det innebär att 1651 personer inte kommer att vidare testas i onödan?

APA, Harvard, Vancouver, ISO, and other styles

37

Wright, Hamish Michael. "A Homogeneous Hierarchical Scripted Vector Classification Network with Optimisation by Genetic Algorithm." Thesis, University of Canterbury. Electrical and Computer Engineering, 2007. http://hdl.handle.net/10092/1191.

Full text

Abstract:

A simulated learning hierarchical architecture for vector classification is presented. The hierarchy used homogeneous scripted classifiers, maintaining similarity tables, and selforganising maps for the input. The scripted classifiers produced output, and guided learning with permutable script instruction tables. A large space of parametrised script instructions was created, from which many different combinations could be implemented. The parameter space for the script instruction tables was tuned using a genetic algorithm with the goal of optimizing the networks ability to predict class labels for bit pattern inputs. The classification system, known as Dura, was presented with various visual classification problems, such as: detecting overlapping lines, locating objects, or counting polygons. The network was trained with a random subset from the input space, and was then tested over a uniformly sampled subset. The results showed that Dura could successfully classify these and other problems. The optimal scripts and parameters were analysed, allowing inferences about which scripted operations were important, and what roles they played in the learning classification system. Further investigations were undertaken to determine Dura's performance in the presence of noise, as well as the robustness of the solutions when faced with highly stochastic training sequences. It was also shown that robustness and noise tolerance in solutions could be improved through certain adjustments to the algorithm. These adjustments led to different solutions which could be compared to determine what changes were responsible for the increased robustness or noise immunity. The behaviour of the genetic algorithm tuning the network was also analysed, leading to the development of a super solutions cache, as well as improvements in: convergence, fitness function, and simulation duration. The entire network was simulated using a program written in C++ using FLTK libraries for the graphical user interface.

APA, Harvard, Vancouver, ISO, and other styles

38

Hafeez, Abdul. "A Software Framework For the Detection and Classification of Biological Targets in Bio-Nano Sensing." Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/50490.

Full text

Abstract:

Detection and identification of important biological targets, such as DNA, proteins, and diseased human cells are crucial for early diagnosis and prognosis. The key to discriminate healthy cells from the diseased cells is the biophysical properties that differ radically. Micro and nanosystems, such as solid-state micropores and nanopores can measure and translate these properties of biological targets into electrical spikes to decode useful insights. Nonetheless, such approaches result in sizable data streams that are often plagued with inherit noise and baseline wanders. Moreover, the extant detection approaches are tedious, time-consuming, and error-prone, and there is no error-resilient software that can analyze large data sets instantly. The ability to effectively process and detect biological targets in larger data sets lie in the automated and accelerated data processing strategies using state-of-the-art distributed computing systems. In this dissertation, we design and develop techniques for the detection and classification of biological targets and a distributed detection framework to support data processing from multiple bio-nano devices. In a distributed setup, the collected raw data stream on a server node is split into data segments and distributed across the participating worker nodes. Each node reduces noise in the assigned data segment using moving-average filtering, and detects the electric spikes by comparing them against a statistical threshold (based on the mean and standard deviation of the data), in a Single Program Multiple Data (SPMD) style. Our proposed framework enables the detection of cancer cells in a mixture of cancer cells, red blood cells (RBCs), and white blood cells (WBCs), and achieves a maximum speedup of 6X over a single-node machine by processing 10 gigabytes of raw data using an 8-node cluster in less than a minute, which will otherwise take hours using manual analysis. Diseases such as cancer can be mitigated, if detected and treated at an early stage. Micro and nanoscale devices, such as micropores and nanopores, enable the translocation of biological targets at finer granularity. These devices are tiny orifices in silicon-based membranes, and the output is a current signal, measured in nanoamperes. Solid-state micropore is capable of electrically measuring the biophysical properties of human cells, when a blood sample is passed through it. The passage of cells via such pores results in an interesting pattern (pulse) in the baseline current, which can be measured at a very high rate, such as 500,000 samples per second, and even higher resolution. The pulse is essentially a sequence of temporal data samples that abruptly falls below and then reverts back to a normal baseline with an acceptable predefined time interval, i.e., pulse width. The pulse features, such as width and amplitude, correspond to the translocation behavior and the extent to which the pore is blocked, under a constant potential. These features are crucial in discriminating the diseased cells from healthy cells, such as identifying cancer cells in a mixture of cells.<br>Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

39

Goeschel, Kathleen. "Feature Set Selection for Improved Classification of Static Analysis Alerts." Diss., NSUWorks, 2019. https://nsuworks.nova.edu/gscis_etd/1091.

Full text

Abstract:

With the extreme growth in third party cloud applications, increased exposure of applications to the internet, and the impact of successful breaches, improving the security of software being produced is imperative. Static analysis tools can alert to quality and security vulnerabilities of an application; however, they present developers and analysts with a high rate of false positives and unactionable alerts. This problem may lead to the loss of confidence in the scanning tools, possibly resulting in the tools not being used. The discontinued use of these tools may increase the likelihood of insecure software being released into production. Insecure software can be successfully attacked resulting in the compromise of one or several information security principles such as confidentiality, availability, and integrity. Feature selection methods have the potential to improve the classification of static analysis alerts and thereby reduce the false positive rates. Thus, the goal of this research effort was to improve the classification of static analysis alerts by proposing and testing a novel method leveraging feature selection. The proposed model was developed and subsequently tested on three open source PHP applications spanning several years. The results were compared to a classification model utilizing all features to gauge the classification improvement of the feature selection model. The model presented did result in the improved classification accuracy and reduction of the false positive rate on a reduced feature set. This work contributes a real-world static analysis dataset based upon three open source PHP applications. It also enhanced an existing data set generation framework to include additional predictive software features. However, the main contribution is a feature selection methodology that may be used to discover optimal feature sets that increase the classification accuracy of static analysis alerts.

APA, Harvard, Vancouver, ISO, and other styles

40

Vogel, Ronny. "eCommerce-Software: Genügt da nicht eine HTML-Seite?" Universitätsbibliothek Chemnitz, 1999. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-199900351.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Hönel, Sebastian. "Efficient Automatic Change Detection in Software Maintenance and Evolutionary Processes." Licentiate thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-94733.

Full text

Abstract:

Software maintenance is such an integral part of its evolutionary process that it consumes much of the total resources available. Some estimate the costs of maintenance to be up to 100 times the amount of developing a software. A software not maintained builds up technical debt, and not paying off that debt timely will eventually outweigh the value of the software, if no countermeasures are undertaken. A software must adapt to changes in its environment, or to new and changed requirements. It must further receive corrections for emerging faults and vulnerabilities. Constant maintenance can prepare a software for the accommodation of future changes. While there may be plenty of rationale for future changes, the reasons behind historical changes may not be accessible longer. Understanding change in software evolution provides valuable insights into, e.g., the quality of a project, or aspects of the underlying development process. These are worth exploiting, for, e.g., fault prediction, managing the composition of the development team, or for effort estimation models. The size of software is a metric often used in such models, yet it is not well-defined. In this thesis, we seek to establish a robust, versatile and computationally cheap metric, that quantifies the size of changes made during maintenance. We operationalize this new metric and exploit it for automated and efficient commit classification. Our results show that the density of a commit, that is, the ratio between its net- and gross-size, is a metric that can replace other, more expensive metrics in existing classification models. Models using this metric represent the current state of the art in automatic commit classification. The density provides a more fine-grained and detailed insight into the types of maintenance activities in a software project. Additional properties of commits, such as their relation or intermediate sojourn-times, have not been previously exploited for improved classification of changes. We reason about the potential of these, and suggest and implement dependent mixture- and Bayesian models that exploit joint conditional densities, models that each have their own trade-offs with regard to computational cost and complexity, and prediction accuracy. Such models can outperform well-established classifiers, such as Gradient Boosting Machines. All of our empirical evaluation comprise large datasets, software and experiments, all of which we have published alongside the results as open-access. We have reused, extended and created datasets, and released software packages for change detection and Bayesian models used for all of the studies conducted.

APA, Harvard, Vancouver, ISO, and other styles

42

Labarge, Isaac E. "Neural Network Pruning for ECG Arrhythmia Classification." DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2136.

Full text

Abstract:

Convolutional Neural Networks (CNNs) are a widely accepted means of solving complex classification and detection problems in imaging and speech. However, problem complexity often leads to considerable increases in computation and parameter storage costs. Many successful attempts have been made in effectively reducing these overheads by pruning and compressing large CNNs with only a slight decline in model accuracy. In this study, two pruning methods are implemented and compared on the CIFAR-10 database and an ECG arrhythmia classification task. Each pruning method employs a pruning phase interleaved with a finetuning phase. It is shown that when performing the scale-factor pruning algorithm on ECG, finetuning time can be expedited by 1.4 times over the traditional approach with only 10% of expensive floating-point operations retained, while experiencing no significant impact on accuracy.

APA, Harvard, Vancouver, ISO, and other styles

43

Gädke, Achim, Markus Rosenstihl, Christopher Schmitt, Holger Stork, and Nikolaus Nestle. "DAMARIS – a flexible and open software platform for NMR spectrometer control: DAMARIS – a flexible and open software platform for NMRspectrometer control." Diffusion fundamentals 5 (2007) 6, S. 1-9, 2007. https://ul.qucosa.de/id/qucosa%3A14270.

Full text

Abstract:

Home-built NMR spectrometers with self-written control software have a long tradition in porous media research. Advantages of such spectrometers are not just lower costs but also more flexibility in developing new experiments (while commercial NMR systems are typically optimized for standard applications such as spectroscopy, imaging or quality control applications). Increasing complexity of computer operating systems, higher expectations with respect to user-friendliness and graphical user interfaces as well as increasing complexity of the NMR experiments themselves have made spectrometer control software development a more complex task than it used to be some years ago. Like that, it becomes more and more complicated for an individual lab to maintain and develop an infrastructure of purely homebuilt NMR systems and software. Possible ways out are: ● commercial NMR hardware with full-blown proprietary software or ● semistandardized home-built equipment and common open-source software environment for spectrometer control. Our present activities in Darmstadt aim at providing a nucleus for the second option: DArmstadt MAgnetic Resonance Instrument Software (DAMARIS) [1]. Based on an ordinary PC, pulse control cards and ADC cards, we have developed an NMR spectrometer control platform that comes at a price tag of about 8000 Euro. The present functionalities of DAMARIS are mainly focused on TD-NMR: the software was successfully used in single-sided NMR [2], pulsed and static field gradient NMR diffusometry [3]. Further work with respect to multipulse/multitriggering experiments in the time domain [4] and solid state NMR spectroscopy multipulse experiments are under development.

APA, Harvard, Vancouver, ISO, and other styles

44

Musa, Mohamed Elhafiz Mustafa. "Towards Finding Optimal Mixture Of Subspaces For Data Classification." Phd thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/1104512/index.pdf.

Full text

Abstract:

In pattern recognition, when data has different structures in different parts of the input space, fitting one global model can be slow and inaccurate. Learning methods can quickly learn the structure of the data in local regions, consequently, offering faster and more accurate model fitting. Breaking training data set into smaller subsets may lead to curse of dimensionality problem, as a training sample subset may not be enough for estimating the required set of parameters for the submodels. Increasing the size of training data may not be at hand in many situations. Interestingly, the data in local regions becomes more correlated. Therefore, by decorrelation methods we can reduce data dimensions and hence the number of parameters. In other words, we can find uncorrelated low dimensional subspaces that capture most of the data variability. The current subspace modelling methods have proved better performance than the global modelling methods for the given type of training data structure. Nevertheless these methods still need more research work as they are suffering from two limitations 2 There is no standard method to specify the optimal number of subspaces. &sup2<br>There is no standard method to specify the optimal dimensionality for each subspace. In the current models these two parameters are determined beforehand. In this dissertation we propose and test algorithms that try to find a suboptimal number of principal subspaces and a suboptimal dimensionality for each principal subspaces automatically.

APA, Harvard, Vancouver, ISO, and other styles

45

Gramlich, Ludwig. "Juristische Probleme bei der Entwicklung und Nutzung von Software." Universitätsbibliothek Chemnitz, 1998. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-199800372.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Hsu, Samantha. "CLEAVER: Classification of Everyday Activities Via Ensemble Recognizers." DigitalCommons@CalPoly, 2018. https://digitalcommons.calpoly.edu/theses/1960.

Full text

Abstract:

Physical activity can have immediate and long-term benefits on health and reduce the risk for chronic diseases. Valid measures of physical activity are needed in order to improve our understanding of the exact relationship between physical activity and health. Activity monitors have become a standard for measuring physical activity; accelerometers in particular are widely used in research and consumer products because they are objective, inexpensive, and practical. Previous studies have experimented with different monitor placements and classification methods. However, the majority of these methods were developed using data collected in controlled, laboratory-based settings, which is not reliably representative of real life data. Therefore, more work is required to validate these methods in free-living settings. For our work, 25 participants were directly observed by trained observers for two two-hour activity sessions over a seven day timespan. During the sessions, the participants wore accelerometers on the wrist, thigh, and chest. In this thesis, we tested a battery of machine learning techniques, including a hierarchical classification schema and a confusion matrix boosting method to predict activity type, activity intensity, and sedentary time in one-second intervals. To do this, we created a dataset containing almost 100 hours worth of observations from three sets of accelerometer data from an ActiGraph wrist monitor, a BioStampRC thigh monitor, and a BioStampRC chest monitor. Random forest and k-nearest neighbors are shown to consistently perform the best out of our traditional machine learning techniques. In addition, we reduce the severity of error from our traditional random forest classifiers on some monitors using a hierarchical classification approach, and combat the imbalanced nature of our dataset using a multi-class (confusion matrix) boosting method. Out of the three monitors, our models most accurately predict activity using either or both of the BioStamp accelerometers (with the exception of the chest BioStamp predicting sedentary time). Our results show that we outperform previous methods while still predicting behavior at a more granular level.

APA, Harvard, Vancouver, ISO, and other styles

47

Chen, Hui. "Identification and classification of shareable tacit knowledge associated with experience in the Chinese software industry sector." Thesis, Loughborough University, 2015. https://dspace.lboro.ac.uk/2134/19659.

Full text

Abstract:

The study reported in this thesis aimed to provide an ontology of professional activities in the software industry that require and enable the acquisition of experience and that, in turn, is the basis for tacit knowledge creation. The rationale behind the creation of such an ontology was based on the need to externalise this tacit knowledge and then record such externalisations so that these can be shared and disseminated across organisations through electronic records management. The research problem here is to conciliate highly theoretical principles associated with tacit knowledge and the ill-defined and quasi-colloquial concept of experience into a tool that can be used by more technical and explicit knowledge minded practitioners of electronic records management. The ontology produced and proposed here provides exactly such a bridge, by identifying what aspects of professional and personal experience should be captured and organising these aspects into an explicit classification that can be used to capture the tacit knowledge and codify it into explicit knowledge. Since such ontologies are always closely related to actual contexts of practice, the researcher decided to choose her own national context of China, where she had worked before and had good guarantees of industrial access. This study used a multiple case-study Straussian Grounded Theory inductive approach. Data collection was conducted through semi-structured interviews in order to get direct interaction with practitioners in the field and capture individuals opinions and perceptions, as well as interpret individuals understandings associated with these processes. The interviews were conducted in three different and representative types of company (SMEs, State Owned and Large Private) in an attempt to capture a rich variety of possible contexts in the SW sector in a Chinese context. Data analysis was conducted according to coding the procedures advocated by Grounded Theory, namely: open, axial and selective coding. Data collection and analysis was conducted until the emergent theory reached theoretical saturation. The theory generated identified 218 different codes out of 797 representative quotations. These codes were grouped and organised into a category hierarchy that includes 6 main categories and 31 sub-categories, which are, in turn, represented in the ontology proposed. This emergent theory indicates in a very concise manner that experienced SW development practitioners in China should be able to understand the nature and value of experience in the SW industry, effectively communicate with other stake holders in the SW development process, be able and motivated to actively engage with continuous professional development, be able to share knowledge with peers and the profession at large, effectively work on projects and exhibit a sound professional attitude both internally to their own company and externally to customers, partners and even competitors. This basic theory was then further analysed by applying selective coding. This resulted in a main theory centred on Working in Projects, which was clearly identified as the core activity in the SW Industry reflecting its design and development nature. Directly related with the core category, three other significant categories were identified as enablers: Communication, Knowledge Sharing and Individual Development. Additionally, Understanding the Nature of Experience in the SW Industry and Professional Attitude were identified as drivers for the entire process of reflection, experience acquisition and tacit knowledge construction by the individual practitioners. Finally, as an integral part of any inductive process of research, the final stage in this study was to position the emerged theory in the body of knowledge. This resulted in the understanding that the theory presented in this study bridges two extremely large bodies of literature: employability skills and competencies. Both of these bodies of literature put their emphasis in explicit knowledge concerning skills and competencies that are defined so that they can be measured and assessed. The focus of the theory proposed in this thesis on experience and resulting acquisition of tacit knowledge allows a natural link between the employability skills and competencies in the SW industry that was hitherto lacking in the body of knowledge. The ontology proposed is of interest to academics in the areas of knowledge management, electronic records management and information systems. The same ontology may be of interest to human resources practitioners to select and develop experienced personnel as well as knowledge and information professionals in organisations.

APA, Harvard, Vancouver, ISO, and other styles

48

Vlas, Radu. "A Requirements-Based Exploration of Open-Source Software Development Projects – Towards a Natural Language Processing Software Analysis Framework." Digital Archive @ GSU, 2012. http://digitalarchive.gsu.edu/cis_diss/48.

Full text

Abstract:

Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects, error-prone. Automated analysis of natural language requirements, even partial, will be of great benefit. Towards that end, I describe the design and validation of an automated natural language requirements classifier for open source software development projects. I compare two strategies for recognizing requirements in open forums of software features. The results suggest that classifying text at the forum post aggregation and sentence aggregation levels may be effective. Initial results suggest that it can reduce the effort required to analyze requirements of open source software development projects. Software development organizations and communities currently employ a large number of software development techniques and methodologies. This implied complexity is also enhanced by a wide range of software project types and development environments. The resulting lack of consistency in the software development domain leads to one important challenge that researchers encounter while exploring this area: specificity. This results in an increased difficulty of maintaining a consistent unit of measure or analysis approach while exploring a wide variety of software development projects and environments. The problem of specificity is more prominently exhibited in an area of software development characterized by a dynamic evolution, a unique development environment, and a relatively young history of research when compared to traditional software development: the open-source domain. While performing research on open source and the associated communities of developers, one can notice the same challenge of specificity being present in requirements engineering research as in the case of closed-source software development. Whether research is aimed at performing longitudinal or cross-sectional analyses, or attempts to link requirements to other aspects of software development projects and their management, specificity calls for a flexible analysis tool capable of adapting to the needs and specifics of the explored context. This dissertation covers the design, implementation, and evaluation of a model, a method, and a software tool comprising a flexible software development analysis framework. These design artifacts use a rule-based natural language processing approach and are built to meet the specifics of a requirements-based analysis of software development projects in the open-source domain. This research follows the principles of design science research as defined by Hevner et. al. and includes stages of problem awareness, suggestion, development, evaluation, and results and conclusion (Hevner et al. 2004; Vaishnavi and Kuechler 2007). The long-term goal of the research stream stemming from this dissertation is to propose a flexible, customizable, requirements-based natural language processing software analysis framework which can be adapted to meet the research needs of multiple different types of domains or different categories of analyses.

APA, Harvard, Vancouver, ISO, and other styles

49

Horáková, Jana. "The gestures that software culture is made of." Hochschule für Musik und Theater 'Felix Mendelssohn Bartholdy' Leipzig, 2016. https://slub.qucosa.de/id/qucosa%3A15860.

Full text

Abstract:

This paper demonstrates the relevance of Vilém Flusser’s concept of post-industrial (programmed) apparatus in contemporary programmed media theory, as represented in the paper by software studies. Examples of software art projects that investigate the limits of apparatus programmability are introduced as examples of artistic gestures of freedom. The interpretation is supported by references to the general theory of gesture proposed by Flusser. The paper suggests that this new interpretative method, described by the author as a discipline for the ‘new people’ of the future can serve alongside software studies as an appropriate theory of software art, understood as gestures of freedom within the apparatus of programmed media.

APA, Harvard, Vancouver, ISO, and other styles

50

Herrmann, Paul, Udo Kebschull, and Wilhelm G. Spruth. "Hard- und Software-Entwicklung eines ATM-Testgenerator/ Monitors." Universität Leipzig, 1998. https://ul.qucosa.de/id/qucosa%3A34538.

Full text

Abstract:

Das Modell von B-ISDN beschreibt ein digitales Allzweck-Netzwerk, das die Übertragung von Daten und Sprache in einem gemeinsamen Netzwerk gleichermaßen unterstützt. Als Transportprotokoll für B-ISDN wurde ATM (Asynchronous Transfer Mode) von der ITU (International Telecommunication Union) festgelegt. Da es sich bei ATM um verbindungsorientierte Hochgeschwindigkeits-Netzwerke mit einer Übertragungsrate ab 155MBit/s handelt, unterscheiden sich die Methoden zur Fehlerverifikation wesentlich von denen in herkömmlichen Netzen. In der vorliegenden Arbeit wird ein ATM-Testgenerator/Monitor vorgestellt, der auf der Basis einer spezifischen Entwicklungsumgebung für eingebettete Systeme des FZI Karlsruhe an der Universität Leipzig entwickelt wird. Ausgehend von den momentan auf dem Markt befindlichen ATM-Testgeräten und Teststrategien wird die Hard- und Software- Architektur des ATM-Testgenerator/Monitors erläutert und die auftretenden Probleme diskutiert. Um den ATM-Testgenerator/Monitor auch von entfernten Lokationen anzusprechen, soll eine Benutzerschnittstelle über ein geeignetes Protokoll mit dem Netscape Browser kommunizieren. Mit Hilfe eines CORBA ORBs werden dem Nutzer unterschiedliche Anwender-Programme zur Verfügung gestellt.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!