Log in

Relevant bibliographies by topics / Datasety / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Datasety.

Dissertations / Theses on the topic 'Datasety'

Author: Grafiati

Published: 28 June 2021

Last updated: 29 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Datasety.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Zembjaková, Martina. "Prieskum a taxonómia sieťových forenzných nástrojov." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445488.

Full text

Abstract:

Táto diplomová práca sa zaoberá prieskumom a taxonómiou sieťových forenzných nástrojov. Popisuje základné informácie o sieťovej forenznej analýze, vrátane procesných modelov, techník a zdrojov dát používaných pri forenznej analýze. Ďalej práca obsahuje prieskum existujúcich taxonómií sieťových forenzných nástrojov vrátane ich porovnania, na ktorý naväzuje prieskum sieťových forenzných nástrojov. Diskutované sieťové nástroje obsahujú okrem nástrojov spomenutých v prieskume taxonómií aj niektoré ďalšie sieťové nástroje. Následne sú v práci detailne popísané a porovnané datasety, ktoré sú podklad

APA, Harvard, Vancouver, ISO, and other styles

2

Kratochvíla, Lukáš. "Trasování objektu v reálném čase." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403748.

Full text

Abstract:

Sledování obecného objektu na zařízení s omezenými prostředky v reálném čase je obtížné. Mnoho algoritmů věnujících se této problematice již existuje. V této práci se s nimi seznámíme. Různé přístupy k této problematice jsou diskutovány včetně hlubokého učení. Představeny jsou reprezentace objektu, datasety i metriky pro vyhodnocování. Mnoho sledovacích algorimů je představeno, osm z nich je implementováno a vyhodnoceno na VOT datasetu.

APA, Harvard, Vancouver, ISO, and other styles

3

Singh, Manjeet. "A Comparison of Rule Extraction Techniques with Emphasis on Heuristics for Imbalanced Datasets." Ohio University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1282139633.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Silva, Jesús, Palma Hugo Hernández, Núẽz William Niebles, David Ovallos-Gazabon, and Noel Varela. "Parallel Algorithm for Reduction of Data Processing Time in Big Data." Institute of Physics Publishing, 2020. http://hdl.handle.net/10757/652134.

Full text

Abstract:

Technological advances have allowed to collect and store large volumes of data over the years. Besides, it is significant that today's applications have high performance and can analyze these large datasets effectively. Today, it remains a challenge for data mining to make its algorithms and applications equally efficient in the need of increasing data size and dimensionality [1]. To achieve this goal, many applications rely on parallelism, because it is an area that allows the reduction of cost depending on the execution time of the algorithms because it takes advantage of the characteristics

APA, Harvard, Vancouver, ISO, and other styles

5

Munyombwe, Theresa. "The harmonisation of stroke datasets : a case study of four UK datasets." Thesis, University of Leeds, 2016. http://etheses.whiterose.ac.uk/13511/.

Full text

Abstract:

Longitudinal studies of stroke patients play a critical part in developing stroke prognostic models. Stroke longitudinal studies are often limited by small sample sizes, poor recruitment, and high attrition levels. Some of these limitations can be addressed by harmonising and pooling data from existing studies. Thus this thesis evaluated the feasibility of harmonising and pooling secondary stroke datasets to investigate the factors associated with disability after stroke. Data from the Clinical Information Management System for Stroke study (n=312), Stroke Outcome Study 1(n=448), Stroke Outcom

APA, Harvard, Vancouver, ISO, and other styles

6

YADAV, DEEPIKA. "SENTIMENT ANALYSIS ON TWITTER DATA." Thesis, DELHI TECHNOLOGICAL UNIVERSITY, 2020. http://dspace.dtu.ac.in:8080/jspui/handle/repository/18821.

Full text

Abstract:

Prior to purchasing an item, individuals for the most part go to different shops in the market, question about the item, cost, and guarantee, and afterward at long last purchase the item dependent on the feelings they got on cost and nature of administration. This procedure is tedious and the odds of being cheated by the merchant are more as there is no one to direct regarding where the purchaser can get valid item and with legitimate expense. Be that as it may, presently a-days a decent number of people rely upon the upon line showcase for purchasing their necessary items. This is on the grou

APA, Harvard, Vancouver, ISO, and other styles

7

Furman, Yoel Avraham. "Forecasting with large datasets." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:69f2833b-cc53-457a-8426-37c06df85bc2.

Full text

Abstract:

This thesis analyzes estimation methods and testing procedures for handling large data series. The first chapter introduces the use of the adaptive elastic net, and the penalized regression methods nested within it, for estimating sparse vector autoregressions. That chapter shows that under suitable conditions on the data generating process this estimation method satisfies an oracle property. Furthermore, it is shown that the bootstrap can be used to accurately conduct inference on the estimated parameters. These properties are used to show that structural VAR analysis can also be validly cond

APA, Harvard, Vancouver, ISO, and other styles

8

Mumtaz, Shahzad. "Visualisation of bioinformatics datasets." Thesis, Aston University, 2015. http://publications.aston.ac.uk/25261/.

Full text

Abstract:

Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in

APA, Harvard, Vancouver, ISO, and other styles

9

Mazumdar, Suvodeep. "Visualising large semantic datasets." Thesis, University of Sheffield, 2013. http://etheses.whiterose.ac.uk/5932/.

Full text

Abstract:

This thesis aims at addressing a major issue in Semantic Web and organisational Knowledge Management: consuming large scale semantic data in a generic, scalable and pleasing manner. It proposes two solutions by de-constructing the issue into two sub problems: how can large semantic result sets be presented to users; and how can large semantic datasets be explored and queried. The first proposed solution is a dashboard-based multi-visualisation approach to present simultaneous views over different facets of the data. Challenges imposed by existing technology infrastructure resulted in the devel

APA, Harvard, Vancouver, ISO, and other styles

10

De, León Eduardo Enrique. "Medical abstract inference dataset." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/119516.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.<br>This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Cataloged from student-submitted PDF version of thesis.<br>Includes bibliographical references (page 35).<br>In this thesis, I built a dataset for predicting clinical outcomes from medical abstracts and their title. Medical Abstract Inference consists of 1,794 data points. Titles were filtered to include the abstract's repor

APA, Harvard, Vancouver, ISO, and other styles

11

Schöner, Holger. "Working with real world datasets preprocessing and prediction with large incomplete and heterogeneous datasets /." [S.l.] : [s.n.], 2005. http://deposit.ddb.de/cgi-bin/dokserv?idn=973424672.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Schmidt, Heiko A. "Phylogenetic trees from large datasets." [S.l. : s.n.], 2003. http://deposit.ddb.de/cgi-bin/dokserv?idn=968534945.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Gemulla, Rainer. "Sampling Algorithms for Evolving Datasets." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2008. http://nbn-resolving.de/urn:nbn:de:bsz:14-ds-1224861856184-11644.

Full text

Abstract:

Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such samples are widely used to speed up the processing of analytic queries and data-mining tasks, to enhance query optimization, and to facilitate information integration. Most of the existing work on database sampling focuses on how to create or exploit a random sample of a static database, that is, a database that does not change over time. The assumption of a static database, however, severely limits the applicability of these techniques in practice, where data is often not static but continuously evol

APA, Harvard, Vancouver, ISO, and other styles

14

Jones, Martin. "Multigene datasets for deep phylogeny." Thesis, University of Edinburgh, 2007. http://hdl.handle.net/1842/2575.

Full text

Abstract:

Though molecular phylogenetics has been very successful in reconstructing the evolutionary history of species, some phylogenies, particularly those involving ancient events, have proven difficult to resolve. One approach to improving the resolution of deep phylogenies is to increase the amount of data by including multiple genes assembled from public sequence databases. Using modern phylogenetic methods and abundant computing power, the vast amount of sequence data available in public databases can be brought to bear on difficult phylogenetic problems. In this thesis I outline the motivation f

APA, Harvard, Vancouver, ISO, and other styles

15

Alawini, Abdussalam. "Identifying Relationships between Scientific Datasets." Thesis, Portland State University, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10127966.

Full text

Abstract:

<p> Scientific datasets associated with a research project can proliferate over time as a result of activities such as sharing datasets among collaborators, extending existing datasets with new measurements, and extracting subsets of data for analysis. As such datasets begin to accumulate, it becomes increasingly difficult for a scientist to keep track of their derivation history, which complicates data sharing, provenance tracking, and scientific reproducibility. Understanding what relationships exist between datasets can help scientists recall their original derivation history. For instance,

APA, Harvard, Vancouver, ISO, and other styles

16

Traore, Michael. "Interactive visualization for volumetric datasets." Thesis, Toulouse, ISAE, 2018. http://www.theses.fr/2018ESAE0028.

Full text

Abstract:

L’occlusion est un problème dans la visualisation volumétrique car elle empêche lavisualisation directe d’une région d’intérêt. Alors que la plupart des systèmes existantsutilisent une combinaison de techniques de rendu en volume direct (DVR) et deleur fonction de transfert (TF) correspondante, nous avons envisagé des techniquesd’interaction alternatives pour explorer ces ensembles de données.Tout d’abord, nous avons proposé un nouveau système de visualisation interactivepour les bagages numérisés en 3D, accéléré par les techniques GPGPU, conformémentaux besoins que nous avons extraits de l’en

APA, Harvard, Vancouver, ISO, and other styles

17

Giritharan, Balathasan. "Incremental Learning with Large Datasets." Thesis, University of North Texas, 2012. https://digital.library.unt.edu/ark:/67531/metadc149595/.

Full text

Abstract:

This dissertation focuses on the novel learning strategy based on geometric support vector machines to address the difficulties of processing immense data set. Support vector machines find the hyper-plane that maximizes the margin between two classes, and the decision boundary is represented with a few training samples it becomes a favorable choice for incremental learning. The dissertation presents a novel method Geometric Incremental Support Vector Machines (GISVMs) to address both efficiency and accuracy issues in handling massive data sets. In GISVM, skin of convex hulls is defined and an

APA, Harvard, Vancouver, ISO, and other styles

18

Barnathan, Michael. "Mining Complex High-Order Datasets." Diss., Temple University Libraries, 2010. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/82058.

Full text

Abstract:

Computer and Information Science<br>Ph.D.<br>Selection of an appropriate structure for storage and analysis of complex datasets is a vital but often overlooked decision in the design of data mining and machine learning experiments. Most present techniques impose a matrix structure on the dataset, with rows representing observations and columns representing features. While this assumption is reasonable when features are scalar and do not exhibit co-dependence, the matrix data model becomes inappropriate when dependencies between non-target features must be modeled in parallel, or when features

APA, Harvard, Vancouver, ISO, and other styles

19

Brugnara, Martin. "UNDERSTANDING AND MANAGING COMPLEX DATASETS." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/337818.

Full text

Abstract:

Nowadays, we are producing and collecting data at an unprecedented rate, measured in the order of petabytes per minute, together with a substantial increase in data volume, complexity, and variety. While tabular and unstructured data still dominate the scene, graphs are becoming ever more prominent, bringing new challenges. The size and complexity of graph datasets have increased, thus renewing the interest in graph databases and distributed graph processing. The current abundance of data and content complicates even simply accessing data. Web users are constantly overwhelmed by the availabili

APA, Harvard, Vancouver, ISO, and other styles

20

STOCCHI, MARCO. "Inference Engines for Streaming Datasets." Doctoral thesis, Università degli Studi di Cagliari, 2017. http://hdl.handle.net/11584/249556.

Full text

Abstract:

The problem of forecasting streaming datasets, particularly the financial time series, has been largely explored in the past, but we believe the advancement of technologies such as the Internet of Things, which will connect an exponentially increasing number of sensors and devices, endowed with limited computational resources, yet capable of producing enormous amounts of sampled data, and the progressively higher social need to deploy intelligent systems, will make the prediction of time series a core industrial issue in the next future. Consequently, we also believe that investigating efficie

APA, Harvard, Vancouver, ISO, and other styles

21

Jayaraman, Jayakumar. "Dental age assessment of Southern Chinese using Demirjian's dataset and the United Kingdom dataset." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2010. http://hub.hku.hk/bib/B45447767.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Horečný, Peter. "Metody segmentace obrazu s malými trénovacími množinami." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-412996.

Full text

Abstract:

The goal of this thesis was to propose an image segmentation method, which is capable of effective segmentation process with small datasets. Recently published ODE neural network was used for this method, because its features should provide better generalization in case of tasks with only small datasets available. The proposed ODE-UNet network was created by combining UNet architecture with ODE neural network, while using benefits of both networks. ODE-UNet reached following results on ISBI dataset: Rand: 0,950272 and Info: 0,978061. These results are better than the ones received from UNet mo

APA, Harvard, Vancouver, ISO, and other styles

23

Koufakou, Anna. "SCALABLE AND EFFICIENT OUTLIER DETECTION IN LARGE DISTRIBUTED DATA SETS WITH MIXED-TYPE ATTRIBUTES." Doctoral diss., University of Central Florida, 2009. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3431.

Full text

Abstract:

An important problem that appears often when analyzing data involves identifying irregular or abnormal data points called outliers. This problem broadly arises under two scenarios: when outliers are to be removed from the data before analysis, and when useful information or knowledge can be extracted by the outliers themselves. Outlier Detection in the context of the second scenario is a research field that has attracted significant attention in a broad range of useful applications. For example, in credit card transaction data, outliers might indicate potential fraud; in network traffic data,

APA, Harvard, Vancouver, ISO, and other styles

24

Sysoev, Oleg. "Monotonic regression for large multivariate datasets /." Linköping : Department of Cuputer and Information Science, Linköping University, 2010. http://www2.bibl.liu.se/liupubl/disp/disp2010/stat11s.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Mahmood, Muhammad Habib. "Motion annotation in complex video datasets." Doctoral thesis, Universitat de Girona, 2018. http://hdl.handle.net/10803/667583.

Full text

Abstract:

Motion segmentation refers to the process of separating regions and trajectories from a video sequence into coherent subsets of space and time. In this thesis, we created a new multifaceted motion segmentation dataset enclosing real-life long and short sequences, with different numbers of motions and frames per sequence, and real distortions with missing data. Trajectory- and region-based ground-truth is provided on all the frames of all the sequences. We also proposed a new semi-automatic tool for delineating the trajectories in complex videos, even in videos captured from moving cameras. Wit

APA, Harvard, Vancouver, ISO, and other styles

26

Shi, Xiaojin. "Visual learning from small training datasets /." Diss., Digital Dissertations Database. Restricted to UC campuses, 2005. http://uclibs.org/PID/11984.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Fraser, Ross Macdonald. "Computational analysis of nucleosome positioning datasets." Thesis, University of Edinburgh, 2006. http://hdl.handle.net/1842/29110.

Full text

Abstract:

Monomer extension (ME) is an established <i>in vitro </i>experimental technique which maps the positions adopted by reconstituted core histone octamers on a defined DNA sequence. It provides quantitative positioning information, at high resolution, over long continuous stretches of DNA sequence. This technique has been employed to map several genes: globin genes (8 kbp), the beta-lactoglobulin gene (10 kbp) and various imprinting genes (4 kbp). This study explores and analyses this unique dataset, utilising computational and stochastic techniques, to gain insight into the potential influence o

APA, Harvard, Vancouver, ISO, and other styles

28

Cotter, Andrew. "Regression on datasets containing missing elements." Diss., Connect to online resource, 2005. http://wwwlib.umi.com/cr/colorado/fullcit?p1425786.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Blum, Joshua (Joshua M. ). "Pinky : interactively analyzing large EEG datasets." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/105939.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.<br>This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Cataloged from student-submitted PDF version of thesis.<br>Includes bibliographical references (pages 75-77).<br>In this thesis, I describe a system I designed and implemented for interactively analyzing large electroencephalogram (EEG) datasets. Trained experts, known as encephalographers, analyze EEG data to determine if a

APA, Harvard, Vancouver, ISO, and other styles

30

Hilton, Erwin. "Visual datasets for artificial intelligence agents." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119553.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.<br>This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Cataloged from PDF version of thesis.<br>Includes bibliographical references (page 41).<br>In this thesis, I designed and implemented two visual dataset generation tool frameworks. With these tools, I introduce procedurally generated new data to test VQA agents and other visual Al models on. The first tool is Spatial IQ Gene

APA, Harvard, Vancouver, ISO, and other styles

31

Liu, Fang. "Mining Security Risks from Massive Datasets." Diss., Virginia Tech, 2017. http://hdl.handle.net/10919/78684.

Full text

Abstract:

Cyber security risk has been a problem ever since the appearance of telecommunication and electronic computers. In the recent 30 years, researchers have developed various tools to protect the confidentiality, integrity, and availability of data and programs. However, new challenges are emerging as the amount of data grows rapidly in the big data era. On one hand, attacks are becoming stealthier by concealing their behaviors in massive datasets. One the other hand, it is becoming more and more difficult for existing tools to handle massive datasets with various data types. This thesis presen

APA, Harvard, Vancouver, ISO, and other styles

32

Smith, Zach. "Joining and aggregating datasets using CouchDB." Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29530.

Full text

Abstract:

Data mining typically requires implementing operations that involve cross-cutting entity boundaries and are awkward to implement in document-oriented databases. CouchDB, for example, models entities as documents, with highly isolated entity boundaries, and on which joins cannot be directly performed. This project shows how join and aggregation can be achieved across entity boundaries in such systems, as encountered for example in the pre-processing and exploration stages of educational data mining. A software stack is presented as a means by which this can be achieved; first, datasets are proc

APA, Harvard, Vancouver, ISO, and other styles

33

Somasundaram, Jyothilakshmi. "Releasing Recommendation Datasets while Preserving Privacy." Miami University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=miami1306427987.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Han, Qian. "Mining Shared Decision Trees between Datasets." Wright State University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=wright1274807201.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Joshi, Vineet. "Unsupervised Anomaly Detection in Numerical Datasets." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1427799744.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Siddique, Nahian A. "PATTERN RECOGNITION IN CLASS IMBALANCED DATASETS." VCU Scholars Compass, 2016. http://scholarscompass.vcu.edu/etd/4480.

Full text

Abstract:

Class imbalanced datasets constitute a significant portion of the machine learning problems of interest, where recognizing the ‘rare class’ is the primary objective for most applications. Traditional linear machine learning algorithms are often not effective in recognizing the rare class. In this research work, a specifically optimized feed-forward artificial neural network (ANN) is proposed and developed to train from moderate to highly imbalanced datasets. The proposed methodology deals with the difficulty in classification task in multiple stages—by optimizing the training dataset, modifyi

APA, Harvard, Vancouver, ISO, and other styles

37

Zhang, Xiaoyu. "Scalable isocontour visualization for large datasets /." Full text (PDF) from UMI/Dissertation Abstracts International, 2001. http://wwwlib.umi.com/cr/utexas/fullcit?p3064695.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Yang, Chaozheng. "Sufficient Dimension Reduction in Complex Datasets." Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/404627.

Full text

Abstract:

Statistics<br>Ph.D.<br>This dissertation focuses on two problems in dimension reduction. One is using permutation approach to test predictor contribution. The permutation approach applies to marginal coordinate tests based on dimension reduction methods such as SIR, SAVE and DR. This approach no longer requires calculation of the method-specific weights to determine the asymptotic null distribution. The other one is through combining clustering method with robust regression (least absolute deviation) to estimate dimension reduction subspace. Compared with ordinary least squares, the proposed m

APA, Harvard, Vancouver, ISO, and other styles

39

Katcoff, Abigail. "Aligning heterogenous single cell assay datasets." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/123030.

Full text

Abstract:

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019<br>Cataloged from student-submitted PDF version of thesis.<br>Includes bibliographical references (pages 51-53).<br>Pluripotent stem cells offer strong promise for regenerative medicine but the pluripotent cell state is poorly understood. The goal of this thesis is the development of methods to analyze how the multiple facets of

APA, Harvard, Vancouver, ISO, and other styles

40

Gerick, Steven Anthony. "Information Engineering with E-Learning Datasets." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-265008.

Full text

Abstract:

The rapid growth of the E-learning industry necessitates a streamlined process for identifying actionable information in the user databases maintained by E-learning companies. This paper applies several traditional mathematical and some machine learning techniques to one such dataset with the goal of identifying patterns in user proficiency that are not readily apparent from simply viewing the data. We also analyze the applicability of such methods to the dataset in question and datasets like it. We find that many of the methods can reveal useful insights into the dataset, even if some methods

APA, Harvard, Vancouver, ISO, and other styles

41

Concepcion, Miranda Tomas Javier. "Profiling and Visualizing Android Malware Datasets." Electronic Thesis or Diss., CentraleSupélec, 2022. http://www.theses.fr/2022CSUP0005.

Full text

Abstract:

Les dispositifs mobiles sont ubiquitaires: aujourd’hui la majorité des gens possèdent un téléphone mobile. A cause de ce fait, ces dispositifs sont une cible d’intérêt pour les attaquants. Ces attaques sont véhiculées au travers des applications malveillantes qui peuvent nuire aux dispositifs mobiles. Les chercheurs en analyse de malware travaillent à reconnaître ces types de programmes avant qu’ils soient installés sur un dispositif utilisateur. Pour faire cela, ils réalisent des expériences pour automatiquement détecter ces malware, où ils utilisent des ensembles de malware et des applicatio

APA, Harvard, Vancouver, ISO, and other styles

42

Roizman, Violeta. "Flexible clustering algorithms for heterogeneous datasets." Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG002.

Full text

Abstract:

L'objectif de la segmentation de données ou clustering est de trouver des groupes homogènes en fonction d'une distance prédeterminée. Étant donnée sa nature non supervisée, le clustering peut être appliqué à tout type de données et peut s'affranchir de processus d'étiquetage (labels) qui peuvent s'avérer très coûteux. Parmi les algorithmes de clustering les plus populaires, celui basé sur le modèle de mélange gaussien (MMG) est particulièrement intéressant. En effet, cet algorithme est très intuitif et fonctionne très bien lorsque les groupes ont une forme elliptique.Cependant, le modèle MMG e

APA, Harvard, Vancouver, ISO, and other styles

43

Talár, Ondřej. "Redukce šumu audionahrávek pomocí hlubokých neuronových sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2017. http://www.nusl.cz/ntk/nusl-317118.

Full text

Abstract:

The thesis focuses on the use of deep recurrent neural network, architecture Long Short-Term Memory for robust denoising of audio signal. LSTM is currently very attractive due to its characteristics to remember previous weights, or edit them not only according to the used algorithms, but also by examining changes in neighboring cells. The work describes the selection of the initial dataset and used noise along with the creation of optimal test data. For creation of the training network is selected KERAS framework for Python and are explored and discussed possible candidates for viable solution

APA, Harvard, Vancouver, ISO, and other styles

44

Tao, F. "Data mining for relationships in large datasets." Thesis, Queen's University Belfast, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.273298.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Romuld, Daniel, and Markus Ruhmén. "Compiling attention datasets : Developing a method for annotating face datasets with human performance attention labels using crowdsourcing." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-166708.

Full text

Abstract:

This essay expands on the problem of human attention detection in computer vision. This is achieved by providing a method for annotating existing face datasets with attention labels through the use of human intelligence. The work described in this essay is justified by a lack of human performance attention datasets and the potential uses of the developed method. Several images of crowds were generated using the Labeled Faces in the Wild dataset of images depicting faces. Thus enabling evaluation of the level of attention of the depicted subjects as part of a crowd. The data collection methodol

APA, Harvard, Vancouver, ISO, and other styles

46

Liu, Qing Computer Science &amp Engineering Faculty of Engineering UNSW. "Summarization of very large spatial dataset." Awarded by:University of New South Wales. School of Computer Science and Engineering, 2006. http://handle.unsw.edu.au/1959.4/25489.

Full text

Abstract:

Nowadays there are a large number of applications, such as digital library information retrieval, business data analysis, CAD/CAM, multimedia applications with images and sound, real-time process control and scientific computation, with data sets about gigabytes, terabytes or even petabytes. Because data distributions are too large to be stored accurately, maintaining compact and accurate summarized information about underlying data is of crucial important. The summarizing problem for Level 1 (disjoint and non-disjoint) topological relationship has been well studied for the past few years. Ho

APA, Harvard, Vancouver, ISO, and other styles

47

TAMPONI, EMANUELE. "Dataset analysis for classifier ensemble enhancement." Doctoral thesis, Università degli Studi di Cagliari, 2015. http://hdl.handle.net/11584/266597.

Full text

Abstract:

We developed three different methods for dataset analysis and ensemble enhance- ment. They share the underlying idea that an accurate preprocessing and adap- tation of the data can improve the system performance, without changing the classification model. Correlation Score is a generic framework for assessing encoding techniques by measuring the correlation between the encoded feature vectors and the corresponding class labels; experiments show its effectiveness in discovering the best encoding configurations between those tested, on a wide range of classification domains. Multi-Resolut

APA, Harvard, Vancouver, ISO, and other styles

48

Lamichhane, Niraj. "Prediction of Travel Time and Development of Flood Inundation Maps for Flood Warning System Including Ice Jam Scenario. A Case Study of the Grand River, Ohio." Youngstown State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1463789508.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Giommi, Luca. "Predicting CMS datasets popularity with machine learning." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/9136/.

Full text

Abstract:

In CMS è stato lanciato un progetto di Data Analytics e, all’interno di esso, un’attività specifica pilota che mira a sfruttare tecniche di Machine Learning per predire la popolarità dei dataset di CMS. Si tratta di un’osservabile molto delicata, la cui eventuale predizione premetterebbe a CMS di costruire modelli di data placement più intelligenti, ampie ottimizzazioni nell’uso dello storage a tutti i livelli Tiers, e formerebbe la base per l’introduzione di un solito sistema di data management dinamico e adattivo. Questa tesi descrive il lavoro fatto sfruttando un nuovo prototipo pilo

APA, Harvard, Vancouver, ISO, and other styles

50

Matsubara, Yasuko. "Statistical Data Mining for Time-series Datasets." 京都大学 (Kyoto University), 2012. http://hdl.handle.net/2433/157475.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!