To see the other types of publications on this topic, follow the link: Shluková analýza dat.

Dissertations / Theses on the topic 'Shluková analýza dat'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 44 dissertations / theses for your research on the topic 'Shluková analýza dat.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Sobíšek, Lukáš. "Shluková a regresní analýza mikropanelových dat." Doctoral thesis, Vysoká škola ekonomická v Praze, 2010. http://www.nusl.cz/ntk/nusl-261941.

Full text
Abstract:
The main purpose of panel studies is to analyze changes in values of studied variables over time. In micro panel research, a large number of elements are periodically observed within the relatively short time period of just a few years. Moreover, the number of repeated measurements is small. This dissertation deals with contemporary approaches to the regression and the clustering analysis of micro panel data. One of the approaches to the micro panel analysis is to use multivariate statistical models originally designed for crosssectional data and modify them in order to take into account the within-subject correlation. The thesis summarizes available tools for the regression analysis of micro panel data. The known and currently used linear mixed effects models for a normally distributed dependent variable are recapitulated. Besides that, new approaches for analysis of a response variable with other than normal distribution are presented. These approaches include the generalized marginal linear model, the generalized linear mixed effects model and the Bayesian modelling approach. In addition to describing the aforementioned models, the paper also includes a brief overview of their implementation in the R software. The difficulty with the regression models adjusted for micro panel data is the ambiguity of their parameters estimation. This thesis proposes a way to improve the estimations through the cluster analysis. For this reason, the thesis also contains a description of methods of the cluster analysis of micro panel data. Because supply of the methods is limited, the main goal of this paper is to devise its own two-step approach for clustering micro panel data. In the first step, the panel data are transformed into a static form using a set of proposed characteristics of dynamics. These characteristics represent different features of time course of the observed variables. In the second step, the elements are clustered by conventional spatial clustering techniques (agglomerative clustering and the C-means partitioning). The clustering is based on a dissimilarity matrix of the values of clustering variables calculated in the first step. Another goal of this paper is to find out whether the suggested procedure leads to an improvement in quality of the regression models for this type of data. By means of a simulation study, the procedure drafted herein is compared to the procedure applied in the kml package of the R software, as well as to the clustering characteristics proposed by Urso (2004). The simulation study demonstrated better results of the proposed combination of clustering variables as compared to the other combinations currently used. A corresponding script written in the R-language represents another benefit of this paper. It is available on the attached CD and it can be used for analyses of readers own micro panel data.
APA, Harvard, Vancouver, ISO, and other styles
2

Helcl, Zbyněk. "Shluková analýza víceroměrných dat neuronovou sítí." Master's thesis, Vysoká škola ekonomická v Praze, 2008. http://www.nusl.cz/ntk/nusl-3810.

Full text
Abstract:
The topic of the present thesis is an analysis of a sample data archive containing measured values of real and reactive power. The measurement in question took place in late 2006 and early 2007 using MEg40 recording measurement devices disposed in a station for transforming high voltage to low voltage in the Pražská energetika distribution network. The procedure of processing measured values, the preparation thereof for a subsequent processing by a neural network, and a final statistical evaluation of determined individual clusters -- typical daily take-off diagrams -- will be described. The results of the present thesis may be applied in the making of predictions of electrical energy consumption at a particular transformer station.
APA, Harvard, Vancouver, ISO, and other styles
3

Žambochová, Marta. "Shluková analýza rozsáhlých souborů dat: nové postupy založené na metodě k-průměrů." Doctoral thesis, Vysoká škola ekonomická v Praze, 2005. http://www.nusl.cz/ntk/nusl-77061.

Full text
Abstract:
Abstract Cluster analysis has become one of the main tools used in extracting knowledge from data, which is known as data mining. In this area of data analysis, data of large dimensions are often processed, both in the number of objects and in the number of variables, which characterize the objects. Many methods for data clustering have been developed. One of the most widely used is a k-means method, which is suitable for clustering data sets containing large number of objects. It is based on finding the best clustering in relation to the initial distribution of objects into clusters and subsequent step-by-step redistribution of objects belonging to the clusters by the optimization function. The aim of this Ph.D. thesis was a comparison of selected variants of existing k-means methods, detailed characterization of their positive and negative characte- ristics, new alternatives of this method and experimental comparisons with existing approaches. These objectives were met. I focused on modifications of the k-means method for clustering of large number of objects in my work, specifically on the algorithms BIRCH k-means, filtering, k-means++ and two-phases. I watched the time complexity of algorithms, the effect of initialization distribution and outliers, the validity of the resulting clusters. Two real data files and some generated data sets were used. The common and different features of method, which are under investigation, are summarized at the end of the work. The main aim and benefit of the work is to devise my modifications, solving the bottlenecks of the basic procedure and of the existing variants, their programming and verification. Some modifications brought accelerate the processing. The application of the main ideas of algorithm k-means++ brought to other variants of k-means method better results of clustering. The most significant of the proposed changes is a modification of the filtering algorithm, which brings an entirely new feature of the algorithm, which is the detection of outliers. The accompanying CD is enclosed. It includes the source code of programs written in MATLAB development environment. Programs were created specifically for the purpose of this work and are intended for experimental use. The CD also contains the data files used for various experiments.
APA, Harvard, Vancouver, ISO, and other styles
4

Langer, Tomáš. "Shluková analýza 100 českých firem na základě účetních výkazů." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2017. http://www.nusl.cz/ntk/nusl-318609.

Full text
Abstract:
The diploma thesis called Cluster Analysis Czech 100 Companies on the Basis of Financial Statements deals with the testing of two hypotheses using a multidimensional statistical method - cluster analysis. The input data for the application of statistical methods are financial statements of selected companies for the years 2014 and 2015 which are publicly available. These data are digitized and subjected to methods of financial analysis.
APA, Harvard, Vancouver, ISO, and other styles
5

Klus, Roman. "Analýza velkých dat v kontextu optimalizace mobilních sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-400542.

Full text
Abstract:
Tato práce se zabývá technologiemi velkých dat v kontextu měření parametrů sítě. Popisuje téma velkých dat a jejich využití, představuje základní parametry sítě, jejich měření a metody zhodnocení. Vyhodnocuje RTR NetTest aplikaci, testovací proceduru a měřené parametry. Byla vytvořena skupina nástrojů pro posouzení základních kvantitativních parametrů mobilní sítě na základě dat z databáze RTR. Rozbor denního efektu shrnuje časovou proměnlivost sítě. Chování v prostoru je posouzeno binováním a shlukovou analýzou, současně se srovnáním řízeného testování a crowdsourcingu.
APA, Harvard, Vancouver, ISO, and other styles
6

Vilikus, Ondřej. "Shlukovací metody pro velké soubory dat." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-4408.

Full text
Abstract:
S rostoucím množstvím shromažďovaných a ukládaných dat vzniká potřeba shlukovacích metod, které by se dokázaly vypořádat i s rozsáhlými datovými soubory. Proto se objevuje množství nových algoritmů, vycházejících jak ze statistických přístupů, tak i z oblasti strojového učení. Cílem této diplomové práce je stručně představit dostupné metody shlukové analýzy a zhodnotit jejich silné a slabé stránky při analýze velkých souborů. Obsahem teoretické části je shrnutí základních pojmů a principů, které jsou všem metodám společné, a popisu nejznámějších metod shlukové analýzy. Ten obsahuje stručné vysvětlení, na jakém principu fungují a jaké výhody nebo případné nedostatky můžeme při jejich použití očekávat. Praktická část práce je věnována vlastnímu testování osmi metod dostupných v komerčním (SPSS, S-PLUS, STATISTICA) nebo akademickém (Weka) softwaru. Pro testování jsou použity umělé soubory se specifickými charakteristikami, které jsem vygeneroval pomocí vlastního algoritmu. Ten je rozšířením Neyman-Scottova procesu a kromě sférických shluků generuje i shluky nepravidelných tvarů. Výsledky potvrzují očekávání vycházející z teoretických předpokladů. Přinášejí však možnost kvantifikace vlivu charakteru dat na vhodnost jednotlivých metod.
APA, Harvard, Vancouver, ISO, and other styles
7

Riedl, Pavel. "Modul shlukové analýzy systému pro dolování z dat." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237095.

Full text
Abstract:
This master's thesis deals with development of a module for a data mining system, which is being developed on FIT. The first part describes the general knowledge discovery process and cluster analysis including cluster validation; it also describes Oracle Data Mining including algorithms, which it uses for clustering. At the end it deals with the system and the technologies it uses, such as NetBeans Platform and DMSL. The second part describes design of a clustering module and a module used to compare its results. It also deals with visualization of cluster analysis results and shows the achievements.
APA, Harvard, Vancouver, ISO, and other styles
8

Pánková, Barbara. "Analýza úrovně kvality života pomocí shlukové analýzy a porovnání s Human Development Indexem." Master's thesis, Vysoká škola ekonomická v Praze, 2015. http://www.nusl.cz/ntk/nusl-264466.

Full text
Abstract:
Nowadays quality of life is often discussed topic. In defining this term, there is considerable ambiguity and disunity, since there is no universally accepted definition, nor theoretically sophisticated model. However, despite this fact, the level of quality of life is currently one of the most discussed topic. Monitoring the quality of life by using a variety of indicators are engaged in several international organizations, one of them is the Development Programme of the United Nations. This organization annually publishes the Human Development Index, which divides the world´s countries into four groups according to their level of development: low, medium, high and very high development. The aim of this thesis is to analyze the quality of life in 125 countries by using cluster analysis, accurately the Ward's method. Quality of life in this thesis is evaluated based on 19 demographic and economic indicators, which include life expectancy, literacy rate, access to drinking water and infant mortality rate. The cluster analysis divided the country into individual clusters by their similarities. Six clusters were created by this analysis, which had been compared with the results of Human Development Index. The clusters very well reflect the division, which is commonly used in the characterization of developing and developed countries. Each of the six clusters can be very well described and characterized in terms of quality of life. It is also possible qualify those clusters as poorest developing, low developed, moderately developed, medium development, high and very high development countries. Based on the results it can be stated that this analysis is consistent with other indicators of quality of life and the resulting clusters are identical with the division of countries which is commonly used.
APA, Harvard, Vancouver, ISO, and other styles
9

Abraham, Lukáš. "Analýza dat síťové komunikace mobilních zařízení." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-432938.

Full text
Abstract:
At the beginning, the work describes DNS and SSL/TLS protocols, it mainly deals with communication between devices using these protocols. Then we'll talk about data preprocessing and data cleaning. Furthermore, the thesis deals with basic data mining techniques such as data classification, association rules, information retrieval, regression analysis and cluster analysis. The next chapter we can read something about how to identify mobile devices on the network. We will evaluate data sets that contain collected data from communication between the above mentioned protocols, which will be used in the practical part. After that, we finally get to the design of a system for analyzing network communication data. We will describe the libraries, which we used and the entire system implementation. We will perform a large number of experiments, which we will finally evaluate.
APA, Harvard, Vancouver, ISO, and other styles
10

Hlosta, Martin. "Modul pro shlukovou analýzu systému pro dolování z dat." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237158.

Full text
Abstract:
This thesis deals with the design and implementation of a cluster analysis module for currently developing datamining system DataMiner on FIT BUT. So far, the system lacked cluster analysis module. The main objective of the thesis was therefore to extend the system of such a module. Together with me, Pavel Riedl worked on the module. We have created a common part for all the algorithms so that the system can be easily extended to other clustering algorithms. In the second part, I extended the clustering module by adding three density based clustering aglorithms - DBSCAN, OPTICS and DENCLUE. Algorithms have been implemented and appropriate sample data was chosen to verify theirs functionality.
APA, Harvard, Vancouver, ISO, and other styles
11

Geschwinder, Lukáš. "Možnosti využití metod vícerozměrné statistické analýzy dat při hodnocení spolehlivosti distribučních sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-217824.

Full text
Abstract:
The aim of this study is evaluation of using multi-dimensional statistical analyses methods as a tool for simulations of reliability of distribution network. Prefered methods are a cluster analysis (CLU) and a principal component analysis (PCA). CLU is used for a division of objects on the basis of their signs and a calculation of the distance between objects into groups whose characteristics should be similar. The readout can reveal a secret structure in data. PCA is used for a location of a structure in signs of multi-dimensional matrix data. Signs present separate quantities describing the given object. PCA uses a dissolution of a primary matrix data to structural and noise matrix data. It concerns the transformation of primary matrix data into new grid system of principal components. New conversion data are called a score. Principal components generating orthogonal system of new position. Distribution network from the aspect of reliability can be characterized by a number of new statistical quantities. Reliability indicators might be: interruption numbers, interruption time. Integral reliability indicators might be: system average interruption frequency index (SAIFI) and system average interruption duration index (SAIDI). In conclusion, there is a comparison of performed SAIFI simulation according to negatively binomial division and provided values from a distribution company. It is performed a test at description of sign dependences and outlet divisions.
APA, Harvard, Vancouver, ISO, and other styles
12

Hezoučký, Ladislav. "Nástroj pro shlukovou analýzu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237169.

Full text
Abstract:
The master' s thesis deals with cluster data analysis. There are explained basic concepts and methods from this domain. Result of the thesis is Cluster analysis tool, in which are implemented methods K-Medoids and DBSCAN. Adjusted results on real data are compared with programs Rapid Miner and SAS Enterprise Miner.
APA, Harvard, Vancouver, ISO, and other styles
13

Kejkula, Martin. "Zpracování asociačních pravidel metodou vícekriteriálního shlukování." Doctoral thesis, Vysoká škola ekonomická v Praze, 2002. http://www.nusl.cz/ntk/nusl-77103.

Full text
Abstract:
Association rules mining is one of several ways of knowledge discovery in databases. Paradoxically, data mining itself can produce such great amounts of association rules that there is a new knowledge management problem: there can easily be thousands or even more association rules holding in a data set. The goal of this work is to design a new method for association rules post-processing. The method should be software and domain independent. The output of the new method should be structured description of the whole set of discovered association rules. The output should help user to work with discovered rules. The path to reach the goal I used is: to split association rules into clusters. Each cluster should contain rules, which are more similar each other than to rules from another cluster. The output of the method is such cluster definition and description. The main contribution of this Ph.D. thesis is the described new Multicriterial clustering association rules method. Secondary contribution is the discussion of already published association rules post-processing methods. The output of the introduced new method are clusters of rules, which cannot be reached by any of former post-processing methods. According user expectations clusters are more relevant and more effective than any former association rules clustering results. The method is based on two orthogonal clustering of the same set of association rules. One clustering is based on interestingness measures (confidence, support, interest, etc.). Second clustering is inspired by document clustering in information retrieval. The representation of rules in vectors like documents is fontal in this thesis. The thesis is organized as follows. Chapter 2 identify the role of association rules in the KDD (knowledge discovery in databases) process, using KDD methodologies (CRISP-DM, SEMMA, GUHA, RAMSYS). Chapter 3 define association rule and introduce characteristics of association rules (including interestingness measuress). Chapter 4 introduce current association rules post-processing methods. Chapter 5 is the introduction to cluster analysis. Chapter 6 is the description of the new Multicriterial clustering association rules method. Chapter 7 consists of several experiments. Chapter 8 discuss possibilities of usage and development of the new method.
APA, Harvard, Vancouver, ISO, and other styles
14

Slezák, Milan. "Dolování dat z databází." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-218944.

Full text
Abstract:
The thesis is focused on an introduction of data mining. Data mining is focused on finding of a hidden data correlation. Interest in this area is dated back to the 60th the 20th century. Data analysis was first used in marketing. However, later it expanded to more areas, and some of its options are still unused. One of methodologies is useful used for creating of this process. Methodology offers a concise guide on how you can create a data mining procedure. The data mining analysis contains a wide range of algorithms for data modification. The interest in data mining causes that number of data mining software is increasing. This thesis contains overviews some of this programs, some examples and assessment.
APA, Harvard, Vancouver, ISO, and other styles
15

Zahradníčková, Jana. "Faktory ovlivňující finanční situaci studentů doktorských studijních programů v České republice." Master's thesis, Vysoká škola ekonomická v Praze, 2015. http://www.nusl.cz/ntk/nusl-193075.

Full text
Abstract:
Ph.D. students are an integral part of the tertiary education system. Encouragement for doctoral programs and their students is very important because they are the ones who will participate in research projects in the future and they will contribute to society as a whole. The majority of scholarships for Ph.D. students comes from public sources. An important question to be asked is whether the scholarships are sufficient to finance Ph.D. studies and whether there are differences in the amount depending on gender, field of study or region. This thesis aims to answer these questions by applying statistical methods to the results of the survey DOKTORANDI 2014.
APA, Harvard, Vancouver, ISO, and other styles
16

Rychnovský, Martin. "Získávání znalostí na webu - shlukování." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-235960.

Full text
Abstract:
This work presents the topic of data mining on the web. It is focused on clustering. The aim of this project was to study the field of clustering and to implement clustering through the k-means algorithm. Then, the algorithm was tested on a dataset of text documents and on data extracted from web. This clustering method was implemented by means of Java technologies.
APA, Harvard, Vancouver, ISO, and other styles
17

David, Lukáš. "Dolování v prostřední MS SQL pomocí inkrementálních algoritmů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236484.

Full text
Abstract:
This work deals with issues in data streams mining which nowadays is a very dynamic area in information technology. The thesis describes the general principles of data mining. There are also the principles of data mining in the data streams. Special attention is given to the implemented algorithm CluStream. In the practical part the data stream processing solution was designed and implemented by the MSSQL technology using the above algorithm. The functionality of the algorithm was verified using own data stream generator.
APA, Harvard, Vancouver, ISO, and other styles
18

Brychta, Jan. "Aplikace pro zpracování dat z oblasti genového inženýrství." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-235899.

Full text
Abstract:
This masters thesis has a few objectives. One of them is to acquaint with the problems of genome engineering, especially with fragmentation of DNA, the macromolecule DNA, the methods for purification and separation of the nucleic acids, the enzymes used for modification of these acids, amplification and get to know with cluster and gradient analysis as well. The next aim is to peruse the existed application and compare it to the layout of the proposed application, that is the third aim. The last one from the objectives is the implementation and the report how was the application tested by the real data. The results will be discussed as well as the possibilities of the further extension.
APA, Harvard, Vancouver, ISO, and other styles
19

Kantor, Jan. "Učení bez učitele." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217651.

Full text
Abstract:
The purpose of this work has been to describe some techniques which are normally used for cluster data analysis process of unsupervised learning. The thesis consists of two parts. The first part of thesis has been focused on some algorithms theory describing advantages and disadvantages of each discussed method and validation of clusters quality. There are many ways how to estimate and compute clustering quality based on internal and external knowledge which is mentioned in this part. A good technique of clustering quality validation is one of the most important parts in cluster analysis. The second part of thesis deals with implementation of different clustering techniques and programs on real datasets and their comparison with true dataset partitioning and published related work.
APA, Harvard, Vancouver, ISO, and other styles
20

Zapletal, Petr. "Dolovací modul systému pro získávání znalostí z dat FIT-Miner." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-236921.

Full text
Abstract:
This master's thesis deals with with FIT-Miner, the system for knowledge discovery in databases. The first part of this paper describes the data-mining process, mixture model's issues and FIT-Miner system. Second part deals with design, implementation and testing of created module, which is used for cluster analysis with Expectation-Maximalization algorithm. The end of the paper is focused to design of modules using Java Store Procedures Technology.
APA, Harvard, Vancouver, ISO, and other styles
21

Šulc, Zdeněk. "Similarity Measures for Nominal Data in Hierarchical Clustering." Doctoral thesis, Vysoká škola ekonomická v Praze, 2013. http://www.nusl.cz/ntk/nusl-261939.

Full text
Abstract:
This dissertation thesis deals with similarity measures for nominal data in hierarchical clustering, which can cope with variables with more than two categories, and which aspire to replace the simple matching approach standardly used in this area. These similarity measures take into account additional characteristics of a dataset, such as frequency distribution of categories or number of categories of a given variable. The thesis recognizes three main aims. The first one is an examination and clustering performance evaluation of selected similarity measures for nominal data in hierarchical clustering of objects and variables. To achieve this goal, four experiments dealing both with the object and variable clustering were performed. They examine the clustering quality of the examined similarity measures for nominal data in comparison with the commonly used similarity measures using a binary transformation, and moreover, with several alternative methods for nominal data clustering. The comparison and evaluation are performed on real and generated datasets. Outputs of these experiments lead to knowledge, which similarity measures can generally be used, which ones perform well in a particular situation, and which ones are not recommended to use for an object or variable clustering. The second aim is to propose a theory-based similarity measure, evaluate its properties, and compare it with the other examined similarity measures. Based on this aim, two novel similarity measures, Variable Entropy and Variable Mutability are proposed; especially, the former one performs very well in datasets with a lower number of variables. The third aim of this thesis is to provide a convenient software implementation based on the examined similarity measures for nominal data, which covers the whole clustering process from a computation of a proximity matrix to evaluation of resulting clusters. This goal was also achieved by creating the nomclust package for the software R, which covers this issue, and which is freely available.
APA, Harvard, Vancouver, ISO, and other styles
22

Peroutka, Lukáš. "Návrh a implementace Data Mining modelu v technologii MS SQL Server." Master's thesis, Vysoká škola ekonomická v Praze, 2012. http://www.nusl.cz/ntk/nusl-199081.

Full text
Abstract:
This thesis focuses on design and implementation of a data mining solution with real-world data. The task is analysed, processed and its results evaluated. The mined data set contains study records of students from University of Economics, Prague (VŠE) over the course of past three years. First part of the thesis focuses on theory of data mining, definition of the term, history and development of this particular field. Current best practices and meth-odology are described, as well as methods for determining the quality of data and methods for data pre-processing ahead of the actual data mining task. The most common data mining techniques are introduced, including their basic concepts, advantages and disadvantages. The theoretical basis is then used to implement a concrete data mining solution with educational data. The source data set is described, analysed and some of the data are chosen as input for created models. The solution is based on MS SQL Server data mining platform and it's goal is to find, describe and analyse potential as-sociations and dependencies in data. Results of respective models are evaluated, including their potential added value. Also mentioned are possible extensions and suggestions for further development of the solution.
APA, Harvard, Vancouver, ISO, and other styles
23

Sirota, Sergej. "Hodnocení úspěšnosti metod využívaných ve shlukové analýze." Master's thesis, Vysoká škola ekonomická v Praze, 2014. http://www.nusl.cz/ntk/nusl-192829.

Full text
Abstract:
The aim of the thesis is to compare methods of cluster analysis correctly classify objects in the dataset into groups, which are known. In the theoretical section first describes the steps needed to prepare a data file for cluster analysis. The next theoretical section is dedicated to the cluster analysis, which describes ways of measuring similarity of objects and clusters, and dedicated to description the methods of cluster analysis used in practical part of this thesis. In practical part are described and analyzed 20 files. Each file contains only quantitative variables and sort characters by which objects are sorted. In each file is calculated success rate of object segmentation into groups for each cluster method. At the end of the practical part is a summary description of the results of cluster methods. The main contribution of this thesis is to evaluate the success of cluster methods for classification objects into known groups.
APA, Harvard, Vancouver, ISO, and other styles
24

Rozinek, Michal. "Aplikace shlukové analýzy při zpracování biomedicínských dat." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-218223.

Full text
Abstract:
The goal of this study is to learn about methods in object classification in medicine and find out what are these methods about. Focusing on functionality and reliability of these methods whith datafile from the medicine compartment after making the algorithm in MATLAB. In form of siple tests, put the touch everyone of classification procedure and find out in which they excel and in which they lags. The choice of input data parametres is very important, this will be tested and noted in conclusions.
APA, Harvard, Vancouver, ISO, and other styles
25

Málik, Peter. "Získávání znalostí z multimediálních databází." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-235525.

Full text
Abstract:
This master"s thesis deals with the knowledge discovery in multimedia databases. It contains general principles of knowledge discovery in databases, especially methods of cluster analysis used for data mining in large and multidimensional databases are described here. The next chapter contains introduction to multimedia databases, focusing on the extraction of low level features from images and video data. The practical part is then an implementation of the methods BIRCH, DBSCAN and k-means for cluster analysis. Final part is dedicated to experiments above TRECVid 2008 dataset and description of achievements.
APA, Harvard, Vancouver, ISO, and other styles
26

Ševčík, Radim. "Klasifikace elektronických dokumentů s využitím shlukové analýzy." Master's thesis, Vysoká škola ekonomická v Praze, 2009. http://www.nusl.cz/ntk/nusl-17157.

Full text
Abstract:
The current age is characterised by unprecedented information growth, whether it is by amount or complexity. Most of it is available in digital form so we can analyze it using cluster analysis. We have tried to classify the documents from 20 Newsgroups collection in terms of their content only. The aim was to asses available clustering methods in a variety of applications. After the transformation into binary vector representation we performed several experiments and measured the values of entropy, purity and time of execution in application CLUTO. For a small number of clusters the best results offered the direct method (generally hierarchical method), but for more it was the repeated bisection (divisive). Agglomerative method proved not to be suitable. Using simulation we estimated the optimal number of clusters to be 10. For this solution we described in detail features of each cluster using repeated bisection method and i2 criterion function. In the future focus should be set on realisation of binary clustering with advantage of programming languages like Perl or C++. Results of this work might be of interest to web search engine developers and electronic catalogue administrators.
APA, Harvard, Vancouver, ISO, and other styles
27

Gal, Pavel. "Identifikace podobných řešení při stochastické simulaci v oblasti odpadového hospodářství." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2015. http://www.nusl.cz/ntk/nusl-231793.

Full text
Abstract:
The Master’s thesis deals with the issue of collecting mixed municipal waste from producers to a~waste-to-energy or landfills. The initial chapters are aimed to waste legislation and transportation of the waste by road freight transport across Europe. The objective is to collect the data, that are required for calculation in tool NERUDA. The next part describes the cluster analysis and different approaches in it. The selected methods of cluster analysis are apllied to the logistic task in the final chapters. The cluster analysis is considered from different aspects. The results are visualized using the software ArcGIS.
APA, Harvard, Vancouver, ISO, and other styles
28

Hebelka, Tomáš. "Analýza dat z mikročipů pro zjišťování genové exprese." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-235549.

Full text
Abstract:
This work concerns with data analysis of DNA microarrays by using cluster analysis. It explains biological terms - gene expression and DNA microarray. Next, it contains mathematical and informatical description of clustering methods and describes a way to apply these methods to microarrays data. Next, the work contains implementation's detail of clustering methods k-means, DBSCAN and introduces an original clustering algorithm Strom++. Then, description of implementation and application manual follow. Finally, accomplished results are evaluated.
APA, Harvard, Vancouver, ISO, and other styles
29

Prokop, David. "Zpracování tomografických dat metodou analýzy hlavních komponent pro archeologické aplikace." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2019. http://www.nusl.cz/ntk/nusl-402523.

Full text
Abstract:
Rentgenová počítačová tomografie je metoda sloužící ke 3D zobrazování vnitřní struktury objektů. Mikrostruktura objektů ukrývá důležité informace, které mohou být použity k jejich charakterizaci. Tato práce podává spojení mezi datasety získanými pomocí rentgenové počítačové mikrotomografie a oblastí statistického zpracování dat. Výstupem metody, pak bude klasifikace vzorků na základě informací o jejich mikrostruktuře. Z výsledků klasifikace vzorků, pak můžeme vyvodit různé hypotézy týkající se původu vzorků. Tato práce by mimo jiné mohla sloužit jako takový nový vhled do problematiky kombinace dat různého původu, pomocí metod statistické analýzy.
APA, Harvard, Vancouver, ISO, and other styles
30

Podzimková, Michaela. "Využití statistických metod v data miningu při predikci chování zákazníků internetového obchodu." Master's thesis, Vysoká škola ekonomická v Praze, 2015. http://www.nusl.cz/ntk/nusl-193125.

Full text
Abstract:
Data mining is a new discipline that occurs with increasing amount of stored data and the increasing need to obtain the information hidden in them. It is focused on the mining of potentially useful information from large data sets and it lies at the intersection of statistics, machine learning, artificial intelligence, databases and other areas. The aim of this thesis is to present the process of data mining with an emphasis on its connection with statistics and to describe a selection of statistical methods widely used in this field and which were also used in the applied data mining problem in this thesis. Real data from purchases in the online store show that using different methods gives different results and interesting information about purchasing behavior, and also proves that not all methods are always applicable to all types of tasks.
APA, Harvard, Vancouver, ISO, and other styles
31

Šalplachta, Jakub. "Analýza 3D CT obrazových dat se zaměřením na detekci a klasifikaci specifických struktur tkání." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2017. http://www.nusl.cz/ntk/nusl-316836.

Full text
Abstract:
This thesis deals with the segmentation and classification of paraspinal muscle and subcutaneous adipose tissue in 3D CT image data in order to use them subsequently as internal calibration phantoms to measure bone mineral density of a vertebrae. Chosen methods were tested and afterwards evaluated in terms of correctness of the classification and total functionality for subsequent BMD value calculation. Algorithms were tested in programming environment Matlab® on created patient database which contains lumbar spines of twelve patients. Following sections of this thesis contain theoretical research of the issue of measuring bone mineral density, segmentation and classification methods and description of practical part of this work.
APA, Harvard, Vancouver, ISO, and other styles
32

Jarolím, Jordán. "Analýza a získávání informací ze souboru dokumentů spojených do jednoho celku." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2018. http://www.nusl.cz/ntk/nusl-385929.

Full text
Abstract:
This thesis deals with mining of relevant information from documents and automatic splitting of multiple documents merged together. Moreover, it describes the design and implementation of software for data mining from documents and for automatic splitting of multiple documents. Methods for acquiring textual data from scanned documents, named entity recognition, document clustering, their supportive algorithms and metrics for automatic splitting of documents are described in this thesis. Furthermore, an algorithm of implemented software is explained and tools and techniques used by this software are described. Lastly, the success rate of the implemented software is evaluated. In conclusion, possible extensions and further development of this thesis are discussed at the end.
APA, Harvard, Vancouver, ISO, and other styles
33

Labounek, René. "Fúze simultánních EEG-FMRI dat za pomoci zobecněných spektrálních vzorců." Doctoral thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2018. http://www.nusl.cz/ntk/nusl-371799.

Full text
Abstract:
Mnoho rozdílných strategií fúze bylo vyvinuto během posledních 15 let výzkumu simultánního EEG-fMRI. Aktuální dizertační práce shrnuje aktuální současný stav v oblasti výzkumu fúze simultánních EEG-fMRI dat a pokládá si za cíl vylepšit vizualizaci úkolem evokovaných mozkových sítí slepou analýzou přímo z nasnímaných dat. Dva rozdílné modely, které by to měly vylepšit, byly navrhnuty v předložené práci (tj. zobecněný spektrální heuristický model a zobecněný prostorovo-frekvenční heuristický model). Zobecněný frekvenční heuristický model využívá fluktuace relativního EEG výkonu v určitých frekvenčních pásmech zprůměrovaných přes elektrody zájmu a srovnává je se zpožděnými fluktuacemi BOLD signálů pomocí obecného lineárního modelu. Získané výsledky ukazují, že model zobrazuje několik na frekvenci závislých rozdílných úkolem evokovaných EEG-fMRI sítí. Model překonává přístup fluktuací absolutního EEG výkonu i klasický (povodní) heuristický přístup. Absolutní výkon vizualizoval s úkolem nesouvisející širokospektrální EEG-fMRI komponentu a klasický heuristický přístup nebyl senzitivní k vizualizaci s úkolem spřažené vizuální sítě, která byla pozorována pro relativní pásmo pro data vizuálního oddball experimentu. Pro EEG-fMRI data s úkolem sémantického rozhodování, frekvenční závislost nebyla ve finálních výsledcích tak evidentní, neboť všechna pásma zobrazily vizuální síť a nezobrazily aktivace v řečových centrech. Tyto výsledky byly pravděpodobně poškozeny artefaktem mrkání v EEG datech. Koeficienty vzájemné informace mezi rozdílnými EEG-fMRI statistickými parametrickými mapami ukázaly, že podobnosti napříč různými frekvenčními pásmy jsou obdobné napříč různými úkoly (tj. vizuální oddball a sémantické rozhodování). Navíc, koeficienty prokázaly, že průměrování napříč různými elektrodami zájmu nepřináší žádnou novou informaci do společné analýzy, tj. signál na jednom svodu je velmi rozmazaný signál z celého skalpu. Z těchto důvodů začalo být třeba lépe zakomponovat informace ze svodů do EEG-fMRI analýzy, a proto jsme navrhli více obecný prostorovo-frekvenční heuristický model a také jak ho odhadnout za pomoci prostorovo-frekvenční skupinové analýzy nezávislých komponent relativního výkonu EEG spektra. Získané výsledky ukazují, že prostorovo-frekvenční heuristický model vizualizuje statisticky nejvíce signifikantní s úkolem spřažené mozkové sítě (srovnáno s výsledky prostorovo-frekvenčních vzorů absolutního výkonu a s výsledky zobecněného frekvenčního heuristického modelu). Prostorovo-frekvenční heuristický model byl jediný, který zaznamenal s úkolem spřažené aktivace v řečových centrech na datech sémantického rozhodování. Mimo fúzi prostorovo-frekvenčních vzorů s fMRI daty, jsme testovali stabilitu odhadů prostorovo-frekvenčních vzorů napříč různými paradigmaty (tj. vizuální oddball, semantické rozhodování a resting-state) za pomoci k-means shlukovacího algoritmu. Dostali jsme 14 stabilních vzorů pro absolutní EEG výkon a 12 stabilních vzorů pro relativní EEG výkon. Ačkoliv 10 z těchto vzorů vypadají podobně napříč výkonovými typy, prostorovo-frekvenční vzory relativního výkonu (tj. vzory prostorovo-frekvenčního heuristického modelu) mají vyšší evidenci k úkolům.
APA, Harvard, Vancouver, ISO, and other styles
34

Zemanová, Barbora. "Shluková analýza pro funkcionální data." Master's thesis, 2012. http://www.nusl.cz/ntk/nusl-304110.

Full text
Abstract:
In this work we deal with cluster analysis for functional data. Functional data contain a set of subjects that are characterized by repeated measurements of a variable. Based on these measurements we want to split the subjects into groups (clusters). The subjects in a single cluster should be similar and differ from subjects in the other clusters. The first approach we use is the reduction of data dimension followed by the clustering method K-means. The second approach is to use a finite mixture of normal linear mixed models. We estimate parameters of the model by maximum likelihood using the EM algorithm. Throughout the work we apply all described procedures to real meteorological data.
APA, Harvard, Vancouver, ISO, and other styles
35

Kocanda, Stanislav. "Klastrovací analýza elektrofyziologických dat." Master's thesis, 2010. http://www.nusl.cz/ntk/nusl-295830.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Muchová, Natália. "Príjmová nerovnosť a jej vplyv na ekonomický rast krajín EU." Master's thesis, 2018. http://www.nusl.cz/ntk/nusl-429578.

Full text
Abstract:
This diploma thesis focuses on income inequality and its impact on economic development in the EU member states. The literature overview deals with description of income inequality and identifies its causes, consequences, methods of measurement and possible solutions. The relationship between income inequality and economic growth is examined through panel data analysis and cluster analysis.
APA, Harvard, Vancouver, ISO, and other styles
37

Pavelková, Adéla. "Účinky vybraných opatření k prevenci malárie: analýza panelových dat." Master's thesis, 2020. http://www.nusl.cz/ntk/nusl-412149.

Full text
Abstract:
The main aim of this diploma thesis was to explore the topic of malaria preventive measures. Concretely, to study which preventive measures are useful and to see how they are distributed around the world. For international organizations, this is very important as they need to know whether funds allocated for malaria aid are distributed effectively. This study is using manually compounded data from the World Health Organization for all countries threatened by malaria mostly from 2001 to 2018. For this purpose, panel data regression methods using robust standard errors, bootstrapping and cluster analysis were used. The results showed that generally, the most useful preventive measures are indoor-residual sprayings, a combination of sprayings and insecticide-treated nets and rapid diagnostic tests. Furthermore, the effect of the population living in rural areas is significant. Besides, gross domestic product is a very important factor for African countries. The stability analysis - bootstrapping - confirmed our results. However, we examined that insecticide-treated nets are still the most distributed measures. Doing the cluster analysis, we observed that countries on the same continent should not be treated similarly and we emphasized countries that should receive higher attention. Overall, the...
APA, Harvard, Vancouver, ISO, and other styles
38

Bobošová, Terézia. "Faktory ovplyvňujúce ekonomickú úroveň krajiny." Master's thesis, 2017. http://www.nusl.cz/ntk/nusl-430062.

Full text
Abstract:
This thesis deals with the identification of factors influencing the level of economic performance of the countries. A theoretical platform presents the total population, the technological level and the economic freedom as possible factors of influence. Groups of countries, which have the similar character of the level of economic performance, are sorted out by cluster analysis. To describe the common factors influencing the level of economic performance among every created group of countries the panel data models are used. The results prove the influence of total population, technological level and economic freedom in the most developed, moderately developed and in developing countries around the world. Each of these indicators shows the positive effect on the level of economic performance, thereby creating an opportunity for the improvements.
APA, Harvard, Vancouver, ISO, and other styles
39

Koukalová, Pavla. "Faktory ovlivňující příjmovou nerovnost." Master's thesis, 2019. http://www.nusl.cz/ntk/nusl-428586.

Full text
Abstract:
This diploma thesis deals with the identification of factors that affect income inequality in the world. The literary overview describes the issue of income inequality, possible methods of measurement, projected factors of influence and the possibility of addressing income inequality. Statistical methods, in particular cluster, regression and panel analysis, are used to quantify and define the significance of the selected factors. The economic and technological level of the country, the education and the state of health of the population are key economic and demographic indicators. Each of these indicators, according to the results of the regression analysis, has a weak negative effect on income inequality, reducing it. The results of the panel analysis demonstrate that an indicator of the level of technology, education and the state of health of the population can be included among the major factors affecting income inequality in the world. The indicator of the economic level measured using per capita gross domestic product has proved insignificant, which means that while GDP has an impact on Gini’s coefficient, the other variables describe the relationship better.
APA, Harvard, Vancouver, ISO, and other styles
40

Hanzlová, Radka. "Self-tracking a běhání: sociologická analýza." Master's thesis, 2018. http://www.nusl.cz/ntk/nusl-383901.

Full text
Abstract:
This thesis focuses on self-tracking, which mean monitoring and recording information about oneself using digital technologies and its use by runners in the Czech Republic. The main aim of this thesis is to describe the Czech running community through a detailed sociological analysis, and to answer a question: Why runners use self-tracking and how they benefit from it? The theoretical part firstly deals with the topic of self-tracking itself, then examines the uses and gratifications theory and the theory of online communities. The analytical part is devoted to description, analysis and interpretation of the results of the author's own survey, in which 844 runners of whom 754 practice self-tracking participated. Several hypotheses concerning sociodemographic structure, running characteristics, motivation, gratifications and safety were formulated. Five key motives (self-control, orientation to result, self-improvement, habit and social interaction) that lead runners to use self-tracking devices were identified through exploratory factor analysis. The motives vary based on gender and running characteristics (experience with running, runner's level, frequency of running, trainer) that also represent the main influencing factor for self-tracking in general. Self-tracking is closely related to sharing...
APA, Harvard, Vancouver, ISO, and other styles
41

Slavíček, Michal. "Konvergence sektoru finančních institucí v zemích Evropské unie." Master's thesis, 2020. http://www.nusl.cz/ntk/nusl-429883.

Full text
Abstract:
Slavíček, M. Convergence of the sector financial corporations in the European Union countries. Diploma thesis. Brno: Mendel University, 2020. The diploma thesis deals with the possible convergence between the years 2004–2018 within the sector account of a financial corporations (defined by the ESA 2010 methodology) across the countries of the European Union, with an emphasis on Central European countries. The stated goal of the work was fulfilled by selecting suitable indicators that were used for the creation of cluster analysis and for calculating the standard deviations in the indicators. More specifically, these are indicators dealing with financial transactions (F.1–F.8), credit market concentration, number of credit institutions, number of employees in credit institutions and premiums written. However, the empirical results of the work did not prove or refute financial convergence across EU countries, as the clusters have changed over the years and even the development in standard deviations did not indicate demonstrable convergence.
APA, Harvard, Vancouver, ISO, and other styles
42

Koutný, Jan. "Konvergence krajů České republiky pod vlivem ekonomické krize." Master's thesis, 2019. http://www.nusl.cz/ntk/nusl-428760.

Full text
Abstract:
This diploma thesis examines the influence of the economic crisis from 2008-2009 on the convergence of the regions of the Czech Republic defined at NUTS 3 level. Trends in region convergence are identified based on the beta convergence and sigma convergence analysis using panel data from 2002-2015. Subsequently are discussed the reasons for the observed differences and the consequences of unequal economic development in the regions for private companies that are located or are operating in the regions.
APA, Harvard, Vancouver, ISO, and other styles
43

Melichar, Miloš. "Aplikace dataminingových metod na bankovní data." Master's thesis, 2016. http://www.nusl.cz/ntk/nusl-251149.

Full text
Abstract:
The thesis deals with pre-processing of two data sets with information on clients, loans and debit cards. The data sets were separately pre-processed and modeled by SPSS Modeler using a number of methods and algorithms. For the modeling purposes, three classification data mining tasks were defined: loan approving or rejecting, loan rating and debit card type assignment. By using the selected methods of machine learning techniques the classification models were built for each task. Models accuracy was tested by script written in SPSS language for automation. All tasks were supplemented by clustering technique based on latent factors gained by factor analysis. Factor analysis combined with clustering presents another approach in pattern discovery.
APA, Harvard, Vancouver, ISO, and other styles
44

Krsová, Lenka. "Čeští novináři na Twitteru: Analýza sociálních interakcí českého mediálního prostoru." Master's thesis, 2018. http://www.nusl.cz/ntk/nusl-373701.

Full text
Abstract:
The thesis is focused on the communication inclusion and exclusivity of Czech journalist on Twitter and how they use conventions of this platform to connect with other users. Through the description of current communication layers and functions of Twitter this thesis depicts how it became one of main sources of news and how it pushed journalists to reinterpret their traditional roles in the society. It also describe how digital humanities and digital trace data gathered from social media can be used as means of analysis of social interactions of its users. The practical part presents a cluster analysis based on Twitter data of 457 Czech journalists that shows how is Twitter used to communicate within and outside the Czech media system.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography