Dissertations / Theses: 'Fuzzy c-means clustering analysis'

1

Kanade, Parag M. "Fuzzy ants as a clustering concept." [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000397.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Camara, Assa. "Využití fuzzy množin ve shlukové analýze se zaměřením na metodu Fuzzy C-means Clustering." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2020. http://www.nusl.cz/ntk/nusl-417051.

Full text

Abstract:

This master thesis deals with cluster analysis, more specifically with clustering methods that use fuzzy sets. Basic clustering algorithms and necessary multivariate transformations are described in the first chapter. In the practical part, which is in the third chapter we apply fuzzy c-means clustering and k-means clustering on real data. Data used for clustering are the inputs of chemical transport model CMAQ. Model CMAQ is used to approximate concentration of air pollutants in the atmosphere. To the data we will apply two different clustering methods. We have used two different methods to select optimal weighting exponent to find data structure in our data. We have compared all 3 created data structures. The structures resembled each other but with fuzzy c-means clustering, one of the clusters did not resemble any of the clustering inputs. The end of the third chapter is dedicated to an attempt to find a regression model that finds the relationship between inputs and outputs of model CMAQ.

APA, Harvard, Vancouver, ISO, and other styles

3

Stetco, Adrian. "An investigation into fuzzy clustering quality and speed : fuzzy C-means with effective seeding." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/an-investigation-into-fuzzy-clustering-quality-and-speed-fuzzy-cmeans-with-effective-seeding(fac3eab2-919a-436c-ae9b-1109b11c1cc2).html.

Full text

Abstract:

Cluster analysis, the automatic procedure by which large data sets can be split into similar groups of objects (clusters), has innumerable applications in a wide range of problem domains. Improvements in clustering quality (as captured by internal validation indexes) and speed (number of iterations until cost function convergence), the main focus of this work, have many desirable consequences. They can result, for example, in faster and more precise detection of illness onset based on symptoms or it could provide investors with a rapid detection and visualization of patterns in financial time series and so on. Partitional clustering, one of the most popular ways of doing cluster analysis, can be classified into two main categories: hard (where the clusters discovered are disjoint) and soft (also known as fuzzy; clusters are non-disjoint, or overlapping). In this work we consider how improvements in the speed and solution quality of the soft partitional clustering algorithm Fuzzy C-means (FCM) can be achieved through more careful and informed initialization based on data content. By carefully selecting the cluster centers in a way which disperses the initial cluster centers through the data space, the resulting FCM++ approach samples starting cluster centers during the initialization phase. The cluster centers are well spread in the input space, resulting in both faster convergence times and higher quality solutions. Moreover, we allow the user to specify a parameter indicating how far and apart the cluster centers should be picked in the dataspace right at the beginning of the clustering procedure. We show FCM++'s superior behaviour in both convergence times and quality compared with existing methods, on a wide rangeof artificially generated and real data sets. We consider a case study where we propose a methodology based on FCM++for pattern discovery on synthetic and real world time series data. We discuss a method to utilize both Pearson correlation and Multi-Dimensional Scaling in order to reduce data dimensionality, remove noise and make the dataset easier to interpret and analyse. We show that by using FCM++ we can make an positive impact on the quality (with the Xie Beni index being lower in nine out of ten cases for FCM++) and speed (with on average 6.3 iterations compared with 22.6 iterations) when trying to cluster these lower dimensional, noise reduced, representations of the time series. This methodology provides a clearer picture of the cluster analysis results and helps in detecting similarly behaving time series which could otherwise come from any domain. Further, we investigate the use of Spherical Fuzzy C-Means (SFCM) with the seeding mechanism used for FCM++ on news text data retrieved from a popular British newspaper. The methodology allows us to visualize and group hundreds of news articles based on the topics discussed within. The positive impact made by SFCM++ translates into a faster process (with on average 12.2 iterations compared with the 16.8 needed by the standard SFCM) and a higher quality solution (with the Xie Beni being lower for SFCM++ in seven out of every ten runs).

APA, Harvard, Vancouver, ISO, and other styles

4

Rodgers, Sarah. "Application of the fuzzy c-means clustering algorithm to the analysis of chemical structures." Thesis, University of Sheffield, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.412772.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

FANEGAN, JULIUS BOLUDE. "A FUZZY MODEL FOR ESTIMATING REMAINING LIFETIME OF A DIESEL ENGINE." University of Cincinnati / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1188951646.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Zubková, Kateřina. "Text mining se zaměřením na shlukovací a fuzzy shlukovací metody." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2018. http://www.nusl.cz/ntk/nusl-382412.

Full text

Abstract:

This thesis is focused on cluster analysis in the field of text mining and its application to real data. The aim of the thesis is to find suitable categories (clusters) in the transcribed calls recorded in the contact center of Česká pojišťovna a.s. by transferring these textual documents into the vector space using basic text mining methods and the implemented clustering algorithms. From the formal point of view, the thesis contains a description of preprocessing and representation of textual data, a description of several common clustering methods, cluster validation, and the application itself.

APA, Harvard, Vancouver, ISO, and other styles

7

Pondini, Alessio. "Tenacizzazione di laminati compositi mediante l'utilizzo di nanofibre in PVDF." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/8463/.

Full text

Abstract:

Analisi riguardante la tenacizzazione della matrice di laminati compositi. Lo scopo è quello di aumentare la resistenza alla frattura di modo I e, a tal proposito, sono stati modificati gli interstrati di alcuni provini tramite l’introduzione di strati, di diverso spessore, di nanofibre in polivinilidenfluoruro (PVDF). La valutazione di tale metodo di rinforzo è stata eseguita servendosi di dati ottenuti tramite prove sperimentali svolte in laboratorio direttamente dal sottoscritto, che si è occupato dell’elaborazione dei dati servendosi di tecniche e algoritmi di recente ideazione. La necessità primaria per cui si cerca di rinforzare la matrice risiede nel problema più sentito dei laminati compositi in opera da molto tempo: la delaminazione. Oltre a verificare le proprietà meccaniche dei provini modificati sottoponendoli a test DCB, si è utilizzata una tecnica basata sulle emissioni acustiche per comprendere più approfonditamente l’inizio della delaminazione e i meccanismi di rottura che si verificano durante le prove. Quest’ultimi sono illustrati servendosi di un algoritmo di clustering, detto Fuzzy C-means, tramite il quale è stato possibile identificare ogni segnale come appartenente o meno ad un determinato modo di rottura. I risultati mostrano che il PVDF, applicato nelle modalità esposte, è in grado di aumentare la resistenza alla frattura di modo I divenendo contemporaneamente causa di un diverso modo di propagazione della frattura. Infine l’elaborato presenta alcune micrografie delle superfici di rottura, le quali appoggiano i risultati ottenuti nelle precedenti fasi di analisi.

APA, Harvard, Vancouver, ISO, and other styles

8

Ataeian, Seyed Mohsen, and Mehrnaz Jaberi Darbandi. "Analysis of Quality of Experience by applying Fuzzy logic : A study on response time." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-5742.

Full text

Abstract:

To be successful in today's competitive market, service providers should look at user's satisfaction as a critical key. In order to gain a better understanding of customers' expectations, a proper evaluations which considers intrinsic characteristics of perceived quality of service is needed. Due to the subjective nature of quality, the vagueness of human judgment and the uncertainty about the degree of users' linguistic satisfaction, fuzziness is associated with quality of experience. Considering the capability of Fuzzy logic in dealing with imprecision and qualitative knowledge, it would be wise to apply it as a powerful mathematical tool for analyzing the quality of experience (QoE). This thesis proposes a fuzzy procedure to evaluate the quality of experience. In our proposed methodology, we provide a fuzzy relationship between QoE and Quality of Service (QoS) parameters. To identify this fuzzy relationship a new term called Fuzzi ed Opinion Score (FOS) representing a fuzzy quality scale is introduced. A fuzzy data mining method is applied to construct the required number of fuzzy sets. Then, the appropriate membership functions describing fuzzy sets are modeled and compared with each other. The proposed methodology will assist service providers for better decision-making and resource management.

APA, Harvard, Vancouver, ISO, and other styles

9

Zettervall, Hang. "Fuzzy Set Theory Applied to Make Medical Prognoses for Cancer Patients." Doctoral thesis, Blekinge Tekniska Högskola [bth.se], Faculty of Engineering - Department of Mathematics and Natural Sciences, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-00574.

Full text

Abstract:

As we all know the classical set theory has a deep-rooted influence in the traditional mathematics. According to the two-valued logic, an element can belong to a set or cannot. In the former case, the element’s membership degree will be assigned to one, whereas in the latter case it takes the zero value. With other words, a feeling of imprecision or fuzziness in the two-valued logic does not exist. With the rapid development of science and technology, more and more scientists have gradually come to realize the vital importance of the multi-valued logic. Thus, in 1965, Professor Lotfi A. Zadeh from Berkeley University put forward the concept of a fuzzy set. In less than 60 years, people became more and more familiar with fuzzy set theory. The theory of fuzzy sets has been turned to be a favor applied to many fields. The study aims to apply some classical and extensional methods of fuzzy set theory in life expectancy and treatment prognoses for cancer patients. The research is based on real-life problems encountered in clinical works by physicians. From the introductory items of the fuzzy set theory to the medical applications, a collection of detailed analysis of fuzzy set theory and its extensions are presented in the thesis. Concretely speaking, the Mamdani fuzzy control systems and the Sugeno controller have been applied to predict the survival length of gastric cancer patients. In order to keep the gastric cancer patients, already examined, away from the unnecessary suffering from surgical operation, the fuzzy c-means clustering analysis has been adopted to investigate the possibilities for operation contra to nonoperation. Furthermore, the approach of point set approximation has been adopted to estimate the operation possibilities against to nonoperation for an arbitrary gastric cancer patient. In addition, in the domain of multi-expert decision-making, the probabilistic model, the model of 2-tuple linguistic representations and the hesitant fuzzy linguistic term sets (HFLTS) have been utilized to select the most consensual treatment scheme(s) for two separate prostate cancer patients. The obtained results have supplied the physicians with reliable and helpful information. Therefore, the research work can be seen as the mathematical complements to the physicians’ queries.

APA, Harvard, Vancouver, ISO, and other styles

10

Moura, Ronildo Pinheiro de Ara?jo. "Algoritmos de agrupamentos fuzzy intervalares e ?ndice de valida??o para agrupamento de dados simb?licos do tipo intervalo." Universidade Federal do Rio Grande do Norte, 2014. http://repositorio.ufrn.br:8080/jspui/handle/123456789/18111.

Full text

Abstract:

Made available in DSpace on 2014-12-17T15:48:11Z (GMT). No. of bitstreams: 1 RonildoPAM_DISSERT.pdf: 2783175 bytes, checksum: c268ade677ca4b8c543ccc014b0aafef (MD5) Previous issue date: 2014-02-21
Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior
Symbolic Data Analysis (SDA) main aims to provide tools for reducing large databases to extract knowledge and provide techniques to describe the unit of such data in complex units, as such, interval or histogram. The objective of this work is to extend classical clustering methods for symbolic interval data based on interval-based distance. The main advantage of using an interval-based distance for interval-based data lies on the fact that it preserves the underlying imprecision on intervals which is usually lost when real-valued distances are applied. This work includes an approach allow existing indices to be adapted to interval context. The proposed methods with interval-based distances are compared with distances punctual existing literature through experiments with simulated data and real data interval
A An?lise de Dados Simb?licos (SDA) tem como objetivo prover mecanismos de redu??o de grandes bases de dados para extra??o do conhecimento e desenvolver m?todos que descrevem esses dados em unidades complexas, tais como, intervalos ou um histograma. O objetivo deste trabalho ? estender m?todos de agrupamento cl?ssicos para dados simb?licos intervalares baseados em dist?ncias essencialmente intervalares. A principal vantagem da utiliza??o de uma dist?ncia essencialmente intervalar est? no fato da preserva??o da imprecis?o inerente aos intervalos, pois a imprecis?o ? normalmente perdida quando as dist?ncias valoradas em R s?o aplicadas. Este trabalho inclui uma abordagem que permite adaptar ?ndices de valida??o de agrupamento existentes para o contexto intervalar. Os m?todos propostos com dist?ncias essencialmente intervalares s?o comparados a dist?ncias pontuais existentes na literatura atrav?s de experimentos realizados com dados sint?ticos e reais intervalares

APA, Harvard, Vancouver, ISO, and other styles

11

Parker, Jonathon Karl. "Accelerated Fuzzy Clustering." Scholar Commons, 2013. http://scholarcommons.usf.edu/etd/4929.

Full text

Abstract:

Clustering algorithms are a primary tool in data analysis, facilitating the discovery of groups and structure in unlabeled data. They are used in a wide variety of industries and applications. Despite their ubiquity, clustering algorithms have a flaw: they take an unacceptable amount of time to run as the number of data objects increases. The need to compensate for this flaw has led to the development of a large number of techniques intended to accelerate their performance. This need grows greater every day, as collections of unlabeled data grow larger and larger. How does one increase the speed of a clustering algorithm as the number of data objects increases and at the same time preserve the quality of the results? This question was studied using the Fuzzy c-means clustering algorithm as a baseline. Its performance was compared to the performance of four of its accelerated variants. Four key design principles of accelerated clustering algorithms were identified. Further study and exploration of these principles led to four new and unique contributions to the field of accelerated fuzzy clustering. The first was the identification of a statistical technique that can estimate the minimum amount of data needed to ensure a multinomial, proportional sample. This technique was adapted to work with accelerated clustering algorithms. The second was the development of a stopping criterion for incremental algorithms that minimizes the amount of data required, while maximizing quality. The third and fourth techniques were new ways of combining representative data objects. Five new accelerated algorithms were created to demonstrate the value of these contributions. One additional discovery made during the research was that the key design principles most often improve performance when applied in tandem. This discovery was applied during the creation of the new accelerated algorithms. Experiments show that the new algorithms improve speedup with minimal quality loss, are demonstrably better than related methods and occasionally are an improvement in both speedup and quality over the base algorithm.

APA, Harvard, Vancouver, ISO, and other styles

12

Naik, Vaibhav C. "Fuzzy C-means clustering approach to design a warehouse layout." [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000437.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

FURUHASHI, Takeshi, and Makoto YASUDA. "Fuzzy Entropy Based Fuzzy c-Means Clustering with Deterministic and Simulated Annealing Methods." Institute of Electronics, Information and Communication Engineers, 2009. http://hdl.handle.net/2237/15060.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Hong, Sui. "Experiments with K-Means, Fuzzy c-Means and Approaches to Choose K and C." Honors in the Major Thesis, University of Central Florida, 2006. http://digital.library.ucf.edu/cdm/ref/collection/ETH/id/1224.

Full text

Abstract:

This item is only available in print in the UCF Libraries. If this is your Honors Thesis, you can help us make it available online for use by researchers around the world by following the instructions on the distribution consent form at http://library.ucf
Bachelors
Engineering and Computer Science
Computer Engineering

APA, Harvard, Vancouver, ISO, and other styles

15

Altinel, Fatih. "An Empirical Study On Fuzzy C-means Clustering For Turkish Banking System." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12615027/index.pdf.

Full text

Abstract:

Banking sector is very sensitive to macroeconomic and political instabilities and they are prone to crises. Since banks are integrated with almost all of the economic agents and with other banks, these crises affect entire societies. Therefore, classification or rating of banks with respect to their credibility becomes important. In this study we examine different models for classification of banks. Choosing one of those models, fuzzy c-means clustering, banks are grouped into clusters using 48 different ratios which can be classified under capital, assets quality, liquidity, profitability, income-expenditure structure, share in sector, share in group and branch ratios. To determine the inter-dependency between these variables, covariance and correlation between variables is analyzed. Principal component analysis is used to decrease the number of factors. As a a result, the representation space of data has been reduced from 48 variables to a 2 dimensional space. The observation is that 94.54% of total variance is produced by these two factors. Empirical results indicate that as the number of clusters is increased, the number of iterations required for minimizing the objective function fluctuates and is not monotonic. Also, as the number of clusters used increases, initial non-optimized maximum objective function values as well as optimized final minimum objective function values monotonically decrease together. Another observation is that the &lsquo
difference between initial non-optimized and final optimized values of objective function&rsquo
starts to diminish as number of clusters increases.

APA, Harvard, Vancouver, ISO, and other styles

16

Chahine, Firas Safwan. "A Genetic Algorithm that Exchanges Neighboring Centers for Fuzzy c-Means Clustering." NSUWorks, 2012. http://nsuworks.nova.edu/gscis_etd/116.

Full text

Abstract:

Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major shortcoming: it is extremely sensitive to the choice of initial centers used to seed the algorithm. Unless k-means is carefully initialized, it converges to an inferior local optimum and results in poor quality partitions. Developing improved method for selecting initial centers for k-means is an active area of research. Genetic algorithms (GAs) have been successfully used to evolve a good set of initial centers. Among the most promising GA-based methods are those that exchange neighboring centers between candidate partitions in their crossover operations. K-means is best suited to work when datasets have well-separated non-overlapping clusters. Fuzzy c-means (FCM) is a popular variant of k-means that is designed for applications when clusters are less well-defined. Rather than assigning each point to a unique cluster, FCM determines the degree to which each point belongs to a cluster. Like k-means, FCM is also extremely sensitive to the choice of initial centers. Building on GA-based methods for initial center selection for k-means, this dissertation developed an evolutionary program for center selection in FCM called FCMGA. The proposed algorithm utilized region-based crossover and other mechanisms to improve the GA. To evaluate the effectiveness of FCMGA, three independent experiments were conducted using real and simulated datasets. The results from the experiments demonstrate the effectiveness and consistency of the proposed algorithm in identifying better quality solutions than extant methods. Moreover, the results confirmed the effectiveness of region-based crossover in enhancing the search process for the GA and the convergence speed of FCM. Taken together, findings in these experiments illustrate that FCMGA was successful in solving the problem of initial center selection in partitional clustering algorithms.

APA, Harvard, Vancouver, ISO, and other styles

17

Rapstine, Thomas D. "Gravity gradiometry and seismic interpretation integration using spatially guided fuzzy c-means clustering inversion." Thesis, Colorado School of Mines, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=1602383.

Full text

Abstract:

Gravity gradiometry has been used as a geophysical tool to image salt structure in hydrocarbon exploration. The knowledge of the location, orientation, and spatial extent of salt bodies helps characterize possible petroleum prospects. Imaging around and underneath salt bodies can be challenging given the petrophysical properties and complicated geometry of salt. Methods for imaging beneath salt using seismic data exist but are often iterative and expensive, requiring a refinement of a velocity model at each iteration. Fortunately, the relatively strong density contrast between salt and background density structure pro- vides the opportunity for gravity gradiometry to be useful in exploration, especially when integrated with other geophysical data such as seismic. Quantitatively integrating multiple geophysical data is not trivial, but can improve the recovery of salt body geometry and petrophysical composition using inversion. This thesis provides two options for quantitatively integrating seismic, AGG, and petrophysical data that may aid the imaging of salt bodies. Both methods leverage and expand upon previously developed deterministic inversion methods. The inversion methods leverage seismically derived information, such as horizon slope and salt body interpretation, to constrain the inversion of airborne gravity gradiometry data (AGG) to arrive at a density contrast model. The first method involves constraining a top of salt inversion using slope in a seismic image. The second method expands fuzzy c-means (FCM) clustering inversion to include spatial control on clustering based on a seismically derived salt body interpretation. The effective- ness of the methods are illustrated on a 2D synthetic earth model derived from the SEAM Phase 1 salt model. Both methods show that constraining the inversion of AGG data using information derived from seismic images can improve the recovery of salt.

APA, Harvard, Vancouver, ISO, and other styles

18

Beca, Cofre Sebastián. "Clustering Difuso con Selección de Atributos." Tesis, Universidad de Chile, 2007. http://www.repositorio.uchile.cl/handle/2250/104686.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Turner, Kevin Michael. "Estimation of Ocean Water Chlorophyll-A Concentration Using Fuzzy C-Means Clustering and Artificial Neural Networks." Fogler Library, University of Maine, 2007. http://www.library.umaine.edu/theses/pdf/TurnerKM2007.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Hore, Prodip. "Scalable frameworks and algorithms for cluster ensembles and clustering data streams." [Tampa, Fla.] : University of South Florida, 2007. http://purl.fcla.edu/usf/dc/et/SFE0002135.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Chakeri, Alireza. "Scalable Clustering Using the Dempster-Shafer Theory of Evidence." Scholar Commons, 2016. http://scholarcommons.usf.edu/etd/6478.

Full text

Abstract:

Clustering large data sets has become very important as the amount of available unlabeled data increases. Single Pass Fuzzy C-Means (SPFCM) is useful when memory is too limited to load the whole data set. The main idea is to divide dataset into several chunks and to apply fuzzy c-means (FCM) to each chunk. SPFCM uses the weighted cluster centers of the previous chunk in the next data chunks. However, when the number of chunks is increased, the algorithm shows sensitivity to the order in which the data is processed. Hence, we improved SPFCM by recognizing boundary and noisy data in each chunk and using it to influence clustering in the next chunks. The proposed approach transfers the boundary and noisy data as well as the weighted cluster centers to the next chunks. We show that our proposed approach is significantly less sensitive to the order in which the data is loaded in each chunk.

APA, Harvard, Vancouver, ISO, and other styles

22

Lai, Daphne Teck Ching. "An exploration of improvements to semi-supervised fuzzy c-means clustering for real-world biomedical data." Thesis, University of Nottingham, 2014. http://eprints.nottingham.ac.uk/14232/.

Full text

Abstract:

This thesis explores various detailed improvements to semi-supervised learning (using labelled data to guide clustering or classification of unlabelled data) with fuzzy c-means clustering (a ‘soft’ clustering technique which allows data patterns to be assigned to multiple clusters using membership values), with the primary aim of creating a semi-supervised fuzzy clustering algorithm that shows good performance on real-world data. Hence, there are two main objectives in this work. The first objective is to explore novel technical improvements to semi-supervised Fuzzy c-means (ssFCM) that can address the problem of initialisation sensitivity and can improve results. The second objective is to apply the developed algorithm on real biomedical data, such as the Nottingham Tenovus Breast Cancer (NTBC) dataset, to create an automatic methodology for identifying stable subgroups which have been previously elicited semi-manually. Investigations were conducted into detailed improvements to the ss-FCM algorithm framework, including a range of distance metrics, initialisation and feature selection techniques and scaling parameter values. These methodologies were tested on different data sources to demonstrate their generalisation properties. Evaluation results between methodologies were compared to determine suitable techniques on various University of California, Irvine (UCI) benchmark datasets. Results were promising, suggesting that initialisation techniques, feature selection and scaling parameter adjustment can increase ssFCM performance. Based on these investigations, a novel ssFCM framework was developed, applied to the NTBC dataset, and various statistical and biological evaluations were conducted. This demonstrated highly significant improvement in agreement with previous classifications, with solutions that are biologically useful and clinically relevant in comparison with Sorias study [141]. On comparison with the latest NTBC study by Green et al. [63], similar clinical results have been observed, confirming stability of the subgroups. Two main contributions to knowledge have been made in this work. Firstly, the ssFCM framework has been improved through various technical refinements, which may be used together or separately. Secondly, the NTBC dataset has been successfully automatically clustered (in a single algorithm) into clinical sub-groups which had previously been elucidated semi-manually. While results are very promising, it is important to note that fully, detailed validation of the framework has only been carried out on the NTBC dataset, and so there is limit on the general conclusions that may be drawn. Future studies include applying the framework on other biomedical datasets and applying distance metric learning into ssFCM. In conclusion, an enhanced ssFCM framework has been proposed, and has been demonstrated to have highly significant improved accuracy on the NTBC dataset.

APA, Harvard, Vancouver, ISO, and other styles

23

Milagre, Selma Terezinha. "Análise do número de grupos em bases de dados incompletas utilizando agrupamentos nebulosos e reamostragem Bootstrap." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/18/18153/tde-04032009-150315/.

Full text

Abstract:

A técnica de agrupamento de dados é amplamente utilizada em análise exploratória, a qual é frequentemente necessária em diversas áreas de pesquisa tais como medicina, biologia e estatística, para avaliar potenciais hipóteses a serem utilizadas em estudos subseqüentes. Em bases de dados reais, a ocorrência de dados incompletos, nos quais os valores de um ou mais atributos do dado são desconhecidos, é bastante comum. Este trabalho apresenta um método capaz de identificar o número de grupos presentes em bases de dados incompletas, utilizando a combinação das técnicas de agrupamentos nebulosos e reamostragem bootstrap. A qualidade da classificação é baseada em medidas de comparação tradicionais como F1, Classificação Cruzada, Hubert e outras. Os estudos foram feitos em oito bases de dados. As quatro primeiras são bases de dados artificiais, a quinta e a sexta são a wine e íris. A sétima e oitava bases são formadas por uma coleção brasileira de 119 estirpes de Bradyrhizobium. Para avaliar toda informação sem introduzir estimativas, fez-se a modificação do algoritmo Fuzzy C-Means (FCM) utilizando-se um vetor de índices de atributos, os quais indicam onde o valor de um atributo é observado ou não, modificando-se ento, os cálculos do centro e distância ao centro. As simulações foram feitas de 2 até 8 grupos utilizando-se 100 sub-amostras. Os percentuais de valores faltando utilizados foram 2%, 5%, 10%, 20% e 30%. Os resultados deste trabalho demonstraram que nosso método é capaz de identificar participações relevantes, até em presença de altos índices de dados incompletos, sem a necessidade de se fazer nenhuma suposição sobre a base de dados. As medidas Hubert e índice randômico ajustado encontraram os melhores resultados experimentais.
Clustering in exploratory data analysis is often necessary in several areas of the survey such as medicine, biology and statistics, to evaluate potential hypotheses for subsequent studies. In real datasets the occurrence of incompleteness, where the values of some of the attributes are unknown, is very common. This work presents a method capable to identifying the number of clusters present in incomplete datasets, using a combination of the fuzzy clustering and resampling (bootstrapping). The quality of classification is based on the traditional measures, like F1, Cross-Classification, Hubert and others. The studies were made on eigth datasets. The first four are artificial datasets, the fifth and sixth are the wine and iris datasets. The seventh and eighth databases are composed of the brazilian collection of 119 Bradyrhizobium strains. To evaluate all information without introducing estimates, a modification of the Fuzzy C-Means (FCM) algorithm was developed using an index vector of attributes, which indicates whether an attribute value is observed or not, and changing the center and distance calculations. The simulations were made from 2 to 8 clusters using 100 sub-samples. The percentages of the missing values used were 2%, 5%, 10%, 20% and 30%. Even lacking data and with no special requirements of the database, the results of this work demonstrate that the proposed method is capable to identifying relevant partitions. The best experimental results were found using Hubert and corrected randomness measures.

APA, Harvard, Vancouver, ISO, and other styles

24

Hughes, M. Joseph. "Determining biogeochemical assemblages on the Stony River, Grant County, WV, using fuzzy c-means and k-nearest neighbors clustering." Huntington, WV : [Marshall University Libraries], 2006. http://www.marshall.edu/etd/descript.asp?ref=723.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Gu, Yuhua. "Ant clustering with consensus." [Tampa, Fla] : University of South Florida, 2009. http://purl.fcla.edu/usf/dc/et/SFE0002959.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Bacak, Hikmet Ozge. "Decision Making System Algorithm On Menopause Data Set." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12612471/index.pdf.

Full text

Abstract:

Multiple-centered clustering method and decision making system algorithm on menopause data set depending on multiple-centered clustering are described in this study. This method consists of two stages. At the first stage, fuzzy C-means (FCM) clustering algorithm is applied on the data set under consideration with a high number of cluster centers. As the output of FCM, cluster centers and membership function values for each data member is calculated. At the second stage, original cluster centers obtained in the first stage are merged till the new numbers of clusters are reached. Merging process relies upon a &ldquo
similarity measure&rdquo
between clusters defined in the thesis. During the merging process, the cluster center coordinates do not change but the data members in these clusters are merged in a new cluster. As the output of this method, therefore, one obtains clusters which include many cluster centers. In the final part of this study, an application of the clustering algorithms &ndash
including the multiple centered clustering method &ndash
a decision making system is constructed using a special data on menopause treatment. The decisions are based on the clusterings created by the algorithms already discussed in the previous chapters of the thesis. A verification of the decision making system / v decision aid system is done by a team of experts from the Department of Department of Obstetrics and Gynecology of Hacettepe University under the guidance of Prof. Sinan Beksaç
.

APA, Harvard, Vancouver, ISO, and other styles

27

Soumi, Ghosh. "A Quasi Stationary Service Architecture for Network Monitoring and Connectivity Prediction in Aeronautical Ad Hoc Network Using Fuzzy C Means Clustering." Thesis, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/31495.

Full text

Abstract:

An Aeronautical Ad Hoc Network (AANET) of airborne elements is a high speed mobile network. The AANET has a 3D topology spread across the airspace. The high ground speed of the airborne elements changes the network topology rapidly. This makes AANET highly dynamic in nature. Upholding the connectivity in the network in such dynamic environment is a challenge. The connectivity in the network is primarily in uenced by proximity of airborne elements to each other and their relative velocities. Once an airborne element gets disconnected from the network, it becomes completely oblivious of the network scenario in its neighborhood. In the absence of a monitoring agent in the airspace, a disconnected member of the network largely depends on the ground infrastructure and satellite resources for immediate information regarding its surrounding region. Network monitoring in dynamic environment of AANET is a challenging task, mainly due to the mobility of the airborne elements. We propose an intelligent network monitoring system for AANET. Disconnected members of the network are directed towards region in the airspace with "higher" connectivity and "minimum" traffic under the monitoring system. Our monitoring system depends on a quasi stationary layer of Higher Altitude Platform (HAP) in the airspace. The primary focus of the monitoring system is to mitigate disconnectivity in the AANET. The monitoring of the network is achieved by a periodic monitoring scheme in every HAP. The proposed HAP monitoring system aims at making the AANET more independent of the ground infrastructure and satellite resources. We also reckon Fuzzy C Means (FCM) data clustering as a means to monitor changes in network topology and traffic in the AANET. The FCM clustering is an integral part of our monitoring scheme. Our simulations demonstrate that FCM clustering can efficiently track the network changes in the AANET and identify regions of connectivity in the network.

APA, Harvard, Vancouver, ISO, and other styles

28

Fontana, Fabiane Sorbar. "Definição de zonas de manejo utilizando algoritmo de agrupamento fuzzy c-means com variadas métricas de distâncias." Universidade Estadual do Oeste do Paraná, 2017. http://tede.unioeste.br/handle/tede/3764.

Full text

Abstract:

Submitted by Neusa Fagundes (neusa.fagundes@unioeste.br) on 2018-06-15T20:19:22Z No. of bitstreams: 2 Fabiane_Fontana2018.pdf: 2677532 bytes, checksum: 3036328537227cc96b8ea368e893f2fc (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Made available in DSpace on 2018-06-15T20:19:22Z (GMT). No. of bitstreams: 2 Fabiane_Fontana2018.pdf: 2677532 bytes, checksum: 3036328537227cc96b8ea368e893f2fc (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2017-07-19
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
Precision Agriculture (AP) uses technologies aimed at increasing productivity and reducing environmental impact through localized application of agricultural inputs. In order to make AP economically feasible, it is essential to improve current methodologies, as well as to propose new ones, such as the design of management areas (MZs) from productivity data, topographic, and soil attributes, among others, to determine which are heterogeneous subareas among themselves in the same area. In this context, the main objective of this research was to evaluate three distance metrics (Diagonal, Euclidian, and Mahalanobis) through FUZME and SDUM software (for the definition of management units) using the fuzzy c-means algorithm, and, at a further moment, to evaluate the cultures of soybeans and corn, as well as the association between them. On the first scientific paper, using data corresponding to four distinct areas, the three metrics with original and normalized data associated with soybean yield were evaluated. For area A, the Diagonal and Mahalanobis distances exempted the need for normalization of the variables, presenting areas that were identical for both versions. After the normalization of the data, the Euclidian distance presented a better delineation in its MZs for area A. For areas B, C, and D it was not possible to reach conclusions regarding the best performance, since only one variable was used for the process of MZs, and that has directly influenced the results. On the second scientific paper, data corresponding to three distinct areas were applied to analyze the use of soybean and corn yields, as well as the association between them, in the selection of variables to define MZs. Based on the variables available for each of the areas, the selection was carried out using the spatial correlation method, considering, for each one of the areas, the three target yields (soybean, corn, and soybean+corn). The type of productivity used demonstrated two different outcomes: first in the variable selection process, where its alternation resulted in different selections for the same area, and second, in the evaluation of the defined MZs, where even when the same variables were selected in the definition of the MZs, the performances of the MZs were different. After the validation methods applied, it was verified that the best target yield was soybean+corn, reasserting the idea of being better to use these two cultures, together, when defining the MZs of an area with rotating crops of soybean and corn.
A Agricultura de Precisão (AP) utiliza tecnologias objetivando o aumento da produtividade e redução do impacto ambiental por meio de aplicação localizada de insumos agrícolas. Para viabilizar economicamente a AP, é essencial aprimorar as metodologias atuais, bem como propor novas, como, por exemplo, o delineamento de zonas de manejo (ZMs) a partir de dados de produtividade, atributos topográficos e do solo, entre outros, utilizados a fim de determinar subáreas heterogêneas entre si em uma mesma área. Neste contexto, este trabalho teve como principal objetivo avaliar três métricas de distâncias (Diagonal, Euclidiana e Mahalanobis) junto aos Softwares FUZME e SDUM (Software para a definição de unidades de manejo), que utilizam o algoritmo fuzzy c-means, e, em um segundo momento, avaliar também as culturas de soja e milho, assim como a associação entre elas. No primeiro artigo, utilizando dados correspondentes a quatro áreas distintas, avaliaram-se as três métricas com dados originais e normalizados associados à produtividade de soja. Para a área A, as distâncias Diagonal e Mahalanobis dispensaram a necessidade de normalização das variáveis, apresentando áreas idênticas para as duas versões. Após a normalização dos dados, a distância Euclidiana apresentou um melhor delineamento em suas ZMs para a área A. Para as áreas B, C e D não foi possível obter conclusões quanto ao melhor desempenho, visto que o fato de ser utilizado apenas uma variável para o processo de definição de ZMs influenciou diretamente nos resultados obtidos. No segundo artigo, dados correspondentes a três áreas distintas foram utilizados para analisar o uso de produtividades de soja e milho, assim como a associação entre elas, na seleção de variáveis para definição de ZMs. A partir das variáveis disponíveis para cada uma das áreas foi realizada a seleção destas através do método da correlação espacial, levando em consideração, para cada uma das áreas, as três produtividades-alvo (soja, milho e soja+milho). O tipo de produtividade utilizada repercutiu de duas formas diferentes: primeiro no processo de seleção de variáveis, onde a sua alternância resultou em seleções diferenciadas para uma mesma área; e em um segundo momento, na avaliação das ZMs definidas, onde mesmo quando as mesmas variáveis foram selecionadas na definição das ZMs, os desempenhos das ZMs foram diferentes. Após os métodos de validação aplicados, verificou-se que a melhor produtividade-alvo foi soja+milho, reforçando a ideia de ser útil a utilização destas duas culturas, em conjunto, na definição das ZMs de uma área com alternância de produção de soja e milho.

APA, Harvard, Vancouver, ISO, and other styles

29

Silva, Ana Claudia Guedes. "Identificação de regiões hidrologicamente homogêneas por agrupamento fuzzy c-means no estado do Paraná." Universidade Estadual do Oeste do Paraná, 2018. http://tede.unioeste.br/handle/tede/3760.

Full text

Abstract:

Submitted by Neusa Fagundes (neusa.fagundes@unioeste.br) on 2018-06-15T17:07:21Z No. of bitstreams: 2 Ana Claudia_Silva2018.pdf: 1741410 bytes, checksum: 83384ab7c02835c3d776f862defc84c1 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Made available in DSpace on 2018-06-15T17:07:21Z (GMT). No. of bitstreams: 2 Ana Claudia_Silva2018.pdf: 1741410 bytes, checksum: 83384ab7c02835c3d776f862defc84c1 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2018-02-07
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
The design of hydrologically homogeneous regions (RHH) is an essential procedure to provide information essential to the modeling, planning, and management of water resources, especially when it is necessary to perform the regionalization of flows, aiming to estimate the water availability in sections without measurements. The definition of strategies for the management and conservation of natural resources depends on information obtained through the identification of RHH, also being one of the steps of a study of regionalization of flows. Thus, this work has the objective of identifying the RHH in the state of Paraná through the grouping method Fuzzy C-Means. A total of 9 variables were used for the 114 fluviometric stations, with 4 dependent variables related to the characteristic flows (annual average long-term flow (Qmld), minimum annual flow with seven days duration and 10-year return period (Q7,10), flow rates associated to the 95% (Q95) and 90% (Q90) permanencies) and 5 independent variables related to the morphometric characteristics of the station (drainage area (AD - m²), sum of drainage (SD - m) (LA - Lat and longitude - Long). From the principal components analysis (PCA), the variables Qmld, DD, Lat and Long were identified as the least representative, being discarded from the study, proceeding with the analysis using only the variables AD, SD, Q90, Q95, and Q7,10. The results were obtained using the Fuzzy C-Means for the chosen variables, and the smallest objective function was found for 4 Clusters in the study group, with index of and fuzzification (m) 1.7. Separating the fluviometric stations by clusters through degrees of pertinence, the largest number of stations were obtained in Cluster 3 (83 stations), followed by Cluster 4 (13 stations) and Clusters 1 and 2 (7 stations in each cluster), and only 4 stations were not inserted in any cluster, being classified as nebulae, where the groups were determined practically by the distribution of the AD and SD variables. The smaller areas of coverage, analyzed flows and the smaller amount of drainage in the coverage area of the stations were found in Cluster 3, considering they were well spread in the state of Paraná. Clusters 1 and 4 were intermediate among the other clusters in all parameters evaluated. The Fuzzy C-Means algorithm proved to be efficient for the grouping of fluviometric stations in the state of Paraná, where it was possible to find the characteristics of each cluster formed, without overlapping of data in the analyzed variables.
O delineamento de regiões hidrologicamente homogêneas (RHH) é um procedimento essencial para provimento de informações indispensáveis aos trabalhos de modelagem, planejamento e gestão de recursos hídricos, principalmente quando se tem a necessidade de realizar a regionalização de vazões, visando estimar a disponibilidade hídrica em seções desprovidas de medições. A definição de estratégias de manejo e conservação dos recursos naturais depende de informações obtidas por meio da identificação de RHH, sendo também um dos passos de um estudo de regionalização de vazões. Assim, este trabalho tem como objetivo a identificação das RHH no estado do Paraná através do método de agrupamento Fuzzy C-Means. Foram utilizadas 9 variáveis, individualizadas para as 114 estações fluviométricas adotadas, sendo 4 variáveis dependentes referentes às vazões características (vazão média anual de longa duração (Qmld), vazão mínima anual com sete dias de duração e período de retorno de 10 anos (Q7,10), vazões associadas às permanências de 95% (Q95) e 90% (Q90)) e 5 independentes referentes às características morfometrias da estação (área de drenagem (AD – m²), soma das drenagens (SD - m), densidade de drenagem (DD – 1/m) e a localização geográfica (latitude - Lat e longitude - Long). A partir da análise de componentes principais (ACP) identificou-se as variáveis Qmld, DD, Lat e Long como as menos representativas, sendo excluídas do estudo, dando procedência à análise de agrupamentos apenas com as variáveis AD, SD, Q90, Q95 e Q7,10. Aplicou-se o Fuzzy C-Means para as variáveis escolhidas, sendo que a menor função objetiva encontrada foi para 4 Clusters no índice de fuzzificação (m) 1,7. Separando as estações fluviométricas por clusters através dos graus de pertinência, obtivemos o maior número de estações no Cluster 3 (83 estações), seguidos do Cluster 4 (13 estações) e dos Clusters 1 e 2 (7 estações em cada cluster), e apenas 4 estações não foram inseridas em nenhum cluster, sendo classificadas como nebulosas, sendo que os grupos foram determinados praticamente pela distribuição das variáveis AD e SD. As menores áreas de abrangência, vazões analisadas e as menores quantidade de drenagens na área de cobertura das estações foram encontras no Cluster 3, que estão bem espalhadas no estado do Paraná. Já os Clusters 1 e 4 ficaram intermediários entre os demais clusters em todos os parâmetros avaliados. O algoritmo Fuzzy C-Means se mostrou eficiente para o agrupamento das estações fluviométricas no estado do Paraná, onde foi possível encontrar as características de cada cluster formado, sem haver sobreposição de dados nos intervalos das variáveis analisadas.

APA, Harvard, Vancouver, ISO, and other styles

30

Vargas, Rogerio Rodrigues de. "Uma nova forma de calcular os centros dos Clusters em algoritmos de agrupamento tipo fuzzy c-means." Universidade Federal do Rio Grande do Norte, 2012. http://repositorio.ufrn.br:8080/jspui/handle/123456789/17949.

Full text

Abstract:

Made available in DSpace on 2014-12-17T15:47:00Z (GMT). No. of bitstreams: 1 RogerioRV_TESE.pdf: 769325 bytes, checksum: ddaac964e1c74fba3533b5cdd90927b2 (MD5) Previous issue date: 2012-03-30
Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior
Clustering data is a very important task in data mining, image processing and pattern recognition problems. One of the most popular clustering algorithms is the Fuzzy C-Means (FCM). This thesis proposes to implement a new way of calculating the cluster centers in the procedure of FCM algorithm which are called ckMeans, and in some variants of FCM, in particular, here we apply it for those variants that use other distances. The goal of this change is to reduce the number of iterations and processing time of these algorithms without affecting the quality of the partition, or even to improve the number of correct classifications in some cases. Also, we developed an algorithm based on ckMeans to manipulate interval data considering interval membership degrees. This algorithm allows the representation of data without converting interval data into punctual ones, as it happens to other extensions of FCM that deal with interval data. In order to validate the proposed methodologies it was made a comparison between a clustering for ckMeans, K-Means and FCM algorithms (since the algorithm proposed in this paper to calculate the centers is similar to the K-Means) considering three different distances. We used several known databases. In this case, the results of Interval ckMeans were compared with the results of other clustering algorithms when applied to an interval database with minimum and maximum temperature of the month for a given year, referring to 37 cities distributed across continents
Agrupar dados ? uma tarefa muito importante em minera??o de dados, processamento de imagens e em problemas de reconhecimento de padr?es. Um dos algoritmos de agrupamentos mais popular ? o Fuzzy C-Means (FCM). Esta tese prop?e aplicar uma nova forma de calcular os centros dos clusters no algoritmo FCM, que denominamos de ckMeans, e que pode ser tamb?m aplicada em algumas variantes do FCM, em particular aqui aplicamos naquelas variantes que usam outras dist?ncias. Com essa modifica??o, pretende-se reduzir o n?mero de itera??es e o tempo de processamento desses algoritmos sem afetar a qualidade da parti??o ou at? melhorar o n?mero de classifica??es corretas em alguns casos. Tamb?m, desenvolveu-se um algoritmo baseado no ckMeans para manipular dados intervalares considerando graus de pertin?ncia intervalares. Este algoritmo possibilita a representa??o dos dados sem convers?o dos dados intervalares para pontuais, como ocorre com outras extens?es do FCM que lidam com dados intervalares. Para validar com as metodologias propostas, comparou-se o agrupamento ckMeans com os algoritmos K-Means (pois o algoritmo proposto neste trabalho para c?lculo dos centros se assemelha ? do K-Means) e FCM, considerando tr?s dist?ncias diferentes. Foram utilizadas v?rias bases de dados conhecidas. No caso, os resultados do ckMeans intervalar, foram comparadas com outros algoritmos de agrupamento intervalar quando aplicadas a uma base de dados intervalar com a temperatura m?nima e m?xima do m?s de um determinado ano, referente a 37 cidades distribu?das entre os continentes

APA, Harvard, Vancouver, ISO, and other styles

31

Filho, Márcio Coutinho Brandão Côrtes. "Reconhecimento de padrão na biodisponibilidade do ferro utilizando o Algoritmo Fuzzy C-Means." Universidade do Estado do Rio de Janeiro, 2012. http://www.bdtd.uerj.br/tde_busca/arquivo.php?codArquivo=4902.

Full text

Abstract:

Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro
Este trabalho apresenta um método para reconhecimento do padrão na biodisponibilidade do ferro, através da interação com substâncias que auxiliam a absorção como vitamina C e vitamina A e nutrientes inibidores como cálcio, fitato, oxalato, tanino e cafeína. Os dados foram obtidos através de inquérito alimentar, almoço e jantar, em crianças de 2 a 5 anos da única Creche Municipal de Paraty-RJ entre 2007 e 2008. A Análise de Componentes Principais (ACP) foi aplicada na seleção dos nutrientes e utilizou-se o Algoritmo Fuzzy C-Means (FCM) para criar os agrupamentos classificados de acordo com a biodisponibilidade do ferro. Uma análise de sensibilidade foi desenvolvida na tentativa de buscar quantidades limítrofes de cálcio a serem consumidas nas refeições. A ACP mostrou que no almoço os nutrientes que explicavam melhor a variabilidade do modelo foram ferro, vitamina C, fitato e oxalato, enquanto no jantar o cálcio se mostrou eficaz na determinação da variabilidade do modelo devido ao elevado consumo de leite e derivados. Para o almoço, a aplicação do FCM na interação dos nutrientes, notou-se que a ingestão de vitamina C foi determinante na classificação dos grupos. No jantar, a classificação de grupos foi determinada pela quantidade de ferro heme na interação com o cálcio. Na análise de sensibilidade realizada no almoço e no jantar, duas iterações do algoritmo determinaram a interferência total do cálcio na biodisponibilidade do ferro.
This dissertation presents a method for pattern recognition on the bioavailability of iron, through interaction with substances that help the absorption such as vitamin C and vitamin A and inhibitors as calcium, phytate, oxalate, tannin and caffeine. The database was obtained through dietary, lunch and dinner, in children 2-5 years in the Municipal Nursery of Paraty - Rio de Janeiro, between 2007 and 2008. The Principal Component Analysis (PCA) was applied in the selection of nutrients and used the Fuzzy C-Means Algorithm (FCM) to create the groups classified according to the bioavailability of iron. A sensitivity analysis was developed in an attempt to find neighboring amounts of calcium being consumed at meals. The PCA showed that at lunch the nutrients that best explained the variability of the model were iron, vitamin C, phytate and oxalate, while at dinner the calcium was effective in determining the variability of the model due to high consumption of dairy products. For lunch, the application of FCM in the interaction of nutrients, it was noted that the intake of vitamin C was decisive in the classification of groups. At dinner, the classification of groups was determined by the amount of iron in the interaction with calcium. In the sensitivity analysis performed for lunch and dinner, two iterations of the algorithm determined the total interference of calcium on iron bioavailability.

APA, Harvard, Vancouver, ISO, and other styles

32

Liu, Xiaofeng. "Machinery fault diagnostics based on fuzzy measure and fuzzy integral data fusion techniques." Queensland University of Technology, 2007. http://eprints.qut.edu.au/16456/.

Full text

Abstract:

With growing demands for reliability, availability, safety and cost efficiency in modern machinery, accurate fault diagnosis is becoming of paramount importance so that potential failures can be better managed. Although various methods have been applied to machinery condition monitoring and fault diagnosis, the diagnostic accuracy that can be attained is far from satisfactory. As most machinery faults lead to increases in vibration levels, vibration monitoring has become one of the most basic and widely used methods to detect machinery faults. However, current vibration monitoring methods largely depend on signal processing techniques. This study is based on the recognition that a multi-parameter data fusion approach to diagnostics can produce more accurate results. Fuzzy measures and fuzzy integral data fusion theory can represent the importance of each criterion and express certain interactions among them. This research developed a novel, systematic and effective fuzzy measure and fuzzy integral data fusion approach for machinery fault diagnosis, which comprises feature set selection schema, feature level data fusion schema and decision level data fusion schema for machinery fault diagnosis. Different feature selection and fault diagnostic models were derived from these schemas. Two fuzzy measures and two fuzzy integrals were employed: the 2-additive fuzzy measure, the fuzzy measure, the Choquet fuzzy integral and the Sugeno fuzzy integral respectively. The models were validated using rolling element bearing and electrical motor experiments. Different features extracted from vibration signals were used to validate the rolling element bearing feature set selection and fault diagnostic models, while features obtained from both vibration and current signals were employed to assess electrical motor fault diagnostic models. The results show that the proposed schemas and models perform very well in selecting feature set and can improve accuracy in diagnosing both the rolling element bearing and electrical motor faults.

APA, Harvard, Vancouver, ISO, and other styles

33

Ulucan, Serkan. "A Recommendation System Combining Context-awarenes And User Profiling In Mobile Environment." Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606845/index.pdf.

Full text

Abstract:

Up to now various recommendation systems have been proposed for web based applications such as e-commerce and information retrieval where a large amount of product or information is available. Basically, the task of the recommendation systems in those applications, for example the e-commerce, is to find and recommend the most relevant items to users/customers. In this domain, the most prominent approaches are collaborative filtering and content-based filtering. Sometimes these approaches are called as user profiling as well. In this work, a context-aware recommendation system is proposed for mobile environment, which also can be considered as an extension of those recommendation systems proposed for web-based information retrieval and e-commerce applications. In the web-based information retrieval and e-commerce applications, for example in an online book store (e-commerce), the users&
#8217
actions are independent of their instant context (location, time&
#8230
etc). But as for mobile environment, the users&
#8217
actions are strictly dependent on their instant context. These dependencies give raise to need of filtering items/actions with respect to the users&
#8217
instant context. In this thesis, an approach coupling approaches from two different domains, one is the mobile environment and other is the web, is proposed. Hence, it will be possible to separate whole approach into two phases: context-aware prediction and user profiling. In the first phase, combination of two methods called fuzzy c-means clustering and learning automata will be used to predict the mobile user&
#8217
s motions in context space beforehand. This provides elimination of a large amount of items placed in the context space. In the second phase, hierarchical fuzzy clustering for users profiling will be used to determine the best recommendation among the remaining items.

APA, Harvard, Vancouver, ISO, and other styles

34

Arnaldo, Helo?na Alves. "Novos m?todos determin?sticos para gerar centros iniciais dos grupos no algoritmo fuzzy C-Means e variantes." Universidade Federal do Rio Grande do Norte, 2014. http://repositorio.ufrn.br:8080/jspui/handle/123456789/18109.

Full text

Abstract:

Made available in DSpace on 2014-12-17T15:48:11Z (GMT). No. of bitstreams: 1 HeloinaAA_DISSERT.pdf: 1661373 bytes, checksum: df9fe39185a27ded472f2f72284acdf6 (MD5) Previous issue date: 2014-02-24
Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior
Data clustering is applied to various fields such as data mining, image processing and pattern recognition technique. Clustering algorithms splits a data set into clusters such that elements within the same cluster have a high degree of similarity, while elements belonging to different clusters have a high degree of dissimilarity. The Fuzzy C-Means Algorithm (FCM) is a fuzzy clustering algorithm most used and discussed in the literature. The performance of the FCM is strongly affected by the selection of the initial centers of the clusters. Therefore, the choice of a good set of initial cluster centers is very important for the performance of the algorithm. However, in FCM, the choice of initial centers is made randomly, making it difficult to find a good set. This paper proposes three new methods to obtain initial cluster centers, deterministically, the FCM algorithm, and can also be used in variants of the FCM. In this work these initialization methods were applied in variant ckMeans.With the proposed methods, we intend to obtain a set of initial centers which are close to the real cluster centers. With these new approaches startup if you want to reduce the number of iterations to converge these algorithms and processing time without affecting the quality of the cluster or even improve the quality in some cases. Accordingly, cluster validation indices were used to measure the quality of the clusters obtained by the modified FCM and ckMeans algorithms with the proposed initialization methods when applied to various data sets
Agrupamento de dados ? uma t?cnica aplicada a diversas ?reas como minera??o de dados, processamento de imagens e reconhecimento de padr?es. Algoritmos de agrupamento particionam um conjunto de dados em grupos, de tal forma, que elementos dentro de um mesmo grupo tenham alto grau de similaridade, enquanto elementos pertencentes a diferentes grupos tenham alto grau de dissimilaridade. O algoritmo Fuzzy C-Means (FCM) ? um dos algoritmos de agrupamento fuzzy de dados mais utilizados e discutidos na literatura. O desempenho do FCM ? fortemente afetado pela sele??o dos centros iniciais dos grupos. Portanto, a escolha de um bom conjunto de centros iniciais ? muito importante para o desempenho do algoritmo. No entanto, no FCM, a escolha dos centros iniciais ? feita de forma aleat?ria, tornando dif?cil encontrar um bom conjunto. Este trabalho prop?e tr?s novos m?todos para obter os centros iniciais dos grupos, de forma determin?stica, no algoritmo FCM, e que podem tamb?m ser usados em variantes do FCM. Neste trabalho esses m?todos de inicializa??o foram aplicados na variante ckMeans. Com os m?todos propostos, pretende-se obter um conjunto de centros iniciais que esteja pr?ximo dos centros reais dos grupos. Com estas novas abordagens de inicializa??o deseja-se reduzir o n?mero de itera??es para estes algoritmos convergirem e o tempo de processamento, sem afetar a qualidade do agrupamento ou at? melhorar a qualidade em alguns casos. Neste sentido, foram utilizados ?ndices de valida??o de agrupamento para medir a qualidade dos agrupamentos obtidos pelos algoritmos FCM e ckMeans, modificados com os m?todos de inicializa??o propostos, quando aplicados a diversas bases de dados

APA, Harvard, Vancouver, ISO, and other styles

35

Wong, Cheok Meng. "A distributed particle swarm optimization for fuzzy c-means algorithm based on an apache spark platform." Thesis, University of Macau, 2018. http://umaclib3.umac.mo/record=b3950604.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Guder, Mennan. "Data Mining Methods For Clustering Power Quality Data Collected Via Monitoring Systems Installed On The Electricity Network." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/3/12611120/index.pdf.

Full text

Abstract:

Increasing power demand and wide use of high technology power electronic devices result in need for power quality monitoring. The quality of electric power in both transmission and distribution systems should be analyzed in order to sustain power system reliability and continuity. This analysis is possible by examination of data collected by power quality monitoring systems. In order to define the characteristics of the power system and reveal the relations between the power quality events, huge amount of data should be processed. In this thesis, clustering methods for power quality events are developed using exclusive and overlapping clustering models. The methods are designed to cluster huge amount of power quality data which is obtained from the online monitoring of the Turkish Electricity Transmission System. The main issues considered in the design of the clustering methods are the amount of the data, efficiency of the designed algorithm and queries that should be supplied to the domain experts. This research work is fully supported by the Public Research grant Committee (KAMAG) of TUBITAK within the scope of National Power quality Project (105G129).

APA, Harvard, Vancouver, ISO, and other styles

37

Mohd-Safar, Noor Zuraidin. "Integration of principal component analysis, fuzzy C-means and artificial neural networks for localised environmental modelling of tropical climate." Thesis, University of Portsmouth, 2017. https://researchportal.port.ac.uk/portal/en/theses/integration-of-principal-component-analysis-fuzzy-cmeans-and-artificial-neural-networks-for-localised-environmental-modelling-of-tropical-climate(46e896a0-e712-4e4f-9812-d5c977fe6b1d).html.

Full text

Abstract:

Meteorological processes are highly non-linear and complicated to predict at high spatial resolutions. Weather forecasting provides critical information about future weather that is important for flooding disaster prediction system and disaster management. This information is also important to businesses, industry, agricultural sector, government and local authorities for a wide range of reasons. Processes leading to rainfall are non-linear with the relationships between meteorological parameters are dynamic and disproportionate. The uncertainty of future occurrence and rain intensity can have a negative impact on many sectors which depend on the weather condition. Therefore, having an accurate rainfall prediction is important in human decisions. Innovative computer technologies such as soft computing can be used to improve the accuracy of rainfall prediction. Soft computing approaches, such as neural network and fuzzy soft clustering are computational intelligent systems that are capable of integrating humanlike knowledge within a specific domain, adapt themselves and learn in changing environments. This study evaluates the performance of a rainfall forecasting model. The data pre-processing method of Principal Component Analysis (PCA) is combined with an Artificial Neural Network (ANN) and Fuzzy C-Means (FCM) clustering algorithm and used to forecast short-term localized rainfall in tropical climate. State forecast (raining or not raining) and value forecast (rain intensity) are tested using a number of trained networks. Different types of ANN structures were trained with a combination of multilayer perceptron with a back propagation network. Levenberg-Marquardt, Bayesian Regularization and a Scaled Conjugate Gradient training algorithm are used in the network training. Each neuron uses linear, logistic sigmoid and hyperbolic tangent sigmoid as a transfer function. Preliminary analysis of input parameter data pre-processing and FCM clustering were used to prepare input data for the ANN forecast model. Meteorological data such as atmospheric pressure, temperature, dew point, humidity and wind speedhave been used as input parameters. The magnitude of errors and correlation coefficient were used to evaluate the performance of trained neural networks. The predicted rainfall forecast for one to six hour ahead are compared and analysed. One hour ahead for state and value forecast yield more than 80% accuracy. The increasing hours of rain prediction will reduce the forecast accuracy because input-output mapping of the forecast model reached termination criterion early during validation test and no improvement of convergence in the consecutive number of epochs. Result shows that, the combination of PCA-FCM-ANN forecast model produces better accuracy compared to a basic ANN forecast model.

APA, Harvard, Vancouver, ISO, and other styles

38

Rodríguez, Martínez Cecilia. "Software quality studies using analytical metric analysis." Thesis, KTH, Kommunikationssystem, CoS, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-120325.

Full text

Abstract:

Today engineering companies expend a large amount of resources on the detection and correction of the bugs (defects) in their software. These bugs are usually due to errors and mistakes made by programmers while writing the code or writing the specifications. No tool is able to detect all of these bugs. Some of these bugs remain undetected despite testing of the code. For these reasons, many researchers have tried to find indicators in the software’s source codes that can be used to predict the presence of bugs. Every bug in the source code is a potentially failure of the program to perform as expected. Therefore, programs are tested with many different cases in an attempt to cover all the possible paths through the program to detect all of these bugs. Early prediction of bugs informs the programmers about the location of the bugs in the code. Thus, programmers can more carefully test the more error prone files, and thus save a lot of time by not testing error free files. This thesis project created a tool that is able to predict error prone source code written in C++. In order to achieve this, we have utilized one predictor which has been extremely well studied: software metrics. Many studies have demonstrated that there is a relationship between software metrics and the presence of bugs. In this project a Neuro-Fuzzy hybrid model based on Fuzzy c-means and Radial Basis Neural Network has been used. The efficiency of the model has been tested in a software project at Ericsson. Testing of this model proved that the program does not achieve high accuracy due to the lack of independent samples in the data set. However, experiments did show that classification models provide better predictions than regression models. The thesis concluded by suggesting future work that could improve the performance of this program.
Idag spenderar ingenjörsföretag en stor mängd resurser på att upptäcka och korrigera buggar (fel) i sin mjukvara. Det är oftast programmerare som inför dessa buggar på grund av fel och misstag som uppkommer när de skriver koden eller specifikationerna. Inget verktyg kan detektera alla dessa buggar. Några av buggarna förblir oupptäckta trots testning av koden. Av dessa skäl har många forskare försökt hitta indikatorer i programvarans källkod som kan användas för att förutsäga förekomsten av buggar. Varje fel i källkoden är ett potentiellt misslyckande som gör att applikationen inte fungerar som förväntat. För att hitta buggarna testas koden med många olika testfall för att försöka täcka alla möjliga kombinationer och fall. Förutsägelse av buggar informerar programmerarna om var i koden buggarna finns. Således kan programmerarna mer noggrant testa felbenägna filer och därmed spara mycket tid genom att inte behöva testa felfria filer. Detta examensarbete har skapat ett verktyg som kan förutsäga felbenägen källkod skriven i C ++. För att uppnå detta har vi utnyttjat en välkänd metod som heter Software Metrics. Många studier har visat att det finns ett samband mellan Software Metrics och förekomsten av buggar. I detta projekt har en Neuro-Fuzzy hybridmodell baserad på Fuzzy c-means och Radial Basis Neural Network använts. Effektiviteten av modellen har testats i ett mjukvaruprojekt på Ericsson. Testning av denna modell visade att programmet inte Uppnå hög noggrannhet på grund av bristen av oberoende urval i datauppsättningen. Men gjordt experiment visade att klassificering modeller ger bättre förutsägelser än regressionsmodeller. Exjobbet avslutade genom att föreslå framtida arbetet som skulle kunna förbättra detta program.
Actualmente las empresas de ingeniería derivan una gran cantidad de recursos a la detección y corrección de errores en sus códigos software. Estos errores se deben generalmente a los errores cometidos por los desarrolladores cuando escriben el código o sus especificaciones. No hay ninguna herramienta capaz de detectar todos estos errores y algunos de ellos pasan desapercibidos tras el proceso de pruebas. Por esta razón, numerosas investigaciones han intentado encontrar indicadores en los códigos fuente del software que puedan ser utilizados para detectar la presencia de errores. Cada error en un código fuente es un error potencial en el funcionamiento del programa, por ello los programas son sometidos a exhaustivas pruebas que cubren (o intentan cubrir) todos los posibles caminos del programa para detectar todos sus errores. La temprana localización de errores informa a los programadores dedicados a la realización de estas pruebas sobre la ubicación de estos errores en el código. Así, los programadores pueden probar con más cuidado los archivos más propensos a tener errores dejando a un lado los archivos libres de error. En este proyecto se ha creado una herramienta capaz de predecir código software propenso a errores escrito en C++. Para ello, en este proyecto se ha utilizado un indicador que ha sido cuidadosamente estudiado y ha demostrado su relación con la presencia de errores: las métricas del software. En este proyecto un modelo híbrido neuro-disfuso basado en Fuzzy c-means y en redes neuronales de función de base radial ha sido utilizado. La eficacia de este modelo ha sido probada en un proyecto software de Ericsson. Como resultado se ha comprobado que el modelo no alcanza una alta precisión debido a la falta de muestras independientes en el conjunto de datos y los experimentos han mostrado que los modelos de clasificación proporcionan mejores predicciones que los modelos de regresión. El proyecto concluye sugiriendo trabajo que mejoraría el funcionamiento del programa en el futuro.

APA, Harvard, Vancouver, ISO, and other styles

39

Koprnicky, Miroslav. "Towards a Versatile System for the Visual Recognition of Surface Defects." Thesis, University of Waterloo, 2005. http://hdl.handle.net/10012/888.

Full text

Abstract:

Automated visual inspection is an emerging multi-disciplinary field with many challenges; it combines different aspects of computer vision, pattern recognition, automation, and control systems. There does not exist a large body of work dedicated to the design of generalized visual inspection systems; that is, those that might easily be made applicable to different product types. This is an important oversight, in that many improvements in design and implementation times, as well as costs, might be realized with a system that could easily be made to function in different production environments.

This thesis proposes a framework for generalizing and automating the design of the defect classification stage of an automated visual inspection system. It involves using an expandable set of features which are optimized along with the classifier operating on them in order to adapt to the application at hand. The particular implementation explored involves optimizing the feature set in disjoint sets logically grouped by feature type to keep search spaces reasonable. Operator input is kept at a minimum throughout this customization process, since it is limited only to those cases in which the existing feature library cannot adequately delineate the classes at hand, at which time new features (or pools) may have to be introduced by an engineer with experience in the domain.

Two novel methods are put forward which fit well within this framework: cluster-space and hybrid-space classifiers. They are compared in a series of tests against both standard benchmark classifiers, as well as mean and majority vote multi-classifiers, on feature sets comprised of just the logical feature subsets, as well as the entire feature sets formed by their union. The proposed classifiers as well as the benchmarks are optimized with both a progressive combinatorial approach and with an genetic algorithm. Experimentation was performed on true colour industrial lumber defect images, as well as binary hand-written digits.

Based on the experiments conducted in this work, it was found that the sequentially optimized multi hybrid-space methods are capable of matching the performances of the benchmark classifiers on the lumber data, with the exception of the mean-rule multi-classifiers, which dominated most experiments by approximately 3% in classification accuracy. The genetic algorithm optimized hybrid-space multi-classifier achieved best performance however; an accuracy of 79. 2%.

The numeral dataset results were less promising; the proposed methods could not equal benchmark performance. This is probably because the numeral feature-sets were much more conducive to good class separation, with standard benchmark accuracies approaching 95% not uncommon. This indicates that the cluster-space transform inherent to the proposed methods appear to be most useful in highly dependant or confusing feature-spaces, a hypothesis supported by the outstanding performance of the single hybrid-space classifier in the difficult texture feature subspace: 42. 6% accuracy, a 6% increase over the best benchmark performance.

The generalized framework proposed appears promising, because classifier performance over feature sets formed by the union of independently optimized feature subsets regularly met and exceeded those classifiers operating on feature sets formed by the optimization of the feature set in its entirety. This finding corroborates earlier work with similar results [3, 9], and is an aspect of pattern recognition that should be examined further.

APA, Harvard, Vancouver, ISO, and other styles

40

Budayan, Cenk. "Strategic Group Analysis: Strategic Perspective, Differentiation And Performance In Construction." Phd thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609676/index.pdf.

Full text

Abstract:

The aim of strategic group analysis is to find out if clusters of firms that have a similar strategic position exist within an industry or not. In this thesis, by using a conceptual framework that reflects the strategic context, contents and process of construction companies and utilising alternative clustering methods such as traditional cluster analysis, self-organizing maps, and fuzzy C-means technique, a strategic group analysis was conducted for the Turkish construction industry. Results demonstrate that there are three strategic groups among which significant performance differences exist. Self-organising maps provide a visual representation of group composition and help identification of hybrid structures. Fuzzy C-means technique reveals the membership degrees of a firm to each strategic group. It is recommended that real strategic group structure can only be identified by using alternative cluster analysis methods. The positive effect of differentiation strategy on achieving competitive advantage is widely acknowledged in the literature and proved to be valid for the Turkish construction industry as a result of strategic group analysis. In this study, a framework is proposed to model the differentiation process in construction. The relationships between the modes and drivers of differentiation are analyzed by structural equation modeling. The results demonstrate that construction companies can either differentiate on quality or productivity. Project management related factors extensively influence productivity differentiation whereas they influence quality differentiation indirectly. Corporate management related factors only affect quality differentiation. Moreover, resources influence productivity differentiation directly whereas they have an indirect effect on quality differentiation.

APA, Harvard, Vancouver, ISO, and other styles

41

Sobíšek, Lukáš. "Shluková a regresní analýza mikropanelových dat." Doctoral thesis, Vysoká škola ekonomická v Praze, 2010. http://www.nusl.cz/ntk/nusl-261941.

Full text

Abstract:

The main purpose of panel studies is to analyze changes in values of studied variables over time. In micro panel research, a large number of elements are periodically observed within the relatively short time period of just a few years. Moreover, the number of repeated measurements is small. This dissertation deals with contemporary approaches to the regression and the clustering analysis of micro panel data. One of the approaches to the micro panel analysis is to use multivariate statistical models originally designed for crosssectional data and modify them in order to take into account the within-subject correlation. The thesis summarizes available tools for the regression analysis of micro panel data. The known and currently used linear mixed effects models for a normally distributed dependent variable are recapitulated. Besides that, new approaches for analysis of a response variable with other than normal distribution are presented. These approaches include the generalized marginal linear model, the generalized linear mixed effects model and the Bayesian modelling approach. In addition to describing the aforementioned models, the paper also includes a brief overview of their implementation in the R software. The difficulty with the regression models adjusted for micro panel data is the ambiguity of their parameters estimation. This thesis proposes a way to improve the estimations through the cluster analysis. For this reason, the thesis also contains a description of methods of the cluster analysis of micro panel data. Because supply of the methods is limited, the main goal of this paper is to devise its own two-step approach for clustering micro panel data. In the first step, the panel data are transformed into a static form using a set of proposed characteristics of dynamics. These characteristics represent different features of time course of the observed variables. In the second step, the elements are clustered by conventional spatial clustering techniques (agglomerative clustering and the C-means partitioning). The clustering is based on a dissimilarity matrix of the values of clustering variables calculated in the first step. Another goal of this paper is to find out whether the suggested procedure leads to an improvement in quality of the regression models for this type of data. By means of a simulation study, the procedure drafted herein is compared to the procedure applied in the kml package of the R software, as well as to the clustering characteristics proposed by Urso (2004). The simulation study demonstrated better results of the proposed combination of clustering variables as compared to the other combinations currently used. A corresponding script written in the R-language represents another benefit of this paper. It is available on the attached CD and it can be used for analyses of readers own micro panel data.

APA, Harvard, Vancouver, ISO, and other styles

42

Hunter, Brandon. "Channel Probing for an Indoor Wireless Communications Channel." BYU ScholarsArchive, 2003. https://scholarsarchive.byu.edu/etd/64.

Full text

Abstract:

The statistics of the amplitude, time and angle of arrival of multipaths in an indoor environment are all necessary components of multipath models used to simulate the performance of spatial diversity in receive antenna configurations. The model presented by Saleh and Valenzuela, was added to by Spencer et. al., and included all three of these parameters for a 7 GHz channel. A system was built to measure these multipath parameters at 2.4 GHz for multiple locations in an indoor environment. Another system was built to measure the angle of transmission for a 6 GHz channel. The addition of this parameter allows spatial diversity at the transmitter along with the receiver to be simulated. The process of going from raw measurement data to discrete arrivals and then to clustered arrivals is analyzed. Many possible errors associated with discrete arrival processing are discussed along with possible solutions. Four clustering methods are compared and their relative strengths and weaknesses are pointed out. The effects that errors in the clustering process have on parameter estimation and model performance are also simulated.

APA, Harvard, Vancouver, ISO, and other styles

43

Ching-Lin, Lin, and 林敬霖. "Kernel Intuitionistic Fuzzy C-Means Clustering Algorithms with Rough Set for Customer Analysis." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/69926836270687990934.

Full text

Abstract:

碩士
龍華科技大學
資訊管理系碩士班
101
Fuzzy C-mean (FCM) algorithms have been widely used in variety of different places. This paper proposes a kernel intuitionistic fuzzy c-means clustering algorithms with rough set (KIFCMRS), and this method is applied to the E-learning data analysis. The rule generation can be divided into two stages for effective rule generation. In the first stage, KIFCM takes advantages of kernel function and intuitionistic fuzzy sets to cluster raw data into similarity groups. In the second stage, the rough set theory is employed to generate rules with different groups. Finally, this paper compared different methods, the first stages comparative KIFCM and the other two methods (KM, FCM), the second stages compare the KIFCMRS and the other two methods (ID3, RS). Comparison with other approaches demonstrate the superior performance of the proposed KIFCMRS.

APA, Harvard, Vancouver, ISO, and other styles

44

Lin, Chun-Hao, and 林峻皓. "Electric Signal-Based Proactive Operation Condition Monitoring of High-Voltage Motors Using Principal Component Analysis and Fuzzy C-means Clustering." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/3ta44f.

Full text

Abstract:

碩士
國立臺灣科技大學
電機工程系
107
In today's industry, high-voltage motors are indispensable sources of power. There are some characteristics about high-voltage motors, like long life cycle, high energy efficiency, low vibration noise and high stability. High-voltage motors usually need long-term operation to maintain economic efficiency. Therefore, how to maintain high-voltage motors is an important issue. Most of today's factories or electric power plants adopt a maintenance strategy for predetermined maintenance, also known as time-based maintenance (TBM). Although the probability of failure can be reduced, the potential operation status of the high-voltage motor cannot be displayed immediately. If the operation status of the high-voltage motor can be predicted early and prevented in advance, the maintenance cost can be greatly reduced, and major accidents can be avoided. This thesis is dedicated to the establishment of a proactive high-voltage motor operation condition monitoring method based on electric signals. Firstly, the three-line voltage and current signal of the high-voltage motor running in an electric power plant are captured by the measuring platform, and the one-day data of normal operation is taken from the database. Because there is no dangerous operation data of the high-voltage motor, this study adds additive white Gaussian noise (AWGN) and linear amplification on normal state data, synthesizing warning and dangerous state data, and makes a case study. Next, calculate the relevant electrical indexes in the international standard, and then extract the least number of characteristic indexes with the most structure information through the principal component analysis (PCA). Further, we use the extracted characteristic indexes dataset as the inputs, and employ the fuzzy C-means (FCM) clustering method to cluster the data, that is, distinguish the various operation states of the motor. Finally, the data is defuzzified, and the data points are displayed in percentage for the user to refer to the high-voltage motor operating state to make the most suitable maintenance decision.

APA, Harvard, Vancouver, ISO, and other styles

45

Sadri, Sara. "Frequency Analysis of Droughts Using Stochastic and Soft Computing Techniques." Thesis, 2010. http://hdl.handle.net/10012/5198.

Full text

Abstract:

In the Canadian Prairies recurring droughts are one of the realities which can have significant economical, environmental, and social impacts. For example, droughts in 1997 and 2001 cost over $100 million on different sectors. Drought frequency analysis is a technique for analyzing how frequently a drought event of a given magnitude may be expected to occur. In this study the state of the science related to frequency analysis of droughts is reviewed and studied. The main contributions of this thesis include development of a model in Matlab which uses the qualities of Fuzzy C-Means (FCMs) clustering and corrects the formed regions to meet the criteria of effective hydrological regions. In FCM each site has a degree of membership in each of the clusters. The algorithm developed is flexible to get number of regions and return period as inputs and show the final corrected clusters as output for most case scenarios. While drought is considered a bivariate phenomena with two statistical variables of duration and severity to be analyzed simultaneously, an important step in this study is increasing the complexity of the initial model in Matlab to correct regions based on L-comoments statistics (as apposed to L-moments). Implementing a reasonably straightforward approach for bivariate drought frequency analysis using bivariate L-comoments and copula is another contribution of this study. Quantile estimation at ungauged sites for return periods of interest is studied by introducing two new classes of neural network and machine learning: Radial Basis Function (RBF) and Support Vector Machine Regression (SVM-R). These two techniques are selected based on their good reviews in literature in function estimation and nonparametric regression. The functionalities of RBF and SVM-R are compared with traditional nonlinear regression (NLR) method. As well, a nonlinear regression with regionalization method in which catchments are first regionalized using FCMs is applied and its results are compared with the other three models. Drought data from 36 natural catchments in the Canadian Prairies are used in this study. This study provides a methodology for bivariate drought frequency analysis that can be practiced in any part of the world.

APA, Harvard, Vancouver, ISO, and other styles

46

Yung-Fu, Tsai. "Multiple-pose Face Detection Using Fuzzy C-means Clustering." 2005. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-2607200512112000.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Ssu-Min, Yang, and 楊斯閔. "Kernel-Based Fuzzy c-Means Clustering Algorithm Hadrware Implementation." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/72322960297601595511.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Tsai, Yung-Fu, and 蔡永富. "Multiple-pose Face Detection Using Fuzzy C-means Clustering." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/27909134956224960721.

Full text

Abstract:

碩士
國立臺灣大學
資訊管理學研究所
93
The challenges for face detection from images come from the variation of poses, facial expressions, occlusions, lighting conditions, and so on. We propose a method for multiple-pose face detection from still images. Our proposed method consists of three phases. First, skin pixels are extracted using a skin color model. Connected component analysis is performed to find the skin regions. Second, before extracting the feature vector of a skin region, we apply edge detection to the region. Our feature vector consists of two parts. The first part is obtained by dividing the edge image into 3*4 grids and calculating the number of horizontal edges and the number of vertical edges in each grid. The other part is obtained by computing the summary of color correlogram of the edge image. Third, with a set of training images, the fuzzy c-means (FCM) clustering algorithm is used to build face models. If the Euclidian distance between a feature vector and a face model does not exceed a predefined threshold, the region will be classified to a face. The experimental results show that our method can deal with the variation in poses, rotations, scales, and so on.

APA, Harvard, Vancouver, ISO, and other styles

49

Ma, Yen-Ting, and 馬嬿婷. "Sample and cluster weighted fuzzy c-means clustering algorithms." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/tpkev7.

Full text

Abstract:

碩士
中原大學
應用數學研究所
102
In fuzzy cluster analysis﹐the fuzzy c-means (FCM) clustering algorithm is the most well-known and used method. Up to now﹐there are various generalizations of FCM. In order to reduce the influences of the clustering results by outliers and noisy points﹐Yu, Yang and Lee (2011) proposed sample-weighted clustering methods that apply the maximum entropy principle to automatically calculate these sample weights so as to increase the robustness of the algorithm. The purpose of this thesis is to give a cluster-weighted version of sample-weighted FCM﹐called the sample and cluster weighted fuzzy c-means (SCW-FCM) clustering algorithms. We apply the SCW-FCM to real data sets. The results demonstrate the SCW-FCM is more effective than the SW-FCM.

APA, Harvard, Vancouver, ISO, and other styles

50

Chu, Chih-Yu, and 朱致宇. "Robust Fuzzy C-Means Clustering Algorithm for Interval Data." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/b8ne4x.

Full text

Abstract:

碩士
中原大學
應用數學研究所
106
Abstract In fuzzy clustering, the fuzzy c-means (FCM) algorithm is the most widely used clustering method. Many extensions of FCM had been proposed in the literature. However, the FCM algorithm and its extensions are usually affected by initializations and parameter selection with a number of clusters to be given a priori. Although there were some works to solve these problems in FCM, there is no work for FCM to be simultaneously robust to initializations and parameter selection under free of the fuzziness index m without a given the number of clusters and parameters in priori. The FCM also have a restriction for classify the interval type of measurement scale. In this thesis, we extend the robust-learning fuzzy c-means clustering algorithm to interval data and called it robust-learning fuzzy c-means clustering algorithm for interval data (I-RLFCM) where based on the fuzzy c-means algorithm to demonstrate the effectiveness of the I-RLFCM algorithm for interval datasets.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Fuzzy c-means clustering analysis'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles