Contents
Academic literature on the topic 'Clustering (intelligence artificielle)'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Clustering (intelligence artificielle).'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Dissertations / Theses on the topic "Clustering (intelligence artificielle)"
Lévy, Loup-Noé. "Advanced Clustering and AI-Driven Decision Support Systems for Smart Energy Management." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG027.
Full textThis thesis addresses the clustering of complex and heterogeneous energy systems within a Decision Support System (DSS).In chapter 1, we delve into the theory of complex systems and their modeling, recognizing buildings as complex systems, specifically as Sociotechnical Complex Systems. We examine the state of the art of the different agents involved in energy performance within the energy sector, identifying our case study as the Trusted Third Party for Energy Measurement and Performance (TTPEMP.) Given our constraints, we opt to concentrate on the need for a DSS to provide energy recommendations. We compare this system to supervision and recommender systems, highlighting their differences and complementarities and introduce the necessity for explainability in AI-aided decision-making (XAI). Acknowledging the complexity, numerosity, and heterogeneity of buildings managed by the TTPEMP, we argue that clustering serves as a pivotal first step in developing a DSS, enabling tailored recommendations and diagnostics for homogeneous subgroups of buildings. This is presented in Chapter 1.In Chapter 2, we explore DSSs' state of the art, emphasizing the need for governance in semi-automated systems for high-stakes decision-making. We investigate European regulations, highlighting the need for accuracy, reliability, and fairness in our decision system, and identify methodologies to address these needs, such as DevOps methodology and Data Lineage. We propose a DSS architecture that addresses these requirements and the challenges posed by big data, featuring a distributed architecture comprising a data lake for heterogeneous data handling, datamarts for specific data selection and processing, and an ML-Factory populating a model library. Different types of methods are selected for different needs based on the specificities of the data and of the question needing answering.Chapter 3 focuses on clustering as a primary machine learning method in our architecture, essential for identifying homogeneous groups of buildings. Given the combination of numerical, categorical and time series nature of the data describing buildings, we coin the term complex clustering to address this combination of data types. After reviewing the state-of-the-art, we identify the need for dimensionality reduction techniques and the most relevant mixed clustering methods. We also introduce Pretopology as an innovative approach for mixed and complex data clustering. We argue that it allows for greater explainability and interactability in the clustering as it enables Hierarchical clustering and the implementation of logical rules and custom proximity notions. The challenges of evaluating clustering are addressed, and adaptations of numerical clustering to mixed and complex clustering are proposed, taking into account the explainability of the methods.In the datasets and results chapter, we present the public, private, and generated datasets used for experimentation and discuss the clustering results. We analyze the computational performances of algorithms and the quality of clusters obtained on different datasets varying in size, number of clusters, distribution, and number of categorical and numerical parameters. Pretopology and Dimensionality Reduction show promising results compared to state-of-the-art mixed data clustering methods.Finally, we discuss our system's limitations, including the automation limits of the DSS at each step of the data flow. We focus on the critical role of data quality and the challenges in predicting the behavior of complex systems over time. The objectivity of our clustering evaluation methods is challenged due to the absence of ground truth and the reliance on dimensionality reduction to adapt state-of-the-art metrics to complex data. We discuss possible issues regarding the chosen elbow method and future work, such as automation of hyperparameter tuning and continuing the development of the DSS
Rastin, Parisa. "Automatic and Adaptive Learning for Relational Data Stream Clustering." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCD052.
Full textThe research work presented in this thesis concerns the development of unsupervised learning approaches adapted to large relational and dynamic data-sets. The combination of these three characteristics (size, complexity and evolution) is a major challenge in the field of data mining and few satisfactory solutions exist at the moment, despite the obvious needs of companies. This is a real challenge, because the approaches adapted to relational data have a quadratic complexity, unsuited to the analysis of dynamic data. We propose here two complementary approaches for the analysis of this type of data. The first approach is able to detect well-separated clusters from a signal created during an incremental reordering of the dissimilarity matrix, with no parameter to choose (e.g., the number of clusters). The second proposes to use support points among the objects in order to build a representation space to define representative prototypes of the clusters. Finally, we apply the proposed approaches to real-time profiling of connected users. Profiling tasks are designed to recognize the "state of mind" of users through their navigations on different web-sites
Guillon, Arthur. "Opérateurs de régularisation pour le subspace clustering flou." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS121.
Full textSubspace clustering is a data mining task which consists in simultaneously identifiying groups of similar data and making this similarity explicit, for example by selecting features characteristic of the groups. In this thesis, we consider a specific family of fuzzy subspace clustering models, which are based on the minimization of a cost function. We propose three desirable qualities of clustering, which are absent from the solutions computed by the previous models. We then propose simple penalty terms which we use to encode these properties in the original cost functions. Some of these terms are non-differentiable and the techniques standard in fuzzy clustering cannot be applied to minimize the new cost functions. We thus propose a new, generic optimization algorithm, which extends the standard approach by combining alternate optimization and proximal gradient descent. We then instanciate this algorithm with operators minimizing the three previous penalty terms and show that the resulting algorithms posess the corresponding qualities
Sarazin, Tugdual. "Apprentissage massivement distribué dans un environnement Big Data." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCD050.
Full textIn recent years, the amount of data analysed by companies and research laboratories increased strongly, opening the era of BigData. However, these raw data are frequently non-categorized and uneasy to use. This thesis aims to improve and ease the pre-treatment and comprehension of these big amount of data by using unsupervised machine learning algorithms.The first part of this thesis is dedicated to a state-of-the-art of clustering and biclustering algorithms and to an introduction to big data technologies. The first part introduces the conception of clustering Self-Organizing Map algorithm [Kohonen,2001] in big data environment. Our algorithm (SOM-MR) provides the same advantages as the original algorithm, namely the creation of data visualisation map based on data clusters. Moreover, it uses the Spark platform that makes it able to treat a big amount of data in a short time. Thanks to the popularity of this platform, it easily fits in many data mining environments. This is what we demonstrated it in our project \Square Predict" carried out in partnership with Axa insurance. The aim of this project was to provide a real-time data analysing platform in order to estimate the severity of natural disasters or improve residential risks knowledge. Throughout this project, we proved the efficiency of our algorithm through its capacity to analyse and create visualisation out of a big volume of data coming from social networks and open data.The second part of this work is dedicated to a new bi-clustering algorithm. BiClustering consists in making a cluster of observations and variables at the same time. In this contribution we put forward a new approach of bi-clustering based on the self-organizing maps algorithm that can scale on big amounts of data (BiTM-MR). To reach this goal, this algorithm is also based on a the Spark platform. It brings out more information than the SOM-MR algorithm because besides producing observation groups, it also associates variables to these groups,thus creating bi-clusters of variables and observations
Thépaut, Solène. "Problèmes de clustering liés à la synchronie en écologie : estimation de rang effectif et détection de ruptures sur les arbres." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS477/document.
Full textIn the view of actual global changes widely caused by human activities, it becomes urgent to understand the drivers of communities' stability. Synchrony between time series of abundances is one of the most important mechanisms. This thesis offers three different angles in order to answer different questions linked to interspecific and spatial synchrony. The works presented find applications beyond the ecological frame. A first chapter is dedicated to the estimation of effective rank of matrices in ℝ or ℂ. We offer tools allowing to measure the synchronisation rate of observations matrices. In the second chapter, we base on the existing work on change-points detection problem on chains in order to offer algorithms which detects change-points on trees. The methods can be used with most data that have to be represented as a tree. In order to study the link between interspecific synchrony and long term tendencies or traits of butterflies species, we offer in the last chapter adaptation of clustering and supervised machine learning methods, such as Random Forest or Artificial Neural Networks to ecological data
Masmoudi, Nesrine. "Modèle bio-inspiré pour le clustering de graphes : application à la fouille de données et à la distribution de simulations." Thesis, Normandie, 2017. http://www.theses.fr/2017NORMLH26/document.
Full textIn this work, we present a novel method based on behavior of real ants for solving unsupervised non-hierarchical classification problem. This approach dynamically creates data groups. It is based on the concept of artificial ants moving complexly at the same time with simple location rules. Each ant represents a data in the algorithm. The movements of ants aim to create homogenous data groups that evolve together in a graph structure. We also propose a method of incremental building neighborhood graphs by artificial ants. We propose two approaches that are derived among biomimetic algorithms, they are hybrid in the sense that the search for the number of classes starting, which are performed by the classical algorithm K-Means classification, it is used to initialize the first partition and the graph structure
Sublemontier, Jacques-Henri. "Classification non supervisée : de la multiplicité des données à la multiplicité des analyses." Phd thesis, Université d'Orléans, 2012. http://tel.archives-ouvertes.fr/tel-00801555.
Full textFalih, Issam. "Attributed Network Clustering : Application to recommender systems." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCD011/document.
Full textIn complex networks analysis field, much effort has been focused on identifying graphs communities of related nodes with dense internal connections and few external connections. In addition to node connectivity information that are mostly composed by different types of links, most real-world networks contains also node and/or edge associated attributes which can be very relevant during the learning process to find out the groups of nodes i.e. communities. In this case, two types of information are available : graph data to represent the relationship between objects and attributes information to characterize the objects i.e nodes. Classic community detection and data clustering techniques handle either one of the two types but not both. Consequently, the resultant clustering may not only miss important information but also lead to inaccurate findings. Therefore, various methods have been developed to uncover communities in networks by combining structural and attribute information such that nodes in a community are not only densely connected, but also share similar attribute values. Such graph-shape data is often referred to as attributed graph.This thesis focuses on developing algorithms and models for attributed graphs. Specifically, I focus in the first part on the different types of edges which represent different types of relations between vertices. I proposed a new clustering algorithms and I also present a redefinition of principal metrics that deals with this type of networks.Then, I tackle the problem of clustering using the node attribute information by describing a new original community detection algorithm that uncover communities in node attributed networks which use structural and attribute information simultaneously. At last, I proposed a collaborative filtering model in which I applied the proposed clustering algorithms
Boudane, Abdelhamid. "Fouille de données par contraintes." Thesis, Artois, 2018. http://www.theses.fr/2018ARTO0403/document.
Full textIn this thesis, We adress the well-known clustering and association rules mining problems. Our first contribution introduces a new clustering framework, where complex objects are described by propositional formulas. First, we extend the two well-known k-means and hierarchical agglomerative clustering techniques to deal with these complex objects. Second, we introduce a new divisive algorithm for clustering objects represented explicitly by sets of models. Finally, we propose a propositional satisfiability based encoding of the problem of clustering propositional formulas without the need for an explicit representation of their models. In a second contribution, we propose a new propositional satisfiability based approach to mine association rules in a single step. The task is modeled as a propositional formula whose models correspond to the rules to be mined. To highlight the flexibility of our proposed framework, we also address other variants, namely the closed, minimal non-redundant, most general and indirect association rules mining tasks. Experiments on many datasets show that on the majority of the considered association rules mining tasks, our declarative approach achieves better performance than the state-of-the-art specialized techniques
Boutalbi, Rafika. "Model-based tensor (co)-clustering and applications." Electronic Thesis or Diss., Université Paris Cité, 2020. https://wo.app.u-paris.fr/cgi-bin/WebObjects/TheseWeb.woa/wa/show?t=7172&f=55867.
Full textClustering, which seeks to group together similar data points according to a given criterion, is an important unsupervised learning technique to deal with large scale data. In particular, given a data matrix where rows represent objects and columns represent features, clustering aims to partition only one dimension of the matrix at a time, by clustering either objects or features. Although successfully applied in several application domains, clustering techniques are often challenged by certain characteristics exhibited by some datasets such as high dimensionality and sparsity. When it comes to such data, co-clustering techniques, which allow the simultaneous clustering of rows and columns of a data matrix, has proven to be more beneficial. In particular, co-clustering techniques allow the exploitation of the inherent duality between the objects set and features set, which make them more effective even if we are interested in the clustering of only one dimension of our data matrix. In addition, co-clustering turns out to be more efficient since compressed matrices are used at each time step of the process instead of the whole matrix for traditional clustering. Although co-clustering approaches have been successfully applied in a variety of applications, existing approaches are specially tailored for datasets represented by double-entry tables. However, in several real-world applications, two dimensions are not sufficient to represent the dataset. For example, if we consider the articles clustering problem, several information linked to the articles can be collected, such as common words, co-authors and citations, which naturally lead to a tensorial representation. Intuitively, leveraging all this information would lead to a better clustering quality. In particular, two articles that share a large set of words, authors and citations are very likely to be similar. Despite the great interest of tensor co-clustering models, research works are extremely limited in this context and rely, for most of them, on tensor factorization methods. Inspired by the famous statement made by Jean Paul Benzécri "The model must follow the data and not vice versa", we have chosen in this thesis to rely on appropriate mixture models. More explicitly, we propose several new co-clustering models which are specially tailored for tensorial representations as well as robust towards data sparsity. Our contribution can be summarized as follows. First, we propose to extend the LBM (Latent Block Model) formalism to take into account tensorial structures. More specifically, we present Tensor LBM (TLBM), a powerful tensor co-clustering model that we successfully applied on diverse kind of data. Moreover, we highlight that the derived algorithm VEM-T, reveals the most meaningful co-clusters from tensor data. Second, we develop a novel Sparse TLBM taking into account sparsity. We extend its use for the management of multiple graphs (or multi-view graphs), leading to implicit consensus clustering of multiple graphs. As a last contribution of this thesis, we propose a new co-clusterwise method which integrates co-clustering in a supervised learning framework. These contributions have been successfully evaluated on tensorial data from various fields ranging from recommendation systems, clustering of hyperspectral images and categorization of documents, to waste management optimization. They also allow us to envisage interesting and immediate future research avenues. For instance, the extension of the proposed models to tri-clustering and multivariate time series