Dissertations / Theses on the topic 'Extraction de Connaissances de Données'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Extraction de Connaissances de Données.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Azé, Jérôme. "Extraction de Connaissances à partir de Données Numériques et Textuelles." Phd thesis, Université Paris Sud - Paris XI, 2003. http://tel.archives-ouvertes.fr/tel-00011196.
Full textL'analyse de telles données est souvent contrainte par la définition d'un support minimal utilisé pour filtrer les connaissances non intéressantes.
Les experts des données ont souvent des difficultés pour déterminer ce support.
Nous avons proposé une méthode permettant de ne pas fixer un support minimal et fondée sur l'utilisation de mesures de qualité.
Nous nous sommes focalisés sur l'extraction de connaissances de la forme "règles d'association".
Ces règles doivent vérifier un ou plusieurs critères de qualité pour être considérées comme intéressantes et proposées à l'expert.
Nous avons proposé deux mesures de qualité combinant différents critères et permettant d'extraire des règles intéressantes.
Nous avons ainsi pu proposer un algorithme permettant d'extraire ces règles sans utiliser la contrainte du support minimal.
Le comportement de notre algorithme a été étudié en présence de données bruitées et nous avons pu mettre en évidence la difficulté d'extraire automatiquement des connaissances fiables à partir de données bruitées.
Une des solutions que nous avons proposée consiste à évaluer la résistance au bruit de chaque règle et d'en informer l'expert lors de l'analyse et de la validation des connaissances obtenues.
Enfin, une étude sur des données réelles a été effectuée dans le cadre d'un processus de fouille de textes.
Les connaissances recherchées dans ces textes sont des règles d'association entre des concepts définis par l'expert et propres au domaine étudié.
Nous avons proposé un outil permettant d'extraire les connaissances et d'assister l'expert lors de la validation de celles-ci.
Les différents résultats obtenus montrent qu'il est possible d'obtenir des connaissances intéressantes à partir de données textuelles en minimisant la sollicitation de l'expert dans la phase d'extraction des règles d'association.
Masseglia, Florent. "Extraction de connaissances : réunir volumes de données et motifs significatifs." Habilitation à diriger des recherches, Université de Nice Sophia-Antipolis, 2009. http://tel.archives-ouvertes.fr/tel-00788309.
Full textDubois, Vincent. "Apprentissage approximatif et extraction de connaissances à partir de données textuelles." Nantes, 2003. http://www.theses.fr/2003NANT2001.
Full textJouve, Pierre-Emmanuel. "Apprentissage non supervisé et extraction de connaissances à partir de données." Lyon 2, 2003. http://theses.univ-lyon2.fr/documents/lyon2/2003/jouve_pe.
Full textZeitouni, Karine. "Analyse et extraction de connaissances des bases de données spatio-temporelles." Habilitation à diriger des recherches, Université de Versailles-Saint Quentin en Yvelines, 2006. http://tel.archives-ouvertes.fr/tel-00325468.
Full textGaumer, Gaëtan. "Résumé de données en extraction de connaissances à partir des données (ECD) : application aux données relationnelles et textuelles." Nantes, 2003. http://www.theses.fr/2003NANT2025.
Full textGodreau, Victor. "Extraction des connaissances à partir des données de la surveillance de l'usinage." Thesis, Nantes, 2017. http://www.theses.fr/2017NANT4104.
Full textIn the industry 4.0 research field, the monitoring of the process is a key issue. Milling machines are in the center of an important flow of information that are measurable and that can be used to improve company processes. Those processes (conception, industrialization, quality, maintenance) are all interested in field manufacturing data to continuously improve themselves. Capitalizing this data flow and transform it into relevant criteria for all services, is then necessary. Chatter is an instability phenomenon of the cut during machining. It deteriorates the quality of machined part surfaces. In a first part, a numerical model has been created to link the vibration measured during machining to their impact on finished part quality. So, new data concerning quality issues is collected. In a second part, methods of knowledge discovery in databases are adapted and applied to monitoring data. This study, concern a maintenance issue. It tends to answer the question: which kind of machining events impacts the wear of machining spindles. Finally, last works will talk about the integration of monitoring systems in the information system of industries and the computation of new Key Performance Indicators (KPI) adapted to each specific need of factories to take advantage of the full potential of the monitoring data
Bendou, Mohamed. "Extraction de connaissances à partir des données à l'aide des réseaux bayésiens." Paris 11, 2003. http://www.theses.fr/2003PA112053.
Full textThe main objective of this thesis basically focuses on developing a new kind of learning algorithms of Bayésiens networks, more accurate, efficient and robust in presence of the noise and, therefore, adapted to KDD tasks. Since most of local optima in the space of networks bayésiens structures are caused directly by the existence of equivalence classes (sets of structures encoding the same conditional independence relations, represented by the partially oriented graphs), we concentrated important part of our researches on the development of a new family of learning algorithms: EQ. These algorithms directly explore the space of equivalence classes. We also developed theoretical and algorithmic tools for the analysis and the treatment of partially oriented graphs. We could demonstrate that a meaningful precision gains brought by this kind of approach can be obtained in a comparable time than the classical approaches. We, thus, contributed to the present interest renewal for the learning of equivalence classes of bayesian networks (considered for a long time as too complex by the scientific community). Finally, another aspect of our research has been dedicated to the analysis of noise effects in data on the learning of the Bayesians networks. We analyzed and explained the increase of the complexity of learned Bayesian networks learned from noisy data and shown that, unlike classical over-fitting which affects other classes of learning methods, this phenomenon is theoretically justified by the alteration of the conditional independence relations between the variables and is beneficial for the predictive power of the learned models
Munteanu, Paul. "Extraction de connaissances dans les bases de données parole : apport de l'apprentissage symbolique." Grenoble INPG, 1996. http://www.theses.fr/1996INPG0207.
Full textGhoorah, Anisah. "Extraction de Connaissances pour la Modelisation tri-dimensionnelle de l'Interactome Structural." Phd thesis, Université de Lorraine, 2012. http://tel.archives-ouvertes.fr/tel-00762444.
Full textGhoorah, Anisah W. "Extraction de connaissances pour la modélisation tri-dimensionnelle de l'interactome structural." Thesis, Université de Lorraine, 2012. http://www.theses.fr/2012LORR0204/document.
Full textUnderstanding how the protein interactome works at a structural level could provide useful insights into the mechanisms of diseases. Comparative homology modelling and ab initio protein docking are two computational methods for modelling the three-dimensional (3D) structures of protein-protein interactions (PPIs). Previous studies have shown that both methods give significantly better predictions when they incorporate experimental PPI information. However, in general, PPI information is often not available in an easily accessible way, and cannot be re-used by 3D PPI modelling algorithms. Hence, there is currently a need to develop a reliable framework to facilitate the reuse of PPI data. This thesis presents a systematic knowledge-based approach for representing, describing and manipulating 3D interactions to study PPIs on a large scale and to facilitate knowledge-based modelling of protein-protein complexes. The main contributions of this thesis are: (1) it describes an integrated database of non-redundant 3D hetero domain interactions; (2) it presents a novel method of describing and clustering DDIs according to the spatial orientations of the binding partners, thus introducing the notion of "domain family-level binding sites" (DFBS); (3) it proposes a structural classification of DFBSs similar to the CATH classification of protein folds, and it presents a study of secondary structure propensities of DFBSs and interaction preferences; (4) it introduces a systematic case-base reasoning approach to model on a large scale the 3D structures of protein complexes from existing structural DDIs. All these contributions have been made publicly available through a web server (http://kbdock.loria.fr)
Objois, Matthieu. "Langages de requêtes temporels, extraction de connaissances temporelles et application aux flux de données." Paris 11, 2007. http://www.theses.fr/2007PA112092.
Full textA temporal database can be seen as a finite sequence of classical relational databases. Within this framework, we first consider an open problem concerning the relative expressive power of some known temporal query languages: mu-TL (Vardi, 1988) on the one hand, and T-FIXPOINT and T-WHILE (Abiteboul et al. , 1999) on the other hand. We prove that these languages are equivalent over most temporal databases. On the basis that known temporal query languages do not allow to extract temporal information, we then introduce and define query languages able to extract such information, and we analyse their properties. Finally, we consider data streams. In the literature, two paradigms have been introduced to continuously query streams: the single-data approach and the window approach. We formalize both paradigms by the way of Turing-like state machines, and we show that the machines have the same expressive power, under some hypothesis
Dahabiah, Anas. "Extraction de connaissances et indexation de données multimédia pour la détection anticipée d'événements indésirables." Télécom Bretagne, 2010. http://www.theses.fr/2010TELB0117.
Full textSimilarity measuring is the essential quoin of the majority of data mining techniques and tasks in which information elements can take any type (quantities, qualitative, binary, ordinal, etc. ) and may be affected with various forms of imperfection (uncertainty, imprecision, ambiguity, etc. ). Additionally, the points of view of the experts and data owners must sometimes be considered and integrated even if presented in ambiguous or imprecise manners. Nonetheless, all the existing methods and approaches have partially handled some aspects of the aforementioned points disregarding the others. In reality, the heterogeneity, the imperfection, and the personalization have been separately conducted in prior works, using some constraints and assumptions that can overburden the procedure, limit their applications, and increase its computing time which is a crucial issue in data mining. In this thesis, we propose a novel approach essentially based on possibility theory to deal with all the aforementioned aspects within a unified general integrated framework. In order to get deeper insight and understanding of the information elements, the possibilistic modeling has been materialized via spatial, graphical and structural representations and applied to several data mining tasks using a medical database
Vandromme, Maxence. "Optimisation combinatoire et extraction de connaissances sur données hétérogènes et temporelles : application à l’identification de parcours patients." Thesis, Lille 1, 2017. http://www.theses.fr/2017LIL10044.
Full textHospital data exhibit numerous specificities that make the traditional data mining tools hard to apply. In this thesis, we focus on the heterogeneity associated with hospital data and on their temporal aspect. This work is done within the frame of the ANR ClinMine research project and a CIFRE partnership with the Alicante company. In this thesis, we propose two new knowledge discovery methods suited for hospital data, each able to perform a variety of tasks: classification, prediction, discovering patients profiles, etc.In the first part, we introduce MOSC (Multi-Objective Sequence Classification), an algorithm for supervised classification on heterogeneous, numeric and temporal data. In addition to binary and symbolic terms, this method uses numeric terms and sequences of temporal events to form sets of classification rules. MOSC is the first classification algorithm able to handle these types of data simultaneously. In the second part, we introduce HBC (Heterogeneous BiClustering), a biclustering algorithm for heterogeneous data, a problem that has never been studied so far. This algorithm is extended to support temporal data of various types: temporal events and unevenly-sampled time series. HBC is used for a case study on a set of hospital data, whose goal is to identify groups of patients sharing a similar profile. The results make sense from a medical viewpoint; they indicate that relevant, and sometimes new knowledge is extracted from the data. These results also lead to further, more precise case studies. The integration of HBC within a software is also engaged, with the implementation of a parallel version and a visualization tool for biclustering results
Rioult, François. "Extraction de connaissances dans les bases de données comportant des valeurs manquantes ou un grand nombre d'attributs." Caen, 2005. http://www.theses.fr/2005CAEN2035.
Full textBen, Ahmed Walid. "SAFE-Next : une approche systémique pour l'extraction de connaissances de données : application à la construction et à l'interprétation de scénarios d'accidents de la route." Châtenay-Malabry, Ecole centrale de Paris, 2005. http://www.theses.fr/2005ECAP0982.
Full textPlantevit, Marc. "Extraction De Motifs Séquentiels Dans Des Données Multidimensionelles." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2008. http://tel.archives-ouvertes.fr/tel-00319242.
Full textRaïssi, Chedy. "Extraction de Séquences Fréquentes : Des Bases de Données Statiques aux Flots de Données." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2008. http://tel.archives-ouvertes.fr/tel-00351626.
Full textInthasone, Somsack. "Techniques d'extraction de connaissances en biodiversité." Thesis, Nice, 2015. http://www.theses.fr/2015NICE4013/document.
Full textBiodiversity data are generally stored in different formats. This makes it difficult for biologists to combine and integrate them in order to retrieve useful information and discover novel knowledge for the purpose of, for example, efficiently classifying specimens. In this work, we present the BioKET data warehouse which is a consolidation of heterogeneous data stored in different formats and originating from different sources. For the time being, the scope of BioKET is botanical. Its construction required, among others things, to identify and analyze existing botanical ontologies, to standardize and relate terms in BioKET. We also developed a methodology for mapping and defining taxonomic terminologies, that are controlled vocabularies with hierarchical structures from authoritative plant ontologies, Google Maps, and OpenStreetMap geospatial information system. Data from four major biodiversity and botanical data providers and from the two previously mentioned geospatial information systems were then integrated in BioKET. The usefulness of such a data warehouse was demonstrated by applying classical knowledge pattern extraction methods, based on the classical Apriori and Galois closure based approaches, to several datasets generated from BioKET extracts. Using these methods, association rules and conceptual bi-clusters were extracted to analyze the risk status of plants endemic to Laos and Southeast Asia. Besides, BioKET is interfaced with other applications and resources, like the GeoCAT Geospatial Conservation Assessment Tool, to provide a powerful analysis tool for biodiversity data
Plantié, Michel. "Extraction automatique de connaissances pour la décision multicritère." Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 2006. http://tel.archives-ouvertes.fr/tel-00353770.
Full textLe modèle qui supporte notre système interactif d'aide à la décision de groupe (SIADG) s'appuie largement sur des traitements automatiques de la connaissance. Datamining, multicritère et optimisation sont autant de techniques qui viennent se compléter pour élaborer un artefact de décision qui s'apparente à une interprétation cybernétique du modèle décisionnel de l'économiste Simon. L'incertitude épistémique inhérente à une décision est mesurée par le risque décisionnel qui analyse les facteurs discriminants entre les alternatives. Plusieurs attitudes dans le contrôle du risque décisionnel peuvent être envisagées : le SIADG peut être utilisé pour valider, vérifier ou infirmer un point de vue. Dans tous les cas, le contrôle exercé sur l'incertitude épistémique n'est pas neutre quant à la dynamique du processus de décision. L'instrumentation de la phase d'apprentissage du processus décisionnel conduit ainsi à élaborer l'actionneur d'une boucle de rétroaction visant à asservir la dynamique de décision. Notre modèle apporte un éclairage formel des liens entre incertitude épistémique, risque décisionnel et stabilité de la décision.
Les concepts fondamentaux de connaissance actionnable (CA) et d'indexation automatique sur lesquels reposent nos modèles et outils de TALN sont analysés. La notion de connaissance actionnable trouve dans cette vision cybernétique de la décision une interprétation nouvelle : c'est la connaissance manipulée par l'actionneur du SIADG pour contrôler la dynamique décisionnelle. Une synthèse rapide des techniques d'apprentissage les plus éprouvées pour l'extraction automatique de connaissances en TALN est proposée. Toutes ces notions et techniques sont déclinées sur la problématique spécifique d'extraction automatique de CAs dans un processus d'évaluation multicritère. Enfin, l'exemple d'application d'un gérant de vidéoclub cherchant à optimiser ses investissements en fonction des préférences de sa clientèle reprend et illustre le processus informatisé dans sa globalité.
Raissi, Chedy. "Extraction de séquences fréquentes : des bases de données statiques aux flots de données." Montpellier 2, 2008. http://www.theses.fr/2008MON20063.
Full textGalarraga, Del Prado Luis. "Extraction des règles d'association dans des bases de connaissances." Thesis, Paris, ENST, 2016. http://www.theses.fr/2016ENST0050/document.
Full textThe continuous progress of information extraction (IE) techniques has led to the construction of large general-purpose knowledge bases (KBs). These KBs contain millions of computer-readable facts about real-world entities such as people, organizations and places. KBs are important nowadays because they allow computers to “understand” the real world. They are used in multiple applications in Information Retrieval, Query Answering and Automatic Reasoning, among other fields. Furthermore, the plethora of information available in today’s KBs allows for the discovery of frequent patterns in the data, a task known as rule mining. Such patterns or rules convey useful insights about the data. These rules can be used in several applications ranging from data analytics and prediction to data maintenance tasks. The contribution of this thesis is twofold : First, it proposes a method to mine rules on KBs. The method relies on a mining model tailored for potentially incomplete webextracted KBs. Second, the thesis shows the applicability of rule mining in several data-oriented tasks in KBs, namely facts prediction, schema alignment, canonicalization of (open) KBs and prediction of completeness
Oudni, Amal. "Fouille de données par extraction de motifs graduels : contextualisation et enrichissement." Thesis, Paris 6, 2014. http://www.theses.fr/2014PA066437/document.
Full textThis thesis's works belongs to the framework of knowledge extraction and data mining applied to numerical or fuzzy data in order to extract linguistic summaries in the form of gradual itemsets: the latter express correlation between attribute values of the form « the more the temperature increases, the more the pressure increases ». Our goal is to contextualize and enrich these gradual itemsets by proposing different types of additional information so as to increase their quality and provide a better interpretation. We propose four types of new itemsets: first of all, reinforced gradual itemsets, in the case of fuzzy data, perform a contextualization by integrating additional attributes linguistically introduced by the expression « all the more ». They can be illustrated by the example « the more the temperature decreases, the more the volume of air decreases, all the more its density increases ». Reinforcement is interpreted as increased validity of the gradual itemset. In addition, we study the extension of the concept of reinforcement to association rules, discussing their possible interpretations and showing their limited contribution. We then propose to process the contradictory itemsets that arise for example in the case of simultaneous extraction of « the more the temperature increases, the more the humidity increases » and « the more the temperature increases, the less the humidity decreases ». To manage these contradictions, we define a constrained variant of the gradual itemset support, which, in particular, does not only depend on the considered itemset, but also on its potential contradictors. We also propose two extraction methods: the first one consists in filtering, after all itemsets have been generated, and the second one integrates the filtering process within the generation step. We introduce characterized gradual itemsets, defined by adding a clause linguistically introduced by the expression « especially if » that can be illustrated by a sentence such as « the more the temperature decreases, the more the humidity decreases, especially if the temperature varies in [0, 10] °C »: the additional clause precise value ranges on which the validity of the itemset is increased. We formalize the quality of this enrichment as a trade-off between two constraints imposed to identified interval, namely a high validity and a high size, as well as an extension taking into account the data density. We propose a method to automatically extract characterized gradual based on appropriate mathematical morphology tools and the definition of an appropriate filter and transcription
Valétudie, Georges. ""Nouvelles méthodes en Data-Mining et extraction de connaissances à partir de données :application au complexe mycobacterium tuberculosis"." Antilles-Guyane, 2006. http://www.theses.fr/2006AGUY0151.
Full textThe needs fo knowledge processing from increasing large databases has been the source for the development of techniques and methods related to Data-Mining (also called knowledge Discovery from Databases). This field is composed of various subfields, in particular techniques for dabatase management, learning and prediction. Data processing and analtsis are both expensive and lengthy in epidemiology. So, we are interested in models tailored to knowledge extraction from sequential data, in order to determine the most discriminating seqeunces of classes of data a priori defined by the experts of the field, and to automate with knowledge rules the treatment of DNA sequences. So, we try to implement systems for surpervised classification, in order to train and predict sequential data, i. E. Spoligotypes in our case. For this objective, we introduced methods adated to our application field(expert rules,Markov chains,Decision trees,. . . ), including classifiers systems, which present the interest of a constzant interaction with their environment and the exploitation of genetic algorithms for their evolution. We have measured their performances, taking their constraints into account. In addition, we have devised an index allowing us to take into account in a better way the sequential form of our data, and we have presented a method based on statistical inference , which allows us to define rules with the condensed representation of a DFA. Our experiments displays promising good results, althourgh it is too early tonperform a selection among the methods. Rather, the possibility of a cooperative approach among methods seems to be more promising. Anyway, the contribution of the sequence-mining methods for knowledge extraction remains a major asset for this application field
Haddad, Mohamed Hatem. "Extraction et impact des connaissances sur les performances des systèmes de recherche d'information." Phd thesis, Université Joseph Fourier (Grenoble), 2002. http://tel.archives-ouvertes.fr/tel-00004459.
Full textSerrano, Laurie. "Vers une capitalisation des connaissances orientée utilisateur : extraction et structuration automatiques de l'information issue de sources ouvertes." Caen, 2014. http://www.theses.fr/2014CAEN2011.
Full textDue to the considerable increase of freely available data (especially on the Web), the discovery of relevant information from textual content is a critical challenge. Open Source Intelligence (OSINT) specialists are particularly concerned by this phenomenon as they try to mine large amounts of heterogeneous information to acquire actionable intelligence. This collection process is still largely done by hand in order to build knowledge sheets summarizing all the knowledge acquired about a specific entity. Given this context, the main goal of this thesis work is to reduce and facilitate the daily work of intelligence analysts. For this sake, our researches revolve around three main axis: knowledge modeling, text mining and knowledge gathering. We explored the literature related to these different domains to develop a global knowledge gathering system. Our first contribution is the building of a domain ontology dedicated to knowledge representation for OSINT purposes and that comprises a specific definition and modeling of the event concept for this domain. Secondly, we have developed and evaluated an event recognition system which is based on two different extraction approaches: the first one is based on hand-crafted rules and the second one on a frequent pattern learning technique. As our third contribution, we proposed a semantic aggregation process as a necessary post-processing step to enhance the quality of the events extracted and to convert extraction results into actionable knowledge. This is achieved by means of multiple similarity measures between events, expressed according a qualitative scale which has been designed following our final users' needs
Coulet, Adrien. "Construction et utilisation d'une base de connaissances pharmacogénomique pour l'intégration de données et la découverte de connaissances." Phd thesis, Université Henri Poincaré - Nancy I, 2008. http://tel.archives-ouvertes.fr/tel-00332407.
Full textJouve, Pierre-Emmanuel Nicoloyannis Nicolas. "Apprentissage non supervisé et extraction de connaissances à partir de données." Lyon : Université Lumière Lyon 2, 2003. http://demeter.univ-lyon2.fr/sdx/theses/lyon2/2003/jouve_pe.
Full textDuthil, Benjamin. "De l'extraction des connaissances à la recommandation." Phd thesis, Montpellier 2, 2012. http://tel.archives-ouvertes.fr/tel-00771504.
Full textBouguessa, Mohamed. "Classification non supervisée des données de hautes dimensions et extraction des connaissances dans les services WEB de question-réponse." Thèse, Université de Sherbrooke, 2009. http://savoirs.usherbrooke.ca/handle/11143/5096.
Full textBadra, Fadi. "Extraction de connaissances d'adaptation en raisonnement à partir de cas." Phd thesis, Université Henri Poincaré - Nancy I, 2009. http://tel.archives-ouvertes.fr/tel-00438140.
Full textFiot, Céline. "Extraction de séquences fréquentes : des données numériques aux valeurs manquantes." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2007. http://tel.archives-ouvertes.fr/tel-00179506.
Full textRioult, François. "Extraction de connaissances dans les bases de donn'ees comportant des valeurs manquantes ou un grand nombre d'attributs." Phd thesis, Université de Caen, 2005. http://tel.archives-ouvertes.fr/tel-00252089.
Full textWajnberg, Mickaël. "Analyse relationnelle de concepts : une méthode polyvalente pour l'extraction de connaissances." Electronic Thesis or Diss., Université de Lorraine, 2020. http://www.theses.fr/2020LORR0136.
Full textAt a time where data, often interpreted as "ground truth", are produced in gigantic quantities, a need for understanding and interpretability emerges in parallel. Dataset are nowadays mainly relational, therefore developping methods that allows relevant information extraction describing both objects and relation among them is a necessity. Association rules, along with their support and confidence metrics, describe co-occurrences of object features, hence explicitly express and evaluate any information contained in a dataset. In this thesis, we present and develop the relational concept analysis approach to extract the association rules that translate objects proper features along with the links with sets of objects. A first part present the mathematical part of the method, while a second part highlights three case studies to assess the pertinence of such a development. Case studies cover various domains to demonstrate the method polyvalence: the first case deals with error analysis in industrial production, the second covers psycholinguistics for dictionary analysis and the last one shows the method application in knowledge engineering
Pennerath, Frédéric. "Méthodes d'extraction de connaissances à partir de données modélisables par des graphes : Application à des problèmes de synthèse organique." Phd thesis, Université Henri Poincaré - Nancy I, 2009. http://tel.archives-ouvertes.fr/tel-00436568.
Full textTraoré, Boukaye Boubacar. "Modélisation des informations et extraction des connaissances pour la gestion des crises." Thesis, Toulouse, INPT, 2018. http://www.theses.fr/2018INPT0153.
Full textThe rise of emerging data collection technologies offers new opportunities for various scientific disciplines. IT is expected to play its part by developing intelligent data analysis techniques to provide some insight into solving complex problems. The content of this doctoral research dissertation is part of the general problem of extracting knowledge from data by computer techniques. This PhD work focuses, firstly, to the problem of information modeling for crisis management requiring medical care, using a collaboration of computer applications of telemedicine. We proposed a methodology for managing a remote crisis in three stages. It is mainly focused on the collaboration of telemedicine acts (Teleconsultation, Teleexpertise, Telemonitoring, Remote Assistance, and Medical Regulation), from the transport phase of victims to the phase of medical treatment in and / or between health structures. This methodology not only helps to provide crisis managers with a computerized decision aid system, but also to minimize the financial costs and to reduce the response time of emergency through an organized management of the crisis. Secondly, we studied in detail the extraction of knowledge using data mining techniques on satellite images to discover epidemic r risk areas, including the case study focused on the cholera epidemic in the region of Mopti, Mali. Thus, a methodology of six phases was presented by relating the data collected in the field and satellite data to prevent and more effectively monitor the epidemic crises. The results show that 66% of the contamination rate is related to the Niger River, in addition to certain societal factors such as garbage dumps in winter. As a result, we have been able to establish the link between the epidemic and its development environment, which will enable decision makers to better manage a possible crisis of epidemic. And finally, during an epidemic crisis situation, we focused on medical analysis, more specifically by the use of portable microscopes to confirm or not the presence of pathogens in samples of case suspects. To do this, we have presented a methodology in six phases, based on the techniques of deep learning including one of convolutional neural network techniques, transfer learning that take advantage of complex systems and analysis of large amounts of data. The idea is to train networks convolutional neural automatic image classification pathogens. For example in our case study, this approach was used to distinguish a microscopic image containing the cholera epidemic virus called Vibrio cholerae from a microscopic image containing the malaria epidemic virus called Plasmodium. This allowed us to obtain good performances with a classification accuracy of 99%. Subsequently, the idea is to deploy this pathogen image recognition solution in intelligent portable microscopes for routine analysis and medical diagnostic applications in crisis management. This will make it possible to fill the lack of specialists in microscopic manipulation and a considerable time saving in the analysis of the samples with precise measures favoring the accomplishment of the work under better conditions
ARMAND, Stéphane. "Analyse Quantifiée de la Marche : extraction de connaissances à partir de données pour l'aide à l'interprétation clinique de la marche digitigrade." Phd thesis, Université de Valenciennes et du Hainaut-Cambresis, 2005. http://tel.archives-ouvertes.fr/tel-00010618.
Full textArmand, Stéphane. "Analyse quantifiée de la marche : extraction de connaissances à partir de données pour l'aide à l'interprétation clinique de la marche digitigrade." Valenciennes, 2005. http://ged.univ-valenciennes.fr/nuxeo/site/esupversions/6cfbb62f-d5e4-4bd3-b7b3-96618bf3ceea.
Full textClinical Gait Analysis (CGA) is used to identify and quantify gait deviations from biomechanical data. Interpreting CGA, which provides the explanations for the identified gait deviations, is a complex task. Toe-walking is one of the most common gait deviations, and identifying its causes is difficult. This research had for objective to provide a support tool for interpreting toe-walker CGAs. To reach this objective, a Knowledge Discovery in Databases (KDD) method combining unsupervised and supervised machine learning is used to extract objectively intrinsic and discriminant knowledge from CGA data. The unsupervised learning (fuzzy c-means) allowed three toe-walking patterns to be identified from ankle kinematics extracted from a database of more than 2500 CGA (Institut Saint-Pierre, Palavas, 34). The supervised learning was employed to explain these three gait patterns through clinical measurement using induced rules from fuzzy decision trees. The most significant and interpretable rules (12) were selected to create a knowledge base that has been validated in terms of the literature and experts. These rules can be used to facilitate the interpretation of toe-walker CGA data. This research opens several prospective paths of investigation, ranging from the development of a generic method based on the proposed method for studying movement to the creation of a pathologic gait simulator
Le, Duff Franck. "Enrichissement quantitatif et qualitatif de relations conceptuelles des bases de connaissances médicales par extraction et héritage automatique par des méthodes informatiques et probabilistes." Rennes 1, 2006. http://www.theses.fr/2006REN1B094.
Full textPugeault, Florence. "Extraction dans les textes de connaissances structurées : une méthode fondée sur la sémantique lexicale linguistique." Toulouse 3, 1995. http://www.theses.fr/1995TOU30164.
Full textGrissa, Dhouha. "Etude comportementale des mesures d'intérêt d'extraction de connaissances." Phd thesis, Université Blaise Pascal - Clermont-Ferrand II, 2013. http://tel.archives-ouvertes.fr/tel-01023975.
Full textElmi, Rayaleh Waïss. "Extraction de connaissances en imagerie microspectrométrique par analyse chimiométrique : application à la caractérisation des constituants d'un calcul urinaire." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2006. http://tel.archives-ouvertes.fr/tel-00270116.
Full textMaillot, Pierre. "Nouvelles méthodes pour l'évaluation, l'évolution et l'interrogation des bases du Web des données." Thesis, Angers, 2015. http://www.theses.fr/2015ANGE0007/document.
Full textThe web of data is a mean to share and broadcast data user-readable data as well as machine-readable data. This is possible thanks to rdf which propose the formatting of data into short sentences (subject, relation, object) called triples. Bases from the web of data, called rdf bases, are sets of triples. In a rdf base, the ontology – structural data – organize the description of factual data. Since the web of datacreation in 2001, the number and sizes of rdf bases have been constantly rising. This increase has accelerated since the apparition of linked data, which promote the sharing and interlinking of publicly available bases by user communities. The exploitation – interrogation and edition – by theses communities is made without adequateSolution to evaluate the quality of new data, check the current state of the bases or query together a set of bases. This thesis proposes three methods to help the expansion at factual and ontological level and the querying of bases from the web ofData. We propose a method designed to help an expert to check factual data in conflict with the ontology. Finally we propose a method for distributed querying limiting the sending of queries to bases that may contain answers
Li, Jinpeng. "Extraction de connaissances symboliques et relationnelles appliquée aux tracés manuscrits structurés en-ligne." Phd thesis, Nantes, 2012. http://tel.archives-ouvertes.fr/tel-00785984.
Full textCandillier, Christophe. "Méthodes d'Extraction de Connaissances à partir de Données (ECD) appliquées aux Systèmes d'Information Géographiques (SIG)." Phd thesis, Université de Nantes, 2006. http://tel.archives-ouvertes.fr/tel-00101491.
Full textBen, Salamah Janan. "Extraction de connaissances dans des textes arabes et français par une méthode linguistico-computationnelle." Thesis, Paris 4, 2017. http://www.theses.fr/2017PA040137.
Full textIn this thesis, we proposed a multilingual generic approach for the automatic information extraction. Particularly, events extraction of price variation and temporal information extraction linked to temporal referential. Our approach is based on the constitution of several semantic maps by textual analysis in order to formalize the linguistic traces expressed by categories. We created a database for an expert system to identify and annotate information (categories and their characteristics) based on the contextual rule groups. Two algorithms AnnotEC and AnnotEV have been applied in the SemanTAS platform to validate our assumptions. We have obtained a satisfactory result; Accuracy and recall are around 80%. We presented extracted knowledge by a summary file. In order to approve the multilingual aspect of our approach, we have carried out experiments on French and Arabic. We confirmed the scalability level by the annotation of large corpus
Tang, My Thao. "Un système interactif et itératif extraction de connaissances exploitant l'analyse formelle de concepts." Thesis, Université de Lorraine, 2016. http://www.theses.fr/2016LORR0060/document.
Full textIn this thesis, we present a methodology for interactive and iterative extracting knowledge from texts - the KESAM system: A tool for Knowledge Extraction and Semantic Annotation Management. KESAM is based on Formal Concept Analysis for extracting knowledge from textual resources that supports expert interaction. In the KESAM system, knowledge extraction and semantic annotation are unified into one single process to benefit both knowledge extraction and semantic annotation. Semantic annotations are used for formalizing the source of knowledge in texts and keeping the traceability between the knowledge model and the source of knowledge. The knowledge model is, in return, used for improving semantic annotations. The KESAM process has been designed to permanently preserve the link between the resources (texts and semantic annotations) and the knowledge model. The core of the process is Formal Concept Analysis that builds the knowledge model, i.e. the concept lattice, and ensures the link between the knowledge model and annotations. In order to get the resulting lattice as close as possible to domain experts' requirements, we introduce an iterative process that enables expert interaction on the lattice. Experts are invited to evaluate and refine the lattice; they can make changes in the lattice until they reach an agreement between the model and their own knowledge or application's need. Thanks to the link between the knowledge model and semantic annotations, the knowledge model and semantic annotations can co-evolve in order to improve their quality with respect to domain experts' requirements. Moreover, by using FCA to build concepts with definitions of sets of objects and sets of attributes, the KESAM system is able to take into account both atomic and defined concepts, i.e. concepts that are defined by a set of attributes. In order to bridge the possible gap between the representation model based on a concept lattice and the representation model of a domain expert, we then introduce a formal method for integrating expert knowledge into concept lattices in such a way that we can maintain the lattice structure. The expert knowledge is encoded as a set of attribute dependencies which is aligned with the set of implications provided by the concept lattice, leading to modifications in the original lattice. The method also allows the experts to keep a trace of changes occurring in the original lattice and the final constrained version, and to access how concepts in practice are related to concepts automatically issued from data. The method uses extensional projections to build the constrained lattices without changing the original data and provide the trace of changes. From an original lattice, two different projections produce two different constrained lattices, and thus, the gap between the representation model based on a concept lattice and the representation model of a domain expert is filled with projections
Karouach, Saïd. "Visualisations interactives pour la découverte de connaissances, concepts, méthodes et outils." Toulouse 3, 2003. http://www.theses.fr/2003TOU30082.
Full textVoisin, Bruno. "Approche extraction de connaissance de l'analyse de données astronomiques : application à l'identification croisée multi-[lambda]." Toulon, 2002. http://www.theses.fr/2002TOUL0011.
Full textGhemtio, Wafo Léo Aymar. "Simulation numérique et approche orientée connaissance pour la découverte de nouvelles molécules thérapeutiques." Thesis, Nancy 1, 2010. http://www.theses.fr/2010NAN10103/document.
Full textTherapeutic innovation has traditionally benefited from the combination of experimental screening and molecular modelling. In practice, however, the latter is often limited by the shortage of structural and biological information. Today, the situation has completely changed with the high-throughput sequencing of the human genome, and the advances realized in the three-dimensional determination of the structures of proteins. This gives access to an enormous amount of data which can be used to search for new treatments for a large number of diseases. In this respect, computational approaches have been used for high-throughput virtual screening (HTVS) and offer an alternative or a complement to the experimental methods, which allow more time for the discovery of new treatments.However, most of these approaches suffer the same limitations. One of these is the cost and the computing time required for estimating the binding of all the molecules from a large data bank to a target, which can be considerable in the context of the high-throughput. Also, the accuracy of the results obtained is another very evident challenge in the domain. The need to manage a large amount of heterogeneous data is also particularly crucial.To try to surmount the current limitations of HTVS and to optimize the first stages of the drug discovery process, I set up an innovative methodology presenting two advantages. Firstly, it allows to manage an important mass of heterogeneous data and to extract knowledge from it. Secondly, it allows distributing the necessary calculations on a grid computing platform that contains several thousand of processors. The whole methodology is integrated into a multiple-step virtual screening funnel. The purpose is the consideration, in the form of constraints, of the knowledge available about the problem posed in order to optimize the accuracy of the results and the costs in terms of time and money at various stages of high-throughput virtual screening