Dissertations / Theses on the topic 'Optimisation de l'échange de données'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Optimisation de l'échange de données.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Ouertani, Mohamed Zied. "DEPNET : une approche support au processus de gestion de conflits basée sur la gestion des dépendances de données de conception." Phd thesis, Université Henri Poincaré - Nancy I, 2007. http://tel.archives-ouvertes.fr/tel-00163113.
Full textC'est à la gestion de ce phénomène, le conflit, que nous nous sommes intéressés dans le travail présenté dans ce mémoire, et plus particulièrement à la gestion de conflits par négociation. Nous proposons l'approche DEPNET (product Data dEPendencies NETwork identification and qualification) pour supporter au processus de gestion de conflits basée sur la gestion des dépendances entre les données. Ces données échangées et partagées entre les différents intervenants sont l'essence même de l'activité de conception et jouent un rôle primordial dans l'avancement du processus de conception.
Cette approche propose des éléments méthodologiques pour : (1) identifier l'équipe de négociation qui sera responsable de la résolution de conflit, et (2) gérer les impacts de la solution retenu suite à la résolution du conflit. Une mise en œuvre des apports de ce travail de recherche est présentée au travers du prototype logiciel DEPNET. Nous validons celui-ci sur un cas d'étude industriel issu de la conception d'un turbocompresseur.
De, Vlieger P. "Création d'un environnement de gestion de base de données " en grille ". Application à l'échange de données médicales." Phd thesis, Université d'Auvergne - Clermont-Ferrand I, 2011. http://tel.archives-ouvertes.fr/tel-00654660.
Full textDe, Vlieger Paul. "Création d'un environnement de gestion de base de données "en grille" : application à l'échange de données médicales." Phd thesis, Université d'Auvergne - Clermont-Ferrand I, 2011. http://tel.archives-ouvertes.fr/tel-00719688.
Full textEl, Khalkhali Imad. "Système intégré pour la modélisation, l'échange et le partage des données de produits." Lyon, INSA, 2002. http://theses.insa-lyon.fr/publication/2002ISAL0052/these.pdf.
Full textIn Virtual Enterprise and Concurrent Engineering environments, a wide variety of information is used. A crucial issue is the data communication and exchange between heterogeneous systems and distant sites. To solve this problem, the STEP project was introduced. The STandard for the Exchange of Product model data STEP is an evolving international standard for the representation and exchange of product data. The objective of STEP is to provide the unambiguous computer-interpretable representation of product data in all phases of the product’s lifecycle. In a collaborative product development different types of experts in different disciplines are concerned by the product (Design, Manufacturing, Marketing, Customers,. . . ). Each of these experts has his own viewpoint about the same product. STEP Models are unable to represent the expert’s viewpoints. The objective of our research work is to propose a methodology for representation and integration of different expert’s viewpoints in design and manufacturing phases. An Information Infrastructure for modelling, exchanging and sharing product data models is also proposed
Stoeklé, Henri-Corto. "Médecine personnalisée et bioéthique : enjeux éthiques dans l'échange et le partage des données génétiques." Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCB175.
Full textIn the context of medicine and life sciences, personalized medicine (PM) is all too often reduced to the idea of adapting a diagnosis, predisposition or treatment according to the genetic characteristics of an individual. However, in human and social sciences, PM may be considered as a complex social phenomenon, due to the proper existence and unique composition of the constraints it imposes on individuals, the large number of interactions and interferences between a large number of units, rich in uncertainties, indeterminations, chance, order and disorder. We feel that this alternative point of view makes it possible to study PM more effectively by bioethics research approaches, but with a new objective, contrasting but complementary to those of law and moral philosophy, and a new method. Indeed, the objective of bioethics should be prospective studies questioning established norms in the face of emerging complex social phenomena, rather than the other way round. This makes it possible to determine the benefits, to society and its individuals, of allowing the phenomenon to emerge fully, and to study possible and probable solutions, rather than certainties, for the present and the future. This may allow the identified benefits to occur. However, this objective requires a method for studying the functioning of the phenomenon as a whole, at the scale of society, without a priori restriction to certain individuals, thereby favoring its interactions over its elements. Qualitative inductive systemic theoretical modeling is just such an approach. The key idea here is a rationale of discovery, rather than of proof. This new approach allowed us to understand that PM should not be called "personalized", or even "genomic" or "precision" medicine, and that the term "data medicine" (DM) should be favored, given the key role of data in its functioning. Indeed, the goal of this phenomenon seems to be to use a large mass of data (genetics) to deduce (data mining) or induce (big data) different types of information useful for medical care, research and industry. The means of achieving this end seems to be the development of a network for exchanging or sharing biological samples, genetic data and information between patients, clinicians, researchers and industrial partners, through electronic communication, with the central storage of biological samples and genetic data, and with treatment and analysis carried out at academic care and research centers (France) or in private companies (United States), with or without the involvement of a clinician. The major ethical issues thus seem to relate to the means and mode of access to, and the storage and use of genetic data, which may lead to a radically opposed (social/liberal) organizations and functioning, calling into question certain moral and legal standards. Finally, our method provided several arguments in favor of the use of dynamic electronic informed consent (e-CE) as a solution optimizing the development of PM in terms of genetic data access, storage and use, for the sharing (France) or exchange (United States) of genetic data
Azizi, Leila. "Pratique et problèmes légaux de l'échange de données informatisées, le cas du crédit documentaire dématérialisé." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape9/PQDD_0020/MQ47163.pdf.
Full textDarlay, Julien. "Analyse combinatoire de données : structures et optimisation." Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00683651.
Full textGamoudi, Oussama. "Optimisation adaptative appliquée au préchargement de données." Paris 6, 2012. http://www.theses.fr/2012PA066192.
Full textData prefetching is an effective way to bridge the increasing performance gap between processor and memory. Prefetching can improve performance but it has some side effects which may lead to no performance improvement while increasing memory pressure or to performance degradation. Adaptive prefetching aims at reducing negative effects of prefetching while keeping its advantages. This paper proposes an adaptive prefetching method based on runtime activity, which corresponds to the processor and memory activities retrieved by hardware counters, to predict the prefetch efficiency. Our approach highlights and relies on the correlation between the prefetch effects and runtime activity. Our method learns all along the execution this correlation to predict the prefetch efficiency in order to filter out predicted inefficient prefetches. Experimental results show that the proposed filter is able to cancel thenegative impact of prefetching when it is unprofitable while keeping the performance improvement due to prefetching when it is beneficial. Our filter works similarly well when several threads are running simultane-ously which shows that runtime activity enables an efficient adaptation of prefetch by providing information on running-applications behaviors and interactions
Travers, Nicolas. "Optimisation Extensible dans un Mediateur de Données Semi-Structurées." Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2006. http://tel.archives-ouvertes.fr/tel-00131338.
Full textcontexte de médiation de données XML. Un médiateur doit fédérer des sources de données
distribuées et hétérogènes. A cette fin, un modèle de représentation des requêtes est néces-
saire. Ce modèle doit intégrer les problèmes de médiation et permettre de définir un cadre
d'optimisation pour améliorer les performances. Le modèle des motifs d'arbre est souvent
utilisé pour représenter les requêtes XQuery, mais il ne reconnaît pas toutes les spécifica-
tions du langage. La complexité du langage XQuery fait qu'aucun modèle de représentation
complet n'a été proposé pour reconna^³tre toutes les spécifications. Ainsi, nous proposons un
nouveau modèle de représentation pour toutes les requêtes XQuery non typées que nous appe-
lons TGV. Avant de modéliser une requête, une étape de canonisation permet de produire une
forme canonique pour ces requêtes, facilitant l'étape de traduction vers le modèle TGV. Ce
modèle prend en compte le contexte de médiation et facilite l'étape d'optimisation. Les TGV
définis sous forme de Types Abstraits de Données facilitent l'intégration du modèle dans tout
système en fonction du modèle de données. De plus, une algèbre d'évaluation est définie pour
les TGV. Grâce µa l'intégration d'annotations et d'un cadre pour règles de transformation, un
optimiseur extensible manipule les TGV. Celui-ci repose sur des règles transformations, un
modèle de coût générique et une stratégie de recherche. Les TGV et l'optimiseur extensible
sont intégrés dans le médiateur XLive, développé au laboratoire PRiSM.
Amstel, Duco van. "Optimisation de la localité des données sur architectures manycœurs." Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAM019/document.
Full textThe continuous evolution of computer architectures has been an important driver of research in code optimization and compiler technologies. A trend in this evolution that can be traced back over decades is the growing ratio between the available computational power (IPS, FLOPS, ...) and the corresponding bandwidth between the various levels of the memory hierarchy (registers, cache, DRAM). As a result the reduction of the amount of memory communications that a given code requires has been an important topic in compiler research. A basic principle for such optimizations is the improvement of temporal data locality: grouping all references to a single data-point as close together as possible so that it is only required for a short duration and can be quickly moved to distant memory (DRAM) without any further memory communications.Yet another architectural evolution has been the advent of the multicore era and in the most recent years the first generation of manycore designs. These architectures have considerably raised the bar of the amount of parallelism that is available to programs and algorithms but this is again limited by the available bandwidth for communications between the cores. This brings some issues thatpreviously were the sole preoccupation of distributed computing to the world of compiling and code optimization techniques.In this document we present a first dive into a new optimization technique which has the promise of offering both a high-level model for data reuses and a large field of potential applications, a technique which we refer to as generalized tiling. It finds its source in the already well-known loop tiling technique which has been applied with success to improve data locality for both register and cache-memory in the case of nested loops. This new "flavor" of tiling has a much broader perspective and is not limited to the case of nested loops. It is build on a new representation, the memory-use graph, which is tightly linked to a new model for both memory usage and communication requirements and which can be used for all forms of iterate code.Generalized tiling expresses data locality as an optimization problem for which multiple solutions are proposed. With the abstraction introduced by the memory-use graph it is possible to solve this optimization problem in different environments. For experimental evaluations we show how this new technique can be applied in the contexts of loops, nested or not, as well as for computer programs expressed within a dataflow language. With the anticipation of using generalized tiling also to distributed computations over the cores of a manycore architecture we also provide some insight into the methods that can be used to model communications and their characteristics on such architectures.As a final point, and in order to show the full expressiveness of the memory-use graph and even more the underlying memory usage and communication model, we turn towards the topic of performance debugging and the analysis of execution traces. Our goal is to provide feedback on the evaluated code and its potential for further improvement of data locality. Such traces may contain information about memory communications during an execution and show strong similarities with the previously studied optimization problem. This brings us to a short introduction to the algorithmics of directed graphs and the formulation of some new heuristics for the well-studied topic of reachability and the much less known problem of convex partitioning
Travers, Nicolas. "Optimisation extensible dans un médiateur de données semi-structurées." Versailles-St Quentin en Yvelines, 2006. http://www.theses.fr/2006VERS0049.
Full textThis thesis proposes to evaluate XQuery queries into a mediation context. This mediator must federate several heterogeneous data sources with an appropriate query model. On this model, an optimization framework must be defined to increase performance. The well-known tree pattern model can represent a subset of XPath queries in a tree form. Because of the complexity of XQuery, no model has been proposed that is able to represent all the structural components of the language. Then, we propose a new logical model for XQuery queries called TGV. It aims at supporting the whole XQuery into a canonical form in order to check more XQuery specifications. This form allows us to translate in a unique way queries into our TGV model. This model takes into account a distributed heterogenous context and eases the optimization process. It integrates transformation rules, cost evaluation, and therefore, execution of XQuery queries. The TGV can be used as a basis for processing XQuery queries, since it is flexible, it provides abstracts data types wich can be implemented according to the underneath data model. Moreover, it allows user-defined annotating ans also cost-related annotating for cost estimation. Althouogh the model will be useful, it relies on XQuery complicates specifications. TGV are illustrated in this thesis with several figures on W3C's uses cases. Finally, a framework to define transformation rules is added to the extensible optimizer to increase the XLive mediator performances. The XLive mediation system has been developped at the PRISM laboratory
Verlaine, Lionel. "Optimisation des requêtes dans une machine bases de données." Paris 6, 1986. http://www.theses.fr/1986PA066532.
Full textJouini, Khaled. "Optimisation de la localité spatiale des données temporelles et multiversions." Paris 9, 2008. https://bu.dauphine.psl.eu/fileviewer/index.php?doc=2008PA090016.
Full textThe efficient management of temporal and multiversion data is crucial for many traditional and emerging database applications. A major performance bottleneck for database systems is the memory hierarchy. One of the main means for optimizing the utilization of the memory hierarchy is to optimize data spatial locality, i. E. To put contiguously data that are likely to be read simultaneously. The problem studied in this thesis is to optimize temporal and multiversion data spatial locality at all levels of the memory hierarchy, using index structures and storage policies. In particular, this thesis proposes a cost model, the steady state analysis, allowing an accurate estimation of the performance of different index structures. The analysis provides database designers tools allowing them to determine the most suitable index structure, for given data and application characteristics. This thesis also studies the impact of version redundancy on L2 cache utilization. It proposes two storage models which, in contrast with the standard storage models, avoid version redundancy and optimize L2 cache and main memory bandwidth utilization
Saidi, Selma. "Optimisation des transferts de données sur systèmes multiprocesseurs sur puce." Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00875582.
Full textPoulliat, Charly. "Allocation et optimisation de ressources pour la transmission de données multimédia." Cergy-Pontoise, 2004. http://www.theses.fr/2004CERG0271.
Full textDesroziers, Gérald. "Mise en œuvre, diagnostic et optimisation des schémas d'assimilation de données." Habilitation à diriger des recherches, Université Paul Sabatier - Toulouse III, 2007. http://tel.archives-ouvertes.fr/tel-00525615.
Full textGillet, Noel. "Optimisation de requêtes sur des données massives dans un environnement distribué." Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0553/document.
Full textDistributed data store are massively used in the actual context of Big Data. In addition to provide data management features, those systems have to deal with an increasing amount of queries sent by distant users in order to process data mining or data visualization operations. One of the main challenge is to evenly distribute the workload of queries between the nodes which compose these system in order to minimize the treatment time. In this thesis, we tackle the problem of query allocation in a distributed environment. We consider that data are replicated and a query can be handle only by a node storing the concerning data. First, near-optimal algorithmic proposals are given when communications between nodes are asynchronous. We also consider that some nodes can be faulty. Second, we study more deeply the impact of data replication on the query treatement. Particularly, we present an algorithm which manage the data replication based on the demand on these data. Combined with our allocation algorithm, we guaranty a near-optimal allocation. Finally, we focus on the impact of data replication when queries are received as a stream by the system. We make an experimental evaluation using the distributed database Apache Cassandra. The experiments confirm the interest of our algorithmic proposals to improve the query treatement compared to the native allocation scheme in Cassandra
Yagoub, Khaled. "Spécification et optimisation de sites Web à usage intensif de données." Versailles-St Quentin en Yvelines, 2001. http://www.theses.fr/2001VERS007V.
Full textA data-intensive web site (DIWS) is a Web site that accesses large numbers of pages whose content is dynamically extracted from a database. In this context, returning a Web page may require a costly interaction with the database system, for connection and querying, to dynamically extract its content. The database interaction cost adds up to the non-negligible base cost of Web page delivery, thereby increasing much the client waiting time. In this thesis, we address this performance problem. Our approach relies on the declarative specification of the Web site. We propose a customized cache system architecture and its implementation, in the context of Weave, a Web site managment system developed at INRIA. The system can cache database data (as materialized views), XML fragments, or HTML files. In addition, Weave comes along with the WeaveRPL langage for specifying both the Web site's content and customized data materialization within the site. We also develop a basic framework for automatic compilation of Web site specifications into optimal caching strategies. Our solution has been illustrated using a Web site derived from TCP/D benchmark database. Based on experiments using our test platform WeaveBench, we assess the performance of various caching strategies. The results clearky show that a mixed strategy is generally optimal
Bradai, Benazouz. "Optimisation des Lois de Commande d’Éclairage Automobile par Fusion de Données." Mulhouse, 2007. http://www.theses.fr/2007MULH0863.
Full textNight-time driving with conventional headlamps is particularly unsafe. Indeed, if one drives much less at night, more than half of the driving fatalities occur during this period. To reduce these figures, several automotive manufacturers and suppliers participated to the European project “Adaptive Front lighting System” (AFS). This project has the aim to define new lightings functions based on a beam adaptation to the driving situation. And, it has to end in 2008 with a change of regulation of the automotive lighting allowing so realisation of all new AFS functions. For that, they explore the possible realisation of such new lighting functions, and study the relevance, the efficiency according to the driving situation, but also the dangers associated with the use, for these lighting functions, of information from the vehicle or from the environment. Since 2003, some vehicles are equipped by bending lights, taking account only of actions of the driver on the steering wheel. These solutions make it possible to improve the visibility by directing the beam towards the interior of the bend. However, the road profile (intersections, bends, etc) not being always known for the driver, the performances related to these solutions are consequently limited. However the embedded navigation systems, on the one hand can contain information on this road profile, and on the other hand have contextual information (engineering works, road type, curve radius, speed limits …). The topic of this thesis aims to optimize lighting control laws based on fusion of navigation systems information with those of vehicle embedded sensors (cameras,…), with consideration of their efficiency and reliability. Thus, this information fusion, applied here to the decision-making, makes it possible to define driving situations and contexts of the vehicle evolution environment (motorway, city, etc) and to choose the appropriate law among the various of developed lighting control laws (code motorway lighting, town lighting, bending light). This approach makes it possible to choose in real time, and by anticipation, between these various lighting control laws. It allows, consequently, the improvement of the robustness of the lighting system. Two points are at the origin of this improvement. Firstly, using the navigation system information, we developed a virtual sensor of event-based electronic horizon analysis allowing an accurate determination of various driving situations. It uses a finite state machine. It thus makes it possible to mitigate the problems of the ponctual nature of the navigation system information. Secondly, we developed a generic virtual sensor of driving situations determination based on the evidence theory of using a navigation system and the vision. This sensor combines confidences coming from the two sources for better distinguishing between the various driving situations and contexts and to mitigate the problems of the two sources taken independently. It also allows building a confidence of the navigation system using some of their criteria. This generic sensor is generalizable with other assistance systems (ADAS) that lighting one. This was shown by applying it to a speed limit detection system SLS (Speed Limit Support). The two developed virtual sensors were applied to the optimization of lighting system (AFS) and for the SLS system. These two systems were implemented on an experimental vehicle (demonstration vehicle) and they are currently operational. They were evaluated by various types of driver going from non experts to experts. They were also shown to car manufacturers (PSA, Audi, Renault, Honda, etc. ) and during different techdays. They proved their reliability during these demonstrations on open roads with various driving situations and contexts
Jeudy, Baptiste. "Optimisation de requêtes inductives : application à l'extraction sous contraintes de règles d'association." Lyon, INSA, 2002. http://theses.insa-lyon.fr/publication/2002ISAL0090/these.pdf.
Full textThe increasingly generalized use of data processing makes it possible to collect more and more data in an automatic way, e. G. In sciences (biology, astronomy, etc) or in the trade (Internet). The analysis of such quantities of data is problematic. Knowledge Discovery in Databases (KDD) techniques were conceived to meet this need. In this thesis, we used the inductive database as a framework for our work. An inductive database is a generalization of the traditional databases in which the user can query not only the data but also properties learned on the data. One can then see the whole KDD process as the interrogation of an inductive database. In this thesis, we particularly studied the optimization of inductive queries relating to the extraction of association rules and itemsets. In this case, the user can specify the rules or the itemsets of interest by using constraints. These constraints can, e. G. , specify a frequency threshold or impose syntactic restrictions on the itemsets or the rules. We propose various strategies for the evaluation of rules and itemsets extraction queries by effectively using the constraints (in particular constraints known as monotonic and anti-monotonic). We studied the use of condensed representations in the optimization of the evaluation of these requests and our experiments show that the simultaneous use of the constraints and the condensed representations gives very good results. We also show how to use condensed representations as a cache for optimization of sequences of queries. Here still, the results are good and the use of the condensed representations makes it possible to obtain much smaller cache than with previous techniques
Gueni, Billel. "Optimisation de requêtes XQuery imbriquées." Paris, Télécom ParisTech, 2009. http://www.theses.fr/2009ENST0035.
Full textWe study in this thesis the optimization of XQuery evaluation in XML databases. As our general approach, we introduce techniques that exploit minimization opportunities on complex XQuery expressions, that involve composition-style nesting and schema information. Based on a large subset of XQuery, we describe rule-based algorithms that rewrite a query by recursively pruning the subexpressions whose results are not needed for the evaluation of the query. Given an input XQuery expression, our techniques will output a simplified yet equivalent XQuery expression. They are thus readily usable as an optimization module in any existing XQuery processor. In practice, our algorithms can drastically impact query evaluation time in various settings such as view-based query answering and access control, or query-by-example interfaces. We demonstrate by experiments the impact of our rewriting approach on query evaluation costs and we prove formally its correctness. We have given also extensions to our solution in order to take into account information about data schema (DTD), to extend the algorithm to other XQuery fragments and to refine the pruning analysis to simplify further the expressions
Brahem, Mariem. "Optimisation de requêtes spatiales et serveur de données distribué - Application à la gestion de masses de données en astronomie." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLV009/document.
Full textThe big scientific data generated by modern observation telescopes, raises recurring problems of performances, in spite of the advances in distributed data management systems. The main reasons are the complexity of the systems and the difficulty to adapt the access methods to the data. This thesis proposes new physical and logical optimizations to optimize execution plans of astronomical queries using transformation rules. These methods are integrated in ASTROIDE, a distributed system for large-scale astronomical data processing.ASTROIDE achieves scalability and efficiency by combining the benefits of distributed processing using Spark with the relevance of an astronomical query optimizer.It supports the data access using the query language ADQL that is commonly used.It implements astronomical query algorithms (cone search, kNN search, cross-match, and kNN join) tailored to the proposed physical data organization.Indeed, ASTROIDE offers a data partitioning technique that allows efficient processing of these queries by ensuring load balancing and eliminating irrelevant partitions. This partitioning uses an indexing technique adapted to astronomical data, in order to reduce query processing time
De, Oliveira Castro Herrero Pablo. "Expression et optimisation des réorganisations de données dans du parallélisme de flots." Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2010. http://tel.archives-ouvertes.fr/tel-00580170.
Full textOliveira, Castro Herrero Pablo de. "Expression et optimisation des réorganisations de données dans du parallélisme de flots." Versailles-St Quentin en Yvelines, 2010. https://tel.archives-ouvertes.fr/tel-00580170.
Full textEmbedded systems designers are moving to multi-cores to increase the performance of their applications. Yet multi-core systems are difficult to program. One hard problem is expressing and optimizing data reorganizations. In this thesis we would like to propose a compilation chain that: 1) uses a simple high-level syntax to express the data reorganization in a parallel application; 2) ensures the deterministic execution of the program (critical in an embedded context); 3) optimizes and adapts the programs to the target's constraints. To address point 1) we propose a high-level language, SLICES, describing data reorganizations through multidimensional slicings. To address point 2) we show that it is possible to compile SLICES to a data-flow language, SJD, that is built upon the Cyclostatic Data-Flow formalism and therefore ensures determinism. To address point 3) we define a set of transformations that preserve the semantics of SJD programs. We show that a subset of these transformations generates a finite space of equivalent programs. We show that this space can be efficiently explored with an heuristic to select the program variant more fit to the target's constraints. Finally we evaluate this method on two classic problems: reducing memory and reducing communication costs in a parallel application
Boussahoua, Mohamed. "Optimisation de performances dans les entrepôts de données distribués NoSQL en colonnes." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSE2007.
Full textThe work presented in this thesis aims at proposing approaches to build data warehouses (DWs) by using the columnar NoSQL model. The use of NoSQL models is motivated by the advent of big data and the inability of the relational model, usually used to implement DW, to allow data scalability. Indeed, the NoSQL models are suitable for storing and managing massive data. They aredesigned to build databases whose storage model is the "key/value". Other models, then, appeared to account for the variability of the data: column oriented, document oriented and graph oriented. We have used the column NoSQL oriented model for building massive DWs because it is more suitable for decisional queries that are defined by a set of columns (measures and dimensions) from warehouse. Column family NoSQL databases offer storage techniques that are well adapted to DWs. Several scenarios are possible to develop DWs on these databases. We present in this thesis new solutions for logical and physical modeling of columnar NoSQL data warehouses. We have proposed a logic model called NLM (Naive Logical Model) to represent a NoSQL oriented columns DW and enable a better management by columnar NoSQL DBMS. We have proposed a new method to build a distributed DW using a column family NoSQL database. Our method is based on a strategy of grouping attributes from fact tables and dimensions, as families´ columns. In this purpose, we used two algorithms, the first one is a meta-heuristic algorithm, in this case the Particle Swarm Optimization : PSO, and the second one is the k-means algorithm. Furthermore, we have proposed a new method to build an efficient distributed DW inside column family NoSQL DBMSs. Our method based on the association rules method that allows to obtain groups of frequently used attributes in the workload. Hence, the partition keys RowKey, necessary to distribute data onto the different cluster nodes, are composed of those attributes groups.To validate our contributions, we have developed a software tool called RDW2CNoSQ (Relational Data Warehouse to Columnar NoSQL) to build a distributed data warehouse using a column family NoSQL Database. Also, we conducted several tests that have shown the effectiveness of different method that we proposed. Our experiments suggest that defining a good data partitioning and placement schemes during the implementation of the data warehouse with NoSQL HBase increase significantly the computation and querying performances
Mahboubi, Hadj. "Optimisation de la performance des entrepôts de données XML par fragmentation et répartition." Phd thesis, Université Lumière - Lyon II, 2008. http://tel.archives-ouvertes.fr/tel-00350301.
Full textPour atteindre cet objectif, nous proposons dans ce mémoire de pallier conjointement ces limitations par fragmentation puis par répartition sur une grille de données. Pour cela, nous nous sommes intéressés dans un premier temps à la fragmentation des entrepôts des données XML et nous avons proposé des méthodes qui sont à notre connaissance les premières contributions dans ce domaine. Ces méthodes exploitent une charge de requêtes XQuery pour déduire un schéma de fragmentation horizontale dérivée.
Nous avons tout d'abord proposé l'adaptation des techniques les plus efficaces du domaine relationnel aux entrepôts de données XML, puis une méthode de fragmentation originale basée sur la technique de classification k-means. Cette dernière nous a permis de contrôler le nombre de fragments. Nous avons finalement proposé une approche de répartition d'un entrepôt de données XML sur une grille. Ces propositions nous ont amené à proposer un modèle de référence pour les entrepôts de données XML qui unifie et étend les modèles existants dans la littérature.
Nous avons finalement choisi de valider nos méthodes de manière expérimentale. Pour cela, nous avons conçu et développé un banc d'essais pour les entrepôts de données XML : XWeB. Les résultats expérimentaux que nous avons obtenus montrent que nous avons atteint notre objectif de maîtriser le volume de données XML et le temps de traitement de requêtes décisionnelles complexes. Ils montrent également que notre méthode de fragmentation basée sur les k-means fournit un gain de performance plus élevé que celui obtenu par les méthodes de fragmentation horizontale dérivée classiques, à la fois en terme de gain de performance et de surcharge des algorithmes.
Lopez-Enriquez, Carlos-Manuel. "HyQoZ - Optimisation de requêtes hybrides basée sur des contrats SLA." Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM060/document.
Full textToday we are witnesses of the explosion of data producer massively by largely distributed of data produced by different devices (e.g. sensors, personal computers, laptops, networks) by means of data services. In this context, It is about evaluate queries named hybrid because they entails aspects related with classic queries, mobile and continuous provided by static or nomad data services in mode push or pull. The objective of my thesis is to propose an approach to optimize hybrid queries based in multi-criteria preferences (i.e. SLA – Service Level Agreement). The principle is to combine data services to construct a query evaluator adapted to the preferences expressed in the SLA whereas the state of services and network is considered as QoS measures
Vandromme, Maxence. "Optimisation combinatoire et extraction de connaissances sur données hétérogènes et temporelles : application à l’identification de parcours patients." Thesis, Lille 1, 2017. http://www.theses.fr/2017LIL10044.
Full textHospital data exhibit numerous specificities that make the traditional data mining tools hard to apply. In this thesis, we focus on the heterogeneity associated with hospital data and on their temporal aspect. This work is done within the frame of the ANR ClinMine research project and a CIFRE partnership with the Alicante company. In this thesis, we propose two new knowledge discovery methods suited for hospital data, each able to perform a variety of tasks: classification, prediction, discovering patients profiles, etc.In the first part, we introduce MOSC (Multi-Objective Sequence Classification), an algorithm for supervised classification on heterogeneous, numeric and temporal data. In addition to binary and symbolic terms, this method uses numeric terms and sequences of temporal events to form sets of classification rules. MOSC is the first classification algorithm able to handle these types of data simultaneously. In the second part, we introduce HBC (Heterogeneous BiClustering), a biclustering algorithm for heterogeneous data, a problem that has never been studied so far. This algorithm is extended to support temporal data of various types: temporal events and unevenly-sampled time series. HBC is used for a case study on a set of hospital data, whose goal is to identify groups of patients sharing a similar profile. The results make sense from a medical viewpoint; they indicate that relevant, and sometimes new knowledge is extracted from the data. These results also lead to further, more precise case studies. The integration of HBC within a software is also engaged, with the implementation of a parallel version and a visualization tool for biclustering results
Collobert, Ronan. "Algorithmes d'Apprentissage pour grandes bases de données." Paris 6, 2004. http://www.theses.fr/2004PA066063.
Full textHidane, Moncef. "Décompositions multi-échelles de données définies sur des graphes." Caen, 2013. http://www.theses.fr/2013CAEN2088.
Full textThis thesis is concerned with approaches to the construction of multiscale decompositions of signals defined on general weighted graphs. This manuscript discusses three approaches that we have developed. The first approach is based on a variational and iterative process. It generalizes the structure-texture decomposition, originally proposed for images. Two versions are proposed: one is based on a quadratic prior while the other is based on a total variation prior. The study of the convergence is performed and the choice of parameters discussed in each case. We describe the application of the decompositions we get to the enhancement of details in images and 3D models. The second approach provides a multiresolution analysis of the space of signals on a given graph. This construction is based on the organization of the graph as a hierarchy of partitions. We have developed an adaptive algorithm for the construction of such hierarchies. Finally, in the third approach, we adapt the lifting scheme to signals on graphs. This adaptation raises a number of practical problems. We focused on the one hand on the subsampling step for which we adopted a greedy approach, and on the other hand on the iteration of the transform on induced subgraphs
Asseraf, Mounir. "Extension et optimisation pour la segmentation de la distance de Kolmogorov-Smirnov." Paris 9, 1998. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=1998PA090026.
Full textBaujoin, Corinne. "Analyse et optimisation d’un système de gestion de bases de données hiérarchique-relationnel : proposition d’une interface d’interrogation." Compiègne, 1985. http://www.theses.fr/1985COMPI209.
Full textAlami, Karim. "Optimisation des requêtes de préférence skyline dans des contextes dynamiques." Thesis, Bordeaux, 2020. http://www.theses.fr/2020BORD0135.
Full textPreference queries are interesting tools to compute small representatives of datasets or to rank tuples based on the users’ preferences. In this thesis, we mainly focus on the optimization of Skyline queries, a special class of preference queries, in dynamic contexts. In a first part, we address the incremental maintenance of the multidimensional indexing structure NSC which has been shown efficient for answering skyline queries in a static context. More precisely, we address (i) the case of dynamic data, i.e. tuples are inserted or deleted at any time, and (ii) the case of streaming data, i.e. tuples are appended only, and discarded after a specific interval of time. In case of dynamic data, we redesign the structure and propose procedures to handle efficiently both insertions and deletions. In case of streaming data, we propose MSSD a data pipeline which operates in batch mode, and maintains NSCt a variation of NSC. In a second part, we address the case of dynamic orders, i.e, some or all attributes of the dataset are nominal and each user expresses his/her own partial order on these attributes’ domain. We propose highly scalable parallel algorithms that decompose an issued query into a set of sub-queries and process each sub-query independently. In a further step for optimization, we propose the partial materialization of sub-queries and introduce the problem of cost-driven sub-queries selection
Guehis, Sonia. "Modélisation, production et optimisation des programmes SQL." Paris 9, 2009. https://bu.dauphine.psl.eu/fileviewer/index.php?doc=2009PA090076.
Full textPiat, Jonathan. "Modélisation flux de données et optimisation pour architecture multi-cœurs de motifs répétitifs." Phd thesis, INSA de Rennes, 2010. http://tel.archives-ouvertes.fr/tel-00564522.
Full textFernandez, Pernas Jesus. "Optimisation et automatisation du traitement informatique des données spectroscopiques cérébrales (proton simple volume)." Caen, 2002. http://www.theses.fr/2002CAEN3079.
Full textAbbas, Issam. "Optimisation d'un langage fonctionnel de requêtes pour une base de données orienté-objet." Aix-Marseille 1, 1999. http://www.theses.fr/1999AIX11003.
Full textBekara, Maïza. "Optimisation de critères de choix de modèles pour un faible nombre de données." Paris 11, 2004. http://www.theses.fr/2004PA112139.
Full textIn this work we propose a model selection criterion based on Kullback's symmetric divergence. The developed criterion, called KICc is a bias corrected version of the asymptotic criterion KIC (Cavanaugh, Statistics and Probability Letters, vol. 42, 1999). The correction is of particular use when the sample size is small or when the number of fitted parameters is moderate to large fraction of the sample size. KICc is an exactly unbiased estimator for linear regression models and appreciatively unbiased for autoregressive and nonlinear regression models. The two criteria KIC and KICc are developed under the assumption that the true model is correctly specified or overfitted by the candidate models. We investigate the bias properties and the model selection performance of the two criteria in the underfitted case. An extension of KICc, called PKIC is also developed for the case of future experiment where date of interest is missing or indirectly observed. The KICc is implemented to solve the problem of denoising by using orthogonal projection and thresholding. The threshold is obtained as the absolute value of the kth largest coefficient that minimizes KICc. Finally, we propose a computational optimization of a cross validation based model selection criterion that uses the Bayesian predictive density as candidate model and marginal likelihood as a cost function. The developed criterion, CVBPD, is a consistent model selection criterion for linear regression
Piat, Jonathan. "Modélisation flux de données et optimisation pour architecture multi-coeurs de motifs répétitifs." Rennes, INSA, 2010. https://tel.archives-ouvertes.fr/tel-00564522.
Full textSince applications such as video coding/decoding or digital communications with advanced features are becoming more complex, the need for computational power is rapidly increasing. In order to satisfy software requirements, the use of parallel architecture is a common answer. To reduce the software development effort for such architectures, it is necessary to provide the programmer with efficient tools capable of automatically solving communications and software partitioning/scheduling concerns. The algorithm architecture matching methodology helps the programmer by providing automatic transformation, partitioning and scheduling of an application for a given architecture this methodology relies on an application model that allow to extract the available parallelism. The contributions of this thesis are tackles both the problem of the model and the associated optimization for parallelism extraction
Ziane, Mikal, and François Bouillé. "Optimisation de requêtes pour un système de gestion de bases de données parallèle." Paris 6, 1992. http://www.theses.fr/1992PA066689.
Full textLu, Yanping. "Optimisation par essaim de particules application au clustering des données de grandes dimensions." Thèse, Université de Sherbrooke, 2009. http://savoirs.usherbrooke.ca/handle/11143/5112.
Full textMartinez, Medina Lourdes. "Optimisation des requêtes distribuées par apprentissage." Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM015.
Full textDistributed data systems are becoming increasingly complex. They interconnect devices (e.g. smartphones, tablets, etc.) that are heterogeneous, autonomous, either static or mobile, and with physical limitations. Such devices run applications (e.g. virtual games, social networks, etc.) for the online interaction of users producing / consuming data on demand or continuously. The characteristics of these systems add new dimensions to the query optimization problem, such as multi-optimization criteria, scarce information on data, lack of global system view, among others. Traditional query optimization techniques focus on semi (or not at all) autonomous systems. They rely on information about data and make strong assumptions about the system behavior. Moreover, most of these techniques are centered on the optimization of execution time only. The difficulty for evaluating queries efficiently on nowadays applications motivates this work to revisit traditional query optimization techniques. This thesis faces these challenges by adapting the Case Based Reasoning (CBR) paradigm to query processing, providing a way to optimize queries when there is no prior knowledge of data. It focuses on optimizing queries using cases generated from the evaluation of similar past queries. A query case comprises: (i) the query, (ii) the query plan and (iii) the measures (computational resources consumed) of the query plan. The thesis also concerns the way the CBR process interacts with the query plan generation process. This process uses classical heuristics and makes decisions randomly (e.g. when there are no statistics for join ordering and selection of algorithms, routing protocols). It also (re)uses cases (existing query plans) for similar queries parts, improving the query optimization, and therefore evaluation efficiency. The propositions of this thesis have been validated within the CoBRa optimizer developed in the context of the UBIQUEST project
Tang, Zhao Hui. "Optimisation de requêtes avec l'expression de chemin pour les bases de données orientées objets." Versailles-St Quentin en Yvelines, 1996. http://www.theses.fr/1996VERS0009.
Full textCoveliers, Alexandre. "Sensibilité aux jeux de données de la compilation itérative." Paris 11, 2007. http://www.theses.fr/2007PA112255.
Full textIn the context of architecture processor conception, the performance research leads to a constant growth of architecture complexity. This growth of architecture complexity made more difficult the exploitation of their potential performance. To improve architecture performance exploitation, new optimization techniques based on dynamic behavior –i. E. Run time behavior- has been proposed Iterative compilation is a such an optimization approach. This approach allows to determine more relevant transformation than those obtained by static analysis. The main drawback of this optimization method is based on the fact that the information that lead to the code transformation are specific to a particular data set. Thus the determined optimizations are dependent on the data set used during the optimization process. In this thesis, we study the optimized application performance variations according to the data set used for two iterative code transformation techniques. We introduce different metrics to quantify this sensitivity. Also, we propose data set selection methods for choosing which data set to use during code transformation process. Selected data sets enable to obtain an optimized code with good performance with all other available data sets
Delot, Thierry. "Interrogation d'annuaires étendus : modèles, langage et optimisation." Versailles-St Quentin en Yvelines, 2001. http://www.theses.fr/2001VERS0028.
Full textBen, Saad Myriam. "Qualité des archives web : modélisation et optimisation." Paris 6, 2011. http://www.theses.fr/2011PA066446.
Full textCollard, Martine. "Fouille de données, Contributions Méthodologiques et Applicatives." Habilitation à diriger des recherches, Université Nice Sophia Antipolis, 2003. http://tel.archives-ouvertes.fr/tel-01059407.
Full textDukan, Laurent. "Étude critique et optimisation d'un système d'acquisition analogique-numérique rapide à hautes performances." Paris 11, 1987. http://www.theses.fr/1987PA112043.
Full textThis work deals principally with the question of High speed Analog Data Acquisition·. We constructed an Acquisition System based on 2 cascaded A/D converters (each with a capacity of 50 MHz-8 bits), with a sample rate of 100 MHz, a total accuracy of 8 bits, and a memory of 8 K-bytes. This system has a number of different functions, for example the "burst function" which consists in encoding only certain parts of the input signal or, for example, a function that permits the advance programming of the total volume of the acquisition memory·. The ensemble is subsequently connected to a computer, specifically tailored for the system (IN11O). In addition, v1e investigated the possibility of a high-speed sample-and-hold circuit, constructed in order to be connected to the input of the Acquisition System. This circuit was designed to be more performant than that which would have been strictly necessary for the connection. The system was designed to obtain a sample rate of 120 MHz with a corresponding accuracy of 12 bits. The study of the entire system (the sample-and-hold circuit in conjunction with the Analog Data Acquisition System), thus allowed for the development of new structures that made optimal use of the avail technology
Dupuis, Sophie. "Optimisation automatique des chemins de données arithmétiques par l’utilisation des systèmes de numération redondants." Paris 6, 2009. http://www.theses.fr/2009PA066131.
Full textLe, Hung-Cuong. "Optimisation d'accès au médium et stockage de données distribuées dans les réseaux de capteurs." Besançon, 2008. http://www.theses.fr/2008BESA2052.
Full textWireless sensor network is a very hot research topic tendency for the last few years. This technology can be applied into different domains as environment, industry, commerce, medicine, military etc. Depending on the application type, the problems and requirements might be different. In this thesis, we are interested in two major problems: the medium access control and the distributed data storage. The document is divided to two parts where the first part is a state of the art of different existing works and the second part describes our contribution. In the first contribution, we have proposed two MAC protocols. The first one optimizes the wireless sensor networks lifetime for surveillance applications and the second one reduces the transmission latency in event-driven wireless sensor networks for critical applications. In the second contribution, we have worked with several data storage models in wireless sensor network and we focus on the data-centric storage model. We have proposed a clustering structure for sensors to improve the routing and reduce the number of transmissions in order to prolong the network lifetime