Dissertations / Theses on the topic 'Analyse statistique de classement'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Analyse statistique de classement.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Ouni, Zaïd. "Statistique pour l’anticipation des niveaux de sécurité secondaire des générations de véhicules." Thesis, Paris 10, 2016. http://www.theses.fr/2016PA100099/document.
Full textRoad safety is a world, European and French priority. Because light vehicles (or simply“vehicles”) are obviously one of the main actors of road activity, the improvement of roadsafety necessarily requires analyzing their characteristics in terms of traffic road accident(or simply “accident”). If the new vehicles are developed in engineering department and validated in laboratory, it is the reality of real-life accidents that ultimately characterizesthem in terms of secondary safety, ie, that demonstrates which level of security they offer to their occupants in case of an accident. This is why car makers want to rank generations of vehicles according to their real-life levels of safety. We address this problem by exploiting a French data set of accidents called BAAC (Bulletin d’Analyse d’Accident Corporel de la Circulation). In addition, fleet data are used to associate a generational class (GC) to each vehicle. We elaborate two methods of ranking of GCs in terms of secondary safety. The first one yields contextual rankings, ie, rankings of GCs in specified contexts of accident. The second one yields global rankings, ie, rankings of GCs determined relative to a distribution of contexts of accident. For the contextual ranking, we proceed by “scoring”: we look for a score function that associates a real number to any combination of GC and a context of accident; the smaller is this number, the safer is the GC in the given context. The optimal score function is estimated by “ensemble learning”, under the form of an optimal convex combination of scoring functions produced by a library of ranking algorithms by scoring. An oracle inequality illustrates the performance of the obtained meta-algorithm. The global ranking is also based on “scoring”: we look for a scoring function that associates any GC with a real number; the smaller is this number, the safer is the GC. Causal arguments are used to adapt the above meta-algorithm by averaging out the context. The results of the two ranking procedures are in line with the experts’ expectations
Paris, Nicolas. "Formalisation algorithmique des classements au tennis : mise en perspective longitudinale par simulation probabiliste." Bordeaux 2, 2008. http://www.theses.fr/2008BOR21603.
Full textSibony, Eric. "Analyse mustirésolution de données de classements." Thesis, Paris, ENST, 2016. http://www.theses.fr/2016ENST0036/document.
Full textThis thesis introduces a multiresolution analysis framework for ranking data. Initiated in the 18th century in the context of elections, the analysis of ranking data has attracted a major interest in many fields of the scientific literature : psychometry, statistics, economics, operations research, machine learning or computational social choice among others. It has been even more revitalized by modern applications such as recommender systems, where the goal is to infer users preferences in order to make them the best personalized suggestions. In these settings, users express their preferences only on small and varying subsets of a large catalog of items. The analysis of such incomplete rankings poses however both a great statistical and computational challenge, leading industrial actors to use methods that only exploit a fraction of available information. This thesis introduces a new representation for the data, which by construction overcomes the two aforementioned challenges. Though it relies on results from combinatorics and algebraic topology, it shares several analogies with multiresolution analysis, offering a natural and efficient framework for the analysis of incomplete rankings. As it does not involve any assumption on the data, it already leads to overperforming estimators in small-scale settings and can be combined with many regularization procedures for large-scale settings. For all those reasons, we believe that this multiresolution representation paves the way for a wide range of future developments and applications
Martins, Da Cruz José Márcio. "Contribution au classement statistique mutualisé de messages électroniques (spam)." Phd thesis, École Nationale Supérieure des Mines de Paris, 2011. http://pastel.archives-ouvertes.fr/pastel-00637173.
Full textCruz, José Marcio Martins da. "Contribution au classement statistique mutualisé de messages électroniques (spam)." Paris, ENMP, 2011. https://pastel.archives-ouvertes.fr/pastel-00637173.
Full textSince the 90's, different machine learning methods were investigated and applied to the email classification problem (spam filtering), with very good but not perfect results. It was always considered that these methods are well adapted to filter messages to a single user and not filter to messages of a large set of users, like a community. Our approach was, at first, look for a better understanding of handled data, with the help of a corpus of real messages, before studying new algorithms. With the help of a logistic regression classifier with online active learning, we could show, empirically, that with a simple classification algorithm coupled with a learning strategy well adapted to the real context it's possible to get results which are as good as those we can get with more complex algorithms. We also show, empirically, with the help of messages from a small group of users, that the efficiency loss is not very high when the classifier is shared by a group of users
Bourdel, Ghislaine. "Les Sociétés de conseil : analyse, classement et prospective." Paris 1, 1992. http://www.theses.fr/1992PA010007.
Full textDenoyer, Ludovic. "Apprentissage et inférence statistique dans les bases de documents structurés : application aux corpus de documents textuels." Paris 6, 2004. http://www.theses.fr/2004PA066087.
Full textDurel, Marie. "Classement et analyse des brouillons de Madame Bovary de Gustave Flaubert." Rouen, 2000. http://www.theses.fr/2000ROUEL352.
Full textSuprême, Hussein. "Analyse et classement des contingences d’un réseau électrique pour la stabilité transitoire." Mémoire, École de technologie supérieure, 2012. http://espace.etsmtl.ca/1100/1/SUPR%C3%8AME_Hussein.pdf.
Full textJourdan-Marias, Astrid. "Analyse statistique et échantillonage d'expériences simulées." Pau, 2000. http://www.theses.fr/2000PAUU1014.
Full textMahé, Cédric. "Analyse statistique de delais d'evenement correles." Paris 7, 1998. http://www.theses.fr/1998PA077254.
Full textMaestracci, Olivier. "Analyse des critères de sélection et de classement des OPCVM : étude théorique et empirique." Aix-Marseille 3, 2004. http://www.theses.fr/2004AIX32003.
Full textValidation of efficiency theory has given place of numerous studies empirical, which have contributed to develop models of performance measure. The analysis of performance and performance persistence allows to judge of value added by active management. We have follow-up three big axes of analysis: The comparison OPCVM with a critical of the models and tools proposed by professionals. We have then look which could being the contributions of analysis multi-factorial to allow us a better representation of the different sources of mutual funds returns. Finally we have constructed a model of selection and classification OPCVM adapted in necessities investors
Lavialle, Olivier. "Décision sensorielle multicritère : classement de produits alimentaires soumis à des jugements hédoniques multicritères." Bordeaux 1, 1994. http://www.theses.fr/1994BOR10606.
Full textCigana, John. "Analyse statistique de sensibilité du modèle SANCHO." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ38667.pdf.
Full textCélimène, Fred. "Analyse statistique et économétrique des DOM-TOM." Paris 10, 1985. http://www.theses.fr/1985PA100002.
Full textOlivier, Adelaïde. "Analyse statistique des modèles de croissance-fragmentation." Thesis, Paris 9, 2015. http://www.theses.fr/2015PA090047/document.
Full textThis work is concerned with growth-fragmentation models, implemented for investigating the growth of a population of cells which divide according to an unknown splitting rate, depending on a structuring variable – age and size being the two paradigmatic examples. The mathematical framework includes statistics of processes, nonparametric estimations and analysis of partial differential equations. The three objectives of this work are the following : get a nonparametric estimate of the division rate (as a function of age or size) for different observation schemes (genealogical or continuous) ; to study the transmission of a biological feature from one cell to an other and study the feature of one typical cell ; to compare different populations of cells through their Malthus parameter, which governs the global growth (when introducing variability in the growth rate among cells for instance)
Hassène, Belguith. "Analyse psycho-sociologique des équipes tunisiennes de football en fonction de leurs niveaux de classement." Doctoral thesis, Universite Libre de Bruxelles, 1998. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/211985.
Full textLassalle, Hugues. "Statistical physics for materials classification." Université Louis Pasteur (Strasbourg) (1971-2008), 2003. http://www.theses.fr/2003STR13103.
Full textGenetic algorithms (GA) and clustering techniques are used to study and classify materials. An analysis of the convergence speed of GA is carried out using advanced probability theory and random walk concepts. The determination of the ground-state of multicomponent alloys and Ising models with long-range interactions is accomplished using genetic algorithm. A new GA operator, the domain-flip, is introduced and its efficiency is compared to that of traditional GA operators, crossover and mutation. The domain-flip operator destroys phase-boundaries by flipping all bits of a given domain at the same time. This operator turns out to be crucial in extracting the system from low local minima. Therefore its presence is rather essential to speed up the GA convergence. A study of GA convergence in its last stages, where all chromosomes present in the population are assumed to consist of two well-ordered domains, is performed using random walk theory and probability theory. Exact expressions for the average time needed for at least one chromosome to find the ground-state are derived. Also, the probability for two chromosomes to undergo a successful crossover, meaning the result is the ground-state, is given. Finally, clustering techniques, which belong to the field of Data Mining, are applied to the classification of materials. An improved version of the widely-used clustering algorithm, K-means, is developed. A comparison of the two clustering techniques on a two-dimensional data set shows that the guide-point approach is more powerful than the K-means algorithm. The guide-point algorithm is used successfully to partition a materials data set. This clustering results in extracting useful information from the data set for which no a priori knowledge was assumed
Goulard, Michel. "Champs spatiaux et statistique multidimensionnelle." Grenoble 2 : ANRT, 1988. http://catalogue.bnf.fr/ark:/12148/cb376138909.
Full textLacombe, Jean-Pierre. "Analyse statistique de processus de poisson non homogènes. Traitement statistique d'un multidétecteur de particules." Phd thesis, Grenoble 1, 1985. http://tel.archives-ouvertes.fr/tel-00318875.
Full textYousfi, Elqasyr Khadija. "MODÉLISATION ET ANALYSE STATISTIQUE DES PLANS D'EXPÉRIENCE SÉQUENTIELS." Phd thesis, Université de Rouen, 2008. http://tel.archives-ouvertes.fr/tel-00377114.
Full textGuillaume, Jean-Loup. "Analyse statistique et modélisation des grands réseaux d'interactions." Phd thesis, Université Paris-Diderot - Paris VII, 2004. http://tel.archives-ouvertes.fr/tel-00011377.
Full textLa première partie est centrée sur l'analyse des réseaux et fait un point critique sur les réseaux étudiés et les paramètres introduits pour mieux comprendre leur structure. Un certain nombre de ces paramètres sont partagés par la majorité des réseaux étudiés et justifient l'étude de ceux-ci de manière globale.
La seconde partie qui constitue le coeur de cette thèse s'attache à la modélisation des grands réseaux d'interactions, c'est-à-dire la construction de graphes artificiels semblables à ceux rencontrés en pratique. Ceci passe tout d'abord par la présentation des modèles existants puis par l'introduction d'un modèle basé sur certaines propriétés non triviales qui est suffisamment simple pour que l'on puisse l'étudier formellement ses propriétés et malgré tout réaliste.
Enfin, la troisième partie est purement méthodologique. Elle permet de présenter la mise en pratique des parties précédentes et l'apport qui en découle en se basant sur trois cas particuliers : une étude des échanges dans un réseau pair-à-pair, une étude de la robustesse des réseaux aux pannes et aux attaques et enfin un ensemble de simulations visant à estimer la qualité des cartes de l'Internet actuellement utilisées.
Cette thèse met en lumière la nécessité de poursuivre les travaux sur les grands réseaux d'interactions et pointe plusieurs pistes prometteuses, notamment sur l'étude plus fine des réseaux, que ce soit de manière pondérée ou dynamique. Mais aussi sur la nécessité d'étudier de nombreux problèmes liés à la métrologie des réseaux pour réussir à capturer leur structure de manière plus précise.
Ledauphin, Stéphanie. "Analyse statistique d'évaluations sensorielles au cours du temps." Phd thesis, Université de Nantes, 2007. http://tel.archives-ouvertes.fr/tel-00139887.
Full textDepuis une vingtaine d'années, les courbes temps-intensité (TI) qui permettent de décrire l'évolution d'une sensation au cours de l'expérience sont de plus en plus populaires parmi les praticiens de l'analyse sensorielle. La difficulté majeure pour l'analyse des courbes TI provient d'un effet juge important qui se traduit par la présence d'une signature propre à chaque juge. Nous proposons une approche fonctionnelle basée sur les fonctions B-splines qui permet de réduire l'effet juge en utilisant une procédure d'alignement de courbes.
D'autres données sensorielles au cours du temps existent telles que le suivi de la dégradation organoleptique de produits alimentaires. Pour les étudier, nous proposons la modélisation par des chaînes de Markov cachées, de manière à pouvoir ensuite visualiser graphiquement la suivi de la dégradation.
Alsheh, Ali Maya. "Analyse statistique de populations pour l'interprétation d'images histologiques." Thesis, Sorbonne Paris Cité, 2015. http://www.theses.fr/2015PA05S001/document.
Full textDuring the last decade, digital pathology has been improved thanks to the advance of image analysis algorithms and calculus power. However, the diagnosis from histopathology images by an expert remains the gold standard in a considerable number of diseases especially cancer. This type of images preserves the tissue structures as close as possible to their living state. Thus, it allows to quantify the biological objects and to describe their spatial organization in order to provide a more specific characterization of diseased tissues. The automated analysis of histopathological images can have three objectives: computer-aided diagnosis, disease grading, and the study and interpretation of the underlying disease mechanisms and their impact on biological objects. The main goal of this dissertation is first to understand and address the challenges associated with the automated analysis of histology images. Then it aims at describing the populations of biological objects present in histology images and their relationships using spatial statistics and also at assessing the significance of their differences according to the disease through statistical tests. After a color-based separation of the biological object populations, an automated extraction of their locations is performed according to their types, which can be point or areal data. Distance-based spatial statistics for point data are reviewed and an original function to measure the interactions between point and areal data is proposed. Since it has been shown that the tissue texture is altered by the presence of a disease, local binary patterns methods are discussed and an approach based on a modification of the image resolution to enhance their description is introduced. Finally, descriptive and inferential statistics are applied in order to interpret the extracted features and to study their discriminative power in the application context of animal models of colorectal cancer. This work advocates the measure of associations between different types of biological objects to better understand and compare the underlying mechanisms of diseases and their impact on the tissue structure. Besides, our experiments confirm that the texture information plays an important part in the differentiation of two implemented models of the same disease
Duvernet, Laurent. "Analyse statistique des processus de marche aléatoire multifractale." Phd thesis, Université Paris-Est, 2010. http://tel.archives-ouvertes.fr/tel-00567397.
Full textGautier, Christian. "Analyse statistique et évolution des séquences d'acides nucléiques." Grenoble 2 : ANRT, 1987. http://catalogue.bnf.fr/ark:/12148/cb37605346q.
Full textGaroche, Pierre-Loïc. "Analyse statistique d'un calcul d'acteurs par interprétation abstraite." Toulouse, INPT, 2008. http://ethesis.inp-toulouse.fr/archive/00000629/.
Full textThe Actor model, introduced by Hewitt and Agha in the late 80s, describes a concurrent communicating system as a set of autonomous agents, with non uniform interfaces and communicating by the use of labeled messages. The CAP process calculus, proposed by Colaço, is based on this model and allows to describe non trivial realistic systems, without the need of complex encodings. CAP is a higher-order calculus: messages can carry actor behaviors. Multiple works address the analysis of CAP properties, mainly by the use of inference-based type systems using behavioral types and sub-typing. Otherwise, more recent works, by Venet and later Feret, propose the use of abstract interpretation to analyze process calculi. These approaches allow to compute non-uniform properties. For example, they are able to differentiate recursive instances of the same thread. This thesis is at the crossroad of these two approaches, applying abstract interpretation to the analysis of CAP. Following the framework of Feret, CAP is firstly expressed in a non standard form, easing its analysis. The set of reachable states is then over-approximated via a sound by construction representation within existing abstract domains. [. . . ]
Gautier, Christian. "Analyse statistique et évolution des séquences d'acides nucléiques." Lyon 1, 1987. http://www.theses.fr/1987LYO19034.
Full textDupuis, Jérôme. "Analyse statistique bayesienne de modèles de capture-recapture." Paris 6, 1995. http://www.theses.fr/1995PA066077.
Full textLarrere, Guy. "Contribution à l'étude asymptotique en analyse statistique multivariée." Pau, 1994. http://www.theses.fr/1994PAUU3026.
Full textVu, Thi Lan Huong. "Analyse statistique locale de textures browniennes multifractionnaires anisotropes." Thesis, Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0094.
Full textWe deal with some anisotropic extensions of the multifractional brownian fields that account for spatial phenomena whose properties of regularity and directionality may both vary in space. Our aim is to set statistical tests to decide whether an observed field of this kind is heterogeneous or not. The statistical methodology relies upon a field analysis by quadratic variations, which are averages of square field increments. Specific to our approach, these variations are computed locally in several directions. We establish an asymptotic result showing a linear gaussian relationship between these variations and parameters related to regularity and directional properties of the model. Using this result, we then design a test procedure based on Fisher statistics of linear gaussian models. Eventually we evaluate this procedure on simulated data. Finally, we design some algorithms for the segmentation of an image into regions of homogeneous textures. The first algorithm is based on a K-means procedure which has estimated parameters as input and takes into account their theoretical probability distributions. The second algorithm is based on an EM algorithm which involves continuous execution ateach 2-process loop (E) and (M). The values found in (E) and (M) at each loop will be used for calculations in the next loop. Eventually, we present an application of these algorithms in the context of a pluridisciplinary project which aims at optimizing the deployment of photo-voltaic panels on the ground. We deal with a preprocessing step of the project which concerns the segmentation of images from the satellite Sentinel-2 into regions where the cloud cover is homogeneous
Douspis, Marian. "Analyse statistique des anisotropies du fond diffus cosmologique." Toulouse 3, 2000. http://www.theses.fr/2000TOU30185.
Full textElqasyr, Khadija. "Modélisation et analyse statistique des plans d’expérience séquentiels." Rouen, 2008. http://www.theses.fr/2008ROUES023.
Full textTwo distinct sections constitute this thesis. The first part concerns the study of sequential experimental designs applied to clinical trials. We study the modelling of these designs. We develop a generalization of the `` Play-The-Winner'' rule. Theoretical and numerical results show that these designs perform better than the designs recently developed, in the framework of the Freedman's urn models, which are a generalization of the ''randomized play-the-winner'' rule or of a modifiedversion of this rule. In the second part, we develop inference methods for analyszing the data from the considered sequential designs. In the case of two treatments, and for ''play-the-winner'' rule, we made explicit the sampling distributions and their factorial moments. We derive frequentist inference procedures (tests and conditional confidence intervals) and Bayesian methods for these designs. In the Bayesian framework, for a family of appropriate priors we found the posterior distributions and the credible intervals about the relevant parameters, and the predictive distributions. The link between conditional tests and Bayesian procedures is made explicit. The Bayesian methods are generalized to cover more complex plans (several treatments and delayed responses). Non informative Bayesian procedures are remarkable frequentist properties
Romefort, Dominique Villedieu. "Analyse statistique des circuits intégrès : caractérisation des modèles." Toulouse 3, 1990. http://www.theses.fr/1990TOU30087.
Full textAubert, Julie. "Analyse statistique de données biologiques à haut débit." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS048/document.
Full textThe technological progress of the last twenty years allowed the emergence of an high-throuput biology basing on large-scale data obtained in a automatic way. The statisticians have an important role to be played in the modelling and the analysis of these numerous, noisy, sometimes heterogeneous and collected at various scales. This role can be from several nature. The statistician can propose new concepts, or new methods inspired by questions asked by this biology. He can propose a fine modelling of the phenomena observed by means of these technologies. And when methods exist and require only an adaptation, the role of the statistician can be the one of an expert, who knows the methods, their limits and the advantages.In a first part, I introduce different methods developed with my co-authors for the analysis of high-throughput biological data, based on latent variables models. These models make it possible to explain a observed phenomenon using hidden or latent variables. The simplest latent variable model is the mixture model. The first two presented methods constitutes two examples: the first in a context of multiple tests and the second in the framework of the definition of a hybridization threshold for data derived from microarrays. I also present a model of coupled hidden Markov chains for the detection of variations in the number of copies in genomics taking into account the dependence between individuals, due for example to a genetic proximity. For this model we propose an approximate inference based on a variational approximation, the exact inference not being able to be considered as the number of individuals increases. We also define a latent-block model modeling an underlying structure per block of rows and columns adapted to count data from microbial ecology. Metabarcoding and metagenomic data correspond to the abundance of each microorganism in a microbial community within the environment (plant rhizosphere, human digestive tract, ocean, for example). These data have the particularity of presenting a dispersion stronger than expected under the most conventional models (we speak of over-dispersion). Biclustering is a way to study the interactions between the structure of microbial communities and the biological samples from which they are derived. We proposed to model this phenomenon using a Poisson-Gamma distribution and developed another variational approximation for this particular latent block model as well as a model selection criterion. The model's flexibility and performance are illustrated on three real datasets.A second part is devoted to work dedicated to the analysis of transcriptomic data derived from DNA microarrays and RNA sequencing. The first section is devoted to the normalization of data (detection and correction of technical biases) and presents two new methods that I proposed with my co-authors and a comparison of methods to which I contributed. The second section devoted to experimental design presents a method for analyzing so-called dye-switch design.In the last part, I present two examples of collaboration, derived respectively from an analysis of genes differentially expressed from microrrays data, and an analysis of translatome in sea urchins from RNA-sequencing data, how statistical skills are mobilized, and the added value that statistics bring to genomics projects
Kollia, Aikaterini. "Analyse statistique de la diversité en anthropometrie tridimensionnelle." Thesis, Lyon, 2016. http://www.theses.fr/2016EMSE0812.
Full textAnthropometry is the scientific field that studies human body dimensions (from the greek άνθρωπος (human) + μέτρον (measure)). Anthropometrical analysis is based actually on 1D measurements (head circumference, length, etc). However, the body’s morphological complexity requires 3D analysis. This is possible due to recent progress of 3D scanners. The objective of this study is to compare population’s anthropometry and use results to adapt sporting goods to user’s morphology. For this purpose, 3D worldwide measurement campaigns were realized and automated treatment algorithms were created in order to analyze the subjects’ point cloud. Based on image processing methods and on shape geometry, these algorithms detect anatomical landmarks, calculate 1D measurements, align subjects and create representative anthropometrical 3D models. In order to analyze morphological characteristics, different statistical methods including components’ analysis, were adapted for use in 3D space. The methods were applied in three body parts: the foot, the head and the bust. The morphological differences between and inside the populations were studied. For example, the difference in each point of the head, between Chinese and European head, was calculated. The statistics in three dimensions, permitted also to show the asymmetry of the head. The method to create anthropometrical models is more adapted to our applications than the methods used in the literature. The analysis in three dimensions, can give results that they are not visible from 1D analyses. The knowledge of this thesis is used for the conception of different products that they are sold in DECATHLON stores around the world
Gerville-Réache, Léo. "Analyse statistique de modèles probabilistes appliqués aux processus sociaux." Bordeaux 1, 1998. http://www.theses.fr/1998BOR10606.
Full textGonzalez, Ignacio Baccini Alain Leon José. "Analyse canonique régularisée pour des données fortement multidimensionnelles." Toulouse (Université Paul Sabatier, Toulouse 3), 2008. http://thesesups.ups-tlse.fr/99.
Full textSalem, André. "Méthodes de la statistique textuelle." Paris 3, 1993. http://www.theses.fr/1994PA030010.
Full textMethods for textual statistics, a multidisciplinary work, presents a critical overview of statistical studies on vocabulary. The first part is devoted to the definition of textual units and to the adaptation of a set of statistical methods (mainly multidimensional statistical methods) to textual studies. That set of lexicometric methods has also been used in various fields dealing with textual data. Beyond the diversity of the domains, lexicometrical methods reveal contrasts between distributions of forms and repeated segments throughout the texts. Those contrasts found pertinent interpretation in each case. Numerous studies performed over chronological textual series show the importance of a same phenomenon: qualitative and quantitative evolution of the vocabulary as time goes by. Taking into account time-variable leads to a better characterization of the successive time periods, or groups of periods, based upon the vocabulary they use. Coefficients calculated on the basis of the distribution of textual units (forms and repeated segments) through the different periods of the corpus, lead to compare the empirical periodizations resulting from chronological analysis of the lexical amount with a priori periodizations based on important dates of the period covered by the corpus
Peyre, Julie. "Analyse statistique des données issues des biopuces à ADN." Phd thesis, Université Joseph Fourier (Grenoble), 2005. http://tel.archives-ouvertes.fr/tel-00012041.
Full textDans un premier chapitre, nous étudions le problème de la normalisation des données dont l'objectif est d'éliminer les variations parasites entre les échantillons des populations pour ne conserver que les variations expliquées par les phénomènes biologiques. Nous présentons plusieurs méthodes existantes pour lesquelles nous proposons des améliorations. Pour guider le choix d'une méthode de normalisation, une méthode de simulation de données de biopuces est mise au point.
Dans un deuxième chapitre, nous abordons le problème de la détection de gènes différentiellement exprimés entre deux séries d'expériences. On se ramène ici à un problème de test d'hypothèses multiples. Plusieurs approches sont envisagées : sélection de modèles et pénalisation, méthode FDR basée sur une décomposition en ondelettes des statistiques de test ou encore seuillage bayésien.
Dans le dernier chapitre, nous considérons les problèmes de classification supervisée pour les données de biopuces. Pour remédier au problème du "fléau de la dimension", nous avons développé une méthode semi-paramétrique de réduction de dimension, basée sur la maximisation d'un critère de vraisemblance locale dans les modèles linéaires généralisés en indice simple. L'étape de réduction de dimension est alors suivie d'une étape de régression par polynômes locaux pour effectuer la classification supervisée des individus considérés.
Vatsiou, Alexandra. "Analyse de génétique statistique en utilisant des données pangénomiques." Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAS002/document.
Full textThe complex phenotypes observed nowadays in human populations are determined by genetic as well as environmental factors. For example, nutrition and lifestyle play important roles in the development of multifactorial diseases such as obesity and diabetes. Adaptation on such complex phenotypic traits may occur via allele frequency shifts at multiple loci, a phenomenon known as polygenic selection. Recent advances in statistical approaches and the emergence of high throughput Next Generation Sequencing data has enabled the detection of such signals. Here we aim to understand the extent to which environmental changes lead to shifts in selective pressures as well as the impact of those on disease susceptibility. To achieve that, we propose a gene set enrichment analysis using SNP selection scores that are simply scores that quantify the selection pressure on SNPs and they could be derived from genome-scan methods. Initially we carry out a sensitivity analysis to investigate which of the recent genome-scan methods identify accurately the selected region. A simulation approach was used to assess their performance under a wide range of complex demographic structures under both hard and soft selective sweeps. Then, we develop SEL-GSEA, a tool to identify pathways enriched for evolutionary pressures, which is based on SNP data. Finally, to examine the effect of potential environmental changes that could represent changes in selection pressures, we apply SEL-GSEA as well as Gowinda, an available online tool, on a population-based study. We analyzed three populations (Africans, Europeans and Asians) from the HapMap database. To acquire the SNP selection scores that are the basis for SEL-GSEA, we used a combination of two genome scan methods (iHS and XPCLR) that performed the best in our sensitivity analysis. The results of our analysis show extensive selection pressures on immune related pathways mainly in Africa population as well as on the glycolysis and gluconeogenesis pathway in Europeans, which is related to metabolism and diabetes
Guerineau, Lise. "Analyse statistique de modèles de fiabilité en environnement dynamique." Lorient, 2013. http://www.theses.fr/2013LORIS297.
Full textWe propose models which integrate time varying stresses for assessing reliability of the electrical network. Our approach is based on the network observation and consists of statistical and probabilistic modelling of failure occurrence. The great flexibility allowed by the piecewise exponential distribution makes it appropriate to model time-to-failure of a component under varying environmental conditions. We study properties of this distribution and make statistical inference for different observation schemes. Models relating components reliability with environmental constraints, and relying on the piecewise exponential distribution, are proposed. The maximum likelihood is assessed on both simulated and real data sets. Then, we consider a multi-component system whose evolution is linked with the corrective maintenance performed. Reliability of this system can be described using stochastic processes. We present inference methods according to the nature of the observation. Discrete observation can be formulated in terms of missing data; the EM algorithm is used to reach estimates in this situation. Stochastic versions of this algorithm have been considered to overcome a possible combinatorial explosion preventing from the EM algorithm implementation. Numerical examples are presented for the proposed algorithms
Meddeb, Ali. "Analyse théorique et statistique du phénomène de l'émergence financière." Montpellier 1, 1999. http://www.theses.fr/1999MON10031.
Full textZabalza-Mezghani, Isabelle. "Analyse statistique et planification d'expérience en ingénierie de réservoir." Pau, 2000. http://www.theses.fr/2000PAUU3009.
Full textColin, Pascal. "Analyse statistique d'images de fluorescence dans des jets diphasiques." Rouen, 1998. http://www.theses.fr/1998ROUES069.
Full textMarchaland, Catherine. "Analyse statistique d'un tableau de notes : comparaisons d'analyses factorielles." Paris 5, 1987. http://www.theses.fr/1987PA05H123.
Full textJaunâtre, Kévin. "Analyse et modélisation statistique de données de consommation électrique." Thesis, Lorient, 2019. http://www.theses.fr/2019LORIS520.
Full textIn October 2014, the French Environment & Energy Management Agency with the ENEDIS company started a research project named SOLENN ("SOLidarité ENergie iNovation") with multiple objectives such as the study of the control of the electric consumption by following the households and to secure the electric supply. The SOLENN project was lead by the ADEME and took place in Lorient, France. The main goal of this project is to improve the knowledge of the households concerning the saving of electric energy. In this context, we describe a method to estimate extreme quantiles and probabilites of rare events which is implemented in a R package. Then, we propose an extension of the famous Cox's proportional hazards model which allows the etimation of the probabilites of rare events. Finally, we give an application of some statistics models developped in this document on electric consumption data sets which were useful for the SOLENN project. A first application is linked to the electric constraint program directed by ENEDIS in order to secure the electric network. The houses are under a reduction of their maximal power for a short period of time. The goal is to study how the household behaves during this period of time. A second application concern the utilisation of the multiple regression model to study the effect of individuals visits on the electric consumption. The goal is to study the impact on the electric consumption for the week or the month following a visit
Huang, Weibing. "Dynamique des carnets d’ordres : analyse statistique, modélisation et prévision." Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066525/document.
Full textThis thesis is made of two connected parts, the first one about limit order book modeling and the second one about tick value effects. In the first part, we present our framework for Markovian order book modeling. The queue-reactive model is first introduced, in which we revise the traditional zero-intelligence approach by adding state dependency in the order arrival processes. An empirical study shows that this model is very realistic and reproduces many interesting microscopic features of the underlying asset such as the distribution of the order book. We also demonstrate that it can be used as an efficient market simulator, allowing for the assessment of complex placement tactics. We then extend the queue-reactive model to a general Markovian framework for order book modeling. Ergodicity conditions are discussed in details in this setting. Under some rather weak assumptions, we prove the convergence of the order book state towards an invariant distribution and that of the rescaled price process to a standard Brownian motion. In the second part of this thesis, we are interested in studying the role played by the tick value at both microscopic and macroscopic scales. First, an empirical study of the consequences of a tick value change is conducted using data from the 2014 Japanese tick size reduction pilot program. A prediction formula for the effects of a tick value change on the trading costs is derived and successfully tested. Then, an agent-based model is introduced in order to explain the relationships between market volume, price dynamics, bid-ask spread, tick value and the equilibrium order book state
Zreik, Rawya. "Analyse statistique des réseaux et applications aux sciences humaines." Thesis, Paris 1, 2016. http://www.theses.fr/2016PA01E061/document.
Full textOver the last two decades, network structure analysis has experienced rapid growth with its construction and its intervention in many fields, such as: communication networks, financial transaction networks, gene regulatory networks, disease transmission networks, mobile telephone networks. Social networks are now commonly used to represent the interactions between groups of people; for instance, ourselves, our professional colleagues, our friends and family, are often part of online networks, such as Facebook, Twitter, email. In a network, many factors can exert influence or make analyses easier to understand. Among these, we find two important ones: the time factor, and the network context. The former involves the evolution of connections between nodes over time. The network context can then be characterized by different types of information such as text messages (email, tweets, Facebook, posts, etc.) exchanged between nodes, categorical information on the nodes (age, gender, hobbies, status, etc.), interaction frequencies (e.g., number of emails sent or comments posted), and so on. Taking into consideration these factors can lead to the capture of increasingly complex and hidden information from the data. The aim of this thesis is to define new models for graphs which take into consideration the two factors mentioned above, in order to develop the analysis of network structure and allow extraction of the hidden information from the data. These models aim at clustering the vertices of a network depending on their connection profiles and network structures, which are either static or dynamically evolving. The starting point of this work is the stochastic block model, or SBM. This is a mixture model for graphs which was originally developed in social sciences. It assumes that the vertices of a network are spread over different classes, so that the probability of an edge between two vertices only depends on the classes they belong to
Tessier, Alexandre Oliver. "Bloc batterie li-ion pour véhicules électriques : méthode de classement novatrice en temps réel des paramètres électriques des cellules." Mémoire, Université de Sherbrooke, 2015. http://hdl.handle.net/11143/8026.
Full text