Dissertations / Theses on the topic 'Analyse et statistique spatiale des données'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Analyse et statistique spatiale des données.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Gharbi, Zied. "Contribution à l’économétrie spatiale et l’analyse de données fonctionnelles." Thesis, Lille 1, 2019. http://www.theses.fr/2019LIL1A012/document.
Full textThis thesis covers two important fields of research in inferential statistics, namely spatial econometrics and functional data analysis. More precisely, we have focused on the analysis of real spatial or spatio-functional data by extending certain inferential methods to take into account a possible spatial dependence. We first considered the estimation of a spatial autoregressive model (SAR) with a functional dependent variable and a real response variable using observations on a given geographical unit. This is a regression model with the specificity that each observation of the independent variable collected in a geographical location depends on observations of the same variable in neighboring locations. This relationship between neighbors is generally measured by a square matrix called the spatial weighting matrix, which measures the interaction effect between neighboring spatial units. This matrix is assumed to be exogenous, i.e. the metric used to construct it does not depend on the explanatory variable. The contribution of this thesis to this model lies in the fact that the explanatory variable is of a functional nature, with values in a space of infinite dimension. Our estimation methodology is based on a dimension reduction of the functional explanatory variable through functional principal component analysis followed by maximization of the truncated likelihood of the model. Asymptotic properties of the estimators, illustrations of the performance of the estimators via a Monte Carlo study and an application to real environmental data were considered. In the second contribution, we use the functional SAR model studied in the first part by considering an endogenous structure of the spatial weighting matrix. Instead of using a geographical criterion to calculate the dependencies between neighboring locations, we calculate them via an endogenous process, i.e. one that depends on explanatory variables. We apply the same two-step estimation approach described above and study the performance of the proposed estimator for finite or infinite-tending samples. In the third part of this thesis we focus on heteroskedasticity in partially linear models for real exogenous variables and binary response variable. We propose a spatial Probit model containing a non-parametric part. Spatial dependence is introduced at the level of errors (perturbations) of the model considered. The estimation of the parametric and non-parametric parts of the model is recursive and consists of first setting the parametric parameters and estimating the non-parametric part using the weighted likelihood method and then using the latter estimate to construct a likelihood profile to estimate the parametric part. The performance of the proposed method is investigated via a Monte-Carlo study. An empirical study on the relationship between economic growth and environmental quality in Sweden using some spatial econometric tools finishes the document
Ahmed, Mohamed Salem. "Contribution à la statistique spatiale et l'analyse de données fonctionnelles." Thesis, Lille 3, 2017. http://www.theses.fr/2017LIL30047/document.
Full textThis thesis is about statistical inference for spatial and/or functional data. Indeed, weare interested in estimation of unknown parameters of some models from random or nonrandom(stratified) samples composed of independent or spatially dependent variables.The specificity of the proposed methods lies in the fact that they take into considerationthe considered sample nature (stratified or spatial sample).We begin by studying data valued in a space of infinite dimension or so-called ”functionaldata”. First, we study a functional binary choice model explored in a case-controlor choice-based sample design context. The specificity of this study is that the proposedmethod takes into account the sampling scheme. We describe a conditional likelihoodfunction under the sampling distribution and a reduction of dimension strategy to definea feasible conditional maximum likelihood estimator of the model. Asymptotic propertiesof the proposed estimates as well as their application to simulated and real data are given.Secondly, we explore a functional linear autoregressive spatial model whose particularityis on the functional nature of the explanatory variable and the structure of the spatialdependence. The estimation procedure consists of reducing the infinite dimension of thefunctional variable and maximizing a quasi-likelihood function. We establish the consistencyand asymptotic normality of the estimator. The usefulness of the methodology isillustrated via simulations and an application to some real data.In the second part of the thesis, we address some estimation and prediction problemsof real random spatial variables. We start by generalizing the k-nearest neighbors method,namely k-NN, to predict a spatial process at non-observed locations using some covariates.The specificity of the proposed k-NN predictor lies in the fact that it is flexible and allowsa number of heterogeneity in the covariate. We establish the almost complete convergencewith rates of the spatial predictor whose performance is ensured by an application oversimulated and environmental data. In addition, we generalize the partially linear probitmodel of independent data to the spatial case. We use a linear process for disturbancesallowing various spatial dependencies and propose a semiparametric estimation approachbased on weighted likelihood and generalized method of moments methods. We establishthe consistency and asymptotic distribution of the proposed estimators and investigate thefinite sample performance of the estimators on simulated data. We end by an applicationof spatial binary choice models to identify UADT (Upper aerodigestive tract) cancer riskfactors in the north region of France which displays the highest rates of such cancerincidence and mortality of the country
Ollier, Sébastien. "Des outils pour l'intégration des contraintes spatiales, temporelles et évolutives en analyse des données écologiques." Lyon 1, 2004. http://www.theses.fr/2004LYO10293.
Full textGoulard, Michel. "Champs spatiaux et statistique multidimensionnelle." Grenoble 2 : ANRT, 1988. http://catalogue.bnf.fr/ark:/12148/cb376138909.
Full textToupin, Marie-Hélène. "La copule khi-carré et son utilisation en statistique spatiale et pour la modélisation de données multidimensionnelles." Doctoral thesis, Université Laval, 2017. http://hdl.handle.net/20.500.11794/27977.
Full textThis thesis studies the properties of the family of chi-square copulas. This is a generalization of the multidimensional normal copulas obtained by squaring the components of normal random vector. These copulas are indexed by a correlation matrix and by a shape parameter. This thesis shows how this family can be used to perform spatial interpolation and to model multidimensional data. First, the usefulness of this class of dependence structures is demonstrated with an application in spatial statistics. An important problem in that context is to predict the value of a stationary random field at a position where it has not been observed. This thesis shows how to construct such predictions using spatial models based on copulas. One focusses on the use of the family of chi-square copulas in that context. One must first assumes that the correlation matrix has a standard parametric form, such as that of Matérn, indexed by an unknown parameter associated with the force of the spatial association. This parameter is first estimated using a composite pseudo-likelihood constructed from the bivariate distributions of the observed data. Then, a spatial interpolation method using the ranks of the observations is suggested to approximate the best prediction of the random field at an unobserved position under a chi-square copula. In a second work, the fundamental properties of the chi-square copulas are studied in detail. This family allows a lot of flexibility to model multidimensional data. In the bivariate case, this family is adapted to symmetric and asymmetric dependence structures. In larger dimensions, the shape parameter controls the degree of radial asymmetry of the two-dimensional marginal distributions. Parameter estimation procedures of the correlation matrix and of the shape parameter are compared under independent and identically distributed repetitions. Finally, the formulas of the conditional expectation for the best prediction in a spatial context are established. Goodness-of-fit tests for the family of chi-square copulas are then developed. These new tests can be applied to data in any dimension. These procedures are based on two association measures based on the ranks of the observations, which avoids having to specify the marginal distributions. It is shown that the joint behavior of these two measures is asymptotically normal. The efficiency of the new goodness-of-fit procedures is demonstrated through a simulation study and is compared to a classical goodness-of-fit test based on the empirical copula.
Cucala, Lionel. "ESPACEMENTS BIDIMENSIONNELS ET DONNÉES ENTACHÉES D'ERREURS DANS L'ANALYSE DES PROCESSUS PONCTUELS SPATIAUX." Phd thesis, Université des Sciences Sociales - Toulouse I, 2006. http://tel.archives-ouvertes.fr/tel-00135890.
Full textFaye, Papa Abdoulaye. "Planification et analyse de données spatio-temporelles." Thesis, Clermont-Ferrand 2, 2015. http://www.theses.fr/2015CLF22638/document.
Full textSpatio-temporal modeling allows to make the prediction of a regionalized variable at unobserved points of a given field, based on the observations of this variable at some points of field at different times. In this thesis, we proposed a approach which combine numerical and statistical models. Indeed by using the Bayesian methods we combined the different sources of information : spatial information provided by the observations, temporal information provided by the black-box and the prior information on the phenomenon of interest. This approach allowed us to have a good prediction of the variable of interest and a good quantification of incertitude on this prediction. We also proposed a new method to construct experimental design by establishing a optimality criterion based on the uncertainty and the expected value of the phenomenon
Chakroun, Hédia. "Concepts et techniques d'intégration du contexte spatial dans les modèles de pondération des données multisources." Sherbrooke : Université de Sherbrooke, 1998.
Find full textSaby, Nicolas. "Distribution à l'échelle nationale des charactéristiques des sols et détection des changements. : Apport des bases de données géographiques, des techniques d’analyse spatiale et de la modélisation." Rennes, Agrocampus Ouest, 2009. http://www.theses.fr/2009NSARB026.
Full textThe aim of this work is to assess the potential of spatial databases to monitor soil quality at a national level. Data were collected in the framework of French National Programmes. To adress this issue we show that the spatio-temporal statistical analyses must be adapted to the sampling design and to the nature of the information studied. Among the set of the possible soil variables, this work focused on some of those having a high environmental impact : the organic carbon content and the content of same trace elements. Our results show the possiblity to map soil properties at national scale, to reveal strong spatial structures and, to attribute them to different natural and artifcial processes. Large temporal trends could also be detected and explained. I discuss the limitation of the present designs and of the statistical analyses we conducted and i propose further research developpements for monitoring of soil quality
Souris, Marc. "La construction d'un système d'information géographique : principes et algorithmes du système Savane." La Rochelle, 2002. http://www.theses.fr/2002LAROS087.
Full textThis thesis present a work in computer sciences and software development. This purpose is to try to give an answer to the question : " How to build a full geographic information system following the principles of database management adapting it to geographical data ? ". We try to show with the full example of the Savane system how general theory of geographical data and algorithms in computational geometry may be use to build a GIS software. This work is part of a research program from the IRD (Institut de Recherche pour le Développement). The thesis expose all the architecture, methods and algorithms of the system, trying to explain all the options of the system building, in the different areas : definition and utilization of geographical information ; principles of database management systems and extension to geographical data ; algorithms to use to the implementation of this principles in an information system ; construction of an operational system build from the theoretical principles and functional requirements for the use in projects in geography and research for the development
Da, Silva Sébastien. "Fouille de données spatiales et modélisation de linéaires de paysages agricoles." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0156/document.
Full textThis thesis is part of a partnership between INRA and INRIA in the field of knowledge extraction from spatial databases. The study focuses on the characterization and simulation of agricultural landscapes. More specifically, we focus on linears that structure the agricultural landscape, such as roads, irrigation ditches and hedgerows. Our goal is to model the spatial distribution of hedgerows because of their role in many ecological and environmental processes. We more specifically study how to characterize the spatial structure of hedgerows in two contrasting agricultural landscapes, one located in south-Eastern France (mainly composed of orchards) and the second in Brittany (western France, \emph{bocage}-Type). We determine if the spatial distribution of hedgerows is structured by the position of the more perennial linear landscape features, such as roads and ditches, or not. In such a case, we also detect the circumstances under which this spatial distribution is structured and the scale of these structures. The implementation of the process of Knowledge Discovery in Databases (KDD) is comprised of different preprocessing steps and data mining algorithms which combine mathematical and computational methods. The first part of the thesis focuses on the creation of a statistical spatial index, based on a geometric neighborhood concept and allowing the characterization of structures of hedgerows. Spatial index allows to describe the structures of hedgerows in the landscape. The results show that hedgerows depend on more permanent linear elements at short distances, and that their neighborhood is uniform beyond 150 meters. In addition different neighborhood structures have been identified depending on the orientation of hedgerows in the South-East of France but not in Brittany. The second part of the thesis explores the potential of coupling linearization methods with Markov methods. The linearization methods are based on the use of alternative Hilbert curves: Hilbert adaptive paths. The linearized spatial data thus constructed were then treated with Markov methods. These methods have the advantage of being able to serve both for the machine learning and for the generation of new data, for example in the context of the simulation of a landscape. The results show that the combination of these methods for learning and automatic generation of hedgerows captures some characteristics of the different study landscapes. The first simulations are encouraging despite the need for post-Processing. Finally, this work has enabled the creation of a spatial data mining method based on different tools that support all stages of a classic KDD, from the selection of data to the visualization of results. Furthermore, this method was constructed in such a way that it can also be used for data generation, a component necessary for the simulation of landscapes
Guillot, Gilles. "Modélisation statistique des champs de pluie sahéliens : application à leur désagrégation spatiale et temporelle." Université Joseph Fourier (Grenoble), 1998. http://www.theses.fr/1998GRE10226.
Full textTerrier, Régis. "Calorimétrie et recherche de sources en astronomie gamma spatiale." Paris 7, 2002. https://tel.archives-ouvertes.fr/tel-00002636.
Full textTernynck, Camille. "Contributions à la modélisation de données spatiales et fonctionnelles : applications." Thesis, Lille 3, 2014. http://www.theses.fr/2014LIL30062/document.
Full textIn this dissertation, we are interested in nonparametric modeling of spatial and/or functional data, more specifically based on kernel method. Generally, the samples we have considered for establishing asymptotic properties of the proposed estimators are constituted of dependent variables. The specificity of the studied methods lies in the fact that the estimators take into account the structure of the dependence of the considered data.In a first part, we study real variables spatially dependent. We propose a new kernel approach to estimating spatial probability density of the mode and regression functions. The distinctive feature of this approach is that it allows taking into account both the proximity between observations and that between sites. We study the asymptotic behaviors of the proposed estimates as well as their applications to simulated and real data. In a second part, we are interested in modeling data valued in a space of infinite dimension or so-called "functional data". As a first step, we adapt the nonparametric regression model, introduced in the first part, to spatially functional dependent data framework. We get convergence results as well as numerical results. Then, later, we study time series regression model in which explanatory variables are functional and the innovation process is autoregressive. We propose a procedure which allows us to take into account information contained in the error process. After showing asymptotic behavior of the proposed kernel estimate, we study its performance on simulated and real data.The third part is devoted to applications. First of all, we present unsupervised classificationresults of simulated and real spatial data (multivariate). The considered classification method is based on the estimation of spatial mode, obtained from the spatial density function introduced in the first part of this thesis. Then, we apply this classification method based on the mode as well as other unsupervised classification methods of the literature on hydrological data of functional nature. Lastly, this classification of hydrological data has led us to apply change point detection tools on these functional data
Le, Gall Caroline. "Algorithmes de détection de ruptures et statistiques spatiales : applications au diagnostic de défaillances dans un procédé de fabrication." Toulouse 3, 2002. http://www.theses.fr/2002TOU30176.
Full textThe continuous improvement of the yield of a production line is a significant goal for the competitiveness of the facility. In the context of integrated circuit manufacturing, the introduction of new increasingly complex technologies makes the statistical tools traditionally used insufficient to prevent process failures. Consequently, new statistical techniques have been developed to improve or replace some existing tools and also to form some new ones. Thus, an improvement process is proposed. When a decrease of yield is observed, it first needs to be characterized. The characterization is achieved by a spatial analysis of the silicon wafers on which the integrated circuits are manufactured. . .
Lemort, Sophie. "Analyse spatiale intrasite de l'habitat : méthodologie, procédures et études de cas : les sites protohistohistoriques de Bucy-le-Long "la Foselle" 'Aisne, Néolithique ancien), et de Changis-sur-Marne "les Pétreaux" (Seine-et-Marne, Âges du Bronze et du Fer)." Thesis, Paris 1, 2018. http://www.theses.fr/2018PA01H079.
Full textIntra-site spatial analysis of settlement does not allow use of general model applicable to any archaeological site. However some items have similar habitat settlement profiles. Can we consider looking for protocols transposable to usual settlement sites? The purpose of this study is based on an exploratory approach, on two protohistoric settlements. On the Bandkeramik site of Bucy-le-Long "la Fosselle", the analysis focuses on spatial distribution of the material remains within comparable architectural units. Data analysis is used to determine different study parameters. The informative potential of housing units, established according to morphological and taphonomic criteria, is evaluated and compared with the archaeological potential, determined from the richness of the furniture and the different categories of artifacts. The global intra-site analysis is made by grouping the furniture by functional category, to highlight significant assemblages of vestiges according to the dwellings. They allow to characterize and to segment the significant sets of food and technical activities at the site scale, based on houses partitioning. The site of Changis-sur-Marne "les Pétreaux" having suffered a long occupation from the Late bronze to the Early la Tène period, lead to a difficult reading of the settlements. During excavation spatial analysis is tried on structures groups. Then, the distribution of furniture is studied at various observation scales. However, those first divisions do not reflect groups of obvious rural settlements. A partitioning of the structures within smaller spatial entities is then engaged starting from the search for aggregates, highlighted by the space-time hot spot analysis. The dynamic of occupation of the site is thus more easily perceptible. Two case studies are finally challenged with other spatial studies about settlement sites. In addition to the material remains commonly seen as reference in the intra-site spatial analysis of settlement, archaeological structures find all their places
Durango, Juan. "Impacts environnementaux de l'exploitation pétrolière en Amazonie équatorienne : de l'étude spatiale de la vulnérabilité à l'évaluation du risque." Thesis, Toulouse 3, 2019. http://www.theses.fr/2019TOU30005.
Full textEcuador is the 5th oil producer in Latin America. Most of crude oil reserves lie beneath the north-eastern Ecuadorian Amazon (NEA), representing 15% of the entire country, yet encompassing high biodiversity and cultural heritage. Crude oil and gas production generate toxic wastes potentially polluting the environment. The methodology was set to evaluate hazards and environmental vulnerability, using score indexes and rankings, as independent components of risk. Then, they were combined using spatial overlay methods. An observed hindrance for risk analysis was the quality of public data that were used in this study. In this context, the first aim was to determine accidental oil spill volumes in well-documented oil blocks. Then, putative spill volumes were allocated to poorly-documented oil blocks to obtain a homogeneous map. The second aim was to map key atmospheric emissions associated to gas flaring, i.e., greenhouse gas (CO2, CH4) and black carbon (BC) particles. The third aim was to assess the potential vulnerability of natural heritage using regional scale proxies such as protection status and land use. Finally, the fourth aim was to exemplify the presented risk assessment approach by evaluating total petroleum hydrocarbons (TPH) potentially flowing to groundwater from oil pits. Main results indicate 10,000.2 t (909.1 t.yr-1; SD = 1,219.5) oil spilled in the NEA during the 2001-2011 period (11 years), according to recorded events. However, a 54.8% increase was found when extrapolating spill rates from well-documented oil blocks to poorly-documented ones. Spatial prediction accuracy ranged from 32 to 97%. Gas flared amounted to 7.6 Gm3 (760 Mm3.yr-1), equivalent to a range of 3.7 - 4.5 kt.yr-1 BC, during 2003-2012 lapse. Total petroleum hydrocarbons in unlined oil pits was estimated to 49,436.4 t. Several maps resulted from this thesis. Spatial emissions indicate spills and gas flaring are occurring at higher rates in settlements of Joya de los Sachas, Dayuma and Shushufindi. The natural heritage vulnerability maps indicated 42% of highly vulnerable surface at the most eastern side of the studied area. Groundwater vulnerability was low to medium in most areas; furthermore, the example considered for risk assessment of groundwater and unlined oil pits, indicated highest potential impacts in settlements of Nueva Loja, Tarapoa and Shushufindi. Publicly available data quality was found to be acceptable. For instance, when comparing airborne emission estimates with some other independent estimates only 2.5-fold difference was found at most. Spatial allocation accuracy of oil spills showed promising methodology for improving hazard mapping. Vulnerability assessment indicated natural heritage proxies to be suitable for building vulnerability indexes at regional scale as land use is significantly correlated to species richness, and protected areas are efficiently conserved in the long term, thus conveying some information on ecological integrity. Moreover, there was only 8.8% of spatial incongruence between the two proxies. Groundwater vulnerability mapping indicated gaps in knowledge that were discussed; some distance thresholds were proposed to select validation sites in future studies. In conclusion, estimates and maps obtained may be valuable for safety and security monitoring, accountability of public institutions and land use planning to lessen future risks
Vannier, Clémence. "Observation et modélisation spatiale de pratiques agricoles territorialisées à partir de données de télédétection : application au paysage bocager." Phd thesis, Université Rennes 2, 2011. http://tel.archives-ouvertes.fr/tel-00651991.
Full textDelaunay, Marie. "Approche géographique appliquée au Réseau National de Vigilance et de Prévention des Pathologies Professionnelles (RNV3P)." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAS031/document.
Full textThe field of occupational health is complex because it combines many different types of data (activity sector, occupations, risk exposures, diseases), available at nested scales (communes, activity territories, employment areas, regions, etc.) and from different partners (insurers, stakeholders, monitoring systems). These multiple sources of additional data, formalized or not, are always analysed independently, ignoring in particular the geographic dimension associated therewith (underlying activities territories).The aim of our work is to consider one of these data sources, the rnv3p (French National Occupational Diseases Surveillance and Prevention Network), as a spatial object. Through different methods (explained Part 1) and geomatics tools, and taking into account the underlying workforce, it is primarily a description of the network in terms of recruitment, shadow and preferential recruitment areas that is made (Part 2). Secondly, it is the confrontation of this database to other data sources describing occupational diseases (especially compensated one) which is analyzed through approaches by industry and pathology (Part 3). Finally, recommendations were made regarding the development of a mapping tool, built for the rnv3p database for vigilance purposes and helping various occupational health stakeholders (Part 4).Key words: occupational health, work related diseases, surveillance network, Geographic Information System (GIS), spatial analysis
Occelli, Florent. "Systèmes d’Information Géographique et Lien Environnement – Santé (SIGLES) : contribution au développement d'outils cartographiques d'aide à la décision face aux risques sanitaires liés à l'environnement." Thesis, Lille 2, 2014. http://www.theses.fr/2014LIL2S043/document.
Full textEnvironmental and social inequalities in health (ESIH) over territories are related to two cumulative dimensions: populations exposed to their living poor quality environment and the vulnerability of these populations to the environmental risk factors, which can affect health. This research deals with the Geographic Information Systems (GIS) applied to the field of environmental health. General purposes are the characterization of environmental media quality and the assessment of ESIH.Achieving these objectives requires a first step of harvest, genesis and formatting spatialized environmental databases. Such data are resulting from physico-chemical monitoring and biomonitoring. They were then mapped using GIS tools, including geostatistical spatial interpolation methods. On the over hand, spatial variability in the incidence of diseases were investigated using disease mapping methods (Standardized Incidence Ratios: SIR) and the detection of atypical clusters of events (scan statistics), which are based on disease registries. Finally, geographical ecological studies are developed to associate the environmental maps generated to health and socio-economic status. Thus, this work aims to answer the question \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"do people with poor state of health live in a poor quality environment?\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" This question has been studied through three main researches.The first relates the characterization of trace elements burdens in the environment and the assessment of ESIH on neighborhood scale, over three territories in the Nord-Pas de Calais (NPdC) region. This research is conducted from measurements of biological burdens performed both in epiphytic lichens and humans and from a localized index of deprivation. The measured metals were considered individually, but also holistically by developing an integrated multimetallic index, in order to describe the general status of environmental pollution by metals. Environmental inequalities were observed on neighborhood scale in Dunkerque. Our results assume that trace elements burdens in populations are affected by environmental burdens.In our second research we revealed spatial disparities in the incidence of end stage renal disease (ESRD) on small area in the NPdC. Unlike other factors (diabetes, cardiovascular disease, medical practices), we highlighted the role of socio-economic status in the occurrence of such disparities. Only a part of the ESRD variability is currently explained. It is therefore necessary to focus on the environmental hypothesis.The third research focuses on the spatial and spatio-temporal analysis of groundwater contamination by trace elements, in order to identify potential environmental risk factors in the incidence of chronic inflammatory bowel disease.This work is based on several collaborations with the REIN network, the EPIMAD registry, and several research teams (EA4483 and EA2694 Université Lille 2, TVES EA4477 ULCO).Following this thesis, the research prospects are to pursue the development of integrated indicators to assess population exposure to the multiple environmental media contamination. The results also indicate a lack of information in environmental databases compared to health registries. A work is thus needed to define the content of such databases. These are necessary to characterize the environmental quality and to help the assessment of interaction between the populations and their living environment
Bounekkar, Ahmed. "Analyse statistique de texture : autocorrélation spatiale et notion de contiguïté." Lyon 1, 1997. http://www.theses.fr/1997LYO10142.
Full textJaunâtre, Kévin. "Analyse et modélisation statistique de données de consommation électrique." Thesis, Lorient, 2019. http://www.theses.fr/2019LORIS520.
Full textIn October 2014, the French Environment & Energy Management Agency with the ENEDIS company started a research project named SOLENN ("SOLidarité ENergie iNovation") with multiple objectives such as the study of the control of the electric consumption by following the households and to secure the electric supply. The SOLENN project was lead by the ADEME and took place in Lorient, France. The main goal of this project is to improve the knowledge of the households concerning the saving of electric energy. In this context, we describe a method to estimate extreme quantiles and probabilites of rare events which is implemented in a R package. Then, we propose an extension of the famous Cox's proportional hazards model which allows the etimation of the probabilites of rare events. Finally, we give an application of some statistics models developped in this document on electric consumption data sets which were useful for the SOLENN project. A first application is linked to the electric constraint program directed by ENEDIS in order to secure the electric network. The houses are under a reduction of their maximal power for a short period of time. The goal is to study how the household behaves during this period of time. A second application concern the utilisation of the multiple regression model to study the effect of individuals visits on the electric consumption. The goal is to study the impact on the electric consumption for the week or the month following a visit
Cappi, Alberto. "Analyse statistique de la distribution spatiale des galaxies et des amas." Paris 11, 1993. http://www.theses.fr/1993PA112104.
Full textKezouit, Omar Abdelaziz. "Bases de données relationnelles et analyse de données : conception et réalisation d'un système intégré." Paris 11, 1987. http://www.theses.fr/1987PA112130.
Full textAbdali, Abdelkebir. "Systèmes experts et analyse de données industrielles." Lyon, INSA, 1992. http://www.theses.fr/1992ISAL0032.
Full textTo analyses industrial process behavio, many kinds of information are needed. As tye ar mostly numerical, statistical and data analysis methods are well-suited to this activity. Their results must be interpreted with other knowledge about analysis prcess. Our work falls within the framework of the application of the techniques of the Artificial Intelligence to the Statistics. Its aim is to study the feasibility and the development of statistical expert systems in an industrial process field. The prototype ALADIN is a knowledge-base system designed to be an intelligent assistant to help a non-specialist user analyze data collected on industrial processes, written in Turbo-Prolong, it is coupled with the statistical package MODULAD. The architecture of this system is flexible and combing knowledge with general plants, the studied process and statistical methods. Its validation is performed on continuous manufacturing processes (cement and cast iron processes). At present time, we have limited to principal Components analysis problems
Dang, Van Mô. "Classification de donnees spatiales : modeles probabilistes et criteres de partitionnement." Compiègne, 1998. http://www.theses.fr/1998COMP1173.
Full textBureau, Jérémie. "Définition et analyse statistique d'une mesure d'intégrité pour données GPS-EGNOS." Toulouse 3, 2012. http://thesesups.ups-tlse.fr/1983/.
Full textAmong the GNSS (Global Navigation Satellite System) applications currently used or in development, some of them require high performances in terms of precise positioning and reliability for safety of life. These critical performances are evaluated using statistical tools, and the problem of measuring the position accuracy or the system reliability (integrity) can be modeled as a quantile estimation. This inverse problem requires the knowledge of the cumulative distribution function of the observations. This is not possible when we have to study real data, then it becomes necessary to use statistical techniques to estimate this function. Specific safety of life applications, such as an airborne precision approach, require very high levels of quantiles which probabilities can reach 10^7. These probabilities correspond to frequencies of rare events occurrence, located in the distribution tails. Quantiles associated to such levels of probability are qualified as extreme quantiles and are generally located beyond the observations domain. We propose in this work two methods of extreme quantile estimation seldom used in the GNSS field. The first one is a direct application of the models stemming from extreme values theory and more particularly from the model of excesses over a threshold called POT (Peak Over Threshold). This theory provides a class of models allowing an extrapolation from the observed domain to the unobserved domain and then the characterization of rare events which never have been observed. The second method supplies an approximation for the decreasing of a distribution tail by the use of analytical techniques adapted to a statistical framework. This method is called derived from saddle point approximation technics. These two techniques of tails distributions characterization are valid under certain hypothesis of stationnarity and independency of the observations. GPS data do not always satisfy these conditions. In this work, we propose statistical methods to reach these conditions allowing us to use the models of extreme quantile estimation in an adequate way. From the tools studied in this thesis, we outline a statistical analysis methodology for integrity measurement. The problems of calibrating these tools are treated by automated processes in a data analysis platform, software developed as a support for this study
Badran, Hussein. "Contribution à la mesure en analyse factorielle des données et applications." Aix-Marseille 3, 2001. http://www.theses.fr/2001AIX30035.
Full textThis thesis presents under a same cover a certain number of articles and studies that are regrouped into two parts. The first part, mostly theoretical, concerns some studies in the framework of factorial analysis. In the beginning several questions related to probability distribution functions appearing in factorial analysis are considereed, mainly about the evaluation and characterization of missing data. Then new results are given on projective transformations that allow to approach probability laws on compact sets. Finally another result on measureness (under the meaning of a given mass distribution) of two complementary subsets of convex sets defined by hyperplanes going through the gravity center. The second part aims at presenting a certain number of applications of Correspondence Factorial Analysis showing the diversity of concrete problems that can be invoked. It offers results of many studies conducted in France as in Lebanon in the framework of several researches that have facilitated the discovery of new information in very different sectors from experimental sciences going from earth science to economical, political and social sciences
Lahatte, Agénor. "Analyse de systèmes de demande des ménages et dépendance spatiale." Université Louis Pasteur (Strasbourg) (1971-2008), 2002. http://www.theses.fr/2002STR1EC05.
Full textThe thesis connects two distinct strands of the literature: models of quantity and quality proposed by Deaton (1987, 1988, 1990) and Crawford, Laisney and Preston (1996) (CLP) and spatial econometrics. I first consider spatial patterns of economic agents decisions in the general context of analysis of share systems and I show that the adding-up property of shares implies equality restrictions on spatial autoregressive parameters of the share models with spatial dependence. Then, in a Monte Carlo study, I emphasize the possibility of the implementation of the Moran's (1950) test in microeconomic context not only to test for the presence of spatial correlation across error terms, but also to identify its potential sources. For estimating the spatial versions of the CLP model, I combine the methodology of Deaton (1987, 1988, 1990) and Crawford et alii (1996) and the procedure of Kelejian and Prucha (1999). I show then that this estimation method involves an identification problem of spatialparameters and that it requires spatial matrices with special structures. An illustration of the estimation of spatial versions of the CLP model is given, with an application of the technique to Czech household survey data; the estimates do not suggest that neighbors' budget shares are an important determinant of the household expenditures in the data analysed
Mourad, Georges. "L'analyse factorielle des correspondances et l'études de quelques marchés : flux des marchandises OCDE-OPEP et OCDE-URSS, flux du pétrole OPEP-OCDE, immatriculation de véhicules utilitaires et des voitures particulières en Europe occidentale." Paris 6, 1986. http://www.theses.fr/1986PA066212.
Full textRahal, Mohamed Cherif. "Classification ascendante spatiale : nouveaux algorithmes et aide à l'interprétation." Paris 9, 2010. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=2010PA090003.
Full textConnault, Pierre. "Calibration d'algorithmes de type Lasso et analyse statistique de données métallurgiques en aéronautique." Thesis, Paris 11, 2011. http://www.theses.fr/2011PA112041.
Full textOur work contains a methodological and an applied part.In the methodological part we study Lasso and a variant of this algorithm : the projectedLasso. We develop slope heuristics to calibrate them.Our approach uses sparsity properties of the Lasso, showing how to remain to a modelselection framework. This both involves a penalized criterion and the tuning of a constant.To this aim, we adopt the classical approaches of Birgé and Massart about slope heuristics.This leads to the notion of canonical penalty.Slope and (tenfold) crossvalidation are then compared through simulations studies.Results suggest the user to consider both of them. In order to increase calculation speed,simplified penalties are (unsuccessfully) tried.The applied part is about aeronautics. The results of the methodological part doapply in reliability : in classical approaches (without Lasso) the large number of variables/number of data ratio leads to an instability of linear models, and to huge calculustimes. Lasso provides a helpful solution.In aeronautics, dealing with reliability questions first needs to study quality of theelaboration and forging processes. Four major axis have to be considered : analysing thefactor of the process, discrimining recipes, studying the impact of time on quality anddetecting outliers. This provides a global statistical strategy of impowerment for processes
Ahamada, Ibrahim. "Analyse spectrale des données non stationnaires : théories et applications aux tests de stationnarité." Aix-Marseille 2, 2002. http://www.theses.fr/2002AIX24007.
Full textCardot, Hervé. "Contribution à l'estimation et à la prévision statistique de données fonctionnelles." Toulouse 3, 1997. http://www.theses.fr/1997TOU30162.
Full textSallah, Kankoe. "Diffusion spatio-temporelle des épidémies : approche comparée des modélisations mathématiques et biostatistiques, cibles d'intervention et mobilité humaine." Thesis, Aix-Marseille, 2017. http://www.theses.fr/2017AIXM0607.
Full textIn the first part of this thesis, we have developed a malaria transmission metamodel based on the susceptible-infected-resistant compartmental modeling framework (SIR) and taking into consideration human mobility flows between different villages in the Center of Senegal. Geographically targeted intervention strategies had been shown to be effective in reducing the incidence of malaria both within and outside of intervention areas. However, combined interventions targeting both vector and host, coordinated on a large scale are needed in regions and countries aiming to achieve malaria elimination in the short/medium term.In the second part we have evaluated different methods of estimating human mobility in the absence of real data. These methods included spatio-temporal traceability of mobile phones, mathematical models of gravity and radiation. The transport of the pathogen through the geographical space via the mobility of an infected subject is a major determinant of the spread of an epidemic. We introduced the impedance model that minimized the mean square error on mobility estimates, especially in contexts where population sets are characterized by their heterogeneous sizes.Finally, we have expanded the framework of assumptions underlying the calibration of the gravity models of human mobility. The hypothesis of a zero inflated distribution provided a better fit and a better predictability, compared to the classical approach not assuming an excess of zeros: Poisson, Quasipoisson
Chambaz, Antoine. "Segmentation spatiale et sélection de modèle : théorie et applications statistiques." Paris 11, 2003. http://www.theses.fr/2003PA112012.
Full textWe tacke in this thesis the elaboration of an original method that provides refinement of the localization of the mobIle telecommunication traffic in urban area for France Télécom R&D. This work involves both practical and theoretical developments. Our point of view is of statistical nature. The major themes are spatial segmentation and model selection. We first introduce the various datasets from which our approach stems. They cast some light on the original problem. We motivate the choice of an heteroscedastic regression model. We then present a practical nonparametric regression method based on CART regression trees and its Bagging and Boosting extensions by resampling. The latter classical methods are designed for ho- moscedastic models. We propose an adaptation to heteroscedastic ODes, including an original analysis of variable importance. We apply the method to various traffic datasets. The final results are commented. The above practical work motivates the theoretical study of the consistency of a family of estimators of the order of a segmented model and its associated segmentation. We also cope, in a general framework of model select ion in a nested family of models, with the estimation of the order of a model. We are particularly concerned with consistency properties and rates of und er- or overestimation. We tackle the problem at stake with a linear functional approach, i. E. An approach where the events of interest are described as events concerning the empirical measute. This allows to derive general results that gather and enhance earlier ODes. A large range of techniques are involved : classical arguments of M -estimation, concentration, max- imal inequalities for dependent variables, Stein's lemma, penalization, Large and Moderate Deviations Principles for the empirical measure, à la Huber trick
Pettorelli, Nathalie. "Variabilité individuelle et dynamique de population : importance de la composante spatiale." Lyon 1, 2002. http://www.theses.fr/2002LYO10131.
Full textBoly, Aliou. "Fonctions d'oubli et résumés dans les entrepôts de données." Paris, ENST, 2006. http://www.theses.fr/2006ENST0049.
Full textThe amount of data stored in data warehouses grows very quickly so that they get saturated. To overcome this problem, the solution is generally to archive older data when new data arrive if there is no space left. This solution is not satisfactory because data mining analyses based on long term historical data become impossible. As a matter of fact data mining analysis cannot be done on archived data without re-loading them in the data warehouse; and the cost of loading back a large dataset of archived data is too high to be operated just for one analysis. So, archived data must be considered as lost data regarding to data mining applications. In this thesis, we propose a solution for solving this problem: a language is defined to specify forgetting functions on older data. The specifications include the definition of some summaries of deleted data to define what data should be present in the data warehouse at each step of time. These summaries are aggregates and samples of deleted data and will be kept in the data warehouse. The goal of these forgetting functions is to control the size of the data warehouse. This control is provided both for the aggregate summaries and the samples. The specification language for forgetting function is defined in the context of relational databases. Once forgetting functions have been specified, the data warehouse is automatically updated in order to follow the specifications. This thesis presents both the language for specifications, the structure of the summaries, the algorithms to update the data warehouse and the possibility of performing interesting analyses of historical data
Raillard, Nicolas. "Modélisation du comportement extrême d'un processus spatio-temporel : applications en océanographie et météorologie." Rennes 1, 2011. http://www.theses.fr/2011REN1S111.
Full textIn this thesis, the extremes of an important oceanographic variable for application will be studied: the significant wave height. This quantity is observed precisely thanks to remote sensing with the satellites. However, this data source produce complex data set with data irregularly spaced in space and time. This issue is central for studding extreme values, since few models are suited to such data. Two models are described in this document. First, we introduce an interpolation model, based on the estimation of displacements sea-states structures, thanks to particle filters. Then, an estimation of the covariance structure of the displaced field is applied to obtain and interpolation scheme. This technique leads to an improvement of usual approaches, but is insufficient to cope with extremes. Secondly, we develop a procedure to model the threshold exeedances for a process observed at irregular time steps or with missing observations. We propose a model based on methods from multivariate threshold exceedances and from extremes of stochastic processes, together with an estimation procedure inspired by composite likelihood techniques. Then, we show both the consistency of the estimators and the practical behaviour with simulations. Last, we use real datasets of significant wave height and see that taking into account every excess leads to an improvement in the estimation of return level and in describing the lengths of extreme events
Baumont, Catherine. "Contribution à l'analyse des espaces urbains multicentriques : la localisation résidentielle : étude théorique et empirique." Dijon, 1990. http://www.theses.fr/1990DIJOE004.
Full textThe thesis is divided into three parts. The first part is devoted to the analysis of multicenter urban spaces integration in spatial analysis. Both fuzzy and non fuzzy characteristics of them are taken into account. In the second part we try to solve the problem of spatial equilibrium of household in a multicenter urban pattern and we construct two models : a standard model and a fuzzy model. Then in the third part we present an econometric study based on the models described in the second part. Dijon is the urban area chosen for the test. The fuzzy approach allows us to bring an interesting economoc solution of household location in multicenter urban spaces
Poupeau, Benoît. "Analyse et requêtes de données géographiques 3 D : contributions de la cristallographie géométrique." Phd thesis, Université Paris-Est, 2008. http://tel.archives-ouvertes.fr/tel-00481924.
Full textKholladi, Mohamed Khureddine. "Représentation, modélisation et manipulation des connaissances spatiales en géomatique"G. R. E. M. A. C. O. S. "." Lyon, INSA, 1991. http://www.theses.fr/1991ISAL0082.
Full textIn this dissertation we present a large survey of problems concerning representation, modelling and manipulation of three dimensional objects. This work is described thanks to three applications using various geomatic objects. The manipulation activity is based on spatial reasoning which fs mainly geometric or topologic such as : - creation of new spatial. Facts thanks. To interpolation (as in our geological strata application), deduction of new -facts from either 6ncomplete or badly structured information. Spatial knowledge manipulation involves several problems. This in lights the multiple reasoning methods and shows how their unification in a Single reasoning model, which could be applied to several cases, is very difficult. As far as the development of spatial reasoning tolls is concerned, our work presents a contribution in the three following axes : control of the representation depending on the application types, - adaptation of the modelling of spatial objects, - manipulation and reasoning in different contexts. This has made us think that this dissertation we in light for each reasoning model, its specific characteristics, thanks to several problems which are taken from very different contexts
Zreik, Rawya. "Analyse statistique des réseaux et applications aux sciences humaines." Thesis, Paris 1, 2016. http://www.theses.fr/2016PA01E061/document.
Full textOver the last two decades, network structure analysis has experienced rapid growth with its construction and its intervention in many fields, such as: communication networks, financial transaction networks, gene regulatory networks, disease transmission networks, mobile telephone networks. Social networks are now commonly used to represent the interactions between groups of people; for instance, ourselves, our professional colleagues, our friends and family, are often part of online networks, such as Facebook, Twitter, email. In a network, many factors can exert influence or make analyses easier to understand. Among these, we find two important ones: the time factor, and the network context. The former involves the evolution of connections between nodes over time. The network context can then be characterized by different types of information such as text messages (email, tweets, Facebook, posts, etc.) exchanged between nodes, categorical information on the nodes (age, gender, hobbies, status, etc.), interaction frequencies (e.g., number of emails sent or comments posted), and so on. Taking into consideration these factors can lead to the capture of increasingly complex and hidden information from the data. The aim of this thesis is to define new models for graphs which take into consideration the two factors mentioned above, in order to develop the analysis of network structure and allow extraction of the hidden information from the data. These models aim at clustering the vertices of a network depending on their connection profiles and network structures, which are either static or dynamically evolving. The starting point of this work is the stochastic block model, or SBM. This is a mixture model for graphs which was originally developed in social sciences. It assumes that the vertices of a network are spread over different classes, so that the probability of an edge between two vertices only depends on the classes they belong to
Bonis, Thomas. "Algorithmes d'apprentissage statistique pour l'analyse géométrique et topologique de données." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLS459/document.
Full textIn this thesis, we study data analysis algorithms using random walks on neighborhood graphs, or random geometric graphs. It is known random walks on such graphs approximate continuous objects called diffusion processes. In the first part of this thesis, we use this approximation result to propose a new soft clustering algorithm based on the mode seeking framework. For our algorithm, we want to define clusters using the properties of a diffusion process. Since we do not have access to this continuous process, our algorithm uses a random walk on a random geometric graph instead. After proving the consistency of our algorithm, we evaluate its efficiency on both real and synthetic data. We then deal tackle the issue of the convergence of invariant measures of random walks on random geometric graphs. As these random walks converge to a diffusion process, we can expect their invariant measures to converge to the invariant measure of this diffusion process. Using an approach based on Stein's method, we manage to obtain quantitfy this convergence. Moreover, the method we use is more general and can be used to obtain other results such as convergence rates for the Central Limit Theorem. In the last part of this thesis, we use the concept of persistent homology, a concept of algebraic topology, to improve the pooling step of the bag-of-words approach for 3D shapes
Hedli-Griche, Sonia. "Estimation de l'opérateur de régression pour des données fonctionnelles et des erreurs corrélées." Université Pierre Mendès France (Grenoble), 2008. http://www.theses.fr/2008GRE21009.
Full textIn the research work that we present in this thesis, we study the problem of nonparametric modelization when the statistical data are represented by curves. More precisely, we are interested in the problems of prediction from an explanatory random variable that takes values in some, eventually, infinite dimensional space. Recently, some work has been realised in the functional operatoriel estimation under the independence assumptions of the functional data. In this thesis, we consider that the functional data are dependent and that the error process is stationary (with short or long memory). We have studied and estimated the regression operator under different set-ups: when the functional data (dependent) are deterministic or random, when the error process is a short or long memory, the asymptotic normality when the error process is negatively associated, the local/global choice of the bandwidth, the study of the relevancy of our theoretical results to simulated data and then to real data
Al, Ayoubi Baydaa. "Analyse des données en distance de type M1 : théorie et algorithmes d'optimisation." Rennes 2, 1991. http://www.theses.fr/1991REN20010.
Full textOur study deals with two aspects of the foundation of data analysis : factor analysis and classification theory. In the application of data analysis, the research encounters noise within the data which occur in rectangular arrays. One of the goals of the research is a mathematical procedure which leads to a simple graphical representation offering a clear over view of the data. From this perspective, one must introduce a distance measuring the differences between individuals. However classical data analysis uses the usual Euclidean distance. We intend to approach factor analysis w using the city block metric. We study the main properties of the city block metric as well as its relations to other distances. As it is important and desirable to present applications of an abstract theory, after our theoretical results, we present an optimization algorithm intended to graphically represent the data by a set in RP equipped with the above metric. In the case of the explanateur of new methods and representations within classification theory, we develop a classification algorithm which entails the graphical representation of the individuals from a particular population called an "additive forest", which is a generalization of the notion of "additive tree"
Bailleul, Marc. "Analyse statistique implicative : variables modales et contribution des sujets : application a la modelisation de l'enseignant dans le systeme didactique." Rennes 1, 1994. http://www.theses.fr/1994REN10061.
Full textLechuga, lopez Olga. "Contributions a l’analyse de données multivoie : algorithmes et applications." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLC038/document.
Full textIn this thesis we develop a framework for the extension of commonly used linear statistical methods (Fisher Discriminant Analysis, Logistical Regression, Cox regression and Regularized Canonical Correlation Analysis) to the multiway context. In contrast to their standard formulation, their multiway generalization relies on structural constraints imposed to the weight vectors that integrate the original tensor structure of the data within the optimization process. This structural constraint yields a more parsimonious and interpretable model. Different strategies to deal with high dimensionality are also considered. The application of these algorithms is illustrated on two real datasets: (i) serving for the discrimination of spectroscopy data for which all methods where tested and (ii) to predict the long term recovery of patients after traumatic brain injury from multi-modal brain Magnetic Resonance Imaging. In both datasets our methods yield valuable results compared to the standard approach
Durbec, Jean-Pierre. "Traitement statistique des données en océanologie biologique : modèles adaptés à l' "in situ" et à l'expérimentation." Aix-Marseille 2, 1988. http://www.theses.fr/1988AIX22012.
Full textPicard, Jacques. "Structure, classification et discrimination des profils évolutifs incomplets et asynchrones." Lyon 1, 1987. http://www.theses.fr/1987LYO19044.
Full text