Dissertations / Theses on the topic 'Modèles linéaires généralisés [GLM]'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 35 dissertations / theses for your research on the topic 'Modèles linéaires généralisés [GLM].'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Le, Tertre Alain. "Séries temporelles et analyse combinée des liens pollution atmosphérique et santé." Paris 6, 2005. http://www.theses.fr/2005PA066434.
Full textZeghnoun, Abdelkrim. "Relation à court terme entre pollution atmosphérique et santé : quelques aspects statistiques et épidémiologiques." Paris 7, 2002. http://www.theses.fr/2002PA077199.
Full textPeyhardi, Jean. "Une nouvelle famille de modèles linéaires généralisés (GLMs) pour l'analyse de données catégorielles ; application à la structure et au développement des plantes." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2013. http://tel.archives-ouvertes.fr/tel-00936845.
Full textMilhaud, Xavier. "Mélanges de GLMs et nombre de composantes : application au risque de rachat en Assurance Vie." Thesis, Lyon 1, 2012. http://www.theses.fr/2012LYO10097/document.
Full textInsurers have been concerned about surrenders for a long time especially in Saving business, where huge sums are at stake. The emergence of the European directive Solvency II, which promotes the development of internal risk models (among which a complete unit is dedicated to surrender risk management), strengthens the necessity to deeply study and understand this risk. In this thesis we investigate the topics of segmenting and modeling surrenders in order to better know and take into account the main risk factors impacting policyholders’ decisions. We find that several complex aspects must be specifically dealt with to predict surrenders, in particular the heterogeneity of behaviours and their correlations as well as the context faced by the insured. Combining them, we develop a methodology that seems to provide good results on given business lines, and that moreover can be adapted for other products with little effort. However the model selection step suffers from a lack of parsimoniousness: we suggest to use another criteria based on a new estimator, and prove its consistant properties in the framework of mixtures of generalized linear models
Trottier, Catherine. "Estimation dans les modèles linéaires généralisés à effets aléatoires." Phd thesis, Grenoble INPG, 1998. http://tel.archives-ouvertes.fr/tel-00004908.
Full textBonneu, Michel. "Choix de modèles linéaires généralisés en vue de la prédiction." Toulouse 3, 1986. http://www.theses.fr/1986TOU30103.
Full textLakhal, Chaieb M'hamed Lajmi. "Utilisation des modèles linéaires généralisés pour estimer la taille d'une population animale fermée." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape10/PQDD_0005/MQ44684.pdf.
Full textChauvet, Jocelyn. "Introducing complex dependency structures into supervised components-based models." Thesis, Montpellier, 2019. http://www.theses.fr/2019MONTS008/document.
Full textHigh redundancy of explanatory variables results in identification troubles and a severe lack of stability of regression model estimates. Even when estimation is possible, a consequence is the near-impossibility to interpret the results. It is then necessary to combine its likelihood with an extra-criterion regularising the estimates. In the wake of PLS regression, the regularising strategy considered in this thesis is based on extracting supervised components. Such orthogonal components must not only capture the structural information of the explanatory variables, but also predict as well as possible the response variables, which can be of various types (continuous or discrete, quantitative, ordinal or nominal). Regression on supervised components was developed for multivariate GLMs, but so far concerned models with independent observations.However, in many situations, the observations are grouped. We propose an extension of the method to multivariate GLMMs, in which within-group correlations are modelled with random effects. At each step of Schall's algorithm for GLMM estimation, we regularise the model by extracting components that maximise a trade-off between goodness-of-fit and structural relevance. Compared to penalty-based regularisation methods such as ridge or LASSO, we show on simulated data that our method not only reveals the important explanatory dimensions for all responses, but often gives a better prediction too. The method is also assessed on real data.We finally develop regularisation methods in the specific context of panel data (involving repeated measures on several individuals at the same time-points). Two random effects are introduced: the first one models the dependence of measures related to the same individual, while the second one models a time-specific effect (thus having a certain inertia) shared by all the individuals. For Gaussian responses, we first propose an EM algorithm to maximise the likelihood penalised by the L2-norm of the regression coefficients. Then, we propose an alternative which rather gives a bonus to the "strongest" directions in the explanatory subspace. An extension of these approaches is also proposed for non-Gaussian data, and comparative tests are carried out on Poisson data
Veilleux, Lucie. "Modélisation de la trajectoire criminelle de jeunes contrevenants à l'aide de modèles linéaires généralisés mixtes." Thesis, Université Laval, 2005. http://www.theses.ulaval.ca/2005/23128/23128.pdf.
Full textMartinez, Marie-José. "Modèles linéaires généralisés à effets aléatoires : contributions au choix de modèle et au modèle de mélange." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2006. http://tel.archives-ouvertes.fr/tel-00388820.
Full textLoum, Mor Absa. "Modèle de mélange et modèles linéaires généralisés, application aux données de co-infection (arbovirus & paludisme)." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS299/document.
Full textWe are interested, in this thesis, to the study of mixture models and generalized linear models, with an application to co-infection data between arboviruses and malaria parasites. After a first part dedicated to the study of co-infection using a multinomial logistic model, we propose in a second part to study the mixtures of generalized linear models. The proposed method to estimate the parameters of the mixture is a combination of a moment method and a spectral method. Finally, we propose a final section for studing extreme value mixtures under random censoring. The estimation method proposed in this section is done in two steps based on the maximization of a likelihood
Semenou, Michel. "Construction de plans expérimentaux et propriétés de modèles linéaires généralisés mal spécifiés : application à une étude de fiabilité." Toulouse 3, 1994. http://www.theses.fr/1994TOU30006.
Full textKide, Saïkou Oumar. "Analyse de la diversité et de la structuration spatio-temporelle des assemblages démersaux dans la zone économique exclusive mauritanienne." Thesis, Aix-Marseille, 2018. http://www.theses.fr/2018AIXM0085/document.
Full textThe Mauritanian exclusive economic zone is the seat of an upwelling phenomenon and constitutes a transition zone where species of temperate and tropical affinities coexist. To understand the spatio-temporal behavior of demersal assemblages from the point of view of their composition, structure, distribution of probability and diversity faced to ecological concerns. Abiotic factors contribute in the structuring of persistent groundfish assemblages over time. The fishing effects were relatively low, although significant in some years and in some specific geographic areas. Temporal trajectories between groundfish assemblages and environmental conditions have been highlighted for some years and in some specific areas. In each type habitats, two species groups were identified: a minority group of species very aggregative well fitted by Fisher’s log-series distribution and another majority of species little or not aggregative well fitted by the truncated negative binomial distribution. Diversity indices analyzed reveal that this set can be split into two distinct and complementary groups: a group associated with the species richness and another group associated with evenness. One component of diversity may not represent the diversity of the groundfish in the study area. GLMs of complementary indices showed essentially a temporal effect and Bathymetric strata-Year interaction. No effect of fishing effort was observed on the species richness and neither was the concentration of chlorophyll a on the evenness. This work could provide managers and scientists to further knowledge on the spatio-temporal dynamics of groundfish species assemblages exploited in upwelling ecosystems
Godeau, Ugoline. "Améliorer la pertinence et l’efficacité des modèles statistiques en écologie : extension des fonctions sigmoïdes dans le cadre de l’étude de la distribution de la biodiversité." Thesis, Orléans, 2020. http://www.theses.fr/2020ORLE3049.
Full textModeling is a major tool in ecology to describe and understand ecosystems or predict their response. We here focused our attention on non-linear sigmoidal models in macroecology, in order to better define them, understand their limitations and suggest improvements. We first studied them in hierarchical Bayesian biodiversity models. We found that taking into account random variations of different parameters of sigmoidal functions has an impact on the estimation of the effects. We then turned our attention to binary binomial generalized linear models for which we compared the classical logistic function to other sigmoidal functions whose asymptotes were estimated. We found strong estimation errors induced by the use of the classical logistic function if the data are not consistent with this model. Finally, we applied these logistic functions with estimated asymptotes in the context of hierarchical joint species occurrence models, thanks to which we were able to demonstrate the usefulness of considering the estimation of asymptotes. However,the unstable results did not allow us to develop ecological conclusions. Throughout, we have used varioustools to better apprehend model evaluation and proposed that they should be used jointly. In conclusion, we have developed new forms of non-linear sigmoidal statistical models, which are new tools for the ecologist allowing to enrich his/her toolbox to better estimate the relationships between ecological variables andbiodiversity data
Bérubé, Valérie. "Modèles avancés en régression appliqués à la tarification IARD." Thesis, Université Laval, 2007. http://www.theses.ulaval.ca/2007/24329/24329.pdf.
Full textBlazere, Melanie. "Inférence statistique en grande dimension pour des modèles structurels. Modèles linéaires généralisés parcimonieux, méthode PLS et polynômes orthogonaux et détection de communautés dans des graphes." Thesis, Toulouse, INSA, 2015. http://www.theses.fr/2015ISAT0018/document.
Full textThis thesis falls within the context of high-dimensional data analysis. Nowadays we have access to an increasing amount of information. The major challenge relies on our ability to explore a huge amount of data and to infer their dependency structures.The purpose of this thesis is to study and provide theoretical guarantees to some specific methods that aim at estimating dependency structures for high-dimensional data. The first part of the thesis is devoted to the study of sparse models through Lasso-type methods. In Chapter 1, we present the main results on this topic and then we generalize the Gaussian case to any distribution from the exponential family. The major contribution to this field is presented in Chapter 2 and consists in oracle inequalities for a Group Lasso procedure applied to generalized linear models. These results show that this estimator achieves good performances under some specific conditions on the model. We illustrate this part by considering the case of the Poisson model. The second part concerns linear regression in high dimension but the sparsity assumptions is replaced by a low dimensional structure underlying the data. We focus in particular on the PLS method that attempts to find an optimal decomposition of the predictors given a response. We recall the main idea in Chapter 3. The major contribution to this part consists in a new explicit analytical expression of the dependency structure that links the predictors to the response. The next two chapters illustrate the power of this formula by emphasising new theoretical results for PLS. The third and last part is dedicated to graphs modelling and especially to community detection. After presenting the main trends on this topic, we draw our attention to Spectral Clustering that allows to cluster nodes of a graph with respect to a similarity matrix. In this thesis, we suggest an alternative to this method by considering a $l_1$ penalty. We illustrate this method through simulations
Schülke, Christophe. "Statistical physics of linear and bilinear inference problems." Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCC058.
Full textThe recent development of compressed sensing has led to spectacular advances in the under standing of sparse linear estimation problems as well as in algorithms to solve them. It has also triggered anew wave of developments in the related fields of generalized linear and bilinear inference problems. These problems have in common that they combine a linear mixing step and a nonlinear, probabilistic sensing step, producing indirect measurements of a signal of interest. Such a setting arises in problems such as medical or astronomical Imaging. The aim of this thesis is to propose efficient algorithms for this class of problems and to perform their theoretical analysis. To this end, it uses belief propagation, thanks to which high-dimensional distributions can be sampled efficiently, thus making a bayesian approach to inference tractable. The resulting algorithms undergo phase transitions that can be analyzed using the replica method, initially developed in statistical physics of disordered systems. The analysis reveals phases in which inference is easy, hard or impossible, corresponding to different energy landscapes of the problem. The main contributions of this thesis can be divided into three categories. First, the application of known algorithms to concrete problems : community detection, superposition codes and an innovative imaging system. Second, a new, efficient message-passing algorithm for blind sensor calibration, that could be used in signal processing for a large class of measurement systems. Third, a theoretical analysis of achievable performances in matrix compressed sensing and of instabilities in bayesian bilinear inference algorithms
Rekik, Donia. "Vers une approche dynamique du processus de la notation souveraine." Thesis, Paris 8, 2018. http://www.theses.fr/2018PA080062.
Full textThe object of this study is to propose a conceptual and statistical framework to better understand the sovereign rating process. This thesis suggests a multi-levels-approach in the perspective (i) of unveiling the limits of expertise of the credit rating agencies due to the noticed differences and to the rating errors. It will also (ii) conduct a classic reconstitution of the sovereign ratings and (iii) will revisit the rating process according to a dynamic reconstitution of the scores. The results of the classic reconstitution revealed that the ratings of the developing countries showed their economic and financial situation whereas it showed the subjective intervention of the experts when it came to developed countries. Studies conducted in a dynamic perspective are based on the construction and the modeling of the rating migration. A first study driven by the MDS method, has allowed to discover the type of ratings used. The four types of identified systems allow distinguishing the most stable countries from the most vulnerable. A second study has consisted on modeling the rating systems in a context of the scores made through ACD model and an ordered Probit model. The results highlight an acceleration of the lowering score for the episodes especially in times of crisis. The lack of heterogeneity in the model raised awareness regarding the ratings of socioeconomic situations and created an advanced composite index. The rating migration reflect the long-term evolution of a country, they also transmit a more important and a larger informational content than a simple rating
Peyre, Julie. "Analyse statistique des données issues des biopuces à ADN." Phd thesis, Université Joseph Fourier (Grenoble), 2005. http://tel.archives-ouvertes.fr/tel-00012041.
Full textDans un premier chapitre, nous étudions le problème de la normalisation des données dont l'objectif est d'éliminer les variations parasites entre les échantillons des populations pour ne conserver que les variations expliquées par les phénomènes biologiques. Nous présentons plusieurs méthodes existantes pour lesquelles nous proposons des améliorations. Pour guider le choix d'une méthode de normalisation, une méthode de simulation de données de biopuces est mise au point.
Dans un deuxième chapitre, nous abordons le problème de la détection de gènes différentiellement exprimés entre deux séries d'expériences. On se ramène ici à un problème de test d'hypothèses multiples. Plusieurs approches sont envisagées : sélection de modèles et pénalisation, méthode FDR basée sur une décomposition en ondelettes des statistiques de test ou encore seuillage bayésien.
Dans le dernier chapitre, nous considérons les problèmes de classification supervisée pour les données de biopuces. Pour remédier au problème du "fléau de la dimension", nous avons développé une méthode semi-paramétrique de réduction de dimension, basée sur la maximisation d'un critère de vraisemblance locale dans les modèles linéaires généralisés en indice simple. L'étape de réduction de dimension est alors suivie d'une étape de régression par polynômes locaux pour effectuer la classification supervisée des individus considérés.
Jiang, Wei. "Statistical inference with incomplete and high-dimensional data - modeling polytraumatized patients." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASM013.
Full textThe problem of missing data has existed since the beginning of data analysis, as missing values are related to the process of obtaining and preparing data. In applications of modern statistics and machine learning, where the collection of data is becoming increasingly complex and where multiple sources of information are combined, large databases often have an extraordinarily high number of missing values. These data therefore present important methodological and technical challenges for analysis: from visualization to modeling including estimation, variable selection, predictive capabilities, and implementation through implementations. Moreover, although high-dimensional data with missing values are considered common difficulties in statistical analysis today, only a few solutions are available.The objective of this thesis is to provide new methodologies for performing statistical inferences with missing data and in particular for high-dimensional data. The most important contribution is to provide a comprehensive framework for dealing with missing values from estimation to model selection based on likelihood approaches. The proposed method doesn't rely on a specific pattern of missingness, and allows a good balance between quality of inference and computational efficiency.The contribution of the thesis consists of three parts. In Chapter 2, we focus on performing a logistic regression with missing values in a joint modeling framework, using a stochastic approximation of the EM algorithm. We discuss parameter estimation, variable selection, and prediction for incomplete new observations. Through extensive simulations, we show that the estimators are unbiased and have good confidence interval coverage properties, which outperforms the popular imputation-based approach. The method is then applied to pre-hospital data to predict the risk of hemorrhagic shock, in collaboration with medical partners - the Traumabase group of Paris hospitals. Indeed, the proposed model improves the prediction of bleeding risk compared to the prediction made by physicians.In chapters 3 and 4, we focus on model selection issues for high-dimensional incomplete data, which are particularly aimed at controlling for false discoveries. For linear models, the adaptive Bayesian version of SLOPE (ABSLOPE) we propose in Chapter 3 addresses these issues by embedding the sorted l1 regularization within a Bayesian spike-and-slab framework. Alternatively, in Chapter 4, aiming at more general models beyond linear regression, we consider these questions in a model-X framework, where the conditional distribution of the response as a function of the covariates is not specified. To do so, we combine knockoff methodology and multiple imputations. Through extensive simulations, we demonstrate satisfactory performance in terms of power, FDR and estimation bias for a wide range of scenarios. In the application of the medical data set, we build a model to predict patient platelet levels from pre-hospital and hospital data.Finally, we provide two open-source software packages with tutorials, in order to help decision making in medical field and users facing missing values
Plichard, Laura. "Modélisation multi-échelles de la sélection de l’habitat hydraulique des poissons de rivière." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSE1284/document.
Full textThe habitat concept, which defines the place where organisms live, is composed by abiotic and biotic conditions and differs for examples between species or activities. The habitat selection is the process where organisms choose the habitat to live in function of all habitats available around them. This habitat selection depends on an individual choice related to the organism, for example its behavior and a common choice related to organisms sharing common traits as individuals from the same species. Specific habitat selection models are developed to understand and represent this common choice and used to build ecological flow tools. For freshwater fish, most of specific habitat selection models have low transferability between reaches and rivers. Indeed, they are built from abundance data and sampled in the same study reach during few numbers of surveys. In order to improve predictive quality of models, I developed an attractive modelling approach, both multi-reach and multi-survey, involving the non-linear response of habitat selection and abundance data overdispersion. Then, despite the high individual variability of habitat selection, I showed, from telemetry data, the relevance of developing specific habitat selection models. Finally, as the habitat selection is also depending on processes which influence community structures at the landscape scale (e.g. dispersal), I demonstrate the benefits of sampling methods such as snorkeling to characterize community structures and their longitudinal distributions at a large spatial scale. These techniques will allow studying the influence of landscape processes on habitat selection models
Varnet, Léo. "Identification des indices acoustiques utilisés lors de la compréhension de la parole dégradée." Thesis, Lyon 1, 2015. http://www.theses.fr/2015LYO10221/document.
Full textThere is today a broad consensus in the scientific community regarding the involvement of acoustic cues in speech perception. Up to now, however, the precise mechanisms underlying the transformation from continuous acoustic stream into discrete linguistic units remain largely undetermined. This is partly due to the lack of an effective method for identifying and characterizing the auditory primitives of speech. Since the earliest studies on the acoustic–phonetic interface by the Haskins Laboratories in the 50’s, a number of approaches have been proposed; they are nevertheless inherently limited by the non-naturalness of the stimuli used, the constraints of the experimental apparatus, and the a priori knowledge needed. The present thesis aimed at introducing a new method capitalizing on the speech-in-noise situation for revealing the acoustic cues used by the listeners.As a first step, we adapted the Classification Image technique, developed in the visual domain, to a phoneme categorization task in noise. The technique relies on a Generalized Linear Model to link each participant’s response to the specific configuration of noise, on a trial-by-trail basis, thereby estimating the perceptual weighting of the different time-frequency regions for the decision. We illustrated the effectiveness of our Auditory Classification Image method through 2 examples: a /aba/-/ada/ categorization and a /da/-/ga/ categorization in context /al/ or /aʁ/. Our analysis confirmed that the F2 and F3 onsets were crucial for the tasks, as suggested in previous studies, but also revealed unexpected cues. In a second step, we relied on this new method to compare the results of musical experts (N=19) or dyslexics participants (N=18) to those of controls. This enabled us to explore the specificities of each group’s listening strategies.All the results taken together show that the Auditory Classification Image method may be a more precise and more straightforward approach to investigate the mechanisms at work at the acoustic-phonetic interface
Jin, Qianying. "Ordinal classification with non-parametric frontier methods : overview and new proposals." Thesis, Lille 1, 2020. http://www.theses.fr/2020LIL1A007.
Full textFollowing the idea of separating two groups with a hypersurface, the convex (C) frontier generated from the data envelopment analysis (DEA) method is employed as a separating hypersurface in classification. No assumption on the shape of the separating hypersurface is required while using a DEA frontier. Moreover, its reasoning of the membership is quite clear by referring to a benchmark observation. Despite these strengths, the DEA frontier-based classifier does not always perform well in classification. Therefore, this thesis focuses on modifying the existing frontier-based classifiers and proposing novel frontier-based classifiers for the ordinal classification problem. In the classification literature, all axioms used to construct the C DEA frontier are kept in generating a separating frontier, without arguing their correspondence with the related background information. This motivates our work in Chapter 2 where the connections between the axioms and the background information are explored. First, by reflecting on the monotonic relation, both input-type and output-type characteristic variables are incorporated. Moreover, the minimize sum of deviations model is proposed to detect the underlying monotonic relation if this relation is not priori given. Second, a nonconvex (NC) frontier classifier is constructed by relaxing the commonly used convexity assumption. Third, the directional distance function (DDF) measure is introduced for providing further managerial implications, although it does not change the classification results comparing to the radial measure. The empirical results show that the NC frontier classifier has the highest classification accuracy. A comparison with six classic classifiers also reveals the superiority of applying the NC frontier classifier. While the relation of the characteristic variables often suggests consideration of a monotonic relation, its parallel problem of considering a non-monotonic relation is rarely considered. In Chapter 3, a generalized disposal assumption which limits the disposability within a value range is developed for characterizing the non-monotonic relation. Instead of having a single separating frontier, a NC separating hull which consists of several frontiers is constructed to separate the groups. By adding the convexity assumption, a C separating hull is then constructed. An illustrative example is used to test the performance. The NC hull classifier outperforms the C hull classifier. Moreover, a comparison with some existing frontier classifiers also reveals the superiority of applying the proposed NC hull classifier. Chapter 4 proposes novel frontier classifiers for accommodating different mixes of classification information. To be specific, by reflecting on the monotonic relation, a NC classifier is constructed. If there is a priori information of the substitution relation, then a C classifier is generated. Both the NC and C classifiers generate two frontiers where each envelops one group of observations. The intersection of two frontiers is known as the overlap which may lead to misclassifications. The overlap is reduced by allowing the two frontiers to shift inwards to the extent that the total misclassification cost is minimized. The shifted cost-sensitive frontiers are then used to separate the groups. The discriminant rules are also designed to incorporate the cost information. The empirical results show that the NC classifier provides a better separation than the C one does. Moreover, the proposed DDF measure outperforms the commonly used radial measure in providing a reasonable separation
Nicolas, Delphine. "Des poissons sous influence ? : une analyse à large échelle des relations entre les gradients abiotiques et l’ichtyofaune des estuaires tidaux européens." Thesis, Bordeaux 1, 2010. http://www.theses.fr/2010BOR14045/document.
Full textBased on a macroecological approach, this thesis aims at determining the influence of the abiotic environment on the structure of fish assemblages among European tidal estuaries. The abiotic environment of 135 North-Eastern Atlantic estuaries from Portugal to Scotland was characterised by fifteen descriptors using an ecohydrological approach. The fish assemblages of about a hundred estuaries were characterised by fish data collected during scientific surveys conducted in the context of the European Water Framework Directive (WFD). Nonetheless, differences among sampling protocols resulted in highly heterogeneous datasets. To limit this heterogeneity, a rigorous selections and standardisation processes were carried out. Fish assemblages were described by total or functional indices related to species richness or abundance. Relationships were identified between large-scale and intra-estuarine abiotic gradients and fish attributes by fitting generalised linear models. Results showed that the total number of species, and more especially of marine and diadromous species, increased with the estuary size. Moreover, the total species richness appeared higher in estuaries associated to a wide continental shelf. The greatest total densities, and more particularly total densities of resident and marine species, were associated to estuaries with a great proportion of intertidal areas. Fish assemblages appeared also strongly structured by the salinity gradient in terms of both species richness and density. Furthermore, this thesis brought some evidence of northward migration of estuarine fish species in the context of the global warming. The results of this thesis will contribute to improve the fish indicators that are currently developed in the context of the European WFD
Sattar, Abdul. "Évitement fiscal des entreprises : déterminants et conséquences pour les pays de l'Union européenne." Thesis, Lille, 2020. http://www.theses.fr/2020LILUA020.
Full textMultinational corporations (MNCs) seek the opportunity to expand their operations on foreign soil to accomplish their strategic expansion needs. In this regard, they undertake foreign direct investment (FDI) in the countries where they find conducive business conditions. From the perspective of countries, FDI is one of the important factors for achieving development objectives.However, for the past few years, the MNCs are being criticised due to tax avoidance. The MNCs channel FDI through offshore financial centres (OFCs) which involve no real economic activities. The flow of FDI towards OFCs has been abnormal, which is hard to explain through orthodox MNC theories because they only focus on the conventional determinants of FDI with hardly analysing the role of tax havens. Multinational firms exploit the competitiveness of tax havens and establish a network of subsidiaries. Because of such activities, the non-tax haven countries suffer billions of dollars of corporate revenue losses every year. The European Union (EU) holds a distinctive position in the debate on tax avoidance as some of its member countries like Luxembourg and the Netherlands are the hub of unreal FDI. With her unique single market, the EU has become one of the hotspots for tax avoidance for the MNCs. Besides, the EU is an active player against tax avoidance not only at the regional level, but she also has a strong position in the international community. Against this background, this thesis analyses the determinants of tax avoidance, its consequences, and the policy response of the EU. With literature survey, we build an analytical framework to understand better the drivers of tax avoidance behaviour of EU-based MNCs. We hypothesise that corporate tax avoidance (tax-avoidance motivated FDI) is determined by the interface of firm-specific advantages and country-competitive characteristic of tax havens. Panel data of firms and ownership information was used to test the hypothesis through hybrid regression model (generalised linear mixed model). We show that the strength of firm-specific advantages and tax haven affiliates determine the level of tax avoidance. The high-tech or medium-tech firms with a number of subsidiaries in tax havens avoid more taxes. The intangible assets play also a crucial role. Addressing the issue of the impact of tax avoidance activities on fiscal resources of the non-haven EU countries, we use unique inward FDI from OFCs and FDI income return data to scale the corporate revenue losses. Using country and year fixed-effects linear regression models, we find that increase in the share of inward FDI from OFCs deflates the rate of return on FDI income. The negative relationship between these two is due to the tax avoidance activities of MNCs. In absolute terms, the large economies suffer more. However, in relative terms of gross domestic product, the smaller economies mark significant fiscal revenue losses. To fight against tax avoidance, the EU initiated several policy measures but had limited success. We elaborate on the reasons by using Multiple Streams Framework (MSF). We show, at the beginning, the primary focus was on the tax harmonisation. Several directives were adopted to eliminate the distortions in the single market. Tax avoidance received attention after the financial crisis. We conclude that the engraved structural constraints in the decision-making process preclude the success of policy outputs against tax avoidance
Davranche, Aurélie. "Suivi de la gestion des zones humides camarguaises par télédétection en référence à leur intérêt avifaunistique." Phd thesis, Université de Provence - Aix-Marseille I, 2008. http://tel.archives-ouvertes.fr/tel-00292694.
Full textKarimi, Maryam. "Modélisation conjointe de trajectoire socioprofessionnelle individuelle et de la survie globale ou spécifique." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLS120/document.
Full textBeing in low socioeconomic position is associated with increased mortality risk from various causes of death. Previous studies have already shown the importance of considering different dimensions of socioeconomic trajectories across the life-course. Analyses of professional trajectories constitute a crucial step in order to better understand the association between socio-economic position and mortality. The main challenge in measuring this association is then to decompose the respectiveshare of these factors in explaining the survival level of individuals. The complexity lies in the bidirectional causality underlying the observed associations:Are mortality differentials due to differences in the initial health conditions that are jointly influencing employment status and mortality, or the professional trajectory influences directly health conditions and then mortality?Standard methods do not consider the interdependence of changes in occupational status and the bidirectional causal effect underlying the observed association and that leads to substantial bias in estimating the causal link between professional trajectory and mortality. Therefore, it is necessary to propose statistical methods that consider simultaneously repeated measurements (careers) and survivalvariables. This study was motivated by the Cosmop-DADS database, which is a sample of the French salaried population.The first aim of this dissertation was to consider the whole professional trajectories and an accurate occupational classification, instead of using limitednumber of stages during life course and a simple occupational classification that has been considered previously. For this purpose, we defined time-dependent variables to capture different life course dimensions, namely critical period, accumulation model and social mobility model, and we highlighted the association between professional trajectories and cause-specific mortality using the definedvariables in a Cox proportional hazards model.The second aim was to incorporate the employment episodes in a longitudinal sub-model within the joint model framework to reduce the bias resulting from the inclusion of internal time-dependent covariates in the Cox model. We proposed a joint model for longitudinal nominal outcomes and competing risks data in a likelihood-based approach. In addition, we proposed an approach mimicking meta-analysis to address the calculation problems in joint models and large datasets, by extracting independent stratified samples from the large dataset, applying the joint model on each sample and then combining the results. In the same objective, that is fitting joint model on large-scale data, we propose a procedure based on the appeal of the Poisson regression model. This approach consist of finding representativetrajectories by means of clustering methods and then applying the joint model on these representative trajectories
Loingeville, Florence. "Modèle linéaire généralisé hiérarchique Gamma-Poisson pour le contrôle de qualité en microbiologie." Thesis, Lille 1, 2016. http://www.theses.fr/2016LIL10005/document.
Full textIn this thesis, we propose an analysis of variance method for discrete data from quality control in microbiology. To identify the issues of this work, we start by studying the analysis of variance method currently used in microbiology, its benefits, drawbacks, and limits. We propose a first model to respond the problem, corresponding to a linear model with two nested fixed factors. We use the analyse of deviance method to develop significance tests, that proved to be efficient on data sets of proficiency testings in microbiology. We then introduce a new model involving random factors. The randomness of the factors allow to assess and to caracterize the overdispersion observed in results of counts from proficiency testings in microbiology, that is one of the main objectives of this work. The new model corresponds to a Gamma-Poisson Hierarchical Generalized Linear Model with three random factors. We propose a method based on this model to estimate dispersion parameters, fixed, and random effects. We show practical applications of this method to data sets of proficiency testings in microbiology, that prove the goodness of fit of the model to real data. We also develop significance tests of the random factors from this new model, and a new method to assess the performance of the laboratories taking part in a proficiency testing. We finally introduce a near-exact distribution for the product of independent generalized Gamma random variables, in order to characterize the intensity of the Poisson distribution of the model. This approximation, developped from a factorization of the characteristic function, is very precise and can be used to detect outliers
Perthame, Emeline. "Stabilité de la sélection de variables pour la régression et la classification de données corrélées en grande dimension." Thesis, Rennes 1, 2015. http://www.theses.fr/2015REN1S122/document.
Full textThe analysis of high throughput data has renewed the statistical methodology for feature selection. Such data are both characterized by their high dimension and their heterogeneity, as the true signal and several confusing factors are often observed at the same time. In such a framework, the usual statistical approaches are questioned and can lead to misleading decisions as they are initially designed under independence assumption among variables. The goal of this thesis is to contribute to the improvement of variable selection methods in regression and supervised classification issues, by accounting for the dependence between selection statistics. All the methods proposed in this thesis are based on a factor model of covariates, which assumes that variables are conditionally independent given a vector of latent variables. A part of this thesis focuses on the analysis of event-related potentials data (ERP). ERPs are now widely collected in psychological research to determine the time courses of mental events. In the significant analysis of the relationships between event-related potentials and experimental covariates, the psychological signal is often both rare, since it only occurs on short intervals and weak, regarding the huge between-subject variability of ERP curves. Indeed, this data is characterized by a temporal dependence pattern both strong and complex. Moreover, studying the effect of experimental condition on brain activity for each instant is a multiple testing issue. We propose to decorrelate the test statistics by a joint modeling of the signal and time-dependence among test statistics from a prior knowledge of time points during which the signal is null. Second, an extension of decorrelation methods is proposed in order to handle a variable selection issue in the linear supervised classification models framework. The contribution of factor model assumption in the general framework of Linear Discriminant Analysis is studied. It is shown that the optimal linear classification rule conditionally to these factors is more efficient than the non-conditional rule. Next, an Expectation-Maximization algorithm for the estimation of the model parameters is proposed. This method of data decorrelation is compatible with a prediction purpose. At last, the issues of detection and identification of a signal when features are dependent are addressed more analytically. We focus on the Higher Criticism (HC) procedure, defined under the assumptions of a sparse signal of low amplitude and independence among tests. It is shown in the literature that this method reaches theoretical bounds of detection. Properties of HC under dependence are studied and the bounds of detectability and estimability are extended to arbitrarily complex situations of dependence. Finally, in the context of signal identification, an extension of Higher Criticism Thresholding based on innovations is proposed
Le, Rest Kévin. "Méthodes statistiques pour la modélisation des facteurs influençant la distribution et l’abondance de populations : application aux rapaces diurnes nichant en France." Thesis, Poitiers, 2013. http://www.theses.fr/2013POIT2330/document.
Full textIn the context of global biodiversity loss, more and more surveys are done at a broad spatial extent and during a long time period, which is done in order to understand processes driving the distribution, the abundance and the trends of populations at the relevant biological scales. These studies allow then defining more precise conservation status for species and establish pertinent conservation measures. However, the statistical analysis of such datasets leads some concerns. Usually, generalized linear models (GLM) are used, trying to link the variable of interest (e.g. presence/absence or abundance) with some external variables suspected to influence it (e.g. climatic and habitat variables). The main unresolved concern is about the selection of these external variables from a spatial dataset. This thesis details several possibilities and proposes a widely usable method based on a cross-validation procedure accounting for spatial dependencies. The method is evaluated through simulations and applied on several case studies, including datasets with higher than expected variability (overdispersion). A focus is also done for methods accounting for an excess of zeros (zero-inflation). The last part of this manuscript applies these methodological developments for modelling the distribution, abundance and trend of raptors breeding in France
Kristensen, Emmanuelle. "Méthodologie de traitement conjoint des signaux EEG et oculométriques : applications aux tâches d'exploration visuelle libre." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAS020/document.
Full textOur research focuses on the issue of overlapping for evoked potential estimation. More specifically, this issue is a significant limitation for Eye-Fixation Related Potentials and Eye-Saccade Related Potentials estimations during a joint EEG and eye-tracking recording. Indeed, the usual estimation, by averaging the signal time-locked to the event of interest, is based on the assumption that a single evoked potential occurs during a trial. However, depending on the inter-stimulus intervals, this assumption is not always verified. This is especially the case in the context of Eye-Fixation Related Potentials and Eye-Saccade Related Potentials, given the fact that the intervals between fixations (or saccades) are not controlled by the experimenter and can be shorter than the latencies of the potentials of interest.The fact that this assumption is not verified gives a distorted estimate of the evoked potential due to overlaps between the evoked potentials.We have therefore used the Linear Model (GLM), a well-known linear regression method, to estimate the potentials evoked by ocular movements in order to take into account overlaps. First, we decided to introduce a term of Tikhonov regularization into this model in order to improve the signal-to-noise ratio of the estimate for a small number of trials. Then, we compared the GLM to the ADJAR algorithm in a context of joint EEG and eye-tracking recording during a task of visual exploration of natural scenes. The ADJAR ("ADJAcent Response") algorithm is an algorithm for iterative estimation of temporal overlaps developed in 1993 by M. Woldorff. The results showed that the GLM model was more flexible and robust than the ADJAR algorithm in estimating Eye-Fixation Related Potentials. Further, two GLM configurations were compared in their estimation of evoked potential at the onset of the stimulus and the eye-fixation related potential at the beginning of the testing. Both configurations took into account the overlaps between evoked potentials, but one additionally distinguished the potential evoked by the first fixation of the exploration from the potential evoked by the following fixations. It became clear that the choice of the GLM configuration was a compromise between the estimation quality of the potentials and the assumptions about the underlying cognitive processes.Finally, we conducted an extensive joint EEG and eye-tracking experiment on the exploration of static and dynamic natural emotional facial expressions. We presented the first results for the static modality. After discussing the estimation method of the evoked potentials according to the impact of the ocular movements on their latency window, we studied the influence of the type of emotion. We found modulations of the differential EPN (Early Posterior Negativity) potential, between 230 and 350 ms after the stimulus onset and the Late Positivity Potential (LPP) , between 400 and 600 ms after the stimulus onset. We also observed variations for the Eye-Fixation Related Potentials. Regarding the LPP component, a marker of conscious recognition of emotion, we have shown that it is important to dissociate information that is immediately encoded at the onset of the emotional stimulus from information encoded at the first fixations. This shows a differentiated pattern of activation according to the emotional stimulus valence. This differentiation is in agreement with the hypothesis of a faster treatment of negative emotional stimuli than of positive emotional stimuli
Maumet, Camille. "From group to patient-specific analysis of brain function in arterial spin labelling and BOLD functional MRI." Phd thesis, Université Rennes 1, 2013. http://tel.archives-ouvertes.fr/tel-00863908.
Full textOuellette, Marie-Hélène. "L’arbre de régression multivariable et les modèles linéaires généralisés revisités : applications à l’étude de la diversité bêta et à l’estimation de la biomasse d’arbres tropicaux." Thèse, 2011. http://hdl.handle.net/1866/5906.
Full textIn ecology, in ecosystem services studies for example, descriptive, explanatory and predictive modelling all have relevance in different situations. Precise circumstances may require one or the other type of modelling; it is important to choose the method properly to insure that the final model fits the study’s goal. In this thesis, we first explore the explanatory power of the multivariate regression tree (MRT). This modelling technique is based on a recursive bipartitionning algorithm. The tree is fully grown by successive bipartitions and then it is pruned by resampling in order to reveal the tree providing the best predictions. This asymmetric analysis of two tables produces homogeneous groups in terms of the response that are constrained by splitting levels in the values of some of the most important explanatory variables. We show that to calculate the explanatory power of an MRT, an appropriate adjusted coefficient of determination must include an estimation of the degrees of freedom of the MRT model through an algorithm. This estimation of the population coefficient of determination is practically unbiased. Since MRT is based upon discontinuity premises whereas canonical redundancy analysis (RDA) models continuous linear gradients, the comparison of their explanatory powers enables one to distinguish between those two patterns of species distributions along the explanatory variables. The extensive use of RDA for the study of beta diversity motivated the comparison between its explanatory power and that of MRT. In an explanatory perspective again, we define a new procedure called a cascade of multivariate regression trees (CMRT). This procedure provides the possibility of computing an MRT model where an order is imposed to nested explanatory hypotheses. CMRT provides a framework to study the exclusive effect of a main and a subordinate set of explanatory variables by calculating their explanatory powers. The interpretation of the final model is done as in nested MANOVA. New information may arise from this analysis about the relationship between the response and the explanatory variables, for example interaction effects between the two explanatory data sets that were not evidenced by the usual MRT model. On the other hand, we study the predictive power of generalized linear models (GLM) to predict individual tropical tree biomass as a function of allometric shape variables. Particularly, we examine the capacity of gaussian and gamma error structures to provide the most precise predictions. We show that for a particular species, gamma error structure is superior in terms of predictive power. This study is part of a practical framework; it is meant to be used as a tool for managers who need to precisely estimate the amount of carbon recaptured by tropical tree plantations. Our conclusions could be integrated within a program of carbon emission reduction by land use changes.
Tomas, Julien. "Mesure des risques biometriques liés à l'assurance vie avec des méthodes non-paramétriques." Phd thesis, 2013. http://tel.archives-ouvertes.fr/tel-00778755.
Full textSautié, Castellanos Miguel. "Assessing the robustness of genetic codes and genomes." Thesis, 2020. http://hdl.handle.net/1866/24333.
Full textThere are two main approaches to assess the robustness of genetic codes and coding sequences. The statistical approach is based on empirical estimates of probabilities computed from random samples of permutations representing assignments of amino acids to codons, whereas, the optimization-based approach relies on the optimization percentage frequently computed by using metaheuristics. We propose a method based on the first two moments of the distribution of robustness values for all possible genetic codes. Based on a polynomially solvable instance of the Quadratic Assignment Problem, we propose also an exact greedy algorithm to find the minimum value of the genome robustness. To reduce the number of operations for computing the scores and Cantelli’s upper bound, we developed methods based on the genetic code neighborhood structure and pairwise comparisons between genetic codes, among others. For assessing the robustness of natural genetic codes and genomes, we have chosen 23 natural genetic codes, 235 amino acid properties, as well as 324 thermophilic and 418 non-thermophilic prokaryotes. Among our results, we found that although the standard genetic code is more robust than most genetic codes, some mitochondrial and nuclear genetic codes are more robust than the standard code at the third and first codon positions, respectively. We also observed that the synonymous codon usage tends to be highly optimized to buffer the impact of single-base changes, mainly, in thermophilic prokaryotes.