Dissertations / Theses: 'Generalised Estimating Equation'

1

Lange, Christoph. "Generalized estimating equation methods in statistical genetics." Thesis, University of Reading, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.269921.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Alnaji, Lulah A. "Generalized Estimating Equations for Mixed Models." Bowling Green State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1530292694012892.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Zheng, Xueying, and 郑雪莹. "Robust joint mean-covariance model selection and time-varying correlation structure estimation for dependent data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2013. http://hub.hku.hk/bib/B50899703.

Full text

Abstract:

In longitudinal and spatio-temporal data analysis, repeated measurements from a subject can be either regional- or temporal-dependent. The correct specification of the within-subject covariance matrix cultivates an efficient estimation for mean regression coefficients. In this thesis, robust estimation for the mean and covariance jointly for the regression model of longitudinal data within the framework of generalized estimating equations (GEE) is developed. The proposed approach integrates the robust method and joint mean-covariance regression modeling. Robust generalized estimating equations using bounded scores and leverage-based weights are employed for the mean and covariance to achieve robustness against outliers. The resulting estimators are shown to be consistent and asymptotically normally distributed. Robust variable selection method in a joint mean and covariance model is considered, by proposing a set of penalized robust generalized estimating equations to estimate simultaneously the mean regression coefficients, the generalized autoregressive coefficients and innovation variances introduced by the modified Cholesky decomposition. The set of estimating equations select important covariate variables in both mean and covariance models together with the estimating procedure. Under some regularity conditions, the oracle property of the proposed robust variable selection method is developed. For these two robust joint mean and covariance models, simulation studies and a hormone data set analysis are carried out to assess and illustrate the small sample performance, which show that the proposed methods perform favorably by combining the robustifying and penalized estimating techniques together in the joint mean and covariance model. Capturing dynamic change of time-varying correlation structure is both interesting and scientifically important in spatio-temporal data analysis. The time-varying empirical estimator of the spatial correlation matrix is approximated by groups of selected basis matrices which represent substructures of the correlation matrix. After projecting the correlation structure matrix onto the space spanned by basis matrices, varying-coefficient model selection and estimation for signals associated with relevant basis matrices are incorporated. The unique feature of the proposed model and estimation is that time-dependent local region signals can be detected by the proposed penalized objective function. In theory, model selection consistency on detecting local signals is provided. The proposed method is illustrated through simulation studies and a functional magnetic resonance imaging (fMRI) data set from an attention deficit hyperactivity disorder (ADHD) study.
published_or_final_version
Statistics and Actuarial Science
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

4

Zhang, Xiaohong. "Generalized estimating equations for clustered survival data." [Ames, Iowa : Iowa State University], 2006.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

5

Jang, Mi Jin. "Working correlation selection in generalized estimating equations." Diss., University of Iowa, 2011. https://ir.uiowa.edu/etd/2719.

Full text

Abstract:

Longitudinal data analysis is common in biomedical research area. Generalized estimating equations (GEE) approach is widely used for longitudinal marginal models. The GEE method is known to provide consistent regression parameter estimates regardless of the choice of working correlation structure, provided the square root of n consistent nuisance parameters are used. However, it is important to use the appropriate working correlation structure in small samples, since it improves the statistical efficiency of β estimate. Several working correlation selection criteria have been proposed (Rotnitzky and Jewell, 1990; Pan, 2001; Hin and Wang, 2009; Shults et. al, 2009). However, these selection criteria have the same limitation in that they perform poorly when over-parameterized structures are considered as candidates. In this dissertation, new working correlation selection criteria are developed based on generalized eigenvalues. A set of generalized eigenvalues is used to measure the disparity between the bias-corrected sandwich variance estimator under the hypothesized working correlation matrix and the model-based variance estimator under a working independence assumption. A summary measure based on the set of the generalized eigenvalues provides an indication of the disparity between the true correlation structure and the misspecified working correlation structure. Motivated by the test statistics in MANOVA, three working correlation selection criteria are proposed: PT (Pillai's trace type criterion),WR (Wilks' ratio type criterion) and RMR (Roy's Maximum Root type criterion). The relationship between these generalized eigenvalues and the CIC measure is revealed. In addition, this dissertation proposes a method to penalize for the over-parameterized working correlation structures. The over-parameterized structure converges to the true correlation structure, using extra parameters. Thus, the true correlation structure and the over-parameterized structure tend to provide similar variance estimate of the estimated β and similar working correlation selection criterion values. However, the over-parameterized structure is more likely to be chosen as the best working correlation structure by "the smaller the better" rule for criterion values. This is because the over-parameterization leads to the negatively biased sandwich variance estimator, hence smaller selection criterion value. In this dissertation, the over-parameterized structure is penalized through cluster detection and an optimization function. In order to find the group ("cluster") of the working correlation structures that are similar to each other, a cluster detection method is developed, based on spacings of the order statistics of the selection criterion measures. Once a cluster is found, the optimization function considering the trade-off between bias and variability provides the choice of the "best" approximating working correlation structure. The performance of our proposed criterion measures relative to other relevant criteria (QIC, RJ and CIC) is examined in a series of simulation studies.

APA, Harvard, Vancouver, ISO, and other styles

6

Sepato, Sandra Moepeng. "Generalized linear mixed model and generalized estimating equation for binary longitudinal data." Diss., University of Pretoria, 2014. http://hdl.handle.net/2263/43143.

Full text

Abstract:

The most common analysis used for binary data is generalised linear model (GLM) with either a binomial or bernoulli distribution using either a logit, probit, complementary log-log or other type of link functions. However, such analyses violate the independence assumption if the binary data are measured repeatedly over time at the same subject or site. Failure to take into account the correlation can lead to incorrect estimation of regression parameters and the estimates are less efficient, particularly when the correlations are large. Therefore, to obtain the most efficient estimates that are also unbiased the methods that incorporate correlations (McCullagh and Nelder, 1989) should be used. Two of the statistical methodologies that can be used to account for this correlation for the longitudinal data are the generalized linear mixed models (GLMMs) and generalized estimating equation (GEE). The GLMM method is based on extending the fixed effects GLM to include random effects and covariance patterns. Unlike the GLM and GLMM methods, the GEE method is based on the quasi-likelihood theory and no assumption is made about the distribution of response observations (Liang and Zeger, 1986). The main objective of the study is to investigate the statistical properties and limitations of these three approaches, i.e. GLM, GLMMs and GEE for analyzing longitudinal data through use of a binary data from an entomology study. The results reaffirms the point made by these authors that misspecification of working correlation in GEE approach would still give consistent regression parameter estimates. Further, the results of this study suggest that even with small correlation, ignoring a random effects in a binary model can lead to inconsistent estimation.
Dissertation (MSc)--University of Pretoria, 2014.
lk2014
Statistics
MSc
Unrestricted

APA, Harvard, Vancouver, ISO, and other styles

7

Huang, Danwei. "Robustness of generalized estimating equations in credibility models." Click to view the E-thesis via HKUTO, 2007. http://sunzi.lib.hku.hk/hkuto/record/B38842312.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Huang, Danwei, and 黃丹薇. "Robustness of generalized estimating equations in credibility models." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B38842312.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Clark, Seth K. "Model Robust Regression Based on Generalized Estimating Equations." Diss., Virginia Tech, 2002. http://hdl.handle.net/10919/26588.

Full text

Abstract:

One form of model robust regression (MRR) predicts mean response as a convex combination of a parametric and a nonparametric prediction. MRR is a semiparametric method by which an incompletely or an incorrectly specified parametric model can be improved through adding an appropriate amount of a nonparametric fit. The combined predictor can have less bias than the parametric model estimate alone and less variance than the nonparametric estimate alone. Additionally, as shown in previous work for uncorrelated data with linear mean function, MRR can converge faster than the nonparametric predictor alone. We extend the MRR technique to the problem of predicting mean response for clustered non-normal data. We combine a nonparametric method based on local estimation with a global, parametric generalized estimating equations (GEE) estimate through a mixing parameter on both the mean scale and the linear predictor scale. As a special case, when data are uncorrelated, this amounts to mixing a local likelihood estimate with predictions from a global generalized linear model. Cross-validation bandwidth and optimal mixing parameter selectors are developed. The global fits and the optimal and data-driven local and mixed fits are studied under no/some/substantial model misspecification via simulation. The methods are then illustrated through application to data from a longitudinal study.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

10

Zhao, Chen. "Evaluating Health Policy Effect with Generalized Linear Model and Generalized Estimating Equation Model." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1586377218891854.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Cai, Jianwen. "Generalized estimating equations for censored multivariate failure time data /." Thesis, Connect to this title online; UW restricted, 1992. http://hdl.handle.net/1773/9581.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Hua, Lei. "Spline-based sieve semiparametric generalized estimating equation for panel count data." Diss., University of Iowa, 2010. https://ir.uiowa.edu/etd/517.

Full text

Abstract:

In this thesis, we propose to analyze panel count data using a spline-based sieve generalized estimating equation method with a semiparametric proportional mean model E(N(t)|Z) = Λ0(t) eβT0Z. The natural log of the baseline mean function, logΛ0(t), is approximated by a monotone cubic B-spline function. The estimates of regression parameters and spline coefficients are the roots of the spline based sieve generalized estimating equations (sieve GEE). The proposed method avoids assumingany parametric structure of the baseline mean function and the underlying counting process. Selection of an appropriate covariance matrix that represents the true correlation between the cumulative counts improves estimating efficiency. In addition to the parameters existing in the proportional mean function, the estimation that accounts for the over-dispersion and autocorrelation involves an extra nuisance parameter σ2, which could be estimated using a method of moment proposed by Zeger (1988). The parameters in the mean function are then estimated by solving the pseudo generalized estimating equation with σ2 replaced by its estimate, σ2n. We show that the estimate of (β0,Λ0) based on this two-stage approach is still consistent and could converge at the optimal convergence rate in the nonparametric/semiparametric regression setting. The asymptotic normality of the estimate of β0 is also established. We further propose a spline-based projection variance estimating method and show its consistency. Simulation studies are conducted to investigate finite sample performance of the sieve semiparametric GEE estimates, as well as different variance estimating methods with different sample sizes. The covariance matrix that accounts for the overdispersion generally increases estimating efficiency when overdispersion is present in the data. Finally, the proposed method with different covariance matrices is applied to a real data from a bladder tumor clinical trial.

APA, Harvard, Vancouver, ISO, and other styles

13

Brady, Kaitlyn. "Learning Curves in Emergency Ultrasonography." Digital WPI, 2012. https://digitalcommons.wpi.edu/etd-theses/1150.

Full text

Abstract:

"This project utilized generalized estimating equations and general linear modeling to model learning curves for sonographer performance in emergency ultrasonography. Performance was measured in two ways: image quality (interpretable vs. possible hindrance in interpretation) and agreement of findings between the sonographer and an expert reviewing sonographer. Records from 109 sonographers were split into two data sets-- training (n=50) and testing (n=59)--to conduct exploratory analysis and fit the final models for analysis, respectively. We determined that the number of scans of a particular exam type required for a sonographer to obtain quality images on that exam type with a predicted probability of 0.9 is highly dependent upon the person conducting the review, the indication of the scan (educational or medical), and the outcome of the scan (whether there is a pathology positive finding). Constructing family-wise 95% confidence intervals for each exam type demonstrated a large amount of variation for the number of scans required both between exam types and within exam types. It was determined that a sonographer's experience with a particular exam type is not a significant predictor of future agreement on that exam type and thus no estimates were made based on the agreement learning curves. In addition, we concluded based on a type III analysis that when already considering exam type related experience, the consideration of experience on other exam types does not significantly impact the learning curve for quality. However, the learning curve for agreement is significantly impacted by the additional consideration of experience on other exam types."

APA, Harvard, Vancouver, ISO, and other styles

14

Akanda, Md Abdus Salam. "A generalized estimating equations approach to capture-recapture closed population models: methods." Doctoral thesis, Universidade de Évora, 2014. http://hdl.handle.net/10174/18297.

Full text

Abstract:

ABSTRACT; Wildlife population parameters, such as capture or detection probabilities, and density or population size, can be estimated from capture-recapture data. These estimates are of particular interest to ecologists and biologists who rely on ac- curate inferences for management and conservation of the population of interest. However, there are many challenges to researchers for making accurate inferences on population parameters. For instance, capture-recapture data can be considered as binary longitudinal observations since repeated measurements are collected on the same individuals across successive points in times, and these observations are often correlated over time. If these correlations are not taken into account when estimating capture probabilities, then parameter estimates will be biased, possibly producing misleading results. Also, an estimator of population size is generally biased under the presence of heterogeneity in capture probabilities. The use of covariates (or auxiliary variables), when available, has been proposed as an alternative way to cope with the problem of heterogeneous capture probabilities. In this dissertation, we are interested in tackling these two main problems, (i) when capture probabilities are dependent among capture occasions in closed population capture-recapture models, and (ii) when capture probabilities are heterogeneous among individuals. Hence, the capture-recapture literature can be improved, if we could propose an approach to jointly account for these problems. In summary, this dissertation proposes: (i) a generalized estimating equations (GEE) approach to model possible effects in capture-recapture closed population studies due to correlation over time and individual heterogeneity; (ii) the corresponding estimating equations for each closed population capture-recapture model; (iii) a comprehensive analysis on various real capture-recapture data sets using classical, GEE and generalized linear mixed models (GLMM); (iv) an evaluation of the effect of ac- counting for correlation structures on capture-recapture model selection based on the ‘Quasi-likelihood Information Criterion (QIC)’; (v) a comparison of the performance of population size estimators using GEE and GLMM approaches in the analysis of capture-recapture data. The performance of these approaches is evaluated by Monte Carlo (MC) simulation studies resembling real capture-recapture data. The proposed GEE approach provides a useful inference procedure for estimating population parameters, particularly when a large proportion of individuals are captured. For a low capture proportion, it is difficult to obtain reliable estimates for all approaches, but the GEE approach outperforms the other methods. Simulation results show that quasi-likelihood GEE provide lower standard error than partial likelihood based on generalized linear modelling (GLM) and GLMM approaches. The estimated population sizes vary on the nature of the existing correlation among capture occasions; RESUMO: Parâmetros populacionais em espécies de vida selvagens, como probabilidade captura ou deteção, e abundância ou densidade da população, podem ser estimados a partir de dados de captura-recaptura. Estas estimativas são de particular interesse para ecologistas e biólogos que dependem de inferências precisas a gestão e conservação das populações. No entanto, há muitos desafios par investigadores fazer inferências precisas de parâmetros populacionais. Por exemplo, os dados de captura-recaptura podem ser considerados como observa longitudinais binárias uma vez que são medições repetidas coletadas nos mesmos indivíduos em pontos sucessivos no tempo, e muitas vezes correlacionadas. Essas correlações não são levadas em conta ao estimar as probabilidades de tura, as estimativas dos parâmetros serão tendenciosas e possivelmente produz resultados enganosos. Também, um estimador do tamanho de uma população geralmente enviesado na presença de heterogeneidade das probabilidades de captura. A utilização de co-variáveis (ou variáveis auxiliares), quando disponível tem sido proposta como uma forma de lidar com o problema de probabilidade captura heterogéneas. Nesta dissertação, estamos interessados em abordar problemas principais em mode1os de captura-recapturar para população fecha (i) quando as probabilidades de captura são dependentes entre ocasiões de captura e (ii) quando as probabilidades de captura são heterogéneas entre os indivíduos Assim, a literatura de captura-recaptura pode ser melhorada, se pudéssemos por uma abordagem conjunta para estes problemas. Em resumo, nesta dissertação propõe-se: (i) uma abordagem de estimação de equações generalizadas (GEE) para modelar possíveis efeitos de correlação temporal e heterogeneidade individual nas probabilidades de captura; (ii) as correspondentes equações de estimação generalizadas para cada modelo de captura-recaptura em população fechadas; (iii) uma análise sobre vários conjuntos de dados reais de captura-recaptura usando a abordagem clássica, GEE e modelos linear generalizados misto (GLMM); (iv) uma avaliação do efeito das estruturas de correlação na seleção de modelos de captura-recaptura com base no ‘critério de informação da Quasi-verossimilhança (QIC); (v) uma comparação da performance das estimativas do tamanho da população usando GEE e GLMM. O desempenho destas abordagens ´e avaliado usando simulações Monte Carlo (MC) que se assemelham a dados de captura- recapture reais. A abordagem GEE proposto ´e um procedimento de inferência útil para estimar parâmetros populacionais, especialmente quando uma grande proporção de indivíduos ´e capturada. Para uma proporção baixa de capturas, ´e difícil obter estimativas fiáveis para todas as abordagens aplicadas, mas GEE supera os outros métodos. Os resultados das simulações mostram que o método da quase-verossimilhança do GEE fornece estimativas do erro padrão menor do que o método da verossimilhança parcial dos modelos lineares generalizados (GLM) e GLMM. As estimativas do tamanho da população variam de acordo com a natureza da correlação existente entre as ocasiões de captura.

APA, Harvard, Vancouver, ISO, and other styles

15

Penzl, T. "Numerical solution of generalized Lyapunov equations." Universitätsbibliothek Chemnitz, 1998. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-199800893.

Full text

Abstract:

Two efficient methods for solving generalized Lyapunov equations and their implementations in FORTRAN 77 are presented. The first one is a generalization of the Bartels--Stewart method and the second is an extension of Hammarling's method to generalized Lyapunov equations. Our LAPACK based subroutines are implemented in a quite flexible way. They can handle the transposed equations and provide scaling to avoid overflow in the solution. Moreover, the Bartels--Stewart subroutine offers the optional estimation of the separation and the reciprocal condition number. A brief description of both algorithms is given. The performance of the software is demonstrated by numerical experiments.

APA, Harvard, Vancouver, ISO, and other styles

16

Onnen, Nathaniel J. "Estimation of Bivariate Spatial Data." The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1616243660473062.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Lin, Wei-Lun. "Selecting the Working Correlation Structure by a New Generalized AIC Index for Longitudinal Data." Digital Archive @ GSU, 2007. http://digitalarchive.gsu.edu/math_theses/37.

Full text

Abstract:

The analysis of longitudinal data has been a popular subject for the recent years. The growth of the Generalized Estimating Equation (GEE) Liang & Zeger, 1986) is one of the most influential recent developments in statistical practice for this practice. GEE methods are attractive both from a theoretical and a practical standpoint. In this paper, we are interested in the influence of different "working" correlation structures for modeling the longitudinal data. Furthermore, we propose a new AIC-like method for the model assessment which generalized AIC from the point of view of the data generating. By comparing the difference of the log-likelihood functions between different correlation models, we define the exact value to create an interval for our model selection. In this thesis, we combine the GEE method and a new generalized AIC Index for the longitudinal data with different correlation structures.

APA, Harvard, Vancouver, ISO, and other styles

18

Cao, Jiguo. "Generalized profiling method and the applications to adaptive penalized smoothing, generalized semiparametric additive models and estimating differential equations." Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=102483.

Full text

Abstract:

Many statistical models involve three distinct groups of variables: local or nuisance parameters, global or structural parameters, and complexity parameters. In this thesis, we introduce the generalized profiling method to estimate these statistical models, which treats one group of parameters as an explicit or implicit function of other parameters. The dimensionality of the parameter space is reduced, and the optimization surface becomes smoother. The Newton-Raphson algorithm is applied to estimate these three distinct groups of parameters in three levels of optimization, with the gradients and Hessian matrices written out analytically by the Implicit Function Theorem it' necessary and allowing for different criteria for each level of optimization. Moreover, variances of global parameters are estimated by the Delta method and include the variation coming from complexity parameters. We also propose three applications of the generalized profiling method.
First, penalized smoothing is extended by allowing for a functional smoothing parameter, which is adaptive to the geometry of the underlying curve, which is called adaptive penalized smoothing. In the first level of optimization, the smooth ing coefficients are local parameters, estimated by minimizing sum of squared errors, conditional on the functional smoothing parameter. In the second level, the functional smoothing parameter is a complexity parameter, estimated by minimizing generalized cross-validation (GCV), treating the smoothing coefficients as explicit functions of the functional smoothing parameter. Adaptive penalized smoothing is shown to obtain better estimates for fitting functions and their derivatives.
Next, the generalized semiparametric additive models are estimated by three levels of optimization, allowing response variables in any kind of distribution. In the first level, the nonparametric functional parameters are nuisance parameters, estimated by maximizing the regularized likelihood function, conditional on the linear coefficients and the smoothing parameter. In the second level, the linear coefficients are structural parameters, estimated by maximizing the likelihood function with the nonparametric functional parameters treated as implicit functions of linear coefficients and the smoothing parameter. In the third level, the smoothing parameter is a complexity parameter, estimated by minimizing the approximated GCV with the linear coefficients treated as implicit functions of the smoothing parameter. This method is applied to estimate the generalized semiparametric additive model for the effect of air pollution on the public health.
Finally, parameters in differential equations (DE's) are estimated from noisy data with the generalized profiling method. In the first level of optimization, fitting functions are estimated to approximate DE solutions by penalized smoothing with the penalty term defined by DE's, fixing values of DE parameters. In the second level of optimization, DE parameters are estimated by weighted sum of squared errors, with the smoothing coefficients treated as an implicit function of DE parameters. The effects of the smoothing parameter on DE parameter estimates are explored and the optimization criteria for smoothing parameter selection are discussed. The method is applied to fit the predator-prey dynamic model to biological data, to estimate DE parameters in the HIV dynamic model from clinical trials, and to explore dynamic models for thermal decomposition of alpha-Pinene.

APA, Harvard, Vancouver, ISO, and other styles

19

Shin, Janey. "Evaluation of candidate genes in family studies, generalized estimating equations and bootstrap approaches." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0002/MQ40723.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Liu, Fangda, and 刘芳达. "Two results in financial mathematics and bio-statistics." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2011. http://hub.hku.hk/bib/B46976437.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Xu, Yanzhi. "Effective GPS-based panel survey sample size for urban travel behavior studies." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33843.

Full text

Abstract:

This research develops a framework to estimate the effective sample size of Global Positioning System (GPS) based panel surveys in urban travel behavior studies for a variety of planning purposes. Recent advances in GPS monitoring technologies have made it possible to implement panel surveys with lengths of weeks, months or even years. The many advantageous features of GPS-based panel surveys make such surveys attractive for travel behavior studies, but the higher cost of such surveys compared to conventional one-day or two-day paper diary surveys requires scrutiny at the sample size planning stage to ensure cost-effectiveness. The sample size analysis in this dissertation focuses on three major aspects in travel behavior studies: 1) to obtain reliable means for key travel behavior variables, 2) to conduct regression analysis on key travel behavior variables against explanatory variables such as demographic characteristics and seasonal factors, and 3) to examine impacts of a policy measure on travel behavior through before-and-after studies. The sample size analyses in this dissertation are based on the GPS data collected in the multi-year Commute Atlanta study. The sample size analysis with regard to obtaining reliable means for key travel behavior variables utilizes Monte Carlo re-sampling techniques to assess the trend of means against various sample size and survey length combinations. The basis for the framework and methods of sample size estimation related to regression analysis and before-and-after studies are derived from various sample size procedures based on the generalized estimating equation (GEE) method. These sample size procedures have been proposed for longitudinal studies in biomedical research. This dissertation adapts these procedures to the design of panel surveys for urban travel behavior studies with the information made available from the Commute Atlanta study. The findings from this research indicate that the required sample sizes should be much larger than the sample sizes in existing GPS-based panel surveys. This research recommends a desired range of sample sizes based on the objectives and survey lengths of urban travel behavior studies.

APA, Harvard, Vancouver, ISO, and other styles

22

Brewer, Ciara. "Using generalized estimating equations with regression splines to improve analysis of butterfly transect data /." St Andrews, 2008. http://hdl.handle.net/10023/488.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Campbell, David Alexander. "Bayesian collocation tempering and generalized profiling for estimation of parameters from differential equation models." Thesis, McGill University, 2007. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=103368.

Full text

Abstract:

The widespread use of ordinary differential equation (ODE) models has long been underrepresented in the statistical literature. The most common methods for estimating parameters from ODE models are nonlinear least squares and an MCMC based method. Both of these methods depend on a likelihood involving the numerical solution to the ODE. The challenge faced by these methods is parameter spaces that are difficult to navigate, exacerbated by the wide variety of behaviours that a single ODE model can produce with respect to small changes in parameter values.
In this work, two competing methods, generalized profile estimation and Bayesian collocation tempering are described. Both of these methods use a basis expansion to approximate the ODE solution in the likelihood, where the shape of the basis expansion, or data smooth, is guided by the ODE model. This approximation to the ODE, smooths out the likelihood surface, reducing restrictions on parameter movement.
Generalized Profile Estimation maximizes the profile likelihood for the ODE parameters while profiling out the basis coefficients of the data smooth. The smoothing parameter determines the balance between fitting the data and the ODE model, and consequently is used to build a parameter cascade, reducing the dimension of the estimation problem. Generalized profile estimation is described with under a constraint to ensure the smooth follows known behaviour such as monotonicity or non-negativity.
Bayesian collocation tempering, uses a sequence posterior densities with smooth approximations to the ODE solution. The level of the approximation is determined by the value of the smoothing parameter, which also determines the level of smoothness in the likelihood surface. In an algorithm similar to parallel tempering, parallel MCMC chains are run to sample from the sequence of posterior densities, while allowing ODE parameters to swap between chains. This method is introduced and tested against a variety of alternative Bayesian models, in terms of posterior variance and rate of convergence.
The performance of generalized profile estimation and Bayesian collocation tempering are tested and compared using simulated data sets from the FitzHugh-Nagumo ODE system and real data from nylon production dynamics.

APA, Harvard, Vancouver, ISO, and other styles

24

Barbosa, Luciano [UNESP]. "Metodologias estatísticas na análise de germinação de sementes de mamona." Universidade Estadual Paulista (UNESP), 2010. http://hdl.handle.net/11449/101848.

Full text

Abstract:

Made available in DSpace on 2014-06-11T19:31:37Z (GMT). No. of bitstreams: 0 Previous issue date: 2010-11-16Bitstream added on 2014-06-13T21:02:57Z : No. of bitstreams: 1 barbosa_l_dr_botfca.pdf: 2587351 bytes, checksum: 76e343f1e0edbbbee5cb996188d8efd2 (MD5)
É bastante comum na área agrícola, experimentos cujas variáveis respostas são contagens ou proporções. Para esse tipo de dados, utiliza-se a metodologia de modelos lineares generalizados quando as respostas são independentes. Por outro lado, quando as respostas são dependentes, há uma correlação entre as observações e isso tem que ser levado em consideração na análise, para evitar inferências incorretas sobre os coeficientes de regressão. Na literatura há técnicas disponíveis para a modelagem e análise desses dados, sendo os modelos disponíveis extensões dos modelos lineares generalizados. No presente trabalho, utiliza-se a metodologia de equação de estimação generalizada, que inclui no modelo uma matriz de correlação para a obtenção de um melhor ajuste. Outra alternativa, também abordada neste trabalho, é a utilização de um modelo linear generalizado misto, no qual o uso de efeitos aleatórios também introduz uma correlação entre observações que tenham algum efeito em comum. Essas duas metodologias são aplicadas a um conjunto de dados obtidos de um experimento para avaliar certas condições na germinação de sementes de mamona da cultivar AL Guarany 2002, com o objetivo de se verificar qual o melhor modelo de estimação para esses dados
Experiments whose response variables are counts or proportions are very common in agriculture. For this type of data, if the observational units are independent, the methodology of generalized linear models can be appropriate. On the other hand, when responses are dependent or clustered, there is a correlation between the observations and that has to be taken into consideration in the analysis to avoid incorrect inferences about the regression coefficients. In the literature there are techniques available for modeling and analyzing such type data, the models being extensions of generalized linear models. The present study explores the use of: 1) generalized estimation equations, that includes a correlation matrix to obtain a better fit; 2) generalized linear mixed models, that introduce a correlation between clustered observations though the addition of random effects in the model. These two methodologies are applied to a data set obtained from an experiment to evaluate certain conditions on the germination of seeds of castor bean cultivar AL Guarany 2002 with the objective of determining the best estimation model for such data

APA, Harvard, Vancouver, ISO, and other styles

25

Valois, Marie-France. "Evaluation of the performance of the generalized estimating equations method for the analysis of crossover designs." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ29805.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

MacNeill, Stephanie Jan. "A statistical analysis of the recurrence of gestational diabetes by logistic regression and generalized estimating equations." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/tape15/PQDD_0008/MQ36504.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Jin, Lei. "Generalized score tests for missing covariate data." [College Station, Tex. : Texas A&M University, 2007. http://hdl.handle.net/1969.1/ETD-TAMU-1625.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Söderdahl, Fabian, and Karl Hammarström. "Measuring the causal effect of air temperature on violent crime." Thesis, Uppsala universitet, Statistiska institutionen, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-243130.

Full text

Abstract:

This thesis aimed to apply the causal framework with potential outcomes to examine the causal effect of air temperature on reported violent crimes in Swedish municipalities. The Generalized Estimating Equations method was used on yearly, monthly and also July only data for the time period 2002-2014. One significant causal effect was established but the majority of the results pointed to there being no causal effect between air temperature and reported violent crimes.

APA, Harvard, Vancouver, ISO, and other styles

29

Barbosa, Luciano 1971. "Metodologias estatísticas na análise de germinação de sementes de mamona /." Botucatu : [s.n.], 2010. http://hdl.handle.net/11449/101848.

Full text

Abstract:

Orientador: Luiza Aparecida Trinca
Banca: Liciana Vaz da Arruda
Banca: Osmar Delmanto Junior
Banca: Célia Regina Lopes Zimback
Banca: Marli Teixeira de A. Minhoni
Resumo: É bastante comum na área agrícola, experimentos cujas variáveis respostas são contagens ou proporções. Para esse tipo de dados, utiliza-se a metodologia de modelos lineares generalizados quando as respostas são independentes. Por outro lado, quando as respostas são dependentes, há uma correlação entre as observações e isso tem que ser levado em consideração na análise, para evitar inferências incorretas sobre os coeficientes de regressão. Na literatura há técnicas disponíveis para a modelagem e análise desses dados, sendo os modelos disponíveis extensões dos modelos lineares generalizados. No presente trabalho, utiliza-se a metodologia de equação de estimação generalizada, que inclui no modelo uma matriz de correlação para a obtenção de um melhor ajuste. Outra alternativa, também abordada neste trabalho, é a utilização de um modelo linear generalizado misto, no qual o uso de efeitos aleatórios também introduz uma correlação entre observações que tenham algum efeito em comum. Essas duas metodologias são aplicadas a um conjunto de dados obtidos de um experimento para avaliar certas condições na germinação de sementes de mamona da cultivar AL Guarany 2002, com o objetivo de se verificar qual o melhor modelo de estimação para esses dados
Abstract: Experiments whose response variables are counts or proportions are very common in agriculture. For this type of data, if the observational units are independent, the methodology of generalized linear models can be appropriate. On the other hand, when responses are dependent or clustered, there is a correlation between the observations and that has to be taken into consideration in the analysis to avoid incorrect inferences about the regression coefficients. In the literature there are techniques available for modeling and analyzing such type data, the models being extensions of generalized linear models. The present study explores the use of: 1) generalized estimation equations, that includes a correlation matrix to obtain a better fit; 2) generalized linear mixed models, that introduce a correlation between clustered observations though the addition of random effects in the model. These two methodologies are applied to a data set obtained from an experiment to evaluate certain conditions on the germination of seeds of castor bean cultivar AL Guarany 2002 with the objective of determining the best estimation model for such data
Doutor

APA, Harvard, Vancouver, ISO, and other styles

30

Lyth, Johan. "En jämförelse mellan individers självuppskattade livskvalitet och samhällets hälsopreferenser : En paneldatastudie av hjärtpatienter." Thesis, Linköpings universitet, Matematiska institutionen, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-15095.

Full text

Abstract:

Objective: In recent years there has been an increasing interest within the clinical (medical) science in measuring people’s health. When estimating quality of life, present practise is to use the EQ-5D questionnaire and an index which weighs the different questions. The question is what happens if the individuals estimate there own health, would it differ from the public preferences? The aim is to make a new prediction model based on the opinion of patients and compare it to the present model based on public preferences. Method: A sample of 362 patients with unstable coronary artery disease from the Frisc II trial, valued their quality of life in the acute phase and after 3, 6 and 12 months. The EQ-5D question form and also the Time Trade-off method (TTO), a direct method of valuing health was used. A regression technique managing panel data had to be used in estimating TTO by the EQ-5D and other variables like gender and age. Result: Different regression techniques vary in estimating parameters and standard errors. A Generalized Estimating Equation approach with empirical correlation structure is the most suitable regression technique for the data material. A model based on the EQ-5D question form and a continuous age variable proves to be the best model for an index derived by individuals. The difference between heart patients own opinion of health and the public preferences differs a great amount in the severe health conditions, but are rather small for healthy patients. Of the total 243 health conditions, only eight of the conditions were estimated higher by the public index. Conclusions: As the differences between the approaches are significantly large the choice of index could affect the decision making in a health economic study.

APA, Harvard, Vancouver, ISO, and other styles

31

Demirbaäg, Mustafa Emin. "Estimation of seismic parameters from multifold reflection seismic data by generalized linear inversion of Zoeppritz equations." Diss., Virginia Tech, 1990. http://hdl.handle.net/10919/37224.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Diaz, Pedro, and Grant Skrepnek. "Marginal Tax Rates and Innovative Activity in the Biotech Sector." The University of Arizona, 2013. http://hdl.handle.net/10150/614244.

Full text

Abstract:

Class of 2013 Abstract
Specific Aims: To assess the association between marginal tax rates (MTR) and innovative output of biotechnology firms. The MTR plays an important role in firms’ financing choices. Assessment of a firm’s tax status may reveal how firms decide on investment policies that affect R&D. Methods: A retrospective database analysis was used. Subjects included were firms within the biotechnology sector with the Standard Industrial Classification code of 2836 from 1980 - 2011. MTR Data was obtained from the S&P Compustat database, and Patent data was obtained from the U.S. Patent and Trademark Office. Changes in MTR’s on outcomes of patents were analyzed by performing an inferential analysis. Generalized estimating equations (GEE) were used, specifically utilizing a GEE regression with a negative binomial distributional family with log link, independent correlation structure and robust standard error variance calculation. Patents were regressed by the lagged change in MTR, after interest deductions. Main Results: The lag years 2 and 5 of the MTR change were statistically significant, (p = 0.031) and (p = 0.026) for each model respectively. Every one unit increase in the change of the MTRs was associated with large and significant drops in patents 78.8% (IRR = 0.212), 90.7% (IRR = 0.093), 92.7% (IRR = 0.073) at year 2 lag and 84.8% (IRR = 0.152), 92.6% (IRR = 0.074) at year 5 lag. Conclusion: An increase in the change of the MTR results in significant drops in patenting activity.

APA, Harvard, Vancouver, ISO, and other styles

33

Wen, Lan. "Methods for handling missing data in cohort studies where outcomes are truncated by death." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278788.

Full text

Abstract:

This dissertation addresses problems found in observational cohort studies where the repeated outcomes of interest are truncated by both death and by dropout. In particular, we consider methods that make inference for the population of survivors at each time point, otherwise known as 'partly conditional inference'. Partly conditional inference distinguishes between the reasons for missingness; failure to make this distinction will cause inference to be based not only on pre-death outcomes which exist but also on post-death outcomes which fundamentally do not exist. Such inference is called 'immortal cohort inference'. Investigations of health and cognitive outcomes in two studies - the 'Origins of Variance in the Old Old' and the 'Health and Retirement Study' - are conducted. Analysis of these studies is complicated by outcomes of interest being missing because of death and dropout. We show, first, that linear mixed models and joint models (that model both the outcome and survival processes) produce immortal cohort inference. This makes the parameters in the longitudinal (sub-)model difficult to interpret. Second, a thorough comparison of well-known methods used to handle missing outcomes - inverse probability weighting, multiple imputation and linear increments - is made, focusing particularly on the setting where outcomes are missing due to both dropout and death. We show that when the dropout models are correctly specified for inverse probability weighting, and the imputation models are correctly specified for multiple imputation or linear increments, then the assumptions of multiple imputation and linear increments are the same as those of inverse probability weighting only if the time of death is included in the dropout and imputation models. Otherwise they may not be. Simulation studies show that each of these methods gives negligibly biased estimates of the partly conditional mean when its assumptions are met, but potentially biased estimates if its assumptions are not met. In addition, we develop new augmented inverse probability weighted estimating equations for making partly conditional inference, which offer double protection against model misspecification. That is, as long as one of the dropout and imputation models is correctly specified, the partly conditional inference is valid. Third, we describe methods that can be used to make partly conditional inference for non-ignorable missing data. Both monotone and non-monotone missing data are considered. We propose three methods that use a tilt function to relate the distribution of an outcome at visit j among those who were last observed at some time before j to those who were observed at visit j. Sensitivity analyses to departures from ignorable missingness assumptions are conducted on simulations and on real datasets. The three methods are: i) an inverse probability weighted method that up-weights observed subjects to represent subjects who are still alive but are not observed; ii) an imputation method that replaces missing outcomes of subjects who are alive with their conditional mean outcomes given past observed data; and iii) a new augmented inverse probability method that combines the previous two methods and is doubly-robust against model misspecification.

APA, Harvard, Vancouver, ISO, and other styles

34

Deng, Wei. "Multiple imputation for marginal and mixed models in longitudinal data with informative missingness." Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1126890027.

Full text

Abstract:

Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains xiii, 108 p.; also includes graphics. Includes bibliographical references (p. 104-108). Available online via OhioLINK's ETD Center

APA, Harvard, Vancouver, ISO, and other styles

35

Wang, Xuesong. "SAFETY ANALYSES AT SIGNALIZED INTERSECTIONS CONSIDERING SPATIAL, TEMPORAL AND SITE CORRELATION." Doctoral diss., University of Central Florida, 2006. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3436.

Full text

Abstract:

Statistics show that signalized intersections are among the most dangerous locations of a roadway network. Different approaches including crash frequency and severity models have been used to establish the relationship between crash occurrence and intersection characteristics. In order to model crash occurrence at signalized intersections more efficiently and eventually to better identify the significant factors contributing to crashes, this dissertation investigated the temporal, spatial, and site correlations for total, rear-end, right-angle and left-turn crashes. Using the basic regression model for correlated crash data leads to invalid statistical inference, due to incorrect test statistics and standard errors based on the misspecified variance. In this dissertation, the Generalized Estimating Equations (GEEs) were applied, which provide an extension of generalized linear models to the analysis of longitudinal or clustered data. A series of frequency models are presented by using the GEE with a Negative Binomial as the link function. The GEE models for the crash frequency per year (using four correlation structures) were fitted for longitudinal data; the GEE models for the crash frequency per intersection (using three correlation structures) were fitted for the signalized intersections along corridors; the GEE models were applied for the rear-end crash data with temporal or spatial correlation separately. For right-angle crash frequency, models at intersection, roadway, and approach levels were fitted and the roadway and approach level models were estimated by using the GEE to account for the "site correlation"; and for left-turn crashes, the approach level crash frequencies were modeled by using the GEE with a Negative Binomial link function for most patterns and using a binomial logit link function for the pattern having a higher proportion of zeros and ones in crash frequencies. All intersection geometry design features, traffic control and operational features, traffic flows, and crashes were obtained for selected intersections. Massive data collection work has been done. The autoregression structure is found to be the most appropriate correlation structure for both intersection temporal and spatial analyses, which indicates that the correlation between the multiple observations for a certain intersection will decrease as the time-gap increase and for spatially correlated signalized intersections along corridors the correlation between intersections decreases as spacing increases. The unstructured correlation structure was applied for roadway and approach level right-angle crashes and also for different patterns of left-turn crashes at the approach level. Usually two approaches at the same roadway have a higher correlation. At signalized intersections, differences exist in traffic volumes, site geometry, and signal operations, as well as safety performance on various approaches of intersections. Therefore, modeling the total number of left-turn crashes at intersections may obscure the real relationship between the crash causes and their effects. The dissertation modeled crashes at different levels. Particularly, intersection, roadway, and approach level models were compared for right-angle crashes, and different crash assignment criteria of "at-fault driver" or "near-side" were applied for disaggregated models. It shows that for the roadway and approach level models, the "near-side" models outperformed the "at-fault driver" models. Variables in traffic characteristics, geometric design features, traffic control and operational features, corridor level factor, and location type have been identified to be significant in crash occurrence. In specific, the safety relationship between crash occurrence and traffic volume has been investigated extensively at different studies. It has been found that the logarithm of traffic volumes per lane for the entire intersection is the best functional form for the total crashes in both the temporal and spatial analyses. The studies of right-angle and left-turn crashes confirm the assumption that the frequency of collisions is related to the traffic flows to which the colliding vehicles belong and not to the sum of the entering flows; the logarithm of the product of conflicting flows is usually the most significant functional form in the model. This study found that the left-turn protection on the minor roadway will increase rear-end crash occurrence, while the left-turn protection on the major roadway will reduce rear-end crashes. In addition, left-turn protection reduces Pattern 5 left-turn crashes (left-turning traffic collides with on-coming through traffic) specifically, but it increases Pattern 8 left-turn crashes (left-turning traffic collides with near-side crossing through traffic), and it has no significant effect on other patterns of left-turn crashes. This dissertation also investigated some other factors which have not been considered before. The safety effectiveness of many variables identified in this dissertation is consistent with previous studies. Some variables have unexpected signs and a justification is provided. Injury severity also has been studied for Patterns 5 left-turn crashes. Crashes were located to the approach with left-turning vehicles. The "site correlation" among the crashes occurred at the same approach was considered since these crashes may have similar propensity in crash severity. Many methodologies and applications have been attempted in this dissertation. Therefore, the study has both theoretical and implementational contribution in safety analysis at signalized intersections.
Ph.D.
Department of Civil and Environmental Engineering
Engineering and Computer Science
Civil Engineering

APA, Harvard, Vancouver, ISO, and other styles

36

Green, Brittany. "Ultra-high Dimensional Semiparametric Longitudinal Data Analysis." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1593171378846243.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Jankovic, Dina. "Analysis of Longitudinal Data with Missing Responses Adjusted by Inverse Probability Weights." Thesis, Université d'Ottawa / University of Ottawa, 2018. http://hdl.handle.net/10393/37838.

Full text

Abstract:

We propose a new method for analyzing longitudinal data which contain responses that are missing at random. This method consists in solving the generalized estimating equation (GEE) of [7] in which the incomplete responses are replaced by values adjusted using the inverse probability weights proposed in [14]. We show that the root estimator is consistent and asymptotically normal, essentially under some conditions on the marginal distribution and the surrogate correlation matrix as those presented in [12] in the case of complete data, and under minimal assumptions on the missingness probabilities. This method is applied to a real-life dataset taken from [10], which examines the incidence of respiratory disease in a sample of 250 pre-school age Indonesian children which were examined every 3 months for 18 months, using as covariates the age, gender, and vitamin A deficiency.

APA, Harvard, Vancouver, ISO, and other styles

38

Jones, David. "Postnatal depression (PND) and neighborhood effects for women enrolled in a home visitation program." University of Cincinnati / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1459438588.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Qi, Xin. "Socio-environmental factors and suicide in Queensland, Australia." Thesis, Queensland University of Technology, 2009. https://eprints.qut.edu.au/30317/1/Xin_Qi_Thesis.pdf.

Full text

Abstract:

Suicide has drawn much attention from both the scientific community and the public. Examining the impact of socio-environmental factors on suicide is essential in developing suicide prevention strategies and interventions, because it will provide health authorities with important information for their decision-making. However, previous studies did not examine the impact of socio-environmental factors on suicide using a spatial analysis approach. The purpose of this study was to identify the patterns of suicide and to examine how socio-environmental factors impact on suicide over time and space at the Local Governmental Area (LGA) level in Queensland. The suicide data between 1999 and 2003 were collected from the Australian Bureau of Statistics (ABS). Socio-environmental variables at the LGA level included climate (rainfall, maximum and minimum temperature), Socioeconomic Indexes for Areas (SEIFA) and demographic variables (proportion of Indigenous population, unemployment rate, proportion of population with low income and low education level). Climate data were obtained from Australian Bureau of Meteorology. SEIFA and demographic variables were acquired from ABS. A series of statistical and geographical information system (GIS) approaches were applied in the analysis. This study included two stages. The first stage used average annual data to view the spatial pattern of suicide and to examine the association between socio-environmental factors and suicide over space. The second stage examined the spatiotemporal pattern of suicide and assessed the socio-environmental determinants of suicide, using more detailed seasonal data. In this research, 2,445 suicide cases were included, with 1,957 males (80.0%) and 488 females (20.0%). In the first stage, we examined the spatial pattern and the determinants of suicide using 5-year aggregated data. Spearman correlations were used to assess associations between variables. Then a Poisson regression model was applied in the multivariable analysis, as the occurrence of suicide is a small probability event and this model fitted the data quite well. Suicide mortality varied across LGAs and was associated with a range of socio-environmental factors. The multivariable analysis showed that maximum temperature was significantly and positively associated with male suicide (relative risk [RR] = 1.03, 95% CI: 1.00 to 1.07). Higher proportion of Indigenous population was accompanied with more suicide in male population (male: RR = 1.02, 95% CI: 1.01 to 1.03). There was a positive association between unemployment rate and suicide in both genders (male: RR = 1.04, 95% CI: 1.02 to 1.06; female: RR = 1.07, 95% CI: 1.00 to 1.16). No significant association was observed for rainfall, minimum temperature, SEIFA, proportion of population with low individual income and low educational attainment. In the second stage of this study, we undertook a preliminary spatiotemporal analysis of suicide using seasonal data. Firstly, we assessed the interrelations between variables. Secondly, a generalised estimating equations (GEE) model was used to examine the socio-environmental impact on suicide over time and space, as this model is well suited to analyze repeated longitudinal data (e.g., seasonal suicide mortality in a certain LGA) and it fitted the data better than other models (e.g., Poisson model). The suicide pattern varied with season and LGA. The north of Queensland had the highest suicide mortality rate in all the seasons, while there was no suicide case occurred in the southwest. Northwest had consistently higher suicide mortality in spring, autumn and winter. In other areas, suicide mortality varied between seasons. This analysis showed that maximum temperature was positively associated with suicide among male population (RR = 1.24, 95% CI: 1.04 to 1.47) and total population (RR = 1.15, 95% CI: 1.00 to 1.32). Higher proportion of Indigenous population was accompanied with more suicide among total population (RR = 1.16, 95% CI: 1.13 to 1.19) and by gender (male: RR = 1.07, 95% CI: 1.01 to 1.13; female: RR = 1.23, 95% CI: 1.03 to 1.48). Unemployment rate was positively associated with total (RR = 1.40, 95% CI: 1.24 to 1.59) and female (RR=1.09, 95% CI: 1.01 to 1.18) suicide. There was also a positive association between proportion of population with low individual income and suicide in total (RR = 1.28, 95% CI: 1.10 to 1.48) and male (RR = 1.45, 95% CI: 1.23 to 1.72) population. Rainfall was only positively associated with suicide in total population (RR = 1.11, 95% CI: 1.04 to 1.19). There was no significant association for rainfall, minimum temperature, SEIFA, proportion of population with low educational attainment. The second stage is the extension of the first stage. Different spatial scales of dataset were used between the two stages (i.e., mean yearly data in the first stage, and seasonal data in the second stage), but the results are generally consistent with each other. Compared with other studies, this research explored the variety of the impact of a wide range of socio-environmental factors on suicide in different geographical units. Maximum temperature, proportion of Indigenous population, unemployment rate and proportion of population with low individual income were among the major determinants of suicide in Queensland. However, the influence from other factors (e.g. socio-culture background, alcohol and drug use) influencing suicide cannot be ignored. An in-depth understanding of these factors is vital in planning and implementing suicide prevention strategies. Five recommendations for future research are derived from this study: (1) It is vital to acquire detailed personal information on each suicide case and relevant information among the population in assessing the key socio-environmental determinants of suicide; (2) Bayesian model could be applied to compare mortality rates and their socio-environmental determinants across LGAs in future research; (3) In the LGAs with warm weather, high proportion of Indigenous population and/or unemployment rate, concerted efforts need to be made to control and prevent suicide and other mental health problems; (4) The current surveillance, forecasting and early warning system needs to be strengthened, to trace the climate and socioeconomic change over time and space and its impact on population health; (5) It is necessary to evaluate and improve the facilities of mental health care, psychological consultation, suicide prevention and control programs; especially in the areas with low socio-economic status, high unemployment rate, extreme weather events and natural disasters.

APA, Harvard, Vancouver, ISO, and other styles

40

Qi, Xin. "Socio-environmental factors and suicide in Queensland, Australia." Queensland University of Technology, 2009. http://eprints.qut.edu.au/30317/.

Full text

Abstract:

Suicide has drawn much attention from both the scientific community and the public. Examining the impact of socio-environmental factors on suicide is essential in developing suicide prevention strategies and interventions, because it will provide health authorities with important information for their decision-making. However, previous studies did not examine the impact of socio-environmental factors on suicide using a spatial analysis approach. The purpose of this study was to identify the patterns of suicide and to examine how socio-environmental factors impact on suicide over time and space at the Local Governmental Area (LGA) level in Queensland. The suicide data between 1999 and 2003 were collected from the Australian Bureau of Statistics (ABS). Socio-environmental variables at the LGA level included climate (rainfall, maximum and minimum temperature), Socioeconomic Indexes for Areas (SEIFA) and demographic variables (proportion of Indigenous population, unemployment rate, proportion of population with low income and low education level). Climate data were obtained from Australian Bureau of Meteorology. SEIFA and demographic variables were acquired from ABS. A series of statistical and geographical information system (GIS) approaches were applied in the analysis. This study included two stages. The first stage used average annual data to view the spatial pattern of suicide and to examine the association between socio-environmental factors and suicide over space. The second stage examined the spatiotemporal pattern of suicide and assessed the socio-environmental determinants of suicide, using more detailed seasonal data. In this research, 2,445 suicide cases were included, with 1,957 males (80.0%) and 488 females (20.0%). In the first stage, we examined the spatial pattern and the determinants of suicide using 5-year aggregated data. Spearman correlations were used to assess associations between variables. Then a Poisson regression model was applied in the multivariable analysis, as the occurrence of suicide is a small probability event and this model fitted the data quite well. Suicide mortality varied across LGAs and was associated with a range of socio-environmental factors. The multivariable analysis showed that maximum temperature was significantly and positively associated with male suicide (relative risk [RR] = 1.03, 95% CI: 1.00 to 1.07). Higher proportion of Indigenous population was accompanied with more suicide in male population (male: RR = 1.02, 95% CI: 1.01 to 1.03). There was a positive association between unemployment rate and suicide in both genders (male: RR = 1.04, 95% CI: 1.02 to 1.06; female: RR = 1.07, 95% CI: 1.00 to 1.16). No significant association was observed for rainfall, minimum temperature, SEIFA, proportion of population with low individual income and low educational attainment. In the second stage of this study, we undertook a preliminary spatiotemporal analysis of suicide using seasonal data. Firstly, we assessed the interrelations between variables. Secondly, a generalised estimating equations (GEE) model was used to examine the socio-environmental impact on suicide over time and space, as this model is well suited to analyze repeated longitudinal data (e.g., seasonal suicide mortality in a certain LGA) and it fitted the data better than other models (e.g., Poisson model). The suicide pattern varied with season and LGA. The north of Queensland had the highest suicide mortality rate in all the seasons, while there was no suicide case occurred in the southwest. Northwest had consistently higher suicide mortality in spring, autumn and winter. In other areas, suicide mortality varied between seasons. This analysis showed that maximum temperature was positively associated with suicide among male population (RR = 1.24, 95% CI: 1.04 to 1.47) and total population (RR = 1.15, 95% CI: 1.00 to 1.32). Higher proportion of Indigenous population was accompanied with more suicide among total population (RR = 1.16, 95% CI: 1.13 to 1.19) and by gender (male: RR = 1.07, 95% CI: 1.01 to 1.13; female: RR = 1.23, 95% CI: 1.03 to 1.48). Unemployment rate was positively associated with total (RR = 1.40, 95% CI: 1.24 to 1.59) and female (RR=1.09, 95% CI: 1.01 to 1.18) suicide. There was also a positive association between proportion of population with low individual income and suicide in total (RR = 1.28, 95% CI: 1.10 to 1.48) and male (RR = 1.45, 95% CI: 1.23 to 1.72) population. Rainfall was only positively associated with suicide in total population (RR = 1.11, 95% CI: 1.04 to 1.19). There was no significant association for rainfall, minimum temperature, SEIFA, proportion of population with low educational attainment. The second stage is the extension of the first stage. Different spatial scales of dataset were used between the two stages (i.e., mean yearly data in the first stage, and seasonal data in the second stage), but the results are generally consistent with each other. Compared with other studies, this research explored the variety of the impact of a wide range of socio-environmental factors on suicide in different geographical units. Maximum temperature, proportion of Indigenous population, unemployment rate and proportion of population with low individual income were among the major determinants of suicide in Queensland. However, the influence from other factors (e.g. socio-culture background, alcohol and drug use) influencing suicide cannot be ignored. An in-depth understanding of these factors is vital in planning and implementing suicide prevention strategies. Five recommendations for future research are derived from this study: (1) It is vital to acquire detailed personal information on each suicide case and relevant information among the population in assessing the key socio-environmental determinants of suicide; (2) Bayesian model could be applied to compare mortality rates and their socio-environmental determinants across LGAs in future research; (3) In the LGAs with warm weather, high proportion of Indigenous population and/or unemployment rate, concerted efforts need to be made to control and prevent suicide and other mental health problems; (4) The current surveillance, forecasting and early warning system needs to be strengthened, to trace the climate and socioeconomic change over time and space and its impact on population health; (5) It is necessary to evaluate and improve the facilities of mental health care, psychological consultation, suicide prevention and control programs; especially in the areas with low socio-economic status, high unemployment rate, extreme weather events and natural disasters.

APA, Harvard, Vancouver, ISO, and other styles

41

Challa, Subhash. "Nonlinear state estimation and filtering with applications to target tracking problems." Thesis, Queensland University of Technology, 1998.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

42

Luppe, Marcos Roberto. "Evidências da sofisticação do padrão de consumo dos domicílios brasileiros: uma análise de cestas de produtos de consumo doméstico." Universidade de São Paulo, 2010. http://www.teses.usp.br/teses/disponiveis/12/12139/tde-05012011-125209/.

Full text

Abstract:

A economia brasileira passa por um momento positivo em sua história, devido principalmente a fatores gerados pela estabilidade econômica advinda com o Plano Real. O conjunto de dados apresentados neste trabalho evidencia uma melhora das condições socioeconômicas de grande parte da população, o que levou a um aumento da renda dos indivíduos e um fortalecimento do poder de consumo dos brasileiros. Nesse contexto, esta tese teve como objetivo a busca de evidências que indicassem uma mudança e possível sofisticação do padrão de consumo dos domicílios brasileiros. Além disso, procurou-se verificar em quais níveis socioeconômicos e em quais regiões as mudanças do padrão de consumo foram mais significativas. Os dados utilizados neste trabalho derivam de um painel de consumidores (Homescan) e foram analisadas informações de dez categorias de produtos de consumo doméstico para os anos de 2007, 2008 e 2009, considerando-se as áreas geográficas auditadas pela Nielsen e os níveis socioeconômicos dos domicílios. Nas análises dos dados, utilizaram-se modelos de equações de estimação generalizadas (EEG), além de análises estatísticas descritivas para avaliar a evolução das variáveis não-contempladas nesses modelos. Além disso, utilizaram-se dados de outra pesquisa (Retail Index) para complementar os resultados obtidos com o painel de consumidores. Os resultados das análises realizadas indicam uma mudança do padrão de consumo, primordialmente, nos domicílios de nível socioeconômico médio (classe C) e baixo (classes D e E) no período analisado. Quanto às áreas geográficas pesquisadas, os destaques foram o Nordeste, o grande Rio de Janeiro e a região Sul. Levando-se em consideração que as categorias analisadas são produtos mais elaborados e de maior valor agregado, o aumento do consumo da grande maioria das categorias nesses níveis socioeconômicos evidencia uma sofisticação do consumo desses domicílios. Esse ambiente de sofisticação dos padrões de consumo, principalmente das classes de renda média e baixa, exigirá das empresas que atuam no mercado de bens e serviços novas estratégias para atender as demandas de consumidores mais conscientes e exigentes. Assim, o grande desafio dessas empresas será decifrar o caminho da expansão e diversificação da cesta de compra desses consumidores.
The Brazilian economy is currently going through a positive time in its history, mainly as a result of factors generated by the economic stability conferred by the Plano Real financial plan. The data presented in this work shows an improvement in the socioeconomic conditions of the vast majority of the population, which has led to an increase in income for individuals, and a strengthening of the consumer power of Brazilians. In this context, this thesis looks for evidence that indicates a change and possible sophistication of consumer patterns in Brazilian households. It also seeks to determine the socioeconomic levels, and the regions in which the changes in consumer patterns are most significant. The data used in this work are derived from a panel of consumers (Homescan), and information from ten categories of domestic consumer goods were analyzed for the years 2007, 2008 and 2009, considering the geographic areas audited by Nielsen and the socioeconomic levels of the households. In the data analyses, generalized estimating equation (GEE) models are used, as well as descriptive statistical analyses, to evaluate the evolution of variables not included in these models. Data are also used from another survey (Retail Index), to complement the results obtained with the panel of consumers. The results of the analyses indicate a change in consumer patterns, particularly in households belonging to the middle (class C) and low (classes D and E) socioeconomic classes, for the period analyzed. In terms of geographical areas researched, the areas highlighted were the Northeast, the greater Rio de Janeiro and the South region. Taking into consideration that the categories analyzed consist of more elaborate products, with higher added value, the increased consumption for the majority of categories at these socioeconomic levels shows that consumption in these households has become more sophisticated. This environment of increasing sophistication of consumer patterns, particularly among the middle and low income classes, will require companies in the goods and services market to implement strategies to meet the requirements of these more aware and demanding consumers. Therefore, the greatest challenge for these companies is to seize the expansion and diversification path of the shopping basket for these consumers.

APA, Harvard, Vancouver, ISO, and other styles

43

Li, Daoji. "Empirical likelihood and mean-variance models for longitudinal data." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/empirical-likelihood-and-meanvariance-models-for-longitudinal-data(98e3c7ef-fc88-4384-8a06-2c76107a9134).html.

Full text

Abstract:

Improving the estimation efficiency has always been one of the important aspects in statistical modelling. Our goal is to develop new statistical methodologies yielding more efficient estimators in the analysis of longitudinal data. In this thesis, we consider two different approaches, empirical likelihood and jointly modelling the mean and variance, to improve the estimation efficiency. In part I of this thesis, empirical likelihood-based inference for longitudinal data within the framework of generalized linear model is investigated. The proposed procedure takes into account the within-subject correlation without involving direct estimation of nuisance parameters in the correlation matrix and retains optimality even if the working correlation structure is misspecified. The proposed approach yields more efficient estimators than conventional generalized estimating equations and achieves the same asymptotic variance as quadratic inference functions based methods. The second part of this thesis focus on the joint mean-variance models. We proposed a data-driven approach to modelling the mean and variance simultaneously, yielding more efficient estimates of the mean regression parameters than the conventional generalized estimating equations approach even if the within-subject correlation structure is misspecified in our joint mean-variance models. The joint mean-variances in parametric form as well as semi-parametric form has been investigated. Extensive simulation studies are conducted to assess the performance of our proposed approaches. Three longitudinal data sets, Ohio Children’s wheeze status data (Ware et al., 1984), Cattle data (Kenward, 1987) and CD4+ data (Kaslowet al., 1987), are used to demonstrate our models and approaches.

APA, Harvard, Vancouver, ISO, and other styles

44

Kauffman, Rudi D. "The Outcomes of Just War: An Empirical Study of the Outcomes Associated with Adherence to Just War Theory, 1960-2000." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1342105770.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Prado, Naimara Vieira do. "Abordagens para análise de dados composicionais." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-17082017-155240/.

Full text

Abstract:

Dados composicionais são vetores, chamados de composições, cujos componentes são todos positivos, satisfazem a soma igual a 1 e possuem um espaço amostral próprio chamado Simplex. A restrição da soma induz a correlação entre os componentes. Isso exige que os métodos estatísticos para análise desses conjuntos de dados considerem esse fato. A teoria para dados composicionais foi desenvolvida inicialmente por Aitchison na década de 80. Desde então, várias técnicas e métodos têm sido desenvolvidos para a modelagem dos dados composicionais. Este trabalho apresenta as principais abordagens para a análise estatística de dados composicionais independentes. Sendo, regressão Dirichlet (distribuição natural aos dados composicionais) ou o uso de transformações em razões logarítmicas que saem do espaço simplex para o espaço real. Também descreve os métodos para os casos em que a suposição de independência não pode ser atendida. Por exemplo, dados composionais com dependência espacial. Para esses casos, há na literatura métodos baseados nas teorias desenvolvidas para análise geoestatística de dados univariados; ou, no uso de transformações em razões logarítmicas com a inclusão da dependência espacial. Além de revisitar os métodos já difundidos, propõe-se o uso do método de Equações de Estimação Generalizadas (EEG) como alternativa para a análise de dados composicionais independentes e com dependência espacial. A principal vantagem é que as equações de estimação necessitam apenas da especificação de funções que descrevam a média e a estrutura de covariância. Assim, não é necessário atribuir uma distribuição de probabilidade aos dados ou fazer o uso de transformações. A aplicação do método EEG para dados composicionais independentes apresentou resultados tão eficientes quanto a regressão Dirichlet ou transformação em razões logarítmicas. Para os dados composicionais com dependência espacial, o método baseado em verossimilhança foi o que apresentou valores preditos mais próximos aos valores reais. O método EEG foi mais eficaz do que a abordagem geoestatística dos componentes individuais, porém, comparado com os demais métodos, foi o que apresentou maior valor residual.
C ompositional data are vectors, called compositions, whose components are all positive, it satisfies the sum equal one and has a Simplex space. The sum constraint induces the correlation between the components and this requires that the statistical methods for the analysis of datasets consider this fact. The theory for compositional data was developed mainly by Aitchison in the 1980s, and since then, several techniques and methods have been developed for compositional data modelling. This work presents the main approaches for the statistical analysis of independent compositional data, such as Dirichlet regression (natural distribution to compositional data) or the use of transformations log-ratios that aim to leave the simplex space for to Euclidean space. Also describes the methods for cases where the assumption of independence cannot be satisfied, for example, spatial dependence compositional data. For these cases, there are in the literature methods of analysis based on the theories developed for univariate geostatistics analysis or use of logratios transformations with the inclusion of the spatial dependence generated by the distance between the points. In addition, to revisiting the already diffused methods, this work propose the use of the Generalized Estimation Equation (GEE) method as an alternative for the analysis of independent compositional data and with spatial dependence. The GEE only requires the specification of functions that describe the mean and correlation matrix (covariance structure, therefore, it is not necessary to assign a probability distribution to the data or transformations. The application of the GEE method for independent compositional data presented results as efficient as Dirichlet regression or log-ratios transformation. Compositional data with spatial dependence, log-ratios transformations presented predicted values close to the real values. GEE method was more effective than the traditional geostatistical approach, however, compared with the other methods, It was the one that presented the high residual values.

APA, Harvard, Vancouver, ISO, and other styles

46

Oesselmann, Clarissa Cardoso. "Equações de estimação generalizadas com resposta binomial negativa: modelando dados correlacionados de contagem com sobredispersão." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-06072017-122423/.

Full text

Abstract:

Uma suposição muito comum na análise de modelos de regressão é a de respostas independentes. No entanto, quando trabalhamos com dados longitudinais ou agrupados essa suposição pode não fazer sentido. Para resolver esse problema existem diversas metodologias, e talvez a mais conhecida, no contexto não Gaussiano, é a metodologia de Equações de Estimação Generalizadas (EEGs), que possui similaridades com os Modelos Lineares Generalizados (MLGs). Essas similaridades envolvem a classificação do modelo em torno de distribuições da família exponencial e da especificação de uma função de variância. A única diferença é que nessa função também é inserida uma matriz trabalho que inclui a parametrização da estrutura de correlação dentro das unidades experimentais. O principal objetivo desta dissertação é estudar como esses modelos se comportam em uma situação específica, de dados de contagem com sobredispersão. Quando trabalhamos com MLGs esse problema é resolvido através do ajuste de um modelo com resposta binomial negativa (BN), e a ideia é a mesma para os modelos envolvendo EEGs. Essa dissertação visa rever as teorias existentes em EEGs no geral e para o caso específico quando a resposta marginal é BN, e além disso mostrar como essa metodologia se aplica na prática, com três exemplos diferentes de dados correlacionados com respostas de contagem.
An assumption that is common in the analysis of regression models is that of independent responses. However, when working with longitudinal or grouped data this assumption may not have sense. To solve this problem there are several methods, but perhaps the best known, in the non Gaussian context, is the one based on Generalized Estimating Equations (GEE), which has similarities with Generalized Linear Models (GLM). Such similarities involve the classification of the model around the exponential family and the specification of a variance function. The only diference is that in this function is also inserted a working correlation matrix concerning the correlations within the experimental units. The main objective of this dissertation is to study how these models behave in a specific situation, which is the one on count data with overdispersion. When we work with GLM this kind of problem is solved by setting a model with a negative binomial response (NB), and the idea is the same for the GEE methodology. This dissertation aims to review in general the GEE methodology and for the specific case when the responses follow marginal negative binomial distributions. In addition, we show how this methodology is applied in practice, with three examples of correlated data with count responses.

APA, Harvard, Vancouver, ISO, and other styles

47

Wang, Shin Cheng. "Analysis of Zero-Heavy Data Using a Mixture Model Approach." Diss., Virginia Tech, 1998. http://hdl.handle.net/10919/30357.

Full text

Abstract:

The problem of high proportion of zeroes has long been an interest in data analysis and modeling, however, there are no unique solutions to this problem. The solution to the individual problem really depends on its particular situation and the design of the experiment. For example, different biological, chemical, or physical processes may follow different distributions and behave differently. Different mechanisms may generate the zeroes and require different modeling approaches. So it would be quite impossible and inflexible to come up with a unique or a general solution. In this dissertation, I focus on cases where zeroes are produced by mechanisms that create distinct sub-populations of zeroes. The dissertation is motivated from problems of chronic toxicity testing which has a data set that contains a high proportion of zeroes. The analysis of chronic test data is complicated because there are two different sources of zeroes: mortality and non-reproduction in the data. So researchers have to separate zeroes from mortality and fecundity. The use of mixture model approach which combines the two mechanisms to model the data here is appropriate because it can incorporate the mortality kind of extra zeroes. A zero inflated Poisson (ZIP) model is used for modeling the fecundity in Ceriodaphnia dubia toxicity test. A generalized estimating equation (GEE) based ZIP model is developed to handle longitudinal data with zeroes due to mortality. A joint estimate of inhibition concentration (ICx) is also developed as potency estimation based on the mixture model approach. It is found that the ZIP model would perform better than the regular Poisson model if the mortality is high. This kind of toxicity testing also involves longitudinal data where the same subject is measured for a period of seven days. The GEE model allows the flexibility to incorporate the extra zeroes and a correlation structure among the repeated measures. The problem of zero-heavy data also exists in environmental studies in which the growth or reproduction rates of multi-species are measured. This gives rise to multivariate data. Since the inter-relationships between different species are imbedded in the correlation structure, the study of the information in the correlation of the variables, which is often accessed through principal component analysis, is one of the major interests in multi-variate data. In the case where mortality influences the variables of interests, but mortality is not the subject of interests, the use of the mixture approach can be applied to recover the information of the correlation structure. In order to investigate the effect of zeroes on multi-variate data, simulation studies on principal component analysis are performed. A method that recovers the information of the correlation structure is also presented.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

48

Menarin, Vinicius. "Modelos estatísticos para dados politômicos nominais em estudos longitudinais com uma aplicação à área agronômica." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-19042016-091641/.

Full text

Abstract:

Estudos em que a resposta de interesse é uma variável categorizada são bastante comuns nas mais diversas áreas da Ciência. Em muitas situações essa resposta é composta por mais de duas categorias não ordenadas, denominada então de uma variável politômica nominal, e em geral o objetivo do estudo é associar a probabilidade de ocorrência de cada categoria aos efeitos de variáveis explicativas. Ademais, existem tipos especiais de estudos em que os dados são coletados diversas vezes para uma mesma unidade amostral ao longo do tempo, os estudos longitudinais. Estudos assim requerem o uso de modelos estatísticos que considerem em sua formulação algum tipo de estrutura que suporte a dependência que tende a surgir entre observações feitas em uma mesma unidade amostral. Neste trabalho são abordadas duas extensões do modelo de logitos generalizados, usualmente empregado quando a resposta é politômica nominal com observações independentes entre si. A primeira consiste de uma modificação das equações de estimação generalizadas para dados nominais que se utiliza de razões de chances locais para descrever a dependência entre as observações da variável resposta politômica ao longo dos diversos tempos observados. Este tipo de modelo é denominado de modelo marginal. A segunda proposta abordada consiste no modelo de logitos generalizados com a inclusão de efeitos aleatórios no preditor linear, que também leva em conta uma dependência entre as observações. Esta abordagem caracteriza o modelo de logitos generalizados misto. Há diferenças importantes inerentes às interpretações dos modelos marginais e mistos, que são discutidas e que devem ser levadas em consideração na escolha da abordagem adequada. Ambas as propostas são aplicadas em um conjunto de dados proveniente de um experimento da área agronômica realizado em campo, conduzido sob um delineamento casualizado em blocos com esquema fatorial para os tratamentos. O experimento foi acompanhado ao longo de seis estações do ano, caracterizando assim uma estrutura longitudinal, sendo a variável resposta o tipo de vegetação observado no campo (touceiras, plantas invasoras ou espaços vazios). Os resultados encontrados são satisfatórios, embora a dependência presente nos dados não seja tão caracterizada; por meio de testes como da razão de verossimilhanças e de Wald diversas diferenças significativas entre os tratamentos foram encontradas. Ainda, devido às diferenças metodológicas das duas abordagens, o modelo marginal baseado nas equações de estimação generalizadas mostra-se mais adequado para esses dados.
Studies where the response is a categorical variable are quite common in many fields of Sciences. In many situations this response is composed by more than two unordered categories characterizing a nominal polytomous outcome and, in general, the aim of the study is to associate the probability of occurrence of each category to the effects of variables. Furthermore, there are special types of study where many measurements are taken over the time for the same sampling unit, called longitudinal studies. Such studies require special statistical models that consider some kind of structure that support the dependence that tends to arise from the repeated measurements for the same sampling unit. This work focuses on two extensions of the baseline-category logit model usually employed in cases when there is a nominal polytomous response with independent observations. The first one consists in a modification of the well-known generalized estimating equations for longitudinal data based on local odds ratios to describe the dependence between the levels of the response over the repeated measurements. This type of model is also known as a marginal model. The second approach adds random effects to the linear predictor of the baseline-category logit model, which also considers a dependence between the observations. This characterizes a baseline-category mixed model. There are substantial differences inherent to interpretations when marginal and mixed models are compared, what should be considered in the choice of the most appropriated approach for each situation. Both methodologies are applied to the data of an agronomic experiment installed under a complete randomized block design with a factorial arrangement for the treatments. It was carried out over six seasons, characterizing the longitudinal structure, and the response is the type of vegetation observed in field (tussocks, weeds or regions with bare ground). The results are satisfactory, even if the dependence found in data is not so strong, and likelihood-ratio and Wald tests point to several differences between treatments. Moreover, due to methodological differences between the two approaches, the marginal model based on generalized estimating equations seems to be more appropriate for this data.

APA, Harvard, Vancouver, ISO, and other styles

49

Venezuela, Maria Kelly. "Equação de estimação generalizada e influência local para modelos de regressão beta com medidas repetidas." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-10072008-210246/.

Full text

Abstract:

Utilizando a teoria de função de estimação linear ótima (Crowder, 1987), propomos equações de estimação generalizadas para modelos de regressão beta (Ferrari e Cribari-Neto, 2004) com medidas repetidas. Além disso, apresentamos equações de estimação generalizadas para modelos de regressão simplex baseadas nas propostas de Song e Tan (2000) e Song et al. (2004) e equações de estimação generalizadas para modelos lineares generalizados com medidas repetidas baseadas nas propostas de Artes e Jorgensen (2000) e Liang e Zeger (1986). Todas essas equações de estimação são desenvolvidas sob os enfoques da modelagem da média com homogeneidade da dispersão e da modelagem conjunta da média e da dispersão com intuito de incorporar ao modelo uma possível heterogeneidade da dispersão. Como técnicas de diagnóstico, desenvolvemos uma generalização de algumas medidas de diagnóstico quando abordamos quaisquer equações de estimação definidas tanto para modelagem do parâmetro de posição considerando a homogeneidade do parâmetro de dispersão como para modelagem conjunta dos parâmetros de posição e dispersão. Entre essas medidas, destacamos a proposta da influência local (Cook, 1986) desenvolvida para equações de estimação. Essa medida teve um bom desempenho, em simulações, para destacar corretamente pontos influentes. Por fim, realizamos aplicações a conjuntos de dados reais.
Based on the concept of optimum linear estimating equation (Crowder, 1987), we develop generalized estimating equation (GEE) to analyze longitudinal data considering marginal beta regression models (Ferrari and Cribari-Neto, 2004). The GEEs are also presented to marginal simplex models for longitudinal continuous proportional data proposed by Song and Tan (2000) and Song et al. (2004) and to generalized linear models for longitudinal data based on the proposes of Artes and J$\\phi$rgensen (2000) and Liang and Zeger (1986). All of them are developed focusing the assumption of homogeneous dispersion and with varying dispersion. For the diagnostic techniques, we generalize some diagnostic measures for estimating equations to model the position parameter considering an homogeneous dispersion parameter and for joint modelling of position and dispersion parameters to take in account a possible heterogeneous dispersion. Among these measures, we point out the local influence (Cook, 1986) developed to estimating equations. This measure can correctly show influential observations in simulation study. Finally, the theory is applied to real data sets.

APA, Harvard, Vancouver, ISO, and other styles

50

Carter, Megan Ann. "Do Childhood Excess Weight and Family Food Insecurity Share Common Risk Factors in the Local Environment? An Examination Using a Quebec Birth Cohort." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/23801.

Full text

Abstract:

Background: Childhood excess weight and family food insecurity are food-system related public health problems that exist in Canada. Since both relate to issues of food accessibility and availability, which have elements of “place”, they may share common risk factors in the local environment that are amenable to intervention. In this area of research, the literature derives mostly from a US context, and there is a dearth of high quality evidence, specifically from longitudinal studies. Objectives: The main objectives of this thesis were to examine the adjusted associations between the place factors: material deprivation, social deprivation, social cohesion, disorder, and living location, with change in child BMI Z-score and with change in family food insecurity status in a Canadian cohort of children. Methods: The Québec Longitudinal Study of Child Development was used to meet the main objectives of this thesis. Response data from six collection cycles (4 – 10 years of age) were used in three main analyses. The first analysis examined change in child BMI Z-score as a function of the place factors using mixed models regression. The second analysis examined change in child BMI Z-score as a function of place factors using group-based trajectory modeling. The third and final analysis examined change in family food insecurity status as a function of the place factors using generalized estimating equations. Results: Social deprivation, social cohesion and disorder were strongly and positively associated with family food insecurity, increasing the odds by 45-76%. These place factors, on the other hand, were not consistently associated with child weight status. Material deprivation was not important for either outcome, except for a slight positive association in the mixed models analysis of child weight status. Living location was not important in explaining family food insecurity. On the other hand, it was associated with child weight status in both analyses, but the nature of the relationship is still unclear. Conclusions: Results do not suggest that addressing similar place factors may alleviate both child excess weight and family food insecurity. More high quality longitudinal and experimental studies are needed to clarify relationships between the local environment and child weight status and family food insecurity.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Generalised Estimating Equation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles