To see the other types of publications on this topic, follow the link: Zero-inflated generalized Poisson model.

Dissertations / Theses on the topic 'Zero-inflated generalized Poisson model'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 22 dissertations / theses for your research on the topic 'Zero-inflated generalized Poisson model.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Guo, Yixuan. "Bayesian Model Selection for Poisson and Related Models." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439310177.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Prasad, Jonathan P. "Zero-Inflated Censored Regression Models: An Application with Episode of Care Data." BYU ScholarsArchive, 2009. https://scholarsarchive.byu.edu/etd/2226.

Full text
Abstract:
The objective of this project is to fit a sequence of increasingly complex zero-inflated censored regression models to a known data set. It is quite common to find censored count data in statistical analyses of health-related data. Modeling such data while ignoring the censoring, zero-inflation, and overdispersion often results in biased parameter estimates. This project develops various regression models that can be used to predict a count response variable that is affected by various predictor variables. The regression parameters are estimated with Bayesian analysis using a Markov chain Monte Carlo (MCMC) algorithm. The tests for model adequacy are discussed and the models are applied to an observed data set.
APA, Harvard, Vancouver, ISO, and other styles
3

Wang, Shin Cheng. "Analysis of Zero-Heavy Data Using a Mixture Model Approach." Diss., Virginia Tech, 1998. http://hdl.handle.net/10919/30357.

Full text
Abstract:
The problem of high proportion of zeroes has long been an interest in data analysis and modeling, however, there are no unique solutions to this problem. The solution to the individual problem really depends on its particular situation and the design of the experiment. For example, different biological, chemical, or physical processes may follow different distributions and behave differently. Different mechanisms may generate the zeroes and require different modeling approaches. So it would be quite impossible and inflexible to come up with a unique or a general solution. In this dissertation, I focus on cases where zeroes are produced by mechanisms that create distinct sub-populations of zeroes. The dissertation is motivated from problems of chronic toxicity testing which has a data set that contains a high proportion of zeroes. The analysis of chronic test data is complicated because there are two different sources of zeroes: mortality and non-reproduction in the data. So researchers have to separate zeroes from mortality and fecundity. The use of mixture model approach which combines the two mechanisms to model the data here is appropriate because it can incorporate the mortality kind of extra zeroes. A zero inflated Poisson (ZIP) model is used for modeling the fecundity in Ceriodaphnia dubia toxicity test. A generalized estimating equation (GEE) based ZIP model is developed to handle longitudinal data with zeroes due to mortality. A joint estimate of inhibition concentration (ICx) is also developed as potency estimation based on the mixture model approach. It is found that the ZIP model would perform better than the regular Poisson model if the mortality is high. This kind of toxicity testing also involves longitudinal data where the same subject is measured for a period of seven days. The GEE model allows the flexibility to incorporate the extra zeroes and a correlation structure among the repeated measures. The problem of zero-heavy data also exists in environmental studies in which the growth or reproduction rates of multi-species are measured. This gives rise to multivariate data. Since the inter-relationships between different species are imbedded in the correlation structure, the study of the information in the correlation of the variables, which is often accessed through principal component analysis, is one of the major interests in multi-variate data. In the case where mortality influences the variables of interests, but mortality is not the subject of interests, the use of the mixture approach can be applied to recover the information of the correlation structure. In order to investigate the effect of zeroes on multi-variate data, simulation studies on principal component analysis are performed. A method that recovers the information of the correlation structure is also presented.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
4

Roemmele, Eric S. "A Flexible Zero-Inflated Poisson Regression Model." UKnowledge, 2019. https://uknowledge.uky.edu/statistics_etds/38.

Full text
Abstract:
A practical problem often encountered with observed count data is the presence of excess zeros. Zero-inflation in count data can easily be handled by zero-inflated models, which is a two-component mixture of a point mass at zero and a discrete distribution for the count data. In the presence of predictors, zero-inflated Poisson (ZIP) regression models are, perhaps, the most commonly used. However, the fully parametric ZIP regression model could sometimes be restrictive, especially with respect to the mixing proportions. Taking inspiration from some of the recent literature on semiparametric mixtures of regressions models for flexible mixture modeling, we propose a semiparametric ZIP regression model. We present an "EM-like" algorithm for estimation and a summary of asymptotic properties of the estimators. The proposed semiparametric models are then applied to a data set involving clandestine methamphetamine laboratories and Alzheimer's disease.
APA, Harvard, Vancouver, ISO, and other styles
5

Llorens, Aleixandre Noelia. "Evaluación en el modelado de las respuestas de recuento." Doctoral thesis, Universitat de les Illes Balears, 2005. http://hdl.handle.net/10803/9446.

Full text
Abstract:
Este trabajo presenta dos líneas de investigación desarrolladas en los últimos años en torno a la etapa de evaluación en datos de recuento. Los campos de estudio han sido: los datos de recuento, concretamente el estudio del modelo de regresión de Poisson y sus extensiones y la etapa de evaluación como punto de inflexión en el proceso de modelado estadístico. Los resultados obtenidos ponen de manifiesto la importancia de aplicar el modelo adecuado a las características de los datos así como de evaluar el ajuste del mismo. Por otra parte la comparación de pruebas, índices, estimadores y modelos intentan señalar la adecuación o la preferencia de unos sobre otros en determinadas circunstancias y en función de los objetivos del investigador.
This paper presents two lines of research that have been developed in recent years on the evaluation stage in count data. The areas of study have been both count data, specifically the study of Poisson regression modelling and its extension, and the evaluation stage as a point of reflection in the statistical modelling process. The results obtained demonstrate the importance of applying appropriate models to the characteristics of data as well as evaluating their fit. On the other hand, comparisons of trials, indices, estimators and models attempt to indicate the suitability or preference for one over the others in certain circumstances and according to research objectives.
APA, Harvard, Vancouver, ISO, and other styles
6

Pedersen, Kristen E. "Sample Size Determination in Auditing Accounts Receivable Using a Zero-Inflated Poisson Model." Digital WPI, 2010. https://digitalcommons.wpi.edu/etd-theses/421.

Full text
Abstract:
In the practice of auditing, a sample of accounts is chosen to verify if the accounts are materially misstated, as opposed to auditing all accounts; it would be too expensive to audit all acounts. This paper seeks to find a method for choosing a sample size of accounts that will give a more accurate estimate than the current methods for sample size determination that are currently being used. A review of methods to determine sample size will be investigated under both the frequentist and Bayesian settings, and then our method using the Zero-Inflated Poisson (ZIP) model will be introduced which explicitly considers zero versus non-zero errors. This model is favorable due to the excess zeros that are present in auditing data which the standard Poisson model does not account for, and this could easily be extended to data similar to accounting populations.
APA, Harvard, Vancouver, ISO, and other styles
7

Kreider, Scott Edwin Douglas. "A case study in handling over-dispersion in nematode count data." Manhattan, Kan. : Kansas State University, 2010. http://hdl.handle.net/2097/4248.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zeileis, Achim, Christian Kleiber, and Simon Jackman. "Regression Models for Count Data in R." Foundation for Open Access Statistics, 2008. http://epub.wu.ac.at/4986/1/Zeileis_etal_2008_JSS_Regression%2DModels%2Dfor%2DCount%2DData%2Din%2DR.pdf.

Full text
Abstract:
The classical Poisson, geometric and negative binomial regression models for count data belong to the family of generalized linear models and are available at the core of the statistics toolbox in the R system for statistical computing. After reviewing the conceptual and computational features of these methods, a new implementation of hurdle and zero-inflated regression models in the functions hurdle() and zeroinfl() from the package pscl is introduced. It re-uses design and functionality of the basic R functions just as the underlying conceptual tools extend the classical models. Both hurdle and zero-inflated model, are able to incorporate over-dispersion and excess zeros-two problems that typically occur in count data sets in economics and the social sciences-better than their classical counterparts. Using cross-section data on the demand for medical care, it is illustrated how the classical as well as the zero-augmented models can be fitted, inspected and tested in practice. (authors' abstract)
APA, Harvard, Vancouver, ISO, and other styles
9

Gao, Siyu. "The impact of misspecification of nuisance parameters on test for homogeneity in zero-inflated Poisson model: a simulation study." Kansas State University, 2014. http://hdl.handle.net/2097/17804.

Full text
Abstract:
Master of Science
Department of Statistics
Wei-Wen Hsu
The zero-inflated Poisson (ZIP) model consists of a Poisson model and a degenerate distribution at zero. Under this model, zero counts are generated from two sources, representing a heterogeneity in the population. In practice, it is often interested to evaluate this heterogeneity is consistent with the observed data or not. Most of the existing methodologies to examine this heterogeneity are often assuming that the Poisson mean is a function of nuisance parameters which are simply the coefficients associated with covariates. However, these nuisance parameters can be misspecified when performing these methodologies. As a result, the validity and the power of the test may be affected. Such impact of misspecification has not been discussed in the literature. This report primarily focuses on investigating the impact of misspecification on the performance of score test for homogeneity in ZIP models. Through an intensive simulation study, we find that: 1) under misspecification, the limiting distribution of the score test statistic under the null no longer follows a chi-squared distribution. A parametric bootstrap methodology is suggested to use to find the true null limiting distribution of the score test statistic; 2) the power of the test decreases as the number of covariates in the Poisson mean increases. The test with a constant Poisson mean has the highest power, even compared to the test with a well-specified mean. At last, simulation results are applied to the Wuhan Inpatient Care Insurance data which contain excess zeros.
APA, Harvard, Vancouver, ISO, and other styles
10

Zeileis, Achim, Christian Kleiber, and Simon Jackman. "Regression Models for Count Data in R." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2007. http://epub.wu.ac.at/1168/1/document.pdf.

Full text
Abstract:
The classical Poisson, geometric and negative binomial regression models for count data belong to the family of generalized linear models and are available at the core of the statistics toolbox in the R system for statistical computing. After reviewing the conceptual and computational features of these methods, a new implementation of zero-inflated and hurdle regression models in the functions zeroinfl() and hurdle() from the package pscl is introduced. It re-uses design and functionality of the basic R functions just as the underlying conceptual tools extend the classical models. Both model classes are able to incorporate over-dispersion and excess zeros - two problems that typically occur in count data sets in economics and the social and political sciences - better than their classical counterparts. Using cross-section data on the demand for medical care, it is illustrated how the classical as well as the zero-augmented models can be fitted, inspected and tested in practice. (author's abstract)
Series: Research Report Series / Department of Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
11

Zavaleta, Katherine Elizabeth Coaguila. "Modelo destrutivo com variável terminal em experimentos quimiopreventivos de tumores em animais." Universidade Federal de São Carlos, 2012. https://repositorio.ufscar.br/handle/ufscar/4561.

Full text
Abstract:
Made available in DSpace on 2016-06-02T20:06:07Z (GMT). No. of bitstreams: 1 4375.pdf: 903031 bytes, checksum: 03118f406867a5d7be3cbc63571d4a2b (MD5) Previous issue date: 2012-04-12
Financiadora de Estudos e Projetos
The chemical induction of carcinogens in chemopreventive animal experiments is becoming increasingly frequent in biological research. The purpose of these biological experiments is to evaluate the effect of a particular treatment on the rate of tumors incidence in animals. In this work, the number of promoted tumors per animal will be parametrically modeled following the suggestions given by Kokoska (1987) and Freedman et al. (1993). The study of these chemopreventive experiments will be presented in the context of the destructive model proposed by Rodrigues et al. (2010) with terminal variable that allows or censures the experiment at time of the animal death. Since the data analyzed in this field are subject to excess of zeros (Freedman et al. (1993)), we propose for the number of promoted tumors a negative binomial distribution (NB), a zero-inflated Poisson distribution (ZIP), and a zero-inflated Negative Binomial distribution (ZINB). The selection of these models will be made through the likelihood ratio test and the AIC, BIC criteria. The estimation of its parameters will be obtained by using the method of maximum likelihood, and further simulation studies will also be realized. As a future proposition to finalize this project, it is suggested the Bayesian methodology as an alternative to the method of maximum likelihood via the EM algorithm.
A indução química de substâncias cancerígenas em experimentos quimiopreventivos em animais é cada vez mais frequente em pesquisas biológicas. O objetivo destes experimentos biológicos é avaliar o efeito de um determinado tratamento na taxa de incidência de tumores em animais. Neste trabalho o número de tumores promovidos por animal será modelado parametricamente seguindo as sugestões dadas por Kokoska (1987) e por Freedman et al. (1993). O estudo desses experimentos quimiopreventivos será apresentado no contexto do modelo destrutivo proposto por Rodrigues et al. (2010) com variável terminal que condiciona ou censura o experimento no instante de morte do animal. Os dados analisados possuem uma grande quantidade de zeros, portanto será proposto para o número de tumores promovidos as seguintes distribuições: binomial negativa, a distribuição de Poisson com zeros inflacionados e a distribuição binomial negativa com zeros inflacionados. A seleção destes modelos será feita através do teste da razão de verossimilhança e os critérios AIC, BIC. As estimativas dos respectivos parâmetros serão obtidas utilizando o método de máxima verossimilhança e serão feitos estudos de simulação. Para continuar este projeto, a proposta futura é utilizar a metodologia Bayesiana como alternativa ao método de máxima verossimilhança via algoritmo EM.
APA, Harvard, Vancouver, ISO, and other styles
12

Low, Wan Jing. "Variants of compound models and their application to citation analysis." Thesis, University of Wolverhampton, 2017. http://hdl.handle.net/2436/620467.

Full text
Abstract:
This thesis develops two variant statistical models for count data based upon compound models for contexts when the counts may be viewed as derived from two generations, which may or may not be independent. Unlike standard compound models, the variants model the sum of both generations. We consider cases where both generations are negative binomial or one is Poisson and the other is negative binomial. The first variant, denoted SVA, follows a zero restriction, where a zero in the first generation will automatically be followed by a zero in the second generation. The second variant, denoted SVB, is a convolution model that does not possess this zero restriction. The main properties of the SVA and SVB models are outlined and compared with standard compound models. The results show that the SVA distributions are similar to standard compound distributions for some fixed parameters. Comparisons of SVA, Poisson hurdle, negative binomial hurdle and their zero-inflated counterpart using simulated SVA data indicate that different models can give similar results, as the generating models are not always selected as the best fitting. This thesis focuses on the use of the variant models to model citation counts. We show that the SVA models are more suitable for modelling citation data than other previously used models such as the negative binomial model. Moreover, the application of SVA and SVB models may be used to describe the citation process. This thesis also explores model selection techniques based on log-likelihood methods, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The suitability of the models is also assessed using two diagrammatic methods, randomised quantile residual plots and Christmas tree plots. The Christmas tree plots clearly illustrate whether the observed data are within fluctuation bounds under the fitted model, but the randomised quantile residual plots utilise the cumulative distribution, and hence are insensitive to individual data values. Both plots show the presence of citation counts that are larger than expected under the fitted model in the data sets.
APA, Harvard, Vancouver, ISO, and other styles
13

Cheng, Lulu. "Statistical Methods for Genetic Pathway-Based Data Analysis." Diss., Virginia Tech, 2013. http://hdl.handle.net/10919/52039.

Full text
Abstract:
The wide application of the genomic microarray technology triggers a tremendous need in the development of the high dimensional genetic data analysis. Many statistical methods for the microarray data analysis consider one gene at a time, but they may miss subtle changes at the single gene level. This limitation may be overcome by considering a set of genes simultaneously where the gene sets are derived from the prior biological knowledge and are called "pathways". We have made contributions on two specific research topics related to the high dimensional genetic pathway data. One is to propose a semi- parametric model for identifying pathways related to the zero inflated clinical outcomes; the other is to propose a multilevel Gaussian graphical model for exploring both pathway and gene level network structures. For the first problem, we develop a semiparametric model via a Bayesian hierarchical framework. We model the pathway effect nonparametrically into a zero inflated Poisson hierarchical regression model with unknown link function. The nonparametric pathway effect is estimated via the kernel machine and the unknown link function is estimated by transforming a mixture of beta cumulative density functions. Our approach provides flexible semiparametric settings to describe the complicated association between gene microarray expressions and the clinical outcomes. The Metropolis-within-Gibbs sampling algorithm and Bayes factor are used to make the statistical inferences. Our simulation results support that the semiparametric approach is more accurate and flexible than the zero inflated Poisson regression with the canonical link function, this is especially true when the number of genes is large. The usefulness of our approaches is demonstrated through its applications to a canine gene expression data set (Enerson et al., 2006). Our approaches can also be applied to other settings where a large number of highly correlated predictors are present. Unlike the first problem, the second one is to take into account that pathways are not independent of each other because of shared genes and interactions among pathways. Multi-pathway analysis has been a challenging problem because of the complex dependence structure among pathways. By considering the dependency among pathways as well as genes within each pathway, we propose a multi-level Gaussian graphical model (MGGM): one level is for pathway network and the second one is for gene network. We develop a multilevel L1 penalized likelihood approach to achieve the sparseness on both levels. We also provide an iterative weighted graphical LASSO algorithm (Guo et al., 2011) for MGGM. Some asymptotic properties of the estimator are also illustrated. Our simulation results support the advantages of our approach; our method estimates the network more accurate on the pathway level, and sparser on the gene level. We also demonstrate usefulness of our approach using the canine genes-pathways data set.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
14

Milani, Eder Angelo. "Modelos para séries temporais utilizando as distribuições normal generalizada e log-normal generalizada." Universidade Federal de São Carlos, 2016. https://repositorio.ufscar.br/handle/ufscar/7943.

Full text
Abstract:
Submitted by Izabel Franco (izabel-franco@ufscar.br) on 2016-10-06T18:14:46Z No. of bitstreams: 1 TeseEAMms.pdf: 1490434 bytes, checksum: e7a807666b453630ffb423774d2539b9 (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T13:51:46Z (GMT) No. of bitstreams: 1 TeseEAMms.pdf: 1490434 bytes, checksum: e7a807666b453630ffb423774d2539b9 (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T13:51:52Z (GMT) No. of bitstreams: 1 TeseEAMms.pdf: 1490434 bytes, checksum: e7a807666b453630ffb423774d2539b9 (MD5)
Made available in DSpace on 2016-10-20T13:52:00Z (GMT). No. of bitstreams: 1 TeseEAMms.pdf: 1490434 bytes, checksum: e7a807666b453630ffb423774d2539b9 (MD5) Previous issue date: 2016-03-23
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
From the generalized normal distribution and concepts of the generalized autoregressive moving averages models we introduce the generalized normal-ARMA model as an alternative way to model time series exhibiting symmetry and tails that may be lighter or heavier when compared the normal distribution. We present application for proposed model using three time series in the hydrology, economy and publics policy areas. The proposed model is presented as good alternative when compared to ARMA model with normal distribution. We extended this model the case of the asymmetric time series. In this case we used the Box-Cox transformation, denoted by Box-Cox generalized normal ARMA. The particular case, when we use the logarithmic transformation is called generalized log-normal ARMA. We adjusted the models with transformation to the series on monthly average affluent streamflow of the Furnas and Sobradinho hydroelectric plants. We obtain the prediction values for the model with transformation, that are better when compared with the model without transformation. To treat time series that exhibit periodic in the correlation function we defined three extensions for periodic autoregressive model, called generalized normal periodic autoregressive model, generalized log-normal periodic autoregressive model and Box-Cox generalized normal periodic autoregressive model. We can observed that the series on monthly average affluent streamflow of the Furnas and Sobradinho hydroelectric plants have periodic correlation. We present two applications of periodic models from these series. In the models, we note that is not necessary the use of generalized normal distribution in every months, just in some the generalized normal distribution presented better results than the normal distribution. Finally, we define the generalized normal zero inflated distribution and the generalized normal zero inflated ARMA model for time series. Adopting the model for series that have zero inflation and the maximum likelihood method for estimation of parameters, we analyze the serie of the amount of rainfall in the city of São Carlos.
A partir da distribuição normal generalizada e dos conceitos do modelo autorregressivo e de médias móveis generalizado, introduzimos o modelo normal generalizada- ARMA, como alternativa para modelar séries temporais, que exibem simetria e caudas mais leves ou mais pesadas quando comparadas com a distribuição normal. Apresentamos aplicações do modelo proposto, usando três séries temporais, das áreas de hidrologia, políticas públicas e economia. O modelo proposto se apresentou como uma boa alternativa ao modelo ARMA com distribuição normal. Estendemos o modelo para o caso de séries que apresentam assimetria. Neste caso, utilizamos a transformação de Box-Cox, denotado por Box-Cox normal generalizada-ARMA. O caso particular quando utilizamos a transformação logarítmica é chamado de log-normal generalizada-ARMA. Ajustamos os modelos com transformação à séries de vazões das usinas hidrelétricas de Furnas e Sobradinho. Calculamos predições, que para o modelo com transformação, foram melhores, quando comparado ao modelo sem transformação. Com o objetivo de tratar séries que apresentam periodicidade na função de correlação, definimos três extensões do modelo autorregressivo periódico, chamando-os de modelo normal generalizada autorregressivo periódico, modelo log-normal generalizada autorregressivo periódico e modelo Box-Cox normal generalizada autorregressivo periódico. Constatamos que as séries de vazões das usinas hidrelétricas de Furnas e Sobradinho apresentam correlação periódica. Apresentamos duas aplicações dos modelos periódicos propostos usando estas séries. Nos ajustes dos modelos, notamos que não há necessidade da utilização da distribuição normal generalizada em todos os meses, mas em alguns a distribuição normal generalizada se sobressaiu em relação a distribuição normal. Por último, definimos a distribuição normal generalizada zero inflacionada e o modelo para séries temporais normal generalizada zero inflacionada-ARMA. Adotando o método de máxima verossimilhança e o modelo para séries que apresentam inflação de zeros, analisamos a série da quantidade de precipitação pluviométrica da cidade de São Carlos.
APA, Harvard, Vancouver, ISO, and other styles
15

Teng, Yungchu, and 鄧詠竹. "A Study On Zero-and-K-Inflated Poisson Regression Model." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/74275146426708698381.

Full text
Abstract:
碩士
國立臺北大學
統計學系
100
In the public health, social science, engineering science, agricultural science and other disciplines, it is common to use the Poisson (POI) regression to analyze discrete count data. However, excessive zeros often occur in the data and then cause over-dispersion. Therefore, Lambert (1992) proposed the zero-inflated Poisson (ZIP) regression model to fit such data. In this research, we extend the zero-inflated Poisson regression model to the zero-and-K-inflated Poisson (ZKIP) regression model. The ZKIP model can be applied to count data, which contains extra zeros and Ks, where K is a non-zero positive integer. For example, a survey question inquiring the number of times that young adults visited a dentist in two years resulted in zero time (zero) or one time (K) for most people, this is so-called zero-and-K-inflated data. In the simulation study, it compares the goodness of fit for ZKIP, ZIP and POI models, and discusses the best timing of using these models in the data. We also explores the effect of different sample size, zero proportion, k proportion and mean in Poisson distribution on data fitting for these considered models The simulation study shows that ZKIP has better fit than POI and ZIP in all simulation configurations. In the empirical study, we use 2005 national health interview survey data to compare the performance of data fitting for the three models. The results show that the zero-and-K-inflated Poisson regression model outperforms the other two models.
APA, Harvard, Vancouver, ISO, and other styles
16

Wang, Ying-Jhih, and 王英至. "Modeling Spatial Risk Variation of Aftershocks Using Zero-Inflated Poisson Model." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/zbnnvs.

Full text
Abstract:
碩士
國立彰化師範大學
統計資訊研究所
106
Modeling spatial risk variation of the interested event is an active research topic in spatial statistics. For count data, when response variables are collected with the excessive zero values, the traditional Poisson regression model may be not suitable for analyzing this type of data. To overcome this issue, we use the zero-inflated Poisson model combined with the spatial hierarchical Bayesian model to assess the spatial risk variation of the interested event, where the spatial correlations of the data set are modeled by the conditional autoregressive model and a logistic regression is used to spatially model the probabilistic variabilities of risks. The statistical inferences of model parameters and risk assessments are conducted based on Bayesian frameworks. We use a real data set regarding aftershocks of 921 Chi-Chi earthquake in Taiwan to illustrate the effectiveness of the proposed methodology.
APA, Harvard, Vancouver, ISO, and other styles
17

Yi-HengLin and 林翊亨. "Applying Zero-inflated Poisson Model for High-quality process Control Chart." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/90999020600242766344.

Full text
Abstract:
碩士
國立成功大學
工業與資訊管理學系專班
98
Attribute c control chart is a Poisson distribution based on the central limit theorem (CLT). With the emergence of high-quality process, the number of defects was substantially reduced. However, the reduction in defects created an excessive amount of zero counts on the c chart. Additionally, it caused the control limit to approach zero or negative. The c control chart was led to invalid CLT assumptions and generated many false alarms. Therefore, the c chart was inadequate. Due to very few numbers of defects on process, the zero count was inadequate in monitoring and controlling attribute data in this high-quality process. Hence, searching for a more appropriate probability distribution is important. Due to the reasons mentioned above, the use of zero-inflated Poisson (ZIP) distribution will be more appropriate than Poisson distribution. In this research, a positive approach was proposed to prove the feasibility assessment of ZIP with a case study. In this paper, the Vuong test and maximum likelihood estimation method (MLE) was presented to estimate parameters, and then the average run length (ARL) applied for performance evaluation. This approach was ensured that ZIP model was a significant improvement and could be a useful reference for high-quality process.
APA, Harvard, Vancouver, ISO, and other styles
18

Lukusa, Martin Tshishimbi Wa, and 盧馬汀. "Semiparametric Estimation of a Zero-Inflated Poisson Regression Model with Missing Covariates." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/73140756254761603751.

Full text
Abstract:
博士
逢甲大學
統計學系應用統計博士班
104
Besides the usual problem of overdispersion encountered in fitting a response count data set with excess of zeroes, some covariates involved in modeling the Poisson mean and mixing probability in a zero-inflated Poisson (ZIP) regression model are likely to have missings. In the presence of missing values in covariates, inference based only on complete cases may not provide efficient results because some available information is discarded by deletion of cases. To obtain the unbiased estimators of the parameters of a ZIP regression model with missing covariates, we propose the inverse probability weighting (IPW) methods. These methods estimate the parameters of the ZIP regression model under the missing at random (MAR) where observations are inversely weighted by the selection probability. In this IPW frame¬work, we prove that the semiparametric IPW estimator is asymptotically more efficient than the true weight IPW estimator. We also investigate the asymptotic properties of the estimators. Finally, we conduct Monte Carlo experiments to study their finite-sample performance and use a real example to illustrate the practical use of the proposed methodology. Keyword: Count data, Overdispersion, Estimating equation, Missing at random, Non-parametric selection probability, Large-sample Properties.
APA, Harvard, Vancouver, ISO, and other styles
19

Santos, Jorge Helder Pereira dos. "Modelos para dados de contagem com excesso de zeros." Master's thesis, 2013. http://hdl.handle.net/1822/29402.

Full text
Abstract:
Dissertação de mestrado em Estatística
Os modelos de regressão para dados de contagem são muito utilizados nas mais variadas áreas de estudo para a modelação de fenómenos. Estes modelos integram um quadro especial de metodologias devido ao facto de a variável resposta tomar apenas valores inteiros não negativos. A distribuição de Poisson é a mais conhecida, e a mais utilizada para modelar dados de contagem, no entanto sempre que existe sobredispersão, torna-se necessário recorrer a outras distribuições, nomeadamente à distribuição Binomial Negativa. Outro problema comum nos dados de contagem é o excesso de zeros na variável resposta. Os modelos de regressão de zeros inflacionados são amplamente usados para modelar esse tipo de dados. Estes modelos modelam as contagens como uma mistura de duas distribuições com dois processos subjacentes, um que trata do excesso de zeros modelado por uma massa pontual, e um outro que trata das contagens sendo modelado por uma distribuição de Poisson ou Binomial Negativa. Neste trabalho pretendeu-se estudar os modelos de regressão para dados de contagem e a sua aplicação a dados bancários relativos a clientes a quem foi garantido crédito de consumo por um banco. Tem como principal objetivo estudar a relação do número de não pagamento da prestação do empréstimo de um cliente em função das caracteristicas do cliente e do contrato. Em particular, foram ajustados os modelos de regressão de Poisson, modelos de regressão Binomial Negativa, modelos de regressão de Poisson de zeros inflacionados e modelos de regressão binomial negativa de zeros inflacionados utilizando o algoritmo EM para obter as estimativas de máxima verosimilhança dos parâmetros. Os resultados obtidos mostraram que os modelos de regressão de zeros inflacionados apresentam um melhor ajustamento, quando comparados com os modelos que não têm em consideração o excesso de zeros. Mostraram ainda que os modelos baseados na distribuição Binomial Negativa, são os mais adequados para modelar estes dados, em vez dos modelos baseados na distribuição de Poisson.
Regression models for count data are highly used in several areas of study for modelation of phenomena. These models feature a special methodological board that comes from the fact that the response variable just takes non-negative integer values. The Poisson distribution is the most recognized and most widely used to model count data, however when there is overdispersion, it becomes necessary the use other distributions, as so, including negative binomial distribution. Another common problem in count data, is the excess of zeros in the response variable. Zero inflated regression models are widely used to model this type of data. These models model the counts as a mixture of two distributions with two underlying processes, one that deals with excess of zeros modeled by a pontual mass, and another one that handles the counts by being modelated by a Poisson or Negative Binomial distributions. In this work we intended to study regression models for count data and its application on bank data clients to whom it was granted consumption credit by a bank. Its main objective is to study the relationship of the number of non payment of the installment of a client depending on the characteristics of client and the contract. In particular, we fit the Poisson regression models, negative binomial regression models, zero inflated Poisson regression models and negative binomial regression models for zero inflated using the EM algorithm to obtain maximum likelihood estimates of the parameters. The results showed that zero inflated regression models have a better fit compared with models that do not take into account the extra zeros. Also showed that models based on the negative binomial distribution, are more suitable for modeling this data instead of models based on Poisson distribution.
APA, Harvard, Vancouver, ISO, and other styles
20

Saab, Rabih. "Nonparametric estimation of the mixing distribution in mixed models with random intercepts and slopes." Thesis, 2013. http://hdl.handle.net/1828/4548.

Full text
Abstract:
Generalized linear mixture models (GLMM) are widely used in statistical applications to model count and binary data. We consider the problem of nonparametric likelihood estimation of mixing distributions in GLMM's with multiple random effects. The log-likelihood to be maximized has the general form l(G)=Σi log∫f(yi,γ) dG(γ) where f(.,γ) is a parametric family of component densities, yi is the ith observed response dependent variable, and G is a mixing distribution function of the random effects vector γ defined on Ω. The literature presents many algorithms for maximum likelihood estimation (MLE) of G in the univariate random effect case such as the EM algorithm (Laird, 1978), the intra-simplex direction method, ISDM (Lesperance and Kalbfleish, 1992), and vertex exchange method, VEM (Bohning, 1985). In this dissertation, the constrained Newton method (CNM) in Wang (2007), which fits GLMM's with random intercepts only, is extended to fit clustered datasets with multiple random effects. Owing to the general equivalence theorem from the geometry of mixture likelihoods (see Lindsay, 1995), many NPMLE algorithms including CNM and ISDM maximize the directional derivative of the log-likelihood to add potential support points to the mixing distribution G. Our method, Direct Search Directional Derivative (DSDD), uses a directional search method to find local maxima of the multi-dimensional directional derivative function. The DSDD's performance is investigated in GLMM where f is a Bernoulli or Poisson distribution function. The algorithm is also extended to cover GLMM's with zero-inflated data. Goodness-of-fit (GOF) and selection methods for mixed models have been developed in the literature, however their application in models with nonparametric random effects distributions is vague and ad-hoc. Some popular measures such as the Deviance Information Criteria (DIC), conditional Akaike Information Criteria (cAIC) and R2 statistics are potentially useful in this context. Additionally, some cross-validation goodness-of-fit methods popular in Bayesian applications, such as the conditional predictive ordinate (CPO) and numerical posterior predictive checks, can be applied with some minor modifications to suit the non-Bayesian approach.
Graduate
0463
rabihsaab@gmail.com
APA, Harvard, Vancouver, ISO, and other styles
21

Koemle, Dieter. "The impact of agri-environmental policy and infrastructure on wildlife and land prices." Doctoral thesis, 2018. http://hdl.handle.net/11858/00-1735-0000-002E-E538-E.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Sebatjane, Phuti. "Understanding patterns of aggregation in count data." Diss., 2016. http://hdl.handle.net/10500/22067.

Full text
Abstract:
The term aggregation refers to overdispersion and both are used interchangeably in this thesis. In addressing the problem of prevalence of infectious parasite species faced by most rural livestock farmers, we model the distribution of faecal egg counts of 15 parasite species (13 internal parasites and 2 ticks) common in sheep and goats. Aggregation and excess zeroes is addressed through the use of generalised linear models. The abundance of each species was modelled using six different distributions: the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), zero-altered Poisson (ZAP) and zero-altered negative binomial (ZANB) and their fit was later compared. Excess zero models (ZIP, ZINB, ZAP and ZANB) were found to be a better fit compared to standard count models (Poisson and negative binomial) in all 15 cases. We further investigated how distributional assumption a↵ects aggregation and zero inflation. Aggregation and zero inflation (measured by the dispersion parameter k and the zero inflation probability) were found to vary greatly with distributional assumption; this in turn changed the fixed-effects structure. Serial autocorrelation between adjacent observations was later taken into account by fitting observation driven time series models to the data. Simultaneously taking into account autocorrelation, overdispersion and zero inflation proved to be successful as zero inflated autoregressive models performed better than zero inflated models in most cases. Apart from contribution to the knowledge of science, predictability of parasite burden will help farmers with effective disease management interventions. Researchers confronted with the task of analysing count data with excess zeroes can use the findings of this illustrative study as a guideline irrespective of their research discipline. Statistical methods from model selection, quantifying of zero inflation through to accounting for serial autocorrelation are described and illustrated.
Statistics
M.Sc. (Statistics)
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography