To see the other types of publications on this topic, follow the link: Zero-inflated distribution.

Dissertations / Theses on the topic 'Zero-inflated distribution'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 16 dissertations / theses for your research on the topic 'Zero-inflated distribution.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Wan, Chung-him, and 溫仲謙. "Analysis of zero-inflated count data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2009. http://hub.hku.hk/bib/B43703719.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wan, Chung-him. "Analysis of zero-inflated count data." Click to view the E-thesis via HKUTO, 2009. http://sunzi.lib.hku.hk/hkuto/record/B43703719.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Dai, Xiaogang. "Score Test and Likelihood Ratio Test for Zero-Inflated Binomial Distribution and Geometric Distribution." TopSCHOLAR®, 2018. https://digitalcommons.wku.edu/theses/2447.

Full text
Abstract:
The main purpose of this thesis is to compare the performance of the score test and the likelihood ratio test by computing type I errors and type II errors when the tests are applied to the geometric distribution and inflated binomial distribution. We first derive test statistics of the score test and the likelihood ratio test for both distributions. We then use the software package R to perform a simulation to study the behavior of the two tests. We derive the R codes to calculate the two types of error for each distribution. We create lots of samples to approximate the likelihood of type I error and type II error by changing the values of parameters. In the first chapter, we discuss the motivation behind the work presented in this thesis. Also, we introduce the definitions used throughout the paper. In the second chapter, we derive test statistics for the likelihood ratio test and the score test for the geometric distribution. For the score test, we consider the score test using both the observed information matrix and the expected information matrix, and obtain the score test statistic zO and zI . Chapter 3 discusses the likelihood ratio test and the score test for the inflated binomial distribution. The main parameter of interest is w, so p is a nuisance parameter in this case. We derive the likelihood ratio test statistics and the score test statistics to test w. In both tests, the nuisance parameter p is estimated using maximum likelihood estimator pˆ. We also consider the score test using both the observed and the expected information matrices. Chapter 4 focuses on the score test in the inflated binomial distribution. We generate data to follow the zero inflated binomial distribution by using the package R. We plot the graph of the ratio of the two score test statistics for the sample data, zI /zO , in terms of different values of n0, the number of zero values in the sample. In chapter 5, we discuss and compare the use of the score test using two types of information matrices. We perform a simulation study to estimate the two types of errors when applying the test to the geometric distribution and the inflated binomial distribution. We plot the percentage of the two errors by fixing different parameters, such as the probability p and the number of trials m. Finally, we conclude by briefly summarizing the results in chapter 6.
APA, Harvard, Vancouver, ISO, and other styles
4

Pailden, Junvie Montealto. "Applications of Empirical Likelihood to Zero-Inflated Data and Epidemic Change Point." Bowling Green State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1367579613.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Fan, Huihao. "Test of Treatment Effect with Zero-Inflated Over-Dispersed Count Data from Randomized Single Factor Experiments." University of Cincinnati / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1407404513.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Ibukun, Michael Abimbola. "Modely s Touchardovým rozdělením." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2021. http://www.nusl.cz/ntk/nusl-445468.

Full text
Abstract:
In 2018, Raul Matsushita, Donald Pianto, Bernardo B. De Andrade, Andre Cançado & Sergio Da Silva published a paper titled ”Touchard distribution”, which presented a model that is a two-parameter extension of the Poisson distribution. This model has its normalizing constant related to the Touchard polynomials, hence the name of this model. This diploma thesis is concerned with the properties of the Touchard distribution for which delta is known. Two asymptotic tests based on two different statistics were carried out for comparison in a Touchard model with two independent samples, supported by simulations in R.
APA, Harvard, Vancouver, ISO, and other styles
7

Silva, João Flávio Andrade. "Modelos preditivos para LGD." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/104/104131/tde-13112018-084000/.

Full text
Abstract:
As instituições financeiras que pretendem utilizar a IRB (Internal Ratings Based) avançada precisam desenvolver métodos para estimar a componente de risco LGD (Loss Given Default). Desde a década de 1950 são apresentadas propostas para modelagem da PD (Probability of default), em contrapartida, a previsão da LGD somente recebeu maior atenção após a publicação do Acordo Basileia II. A LGD possui ainda uma literatura pequena, se comparada a PD, e não há um método eficiente em termos de acurácia e interpretação como é a regressão logística para a PD. Modelos de regressão para LGD desempenham um papel fundamental na gestão de risco das instituições financeiras. Devido sua importância este trabalho propõe uma metodologia para quantificar a componente de risco LGD. Considerando as características relatadas sobre a distribuição da LGD e na forma flexível que a distribuição beta pode assumir, propomos uma metodologia de estimação da LGD por meio do modelo de regressão beta bimodal inflacionado em zero. Desenvolvemos a distribuição beta bimodal inflacionada em zero, apresentamos algumas propriedades, incluindo momentos, definimos estimadores via máxima verossimilhança e construímos o modelo de regressão para este modelo probabilístico, apresentamos intervalos de confiança assintóticos e teste de hipóteses para este modelo, bem como critérios para seleção de modelos, realizamos um estudo de simulação para avaliar o desempenho dos estimadores de máxima verossimilhança para os parâmetros da distribuição beta bimodal inflacionada em zero. Para comparação com nossa proposta selecionamos os modelos de regressão beta e regressão beta inflacionada, que são abordagens mais usuais, e o algoritmo SVR , devido a significativa superioridade relatada em outros trabalhos.
Financial institutions willing to use the advanced Internal Ratings Based (IRB) need to develop methods to estimate the LGD (Loss Given Default) risk component. Proposals for PD (Probability of default) modeling have been presented since the 1950s, in contrast, LGDs forecast has received more attention only after the publication of the Basel II Accord. LGD also has a small literature, compared to PD, and there is no efficient method in terms of accuracy and interpretation such as logistic regression for PD. Regression models for LGD play a key role in the risk management of financial institutions, due to their importance this work proposes a methodology to quantify the LGD risk component. Considering the characteristics reported on the distribution of LGD and in the flexible form that the beta distribution may assume, we propose a methodology for estimation of LGD using the zero inflated bimodal beta regression model. We developed the zero inflated bimodal beta distribution, presented some properties, including moments, defined estimators via maximum likelihood and constructed the regression model for this probabilistic model, presented asymptotic confidence intervals and hypothesis test for this model, as well as selection criteria of models, we performed a simulation study to evaluate the performance of the maximum likelihood estimators for the parameters of the zero inflated bimodal beta distribution. For comparison with our proposal we selected the beta regression models and inflated beta regression, which are more usual approaches, and the SVR algorithm, due to the significant superiority reported in other studies.
APA, Harvard, Vancouver, ISO, and other styles
8

Zbylut, Joanna. "Modeling proportions to assess the soil nematode community structure in a two year alfalfa crop." Kansas State University, 2014. http://hdl.handle.net/2097/17327.

Full text
Abstract:
Master of Science
Department of Statistics
Leigh Murray
The southern root-knot nematode (SRKN) and the weedy perennials, yellow nutsedge (YNS) and purple nutsedge (PNS) are simultaneously occurring pests in the irrigated agricultural soils of southern New Mexico. Previous research has very well characterized SRKN, YNS and PNS as a mutually-beneficial pest complex and has revealed their enhanced population growth and survival when they occur together. The density of nutsedge in a field could be used as a predictor of SRKN juveniles in the soil. In addition to SRKN, which is the most harmful of the plant parasitic nematodes, in southern New Mexico, other species or categories of nematodes could be identified and counted. Some of them are not as damaging to the plant as SRKN, and some of them may be essential for soil health. The nematode species could be grouped into categories according to trophic level (what nematodes eat) and herbivore feeding behavior (how herbivore nematodes eat). Subsequently, three ratios of counts were calculated for trophic level and for feeding behavior level to investigate the soil nematode community structure. These proportions were modeled as functions of the weed hosts YNS and PNS by generalized linear regression models using the logit link function and three probability distributions: the Binomial, Zero Inflated Binomial (ZIB) and Binomial Hurdle (BH). The latter two were used to account for potential high proportions of zeros in the data. The SAS NLMIXED procedure was used to fit models for each of the six sampling dates (May, July and September) over the two years of the alfalfa study. General results showed that the Binomial pmf generally provided the best fit, indicating lower zero-inflation than expected. Importance of YNS and PNS predictors varied over time and the different ratios. Specific results illustrate the differences in estimated probabilities between Binomial, ZIB and BH distributions as YNS counts increase for two selected ratios.
APA, Harvard, Vancouver, ISO, and other styles
9

Ljung, Carolina, and Maria Svedberg. "Estimation of Loss Given Default Distributions for Non-Performing Loans Using Zero-and-One Inflated Beta Regression Type Models." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273593.

Full text
Abstract:
This thesis investigates three different techniques for estimating loss given default of non-performing consumer loans. This is a contribution to a credit risk evaluation model compliant with the regulations stipulated by the Basel Accords, regulating the capital requirements of European financial institutions. First, multiple linear regression is applied, and thereafter, zero-and-one inflated beta regression is implemented in two versions, with and without Bayesian inference. The model performances confirm that modeling loss given default data is challenging, however, the result shows that the zero-and-one inflated beta regression is superior to the other models in predicting LGD. Although, it shall be recognized that all models had difficulties in distinguishing low-risk loans, while the prediction accuracy of riskier loans, resulting in larger losses, were higher. It is further recommended, in future research, to include macroeconomic variables in the models to capture economic downturn conditions as well as adopting decision trees, for example by applying machine learning.
Detta examensarbete undersöker tre olika metoder för att estimera förlusten vid fallissemang för icke-presterande konsumentlån. Detta som ett bidrag till en kreditrisksmodell i enlighet med bestämmelserna i Baselregelverken, som bland annat reglerar kapitalkraven för europeiska finansiella institut. Inledningsvis tillämpas multipel linjär regression, därefter implementeras två versioner av utvidgad betaregression, med och utan bayesiansk inferens. Resultatet bekräftar att modellering data för förlust givet fallissemang är utmanande, men visar även att den utvidgade betaregressionen utan bayesiansk inferens är bättre de andra modellerna. Det ska dock tilläggas att alla modeller visade svårigheter att estimera lån med låg risk, medan tillförlitligheten hos lån med hög risk, vilka generellt sett medför större förluster, var högre. Vidare rekommenderas det för framtida forskning att inkludera makroekonomiska variabler i modellerna för att fånga ekonomiska nedgångar samt att implementera beslutsträd, exempelvis genom applicering av maskininlärning.
APA, Harvard, Vancouver, ISO, and other styles
10

Silva, Deise Deolindo. "Classe de distribuições série de potências inflacionadas com aplicações." Universidade Federal de São Carlos, 2009. https://repositorio.ufscar.br/handle/ufscar/4536.

Full text
Abstract:
Made available in DSpace on 2016-06-02T20:06:03Z (GMT). No. of bitstreams: 1 2510.pdf: 1878422 bytes, checksum: 882e21e70271b7a106e3a27a080da004 (MD5) Previous issue date: 2009-04-06
This work has as central theme the Inflated Modified Power Series Distributions, where the objective is to study its main properties and the applicability in the bayesian context. This class of models includes the generalized Poisson, binomial and negative binomial distributions. These probability distributions are very helpful to models discrete data with inflated values. As particular case the - zero inflated Poisson models (ZIP) is studied, where the main purpose was to verify the effectiveness of it when compared to the Poisson distribution. The same methodology was considered for the negative binomial inflated distribution, but comparing it with the Poisson, negative binomial and ZIP distributions. The Bayes factor and full bayesian significance test were considered for selecting models.
Este trabalho tem como tema central a classe de distribuições série de potências inflacionadas, em que o intuito é estudar suas principais propriedades e a aplicabilidade no contexto bayesiano. Esta classe de modelos engloba as distribuições de Poisson, binomial e binomial negativa simples e as generalizadas e, por isso é muito aplicada na modelagem de dados discretos com valores excessivos. Como caso particular propôs-se explorar a distribuição de Poisson zero inflacionada (ZIP), em que o objetivo principal foi verificar a eficácia de sua modelagem quando comparada à distribuição de Poisson. A mesma metodologia foi considerada para a distribuição binomial negativa inflacionada, mas comparando-a com as distribuições de Poisson, binomial negativa e ZIP. Como critérios formais para seleção de modelos foram considerados o fator de Bayes e o teste de significância completamente bayesiano.
APA, Harvard, Vancouver, ISO, and other styles
11

Llorens, Aleixandre Noelia. "Evaluación en el modelado de las respuestas de recuento." Doctoral thesis, Universitat de les Illes Balears, 2005. http://hdl.handle.net/10803/9446.

Full text
Abstract:
Este trabajo presenta dos líneas de investigación desarrolladas en los últimos años en torno a la etapa de evaluación en datos de recuento. Los campos de estudio han sido: los datos de recuento, concretamente el estudio del modelo de regresión de Poisson y sus extensiones y la etapa de evaluación como punto de inflexión en el proceso de modelado estadístico. Los resultados obtenidos ponen de manifiesto la importancia de aplicar el modelo adecuado a las características de los datos así como de evaluar el ajuste del mismo. Por otra parte la comparación de pruebas, índices, estimadores y modelos intentan señalar la adecuación o la preferencia de unos sobre otros en determinadas circunstancias y en función de los objetivos del investigador.
This paper presents two lines of research that have been developed in recent years on the evaluation stage in count data. The areas of study have been both count data, specifically the study of Poisson regression modelling and its extension, and the evaluation stage as a point of reflection in the statistical modelling process. The results obtained demonstrate the importance of applying appropriate models to the characteristics of data as well as evaluating their fit. On the other hand, comparisons of trials, indices, estimators and models attempt to indicate the suitability or preference for one over the others in certain circumstances and according to research objectives.
APA, Harvard, Vancouver, ISO, and other styles
12

Ling, Wodan. "Quantile regression for zero-inflated outcomes." Thesis, 2019. https://doi.org/10.7916/d8-rre7-sw52.

Full text
Abstract:
Zero-inflated outcomes are common in biomedical studies, where the excessive zeros indicate some special but undetectable events. Quantile regression is potentially advantageous in analyzing zero-inflated outcomes due to two reasons. First, compared to parametric models such as the zero-inflated Poisson and two-part model, quantile regression gives robust and accurate estimation by avoiding likelihood specification and can capture the tail events and heterogeneity over the outcome distribution. Second, while the mean-based regression may be misinterpreted for a zero-inflated outcome, the interpretation of quantiles is naturally compatible with the underlying process that such an outcome intends to measure. Unfortunately, uncorrected linear quantile regression is not directly applicable because of two reasons. First, the feasibility of estimation and validity of inference of quantile regression require the conditional distribution of outcomes to be absolutely continuous, which is violated due to zero-inflation. Second, direct quantile regression implicitly assumes a constant chance to observe a positive outcome, but the degree of zero-inflation varies with the covariates in most cases. Thus the conditional quantile function of the outcome depends on the covariates in a nonlinear fashion. To analyze the zero-inflated outcomes by taking advantage of the merits of quantile regression, we propose a novel quantile regression framework that can address all the issues above. In the first part of this dissertation, we propose a two-part model that comprises a logistic regression for the probability of being positive, and a linear quantile regression for the positive part with subject-specific zero-inflation adjusted. Inference on the estimated conditional quantile and covariate effect are not trivial based on such a two-part model. We then develop an algorithm to achieve a consistent estimation of the conditional quantiles, while circumventing the unbounded variance at the quantile level where the conditional quantile changes from zero to positive. Furthermore, we develop an inference tool to determine the quantile treatment effect associated with a covariate at a given quantile level. We evaluate the proposed method and compare it with existing approaches by simulation studies and a real data analysis aimed at studying the risk factors for carotid atherosclerosis. In the second part, based on the proposed two-part model mentioned above, we develop ZIQRank, a zero-inflated quantile rank-score based test to detect the difference in distributions. The proposed test extends the local inference in the first part to a simultaneous one. It is powerful to handle zero-inflation and heterogeneity simultaneously. It comprises a valid test of logistic regression for the zero-inflation and rank-score based tests on multiple quantiles for the positive part with zero-inflation adjusted. The p-values are combined with a procedure selected according to the extent of zero-inflation and heterogeneity of the data. Simulation studies show that compared to existing tests, the proposed test has a higher power in detecting differential distributions. Finally, we apply the ZIQRank test to a human scRNA-seq data to study differentially expressed genes in Neoplastic and Regular cells. It successfully discovers a group of crucial genes associated with glioma, while the other methods fail to do so. In the third part, we extend the proposed two-part quantile regression model for zero-inflated outcomes and the ZIQRank test to analyze longitudinal data. Each part of the proposed two-part model is modified as a marginal longitudinal model (GEE), conditioning on the outcome at the previous time point and its zero/positive status. We apply the model and the test to study the effect of a recommender system aimed at boosting user engagement of a suite of smartphone apps designed for depressed patients. Our novel model framework demonstrates a dominating performance in model fitting, prediction, and critical feature detection, compared to the existing methods.
APA, Harvard, Vancouver, ISO, and other styles
13

Wen, Chi-Chun, and 温啟君. "A simulation study for estimating the parametersof zero-inflated negative binomial distribution." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/25612469063687365982.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Suharto, Rizki Aviandri, and Rizki Aviandri Suharto. "A Double Sampling Plan for the Zero-Inflated Poisson Distribution in the Food Industry." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/93k37m.

Full text
Abstract:
碩士
國立臺灣科技大學
工業管理系
105
Implementing appropriate sampling plan is an important task to protect consumers from pathogenic microorganisms in the food industry. In this study, we develop a double sampling plan (DSP) for the zero-inflated Poisson distribution as an alternative sampling plan. This research provides R-code to determine the plan parameters such as the number of sample size, the acceptance number, and the rejection number on the first stage (n_1, c_1, and r_1), and the number of the sample size and the acceptance number on the second stage (n_2 and c_2). The DSP performs better than the single sampling plan (SSP) in terms of average sample size (ASN) and total risk with the same level quality, acceptable quality level (AQL) and lot of tolerance percent defective (LTPD). Simulated data with various set of given mixing proportion (φ) and specified values (p_1,α,p_2,β) are used to illustrate the applications.
APA, Harvard, Vancouver, ISO, and other styles
15

Herwantoro, Eko, and Eko Herwantoro. "A Double Sampling Plan by Attributes for Zero-Inflated Negative Binomial Distribution in the Food Industry." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/5nq97v.

Full text
Abstract:
碩士
國立臺灣科技大學
工業管理系
105
Microbiological testing is an important task in the food industry. Pathogenic microorganism contaminates the food material as clusters or group of individuals cells. The bacteria follows a zero-inflated Negative Binomial (ZINB) distribution in the entire food material. A double sampling plan (DSP) is developed for an ZINB distribution. We also provide R-code to determine the plan parameters such as the number of sample size, the acceptance number, and the rejection number on the first stage (n1, c1, and r1), and the number of the sample size and the acceptance number on the second stage (n2 and c2). Under given value of (acceptable quality level (AQL), lot tolerance percent defects (LTPD) two risk α and β. Simulated data with various of parameters (φ,p_1,α,p_2,β) are used to illustrate the applications.
APA, Harvard, Vancouver, ISO, and other styles
16

Saab, Rabih. "Nonparametric estimation of the mixing distribution in mixed models with random intercepts and slopes." Thesis, 2013. http://hdl.handle.net/1828/4548.

Full text
Abstract:
Generalized linear mixture models (GLMM) are widely used in statistical applications to model count and binary data. We consider the problem of nonparametric likelihood estimation of mixing distributions in GLMM's with multiple random effects. The log-likelihood to be maximized has the general form l(G)=Σi log∫f(yi,γ) dG(γ) where f(.,γ) is a parametric family of component densities, yi is the ith observed response dependent variable, and G is a mixing distribution function of the random effects vector γ defined on Ω. The literature presents many algorithms for maximum likelihood estimation (MLE) of G in the univariate random effect case such as the EM algorithm (Laird, 1978), the intra-simplex direction method, ISDM (Lesperance and Kalbfleish, 1992), and vertex exchange method, VEM (Bohning, 1985). In this dissertation, the constrained Newton method (CNM) in Wang (2007), which fits GLMM's with random intercepts only, is extended to fit clustered datasets with multiple random effects. Owing to the general equivalence theorem from the geometry of mixture likelihoods (see Lindsay, 1995), many NPMLE algorithms including CNM and ISDM maximize the directional derivative of the log-likelihood to add potential support points to the mixing distribution G. Our method, Direct Search Directional Derivative (DSDD), uses a directional search method to find local maxima of the multi-dimensional directional derivative function. The DSDD's performance is investigated in GLMM where f is a Bernoulli or Poisson distribution function. The algorithm is also extended to cover GLMM's with zero-inflated data. Goodness-of-fit (GOF) and selection methods for mixed models have been developed in the literature, however their application in models with nonparametric random effects distributions is vague and ad-hoc. Some popular measures such as the Deviance Information Criteria (DIC), conditional Akaike Information Criteria (cAIC) and R2 statistics are potentially useful in this context. Additionally, some cross-validation goodness-of-fit methods popular in Bayesian applications, such as the conditional predictive ordinate (CPO) and numerical posterior predictive checks, can be applied with some minor modifications to suit the non-Bayesian approach.
Graduate
0463
rabihsaab@gmail.com
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography