To see the other types of publications on this topic, follow the link: Count statistics.

Dissertations / Theses on the topic 'Count statistics'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Count statistics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Rota, Bernardo Joao. "USING FINITE MIXTURE OF MULTIVARIATE POISSON FOR DETECTION OF MEASUREMENT ERRORS IN COUNT DATA." Thesis, Örebro universitet, Handelshögskolan vid Örebro universitet, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-12562.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Chanialidis, Charalampos. "Bayesian mixture models for count data." Thesis, University of Glasgow, 2015. http://theses.gla.ac.uk/6371/.

Full text
Abstract:
Regression models for count data are usually based on the Poisson distribution. This thesis is concerned with Bayesian inference in more flexible models for count data. Two classes of models and algorithms are presented and studied in this thesis. The first employs a generalisation of the Poisson distribution called the COM-Poisson distribution, which can represent both overdispersed data and underdispersed data. We also propose a density regression technique for count data, which, albeit centered around the Poisson distribution, can represent arbitrary discrete distributions. The key contribution of this thesis are MCMC-based methods for posterior inference in these models. One key challenge in COM-Poisson-based models is the fact that the normalisation constant of the COM-Poisson distribution is not known in closed form. We propose two exact MCMC algorithms which address this problem. One is based on the idea of retrospective sampling; we sample the uniform random variable used to decide on the acceptance (or rejection) of the proposed new state of the unknown parameter first and then only evaluate bounds for the acceptance probability, in the hope that we will not need to know the acceptance probability exactly in order to come to a decision on whether to accept or reject the newly proposed value. This strategy is based on an efficient scheme for computing lower and upper bounds for the normalisation constant. This procedure can be applied to a number of discrete distributions, including the COM-Poisson distribution. The other MCMC algorithm proposed is based on an algorithm known as the exchange algorithm. The latter requires sampling from the COM-Poisson distribution and we will describe how this can be done efficiently using rejection sampling. We will also present simulation studies which show the advantages of using the COM-Poisson regression model compared to the alternative models commonly used in literature (Poisson and negative binomial). Three real world applications are presented: the number of emergency hospital admissions in Scotland in 2010, the number of papers published by Ph.D. students and fertility data from the second German Socio-Economic Panel. COM-Poisson distributions are also the cornerstone of the proposed density regression technique based on Dirichlet process mixture models. Density regression can be thought of as a competitor to quantile regression. Quantile regression estimates the quantiles of the conditional distribution of the response variable given the covariates. This is especially useful when the dispersion changes across the covariates. Instead of estimating the conditional mean , quantile regression estimates the conditional quantile function across different quantiles. As a result, quantile regression models both location and shape shifts of the conditional distribution. This allows for a better understanding of how the covariates affect the conditional distribution of the response variable. Almost all quantile regression techniques deal with a continuous response. Quantile regression models for count data have so far received little attention. A technique that has been suggested is adding uniform random noise ('jittering'), thus overcoming the problem that, for a discrete distribution, the conditional quantile function is not a continuous function of the parameters of interest. Even though this enables us to estimate the conditional quantiles of the response variable, it has disadvantages. For small values of the response variable Y, the added noise can have a large influence on the estimated quantiles. In addition, the problem of 'crossing quantiles' still exists for the jittering method. We eliminate all the aforementioned problems by estimating the density of the data, rather than the quantiles. Simulation studies show that the proposed approach performs better than the already established jittering method. To illustrate the new method we analyse fertility data from the second German Socio-Economic Panel.
APA, Harvard, Vancouver, ISO, and other styles
3

Spangler, Ashley. "An Exploration of the First Pitch in Baseball." Bowling Green State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1490300154782369.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Michell, Justin Walter. "A review of generalized linear models for count data with emphasis on current geospatial procedures." Thesis, Rhodes University, 2016. http://hdl.handle.net/10962/d1019989.

Full text
Abstract:
Analytical problems caused by over-fitting, confounding and non-independence in the data is a major challenge for variable selection. As more variables are tested against a certain data set, there is a greater risk that some will explain the data merely by chance, but will fail to explain new data. The main aim of this study is to employ a systematic and practicable variable selection process for the spatial analysis and mapping of historical malaria risk in Botswana using data collected from the MARA (Mapping Malaria Risk in Africa) project and environmental and climatic datasets from various sources. Details of how a spatial database is compiled for a statistical analysis to proceed is provided. The automation of the entire process is also explored. The final bayesian spatial model derived from the non-spatial variable selection procedure using Markov Chain Monte Carlo simulation was fitted to the data. Winter temperature had the greatest effect of malaria prevalence in Botswana. Summer rainfall, maximum temperature of the warmest month, annual range of temperature, altitude and distance to closest water source were also significantly associated with malaria prevalence in the final spatial model after accounting for spatial correlation. Using this spatial model malaria prevalence at unobserved locations was predicted, producing a smooth risk map covering Botswana. The automation of both compiling the spatial database and the variable selection procedure proved challenging and could only be achieved in parts of the process. The non-spatial selection procedure proved practical and was able to identify stable explanatory variables and provide an objective means for selecting one variable over another, however ultimately it was not entirely successful due to the fact that a unique set of spatial variables could not be selected.
APA, Harvard, Vancouver, ISO, and other styles
5

Gao, Dexiang. "Analysis of clustered longitudinal count data /." Connect to full text via ProQuest. Limited to UCD Anschutz Medical Campus, 2007.

Find full text
Abstract:
Thesis (Ph.D. in Analytic Health Sciences, Department of Preventive Medicine and Biometrics) -- University of Colorado Denver, 2007.<br>Typescript. Includes bibliographical references (leaves 75-77). Free to UCD affiliates. Online version available via ProQuest Digital Dissertations;
APA, Harvard, Vancouver, ISO, and other styles
6

Beeler, Robert A. "How to Count: An Introduction to Combinatorics and Its Applications." Digital Commons @ East Tennessee State University, 2015. https://dc.etsu.edu/etsu_books/179.

Full text
Abstract:
Providing a self-contained resource for upper undergraduate courses in combinatorics, this text emphasizes computation, problem solving, and proof technique. In particular, the book places special emphasis the Principle of Inclusion and Exclusion and the Multiplication Principle. To this end, exercise sets are included at the end of every section, ranging from simple computations (evaluate a formula for a given set of values) to more advanced proofs. The exercises are designed to test students' understanding of new material, while reinforcing a working mastery of the key concepts previously developed in the book. Intuitive descriptions for many abstract techniques are included. Students often struggle with certain topics, such as generating functions, and this intuitive approach to the problem is helpful in their understanding. When possible, the book introduces concepts using combinatorial methods (as opposed to induction or algebra) to prove identities. Students are also asked to prove identities using combinatorial methods as part of their exercises. These methods have several advantages over induction or algebra.<br>https://dc.etsu.edu/etsu_books/1199/thumbnail.jpg
APA, Harvard, Vancouver, ISO, and other styles
7

Roberts, Clint Douglas. "Imputing Missing Values In Time Series Of Count Data Using Hierarchical Models." The Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1211910310.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zheng, Shuo. "New Hierarchical Nonlinear Modeling for Count Data: Estimation and Testing in The Presence of Overdispersion." Diss., Temple University Libraries, 2011. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/132533.

Full text
Abstract:
Statistics<br>Ph.D.<br>In studies of traffic accidents, disease occurrence, mismatches in genetic code and impact of pollution on ecological communities the key observational variable is often a count. For example, daily counts of accidents on a segment of highway will vary with weather and other traffic conditions: highway engineers seek to relate accident counts to these roadway conditions. Formal statistical analysis of this kind of data is critically dependent on using the correct model for the random count data. However, statistical analysis packages use only those random count models that are relatively easy to implement. This thesis shows that applying standard statistical procedures in situations where these limited models are not correct can lead to important errors in statistical inference. A more general and realistic count model is proposed and a statistical analysis is implemented using advanced statistical software. The Poisson log-linear model is a common choice for modeling count data. However, in many applications variability in the observed counts is higher than predicted by the Poisson model. That is, the observed count data is overdispersed relative to the Poisson model. The most common probabilistic model used for analyzing overdispersed count data is the Poisson-gamma mixed model, which is identical to the negative binomial model. Since the negative binomial model is a member of the exponential family of distributions, maximum likelihood estimation can be carried out using widely available generalized linear model procedures. Often a more scientifically justifiable model for overdispersion is the Poisson-lognormal model. However, statistical methods based on the Poisson-lognormal model are seldom used in practice, because of their computational complexity. This thesis addresses the following general question: What are the practical implications of using maximum likelihood procedures based on the negative binomial model when the correct distributional model is the Poisson-lognormal? To answer this question we investigate the robustness, bias and confidence interval coverage for these procedures. A summary conclusion of these extensive studies is that the widely used negative binomial procedure underestimates then variability when the data are from a Poisson-lognormal distribution; leading to hypothesis tests and confidence intervals that are anti-conservative. To set this problem in a classical hypothesis-testing framework, a new hierarchical nonlinear model is developed that includes both the Poisson-lognormal and the Poisson-gamma model within the generalized model's parameter space. Estimation and hypothesis testing can then be carried out using nonlinear mixed procedures available in advanced computational packages.<br>Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
9

Holloway, Jennifer Patricia. "Time series analysis of count data with an application to the incidence of cholera." Master's thesis, University of Cape Town, 2011. http://hdl.handle.net/11427/11088.

Full text
Abstract:
Includes bibliographical references (leaves 88-93).<br>This dissertation comprises a study into the application of count data time series models to weekly counts of cholera cases that have been recorded in Beira, Mozambique. The study specifically looks at two classes of time series models for count data, namely observation-driven and parameter-driven, and two models from each of these classes are investigated. The autoregressive conditional Poisson (ACP) and double autoregressive conditional Poisson (DACP) are considered under the observation-driven class, while the parameter-driven models used are the Poisson-gamma and stochastic autoregressive mean (SAM) model. An in-depth case study of the cholera counts is presented in which the four selected count data time series models are compared. In addition the time series models are compared to static Poisson and negative binomial regression, thereby indicating the benefits gained in using count data time series models when the counts exhibit serial correlation. In the process of comparing the models, the effect of environmental drivers on the outbreaks of cholera are observed and discussed.
APA, Harvard, Vancouver, ISO, and other styles
10

Zhuang, Lili. "Bayesian Dynamical Modeling of Count Data." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1315949027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Haggerty, Kevin Daniel. "Making crime count : a study of the institutional production of criminal justice statistics." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/tape15/PQDD_0003/NQ34528.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Ntushelo, Nombasa Sheroline. "Exploratory and inferential multivariate statistical techniques for multidimensional count and binary data with applications in R." Thesis, Stellenbosch : Stellenbosch University, 2011. http://hdl.handle.net/10019.1/17949.

Full text
Abstract:
Thesis (MComm)--Stellenbosch University, 2011.<br>ENGLISH ABSTRACT: The analysis of multidimensional (multivariate) data sets is a very important area of research in applied statistics. Over the decades many techniques have been developed to deal with such datasets. The multivariate techniques that have been developed include inferential analysis, regression analysis, discriminant analysis, cluster analysis and many more exploratory methods. Most of these methods deal with cases where the data contain numerical variables. However, there are powerful methods in the literature that also deal with multidimensional binary and count data. The primary purpose of this thesis is to discuss the exploratory and inferential techniques that can be used for binary and count data. In Chapter 2 of this thesis we give the detail of correspondence analysis and canonical correspondence analysis. These methods are used to analyze the data in contingency tables. Chapter 3 is devoted to cluster analysis. In this chapter we explain four well-known clustering methods and we also discuss the distance (dissimilarity) measures available in the literature for binary and count data. Chapter 4 contains an explanation of metric and non-metric multidimensional scaling. These methods can be used to represent binary or count data in a lower dimensional Euclidean space. In Chapter 5 we give a method for inferential analysis called the analysis of distance. This method use a similar reasoning as the analysis of variance, but the inference is based on a pseudo F-statistic with the p-value obtained using permutations of the data. Chapter 6 contains real-world applications of these above methods on two special data sets called the Biolog data and Barents Fish data. The secondary purpose of the thesis is to demonstrate how the above techniques can be performed in the software package R. Several R packages and functions are discussed throughout this thesis. The usage of these functions is also demonstrated with appropriate examples. Attention is also given to the interpretation of the output and graphics. The thesis ends with some general conclusions and ideas for further research.<br>AFRIKAANSE OPSOMMING: Die analise van meerdimensionele (meerveranderlike) datastelle is ’n belangrike area van navorsing in toegepaste statistiek. Oor die afgelope dekades is daar verskeie tegnieke ontwikkel om sulke data te ontleed. Die meerveranderlike tegnieke wat ontwikkel is sluit in inferensie analise, regressie analise, diskriminant analise, tros analise en vele meer verkennende data analise tegnieke. Die meerderheid van hierdie metodes hanteer gevalle waar die data numeriese veranderlikes bevat. Daar bestaan ook kragtige metodes in die literatuur vir die analise van meerdimensionele binêre en telling data. Die primêre doel van hierdie tesis is om tegnieke vir verkennende en inferensiële analise van binêre en telling data te bespreek. In Hoofstuk 2 van hierdie tesis bespreek ons ooreenkoms analise en kanoniese ooreenkoms analise. Hierdie metodes word gebruik om data in gebeurlikheidstabelle te analiseer. Hoofstuk 3 bevat tegnieke vir tros analise. In hierdie hoofstuk verduidelik ons vier gewilde tros analise metodes. Ons bespreek ook die afstand maatstawwe wat beskikbaar is in die literatuur vir binêre en telling data. Hoofstuk 4 bevat ’n verduideliking van metriese en nie-metriese meerdimensionele skalering. Hierdie metodes kan gebruik word om binêre of telling data in ‘n lae dimensionele Euclidiese ruimte voor te stel. In Hoofstuk 5 beskryf ons ’n inferensie metode wat bekend staan as die analise van afstande. Hierdie metode gebruik ’n soortgelyke redenasie as die analise van variansie. Die inferensie hier is gebaseer op ’n pseudo F-toetsstatistiek en die p-waardes word verkry deur gebruik te maak van permutasies van die data. Hoofstuk 6 bevat toepassings van bogenoemde tegnieke op werklike datastelle wat bekend staan as die Biolog data en die Barents Fish data. Die sekondêre doel van die tesis is om te demonstreer hoe hierdie tegnieke uitgevoer word in the R sagteware. Verskeie R pakette en funksies word deurgaans bespreek in die tesis. Die gebruik van die funksies word gedemonstreer met toepaslike voorbeelde. Aandag word ook gegee aan die interpretasie van die afvoer en die grafieke. Die tesis sluit af met algemene gevolgtrekkings en voorstelle vir verdere navorsing.
APA, Harvard, Vancouver, ISO, and other styles
13

Nian, Gaowei. "A score test of homogeneity in generalized additive models for zero-inflated count data." Kansas State University, 2014. http://hdl.handle.net/2097/18230.

Full text
Abstract:
Master of Science<br>Department of Statistics<br>Wei-Wen Hsu<br>Zero-Inflated Poisson (ZIP) models are often used to analyze the count data with excess zeros. In the ZIP model, the Poisson mean and the mixing weight are often assumed to depend on covariates through regression technique. In other words, the effect of covariates on Poisson mean or the mixing weight is specified using a proper link function coupled with a linear predictor which is simply a linear combination of unknown regression coefficients and covariates. However, in practice, this predictor may not be linear in regression parameters but curvilinear or nonlinear. Under such situation, a more general and flexible approach should be considered. One popular method in the literature is Zero-Inflated Generalized Additive Models (ZIGAM) which extends the zero-inflated models to incorporate the use of Generalized Additive Models (GAM). These models can accommodate the nonlinear predictor in the link function. For ZIGAM, it is also of interest to conduct inferences for the mixing weight, particularly evaluating whether the mixing weight equals to zero. Many methodologies have been proposed to examine this question, but all of them are developed under classical zero-inflated models rather than ZIGAM. In this report, we propose a generalized score test to evaluate whether the mixing weight is equal to zero under the framework of ZIGAM with Poisson model. Technically, the proposed score test is developed based on a novel transformation for the mixing weight coupled with proportional constraints on ZIGAM, where it assumes that the smooth components of covariates in both the Poisson mean and the mixing weight have proportional relationships. An intensive simulation study indicates that the proposed score test outperforms the other existing tests when the mixing weight and the Poisson mean truly involve a nonlinear predictor. The recreational fisheries data from the Marine Recreational Information Program (MRIP) survey conducted by National Oceanic and Atmospheric Administration (NOAA) are used to illustrate the proposed methodology.
APA, Harvard, Vancouver, ISO, and other styles
14

Knapová, Petra. "Výsledky studentů VŠE ve statistice." Master's thesis, Vysoká škola ekonomická v Praze, 2008. http://www.nusl.cz/ntk/nusl-4228.

Full text
Abstract:
The main aim of my thesis is to compare results of different groups of students who took part in the statistic course on VŠE. The thesis is dividend into two parts. The first part is focused on theoretical solution of used data comparison. The second part is practical and it is focused on results comparison from different angles with descriptive statistics and some statisistics tests. The results from partial comparison of data from different points of view are summarized at the end of the thesis.
APA, Harvard, Vancouver, ISO, and other styles
15

Pihl, Svante, and Leonardo Olivetti. "An Empirical Comparison of Static Count Panel Data Models: the Case of Vehicle Fires in Stockholm County." Thesis, Uppsala universitet, Statistiska institutionen, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-412014.

Full text
Abstract:
In this paper we study the occurrences of outdoor vehicle fires recorded by the Swedish Civil Contingencies Agency (MSB) for the period 1998-2019, and build static panel data models to predict future occurrences of fire in Stockholm County. Through comparing the performance of different models, we look at the effect of different distributional assumptions for the dependent variable on predictive performance. Our study concludes that treating the dependent variable as continuous does not hamper performance, with the exception of models meant to predict more uncommon occurrences of fire. Furthermore, we find that assuming that the dependent variable follows a Negative Binomial Distribution, rather than a Poisson Distribution, does not lead to substantial gains in performance, even in cases of overdispersion. Finally, we notice a slight increase in the number of vehicle fires shown in the data, and reflect on whether this could be related to the increased population size.
APA, Harvard, Vancouver, ISO, and other styles
16

Shobe, Kristin N. "Variable sampling intervals for control charts using count data." Thesis, Virginia Polytechnic Institute and State University, 1988. http://hdl.handle.net/10919/52076.

Full text
Abstract:
This thesis examines the use of variable sampling intervals as they apply to control charts that use count data. Papers by Reynolds, Arnold, and R. Amin developed properties for charts with an underlying normal distribution. These properties are extended in this thesis to accommodate an underlying Poisson distribution.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
17

Yang, Hui. "Adjusting for Bounding and Time-in-Sample Eects in the National Crime Victimization Survey (NCVS) Property Crime Rate Estimation." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1452167047.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Harper, Gill. "A study of the use of linked, routinely collected, administrative data at the local level to count and profile populations." Thesis, City, University of London, 2017. http://openaccess.city.ac.uk/18244/.

Full text
Abstract:
There is increasing evidence that official population statistics are inaccurate at the local authority level, the fundamental administrative unit of the UK. The main source of official population statistics in the UK comes from the decennial census, last undertaken in 2011. The methodology and results of official population counts have been criticised and described as unfit for purpose. The three main purposes of population statistics are resource allocation, population ratios, and local planning and intelligence. Administrative data are data that is routinely collected for administrative purposes by organisations, government departments or companies and not for statistical or research purposes. This is in contrast with surveys which are designed and carried out as a specific information gathering exercise. This thesis describes a methodology for linking routinely collected administrative data for counting and profiling populations and other purposes at the local level. The benefits of this methodology are that it produces results more quickly than the decennial census, in a format that is more suitable for accurate and detailed analyses. Utilising existing datasets in this way reduces costs and adds value. The need and the evolution of this innovative methodology are set out, and the success and impact it has had are discussed, including how it has helped shape thinking on statistics in the UK. This research preceded the current paradigm shift in the UK for research and national statistics to move towards the use of linked administrative data. Future censuses after 2021 may no longer be in the traditional survey format, and the Office for National Statistics are exploring using a similar administrative data method at the national level as an alternative. The research in this thesis has been part of this inevitable evolution and has helped pave the way for this.
APA, Harvard, Vancouver, ISO, and other styles
19

Kreider, Scott Edwin Douglas. "A case study in handling over-dispersion in nematode count data." Manhattan, Kan. : Kansas State University, 2010. http://hdl.handle.net/2097/4248.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Raithel, Seth. "Inferential considerations for low-count RNA-seq transcripts: a case study on an edaphic subspecies of dominant prairie grass Andropogon gerardii." Kansas State University, 2015. http://hdl.handle.net/2097/19712.

Full text
Abstract:
Master of Science<br>Statistics<br>Nora M. Bello<br>Big bluestem (Andropogon gerardii) is a wide-ranging dominant prairie grass of ecological and agricultural importance to the US Midwest while edaphic subspecies sand bluestem (A. gerardii ssp. Hallii) grows exclusively on sand dunes. Sand bluestem exhibits phenotypic divergence related to epicuticular properties and enhanced drought tolerance relative to big bluestem. Understanding the mechanisms underlying differential drought tolerance is relevant in the face of climate change. For bluestem subspecies, presence or absence of these phenotypes may be associated with RNA transcripts characterized by low number of read counts. So called low-count transcripts pose particular inferential challenges and are thus usually filtered out at early steps of data management protocols and ignored for analyses. In this study, we use a plasmode-based approach to assess the relative performance of alternative inferential strategies on RNA-seq transcripts, with special emphasis on low-count transcripts as motivated by differential bluestem phenotypes. Our dataset consists of RNA-seq read counts for 25,582 transcripts (60% of which are classified as low-count) collected from leaf tissue of 4 individual plants of big bluestem and 4 of sand bluestem. We also compare alternative ad-hoc data filtering techniques commonly used in RNA-seq pipelines and assess the performance of recently developed statistical methods for differential expression (DE) analysis, namely DESeq2 and edgeR robust. These methods attempt to overcome the inherently noisy behavior of low-count transcripts by either shrinkage or differential weighting of observations, respectively. Our results indicate that proper specification of DE methods can remove the need for ad- hoc data filtering at arbitrary expression threshold, thus allowing for inference on low-count transcripts. Practical recommendations for inference are provided when low-count RNA-seq transcripts are of interest, as is the case in the comparison of subspecies of bluestem grasses. Insights from this study may also be relevant to other applications also focused on transcripts of low expression levels.
APA, Harvard, Vancouver, ISO, and other styles
21

Su, Weizhe. "Bayesian Hidden Markov Model in Multiple Testing on Dependent Count Data." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1613751403094066.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Duke, John E. "Land Use and Urbanization Patterns in an Established Enzootic Raccoon Rabies Area." Digital Archive @ GSU, 2012. http://digitalarchive.gsu.edu/iph_theses/205.

Full text
Abstract:
We analyzed how land-use patterns and changes in urbanization influence positive raccoon rabies cases in an established enzootic area. County resolution was used and the study area included all 159 counties in Georgia. We obtained data on raccoons submitted from 2006 through 2010 for testing at the state public health labs due to exposure incidents with people or domesticated animals. The land-use patterns were extracted from the US Geological Survey’s National Land Cover Database from both 2001 and 2006. Odds ratios were calculated on 16 land-use variables that included natural topography, agricultural development, and urbanization. An additional variable, Submissions/Population density, was used to normalize counties and to account for population bias associated with rabies surveillance. The use of this demographic variable was substantiated by GIS clustering analysis. The outcome variable was heavily right skewed and over dispersed and therefore a negative binomial regression was used in this count statistics technique. The final analysis showed that low intensity residential development is associated with raccoon rabies cases while evergreen forest offers protection. This study supports the hypothesis that the raccoon rabies enzootic is maintained in those edge ecosystems of urbanization. It is advocated here that the public health animal rabies database to include GPS coordinates when reporting wildlife rabies submissions for testing to improve the resolution when studying the disease ecology of enzootic rabies.
APA, Harvard, Vancouver, ISO, and other styles
23

Rettiganti, Mallikarjuna Rao. "Statistical Models for Count Data from Multiple Sclerosis Clinical Trials and their Applications." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1291180207.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Lenz, Lauren Holt. "Statistical Methods to Account for Gene-Level Covariates in Normalization of High-Dimensional Read-Count Data." DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7392.

Full text
Abstract:
The goal of genetic-based cancer research is often to identify which genes behave differently in cancerous and healthy tissue. This difference in behavior, referred to as differential expression, may lead researchers to more targeted preventative care and treatment. One way to measure the expression of genes is though a process called RNA-Seq, that takes physical tissue samples and maps gene products and fragments in the sample back to the gene that created it, resulting in a large read-count matrix with genes in the rows and a column for each sample. The read-counts for tumor and normal samples are then compared in a process called differential expression analysis. However, normalization of these read-counts is a necessary pre-processing step, in order to account for differences in the read-count values due to non-expression related variables. It is common in recent RNA-Seq normalization methods to also account for gene-level covariates, namely gene length in base pairs and GC-content, the proportion of bases in the gene that are Guanine and Cytosine. Here a colorectal cancer RNA-Seq read-count data set comprised of 30,220 genes and 378 samples is examined. Two of the normalization methods that account for gene length and GC-content, CQN and EDASeq, are extended to account for protein coding status as a third gene-level covariate. The binary nature of protein coding status results in unique computation issues. The results of using the normalized read counts from CQN, EDASeq, and four new normalization methods are used for differential expression analysis via the nonparametric Wilcoxon Rank-Sum Test as well as the lme4 pipeline that produces per-gene models based on a negative binomial distribution. The resulting differential expression results are compared for two genes of interest in colorectal cancer, APC and CTNNB1, both of the WNT signaling pathway.
APA, Harvard, Vancouver, ISO, and other styles
25

Li, Xiaobai. "Stochastic models for MRI lesion count sequences from patients with relapsing remitting multiple sclerosis." Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1142907194.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Sonksen, Michael David. "Bayesian Model Diagnostics and Reference Priors for Constrained Rate Models of Count Data." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1312909127.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Bhaktha, Nivedita. "Properties of Hurdle Negative Binomial Models for Zero-Inflated and Overdispersed Count data." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1543573678017356.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Inan, Gul. "A Simulation Study On The Comparison Of Methods For The Analysis Of Longitudinal Count Data." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/2/12610798/index.pdf.

Full text
Abstract:
The longitudinal feature of measurements and counting process of responses motivate the regression models for longitudinal count data (LCD) to take into account the phenomenons such as within-subject association and overdispersion. One common problem in longitudinal studies is the missing data problem, which adds additional difficulties into the analysis. The missingness can be handled with missing data techniques. However, the amount of missingness in the data and the missingness mechanism that the data have affect the performance of missing data techniques. In this thesis, among the regression models for LCD, the Log-Log-Gamma marginalized multilevel model (Log-Log-Gamma MMM) and the random-intercept model are focused on. The performance of the models is compared via a simulation study under three missing data mechanisms (missing completely at random, missing at random conditional on observed data, and missing not random), two types of missingness percentage (10% and 20%), and four missing data techniques (complete case analysis, subject, occasion and conditional mean imputation). The simulation study shows that while the mean absolute error and mean square error values of Log-Log-Gamma MMM are larger in amount compared to the random-intercept model, both regression models yield parallel results. The simulation study results justify that the amount of missingness in the data and that the missingness mechanism that the data have, strictly influence the performance of missing data techniques under both regression models. Furthermore, while generally occasion mean imputation displays the worst performance, conditional mean imputation shows a superior performance over occasion and subject mean imputation and gives parallel results with complete case analysis.
APA, Harvard, Vancouver, ISO, and other styles
29

Landgraf, Andrew J. "Generalized Principal Component Analysis: Dimensionality Reduction through the Projection of Natural Parameters." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1437610558.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Monzón, Montoya Alejandro Guillermo. "Inferencia e diagnostico em modelos para dados de contagem com excesso de zeros." [s.n.], 2009. http://repositorio.unicamp.br/jspui/handle/REPOSIP/306677.

Full text
Abstract:
Orientador: Victor Hugo Lachos Davila<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica<br>Made available in DSpace on 2018-08-13T06:59:43Z (GMT). No. of bitstreams: 1 MonzonMontoya_AlejandroGuillermo_M.pdf: 1229957 bytes, checksum: a4ad33aa2fe94f8744977822a1fd1362 (MD5) Previous issue date: 2009<br>Resumo: Em análise de dados, muitas vezes encontramos dados de contagem onde a quantidade de zeros excede aquela esperada sob uma determinada distribuição, tal que não é possível fazer uso dos modelos de regressão usuais. Além disso, o excesso de zeros pode fazer com que exista sobredispersão nos dados. Neste trabalho são apresentados quatro tipos de modelos para dados de contagem inflacionados de zeros: o modelo Binomial (ZIB), o modelo Poisson (ZIP), o modelo binomial negativa (ZINB) e o modelo beta-binomial (ZIBB). Usa-se o algoritmo EM para obter estimativas de máxima verossimilhança dos parâmetros do modelo e usando a função de log-verossimilhança dos dados completos obtemos medidas de influência local baseadas na metodologia proposta por Zhu e Lee (2001) e Lee e Xu (2004). Também propomos como construir resíduos para os modelos ZIB e ZIP. Finalmente, as metodologias descritas são ilustradas pela análise de dados reais<br>Abstract: When analyzing count data sometimes a high frequency of extra zeros is observed and the usual regression analysis is not applicable. This feature may be accounted for by over-dispersion in the data set. In this work, four types of models for zero inflated count data are presented: viz., the zero-inflated Binomial (ZIB), the zero-inflated Poisson (ZIP), the zero-inflated Negative Binomial (ZINB) and the zero-inflated Beta-Binomial (ZIBB) regression models. We use the EM algorithm to obtain maximum likelihood estimates of the parameter of the proposed models and by using the complete data likelihood function we develop local influence measures following the approach of Zhu and Lee (2001) and Lee and Xu (2004). We also discuss the calculation of residuals for the ZIB and ZIP regression models with the aim of identifying atypical observations and/or model misspecification. Finally, results obtained for two real data sets are reported, illustrating the usefulness of the proposed methodology<br>Mestrado<br>Mestre em Estatística
APA, Harvard, Vancouver, ISO, and other styles
31

Chegancas, Rito Tiago Miguel. "Modelling and comparing protein interaction networks using subgraph counts." Thesis, University of Oxford, 2012. http://ora.ox.ac.uk/objects/uuid:dcc0eb0d-1dd8-428d-b2ec-447a806d6aa8.

Full text
Abstract:
The astonishing progress of molecular biology, engineering and computer science has resulted in mature technologies capable of examining multiple cellular components at a genome-wide scale. Protein-protein interactions are one example of such growing data. These data are often organised as networks with proteins as nodes and interactions as edges. Albeit still incomplete, there is now a substantial amount of data available and there is a need for biologically meaningful methods to analyse and interpret these interactions. In this thesis we focus on how to compare protein interaction networks (PINs) and on the rela- tionship between network architecture and the biological characteristics of proteins. The underlying theme throughout the dissertation is the use of small subgraphs – small interaction patterns between 2-5 proteins. We start by examining two popular scores that are used to compare PINs and network models. When comparing networks of the same model type we find that the typical scores are highly unstable and depend on the number of nodes and edges in the networks. This is unsatisfactory and we propose a method based on non-parametric statistics to make more meaningful comparisons. We also employ principal component analysis to judge model fit according to subgraph counts. From these analyses we show that no current model fits to the PINs; this may well reflect our lack of knowledge on the evolution of protein interactions. Thus, we use explanatory variables such as protein age and protein structural class to find patterns in the interactions and subgraphs we observe. We discover that the yeast PIN is highly heterogeneous and therefore no single model is likely to fit the network. Instead, we focus on ego-networks containing an initial protein plus its interacting partners and their interaction partners. In the final chapter we propose a new, alignment-free method for network comparison based on such ego-networks. The method compares subgraph counts in neighbourhoods within PINs in an averaging, many-to-many fashion. It clusters networks of the same model type and is able to successfully reconstruct species phylogenies solely based on PIN data providing exciting new directions for future research.
APA, Harvard, Vancouver, ISO, and other styles
32

Bel, Julien. "Les moments cumulant d'ordre supérieur à deux points des champs cosmologiques : propriétés théoriques et applications." Thesis, Aix-Marseille, 2012. http://www.theses.fr/2012AIXM4715/document.

Full text
Abstract:
La philosophie de cette thèse est de dire que nos plus grandes chances de trouver et de caractériser les ingrédients essentiels du modèle cosmologique passe par l'élargissement de l'éventail des méthodes que l'on peut utiliser dans le but de trouver une nouvelle physique. Bien qu'il soit d'une importance primordiale de continuer à affiner, à de-biaiser et à rendre plus puissantes, les stratégies qui ont contribué à établir le modèle de concordance, il est également crucial de remettre en question, avec de nouvelles méthodes, tous les secteurs de l'actuel paradigme cosmologique. Cette thèse, par conséquent, s'engage dans le défi de développer des sondes cosmologiques nouvelle et performantes qui visent à optimiser les résultats scientifiques des futures grand sondages de galaxies. L'objectif est double. Du côté théorique, je cherche à mettre au point de nouvelles stratégies de test qui sont peu (voire pas du tout) affectées par des incertitudes astrophysiques ou par des modèles phénoménologiques qui ne sont pas complet . Cela rendra les interprétations cosmologiques plus facile et plus sûr. Du côté des observations, l'objectif est d'évaluer les performances des stratégies proposées en utilisant les données actuelles, dans le but de démontrer leur potentiel pour les futures grandes missions cosmologiques tels que BigBoss et EUCLID<br>The philosophy of this thesis is that our best chances of finding and characterizing the essential ingredients of a well grounded cosmological model is by enlarging the arsenal of methods with which we can hunt for new physics. While it is of paramount importance to continue to refine, de-bias and power, the very same testing strategies that contributed to establish the concordance model, it is also crucial to challenge, with new methods, all the sectors of the current cosmological paradigm. This thesis, therefore, engages in the challenge of developing new and performant cosmic probes that aim at optimizing the scientific output of future large redshift surveys. The goal is twofold. From the theoretical side, I aim at developing new testing strategies that are minimally (if not at all) affected by astrophysical uncertainties or by not fully motivated phenomenological models. This will make cosmological interpretations easier and safer. From the observational side, the goal is to gauge the performances of the proposed strategies using current, state of the art, redshift data, and to demonstrate their potential for the future large cosmological missions such as BigBOSS and EUCLID
APA, Harvard, Vancouver, ISO, and other styles
33

Ionnidis, G. "Statistics of YSO jets in the galactic plane from UWISH2." Thesis, University of Kent, 2013. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.653050.

Full text
Abstract:
In order to study jets and outflows from Young Stellar Objects (YSOs), I performed an unbiased search on a continuous 33 square degree sized region in Serpens and Aquila using data taken from the UWISH2 survey, which uses the 1-0 S(I) emission line ofH2 as a tracer. I identified 130 molecular hydrogen outflows from YSOs from which 120 (92 %) objects are new discoveries. Distances were measured by foreground star counts with an accuracy of25 %. Outflows were found in groups of 3 - 5 members with a size of about 5 pc. Groups were separated by about half a degree on the sky. About half of the objects were assigned with potential source candidates. Brighter MHOs had a higher probability to have a source candidate assigned to them. I find an over abundance of outflows with position angles between 1300 and 1500 which is almost perpendicular to the Galactic Plane. The fraction of parsec scale outflows is about 25 % which is more than twice compared to the one found in Orion A by Stanke et al. (2002) and Davis et al. (2009). The outflows are not able to provide a sufficient fraction of energy and momentum to support the turbulence levels in their surrounding molecular clouds. The typical dynamical jet age was of the order of 104 yrs, while groups of emission knots are ejected every 103 yrs. This indicates that low level accretion rate fluctuations and not Fu-Ori type events are responsible for the episodic ej ection of material. The luminosity distribution of the outflows shows a power law behaviour with N ex: LH;·9. The Milky Way star formation rate was estimated to more than 1.6 ± 0.4M0 yr-1 The Spectral Index Classification distribution plot of YSOs indicated that the number of outflows increases in line with a values and has a similar distribution to the one from Davis et al. (2009) from Orion A.
APA, Harvard, Vancouver, ISO, and other styles
34

Ingersoll, Thomas Eugene. "Statistical modeling with counts of bats." UNIVERSITY OF CALIFORNIA, BERKELEY, 2011. http://pqdtopen.proquest.com/#viewpdf?dispub=3448989.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

何志興 and Chi-hing Ho. "The statistical analysis of multivariate counts." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1991. http://hub.hku.hk/bib/B31232218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Ho, Chi-hing. "The statistical analysis of multivariate counts /." [Hong Kong] : University of Hong Kong, 1991. http://sunzi.lib.hku.hk/hkuto/record.jsp?B12922602.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Santos, Cláudia Susana Pereira dos. "Statistical analysis of count time series with periodic structure." Doctoral thesis, Universidade de Aveiro, 2017. http://hdl.handle.net/10773/21519.

Full text
Abstract:
Doutoramento em Matemática<br>Os modelos autoregressivos de valores inteiros multivariados (MINAR) desempenham um papel central na análise estatística de séries temporais de contagem. Dentro do razoavelmente grande espectro de modelos MINAR propostos na literatura, muito poucos focam a análise de séries de contagem com estrutura periódica. A análise dos processos de contagem multivariados apresenta muitos desafios que vão desde a especificação do modelo até à estimação de parâmetros. Esta tese tem como objetivo dar uma contribuição nessa direção. Especificamente, o objetivo deste trabalho é duplo: primeiro, introduzimos o processo multivariado periódico de ordem um, PMINAR(1). As propriedades probabilísticas e estatísticas do modelo são estudadas em detalhe. Para superar as dificuldades computacionais decorrentes da utilização do método da máxima verosimilhança introduzimos uma abordagem baseada na verosimilhança composta. O desempenho do método proposto e outros métodos concorrentes na estimação dos parâmetros é comparado através de um estudo de simulação. A previsão também é abordada. Uma aplicação de dados reais relacionados com a análise de fogos é apresentada. Em segundo lugar, propomos dois modelos INAR (univariado e bivariado) com estrutura periódica, S-PINAR(1) e BS-PINAR(1), respetivamente. Ambos os modelos são baseados no operador signed thinning permitindo contagens de valores positivos e negativos. Apresentamos as propriedades probabilísticas básicas e estatísticas dos modelos periódicos. As inovações são modeladas através das distribuições Skellam univariada e bivariada, respetivamente. Para avaliar o desempenho dos estimadores dos mínimos quadrados condicionais e da máxima verosimilhança condicional, foi realizado um estudo de simulação para o modelo S-PINAR(1).<br>Multivariate INteger–valued AutoRegressive (MINAR) processes play a central role in the statistical analysis of integer-valued time series. Within the reasonably large spectrum of MINAR models proposed in the literature, however, only a few focus on the analysis of time series of count data with periodic structure. The analysis of multivariate counting processes presents many challenging problems ranging from model specification to parameter estimation. This thesis aims at giving a contribution towards this direction. Specifically, the purpose of this research is two-fold: first, we introduce the periodic multivariate process of order one (PMINAR(1) in short). The probabilistic and also the statistical properties of the model are studied in detail. To overcome the computational difficulties arising from the use of the maximum likelihood method we introduce a composite likelihood-based approach. The performance of the proposed method and other competitors methods of estimation is compared through a simulation study. Forecasting is also addressed. An application to a real data set related with the analysis of fire activity is presented. Secondly, we propose two INAR (univariate and bivariate) models with periodic structure, S-PINAR(1) and BS-PINAR(1), respectively. Both models are based on the signed thinning operator allowing for positive and negative counts. We examine the basic probabilistic and also the statistical properties of the periodic models. Innovations are modeled by univariate and bivariate Skellam distributions, respectively. To study the performance of the conditional least squares and conditional maximum likelihood estimators, a simulation study is conducted for the S-PINAR(1) model.
APA, Harvard, Vancouver, ISO, and other styles
38

Yang, Ming. "Statistical models for count time series with excess zeros." Diss., University of Iowa, 2012. https://ir.uiowa.edu/etd/3019.

Full text
Abstract:
Time series data involving counts are frequently encountered in many biomedical and public health applications. For example, in disease surveillance, the occurrence of rare infections over time is often monitored by public health officials, and the time series data collected can be used for the purpose of monitoring changes in disease activity. For rare diseases with low infection rates, the observed counts typically contain a high frequency of zeros (zero-inflated), but the counts can also be very large during an outbreak period. Failure to account for zero-inflation in the data may result in misleading inference and the detection of spurious associations. In this thesis, we develop two classes of statistical models for zero-inflated time series. The first part of the thesis introduces a class of observation-driven models in a partial likelihood framework. The expectation-maximization (EM) algorithm is applied to obtain the maximum partial likelihood estimator (MPLE). We establish the asymptotic theory of the MPLE under certain regularity conditions. The performances of different partial-likelihood based model selection criteria are compared under model misspecification. In the second part of the thesis, we introduce a class of parameter-driven models in a state-space framework. To estimate the model parameters, we devise a Monte Carlo EM algorithm, where particle filtering and particle smoothing methods are employed to approximate the high-dimensional integrals in the E-step of the algorithm. Upon convergence, Louis' formula is used to find the observed information matrix. The proposed models are illustrated with simulated data and an application based on public health surveillance for syphilis, a sexually transmitted disease (STD) that remains a major public health challenge in the United States. An R package, called ZIM (Zero-Inflated Models), has been developed to fit both observation-driven models and parameter-driven models.
APA, Harvard, Vancouver, ISO, and other styles
39

Osuna, Echavarría Leyre Estíbaliz. "Semiparametric Bayesian Count Data Models." Diss., lmu, 2004. http://nbn-resolving.de/urn:nbn:de:bvb:19-25573.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Hofmann, Mathias. "Statistical Models for Infectious Disease Surveillance Counts." Diss., lmu, 2007. http://nbn-resolving.de/urn:nbn:de:bvb:19-66012.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Keita, Abdoulaye. "The relative ecological effectiveness and economic efficiency of four wastewater treatment plants in East Central Indiana." Virtual Press, 2000. http://liblink.bsu.edu/uhtbin/catkey/1177978.

Full text
Abstract:
The study was conducted to investigate the ecological effectiveness and economicefficiency of four wastewater treatment plants in East Central Indiana (Muncie, Anderson, Alexandria, and Paws). Data were collected from the four plants, then analyzed descriptively and statistically, and compared in terms of ecological effectiveness and economic efficiency. The Muncie, Anderson, and Paws wastewater treatment facilities were not significantly different from one another in terms of biochemical oxygen demand (BOD5) reductions, but each reduced BOD5 more than the Alexandria facility over the three- year period (1996, 1997, and 1998). Plants were not statistically different regarding suspended solids (SS) reductions. The Muncie, Anderson, and Paws wastewater treatment plants were also not significantly different from one another on ammonia reduction, but each plant reduced ammonia significantly more than Alexandria. Muncie and Anderson were not different from each other on dissolved oxygen (DO) levels, but each had a statistically higher level of DO in the final effluent than Alexandria and Paws. The study showed a statistically significant difference in fecal coliform bacteria abatement between Anderson and Alexandria, Anderson and Paws, and Muncie and Alexandria. Furthermore, Muncie, Anderson and Alexandria were different in terms of cost per 1000 gallons of wastewater treated. Muncie has been treating wastewater at a lower cost than the other treatment plants, whereas Anderson had a higher cost over the three-year period.<br>Department of Natural Resources and Environmental Management
APA, Harvard, Vancouver, ISO, and other styles
42

Bonafede, Elisabetta <1987&gt. "Differential expression analysis for sequence count data via mixtures of negative binomials." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amsdottorato.unibo.it/6741/.

Full text
Abstract:
The recent advent of Next-generation sequencing technologies has revolutionized the way of analyzing the genome. This innovation allows to get deeper information at a lower cost and in less time, and provides data that are discrete measurements. One of the most important applications with these data is the differential analysis, that is investigating if one gene exhibit a different expression level in correspondence of two (or more) biological conditions (such as disease states, treatments received and so on). As for the statistical analysis, the final aim will be statistical testing and for modeling these data the Negative Binomial distribution is considered the most adequate one especially because it allows for "over dispersion". However, the estimation of the dispersion parameter is a very delicate issue because few information are usually available for estimating it. Many strategies have been proposed, but they often result in procedures based on plug-in estimates, and in this thesis we show that this discrepancy between the estimation and the testing framework can lead to uncontrolled first-type errors. We propose a mixture model that allows each gene to share information with other genes that exhibit similar variability. Afterwards, three consistent statistical tests are developed for differential expression analysis. We show that the proposed method improves the sensitivity of detecting differentially expressed genes with respect to the common procedures, since it is the best one in reaching the nominal value for the first-type error, while keeping elevate power. The method is finally illustrated on prostate cancer RNA-seq data.
APA, Harvard, Vancouver, ISO, and other styles
43

Love, Michael I. [Verfasser]. "Statistical analysis of high-throughput sequencing count data / Michael I. Love." Berlin : Freie Universität Berlin, 2013. http://d-nb.info/1043197842/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Duan, Yuanyuan. "Statistical Predictions Based on Accelerated Degradation Data and Spatial Count Data." Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/56616.

Full text
Abstract:
This dissertation aims to develop methods for statistical predictions based on various types of data from different areas. We focus on applications from reliability and spatial epidemiology. Chapter 1 gives a general introduction of statistical predictions. Chapters 2 and 3 investigate the photodegradation of an organic coating, which is mainly caused by ultraviolet (UV) radiation but also affected by environmental factors, including temperature and humidity. In Chapter 2, we identify a physically motivated nonlinear mixed-effects model, including the effects of environmental variables, to describe the degradation path. Unit-to-unit variabilities are modeled as random effects. The maximum likelihood approach is used to estimate parameters based on the accelerated test data from laboratory. The developed model is then extended to allow for time-varying covariates and is used to predict outdoor degradation where the explanatory variables are time-varying. Chapter 3 introduces a class of models for analyzing degradation data with dynamic covariate information. We use a general path model with random effects to describe the degradation paths and a vector time series model to describe the covariate process. Shape restricted splines are used to estimate the effects of dynamic covariates on the degradation process. The unknown parameters of these models are estimated by using the maximum likelihood method. Algorithms for computing the estimated lifetime distribution are also described. The proposed methods are applied to predict the photodegradation path of an organic coating in a complicated dynamic environment. Chapter 4 investigates the Lyme disease emergency in Virginia at census tract level. Based on areal (census tract level) count data of Lyme disease cases in Virginia from 1998 to 2011, we analyze the spatial patterns of the disease using statistical smoothing techniques. We also use the space and space-time scan statistics to reveal the presence of clusters in the spatial and spatial/temporal distribution of Lyme disease. Chapter 5 builds a predictive model for Lyme disease based on historical data and environmental/demographical information of each census tract. We propose a Divide-Recombine method to take advantage of parallel computing. We compare prediction results through simulation studies, which show our method can provide comparable fitting and predicting accuracy but can achieve much more computational efficiency. We also apply the proposed method to analyze Virginia Lyme disease spatio-temporal data. Our method makes large-scale spatio-temporal predictions possible. Chapter 6 gives a general review on the contributions of this dissertation, and discusses directions for future research.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
45

Boose, Lynn Allen. "A Study of Differences between Social/HMO and Other Medicare Beneficiaries Enrolled in Kaiser Permanente under Capitation Contracts Regarding Intermediate Care Facility Use Rates and Expenditures." PDXScholar, 1993. https://pdxscholar.library.pdx.edu/open_access_etds/1135.

Full text
Abstract:
The Social/HMO Demonstration evaluates the feasibility of expanding Medicare Supplemental Insurance benefits to cover a limited amount of ICF and community based long-term care (LTC) services provided under a comprehensive HMO benefit package for capitated Medicare beneficiaries. The policy research question addressed by this study is whether adding an Expanded Care Benefit (ECB) to the capitated HMO benefit package offered by Kaiser Permanente (KP) changes utilization patterns and costs of ICF services, and the probability of becoming Medicaid eligible. This study provides descriptive information regarding this policy research question. The research goal of this study is to measure the extent to which collective ICF use rates and expenditure patterns for S/HMO members are consistently the same, greater or less than baseline data of Risk HMO Medicare members who do not have the S/HMO ECB. The purpose of such measurement is to determine if an empirical basis exists for postulating an ICF utilization and expenditures outcome effect which is influenced by the S/HMO ECB. Utilization and financial data are collected from all SNF and ICF level nursing homes in Multnomah County for all Medicare beneficiaries enrolled in KP between June 1, 1986 and July 31, 1988. Eligibility data are assembled on all Medicare beneficiaries enrolled in KP during the same time period who were residents of Multnomah County. Nursing home use rates and rates for related expenditures are determined for all nursing home residents (1, 331) by their eligibility status in KP during the time of each nursing home stay. Days in an ICF are censored by transfers between Cost, Risk and S/HMO enrollment status. Rates are standardized by the age and gender distribution of research population members (19, 261) to adjust use rates for differences in age cohort distribution of Risk members and S/HMO members. Risk rates and S/HMO rates are compared and differences in utilization and expenditures are evaluated. Conclusions about such patterns are used to formulate hypotheses for testing and confirming descriptive observations. Findings show that overall S/HMO member rates are less than Risk member rates for five of the six Research Questions addressed in this study. Specifically, the probability of admission to an ICF is substantially greater for S/HMO members than for Risk members. However, S/HMO members remained in ICFs fewer days than Risk members, over the two year study period, as measured by age adjusted rates for ICF days per member year of eligibility during the study period. Difference in the mean length of ICF stay is statistically significant between Risk and S/HMO. The rate of total payments received by nursing homes for S/HMO ICF residents per 1000 S/HMO members was substantially less than that for Risk members. The rate of spend-down to welfare status was substantially lower for S/HMO members than for Risk members who became ICF residents. Higher proportions of S/HMO members were discharged from ICFs to home than were Risk members, which is consistent with S/HMO Expanded Care Benefit objectives.
APA, Harvard, Vancouver, ISO, and other styles
46

Jawa, Taghreed Mohammed. "Statistical methods of detecting change points for the trend of count data." Thesis, University of Strathclyde, 2017. http://digitool.lib.strath.ac.uk:80/R/?func=dbin-jump-full&object_id=28854.

Full text
Abstract:
In epidemiology, controlling infection is a crucial element. Since healthcare associated infections (HAIs) are correlated with increasing costs and mortality rates, effective healthcare interventions are required. Several healthcare interventions have been implemented in Scotland and subsequently Health Protection Scotland (HPS) reported a reduction in HAIs [HPS (2015b, 2016a)]. The aim of this thesis is to use statistical methods and change points analysis to detect the time when the rate of HAIs changed and determine which associated interventions may have impacted such rates. Change points are estimated from polynomial generalized linear models (GLM) and confidence intervals are constructed using bootstrap and delta methods and the two techniques are compared. Segmented regression is also used to look for change points at times when specific interventions took place. A generalization of segmented regression is known as joinpoint analysis which looks for potential change points at each time point in the data, which allows the change to have occurred at any point over time. The joinpoint model is adjusted by adding a seasonal effect to account for additional variability in the rates. Confidence intervals for joinpoints are constructed using bootstrap and profile likelihood methods and the two approaches are compared. Change points from the smoother trend of the generalized additive model (GAM) are also estimated and bootstrapping is used to construct confidence intervals. All methods were found to have similar change points. Segmented regression detects the actual point when an intervention took place. Polynomial GLM, spline GAM and joinpoint analysis models are useful when the impact of an intervention occurs after a period of time. Simulation studies are used to compare polynomial GLM, segmented regression and joinpoint analysis models for detecting change points along with their confidence intervals.
APA, Harvard, Vancouver, ISO, and other styles
47

Madsen, Christopher. "Clustering of the Stockholm County housing market." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-252301.

Full text
Abstract:
In this thesis a clustering of the Stockholm county housing market has been performed using different clustering methods. Data has been derived and different geographical constraints have been used. DeSO areas (Demographic statistical areas), developed by SCB, have been used to divide the housing market in to smaller regions for which the derived variables have been calculated. Hierarchical clustering methods, SKATER and Gaussian mixture models have been applied. Methods using different kinds of geographical constraints have also been applied in an attempt to create more geographically contiguous clusters. The different methods are then compared with respect to performance and stability. The best performing method is the Gaussian mixture model EII, also known as the K-means algorithm. The most stable method when applied to bootstrapped samples is the ClustGeo-method.<br>I denna uppsats har en klustring av Stockholms läns bostadsmarknad genomförts med olika klustringsmetoder. Data har bearbetats och olika geografiska begränsningar har använts. DeSO (Demografiska Statistiska Områden), som utvecklats av SCB, har använts för att dela in bostadsmarknaden i mindre regioner för vilka områdesattribut har beräknats. Hierarkiska klustringsmetoder, SKATER och Gaussian mixture models har tillämpats. Metoder som använder olika typer av geografiska begränsningar har också tillämpats i ett försök att skapa mer geografiskt sammanhängande kluster. De olika metoderna jämförs sedan med avseende på kvalitet och stabilitet. Den bästa metoden, med avseende på kvalitet, är en Gaussian mixture model kallad EII, även känd som K-means. Den mest stabila metoden är ClustGeo-metoden.
APA, Harvard, Vancouver, ISO, and other styles
48

Pirozhkova, Daria. "Statistical models for an MTPL portfolio." Master's thesis, Vysoká škola ekonomická v Praze, 2017. http://www.nusl.cz/ntk/nusl-359373.

Full text
Abstract:
In this thesis, we consider several statistical techniques applicable to claim frequency models of an MTPL portfolio with a focus on overdispersion. The practical part of the work is focused on the application and comparison of the models on real data represented by an MTPL portfolio. The comparison is presented by the results of goodness-of-fit measures. Furthermore, the predictive power of selected models is tested for the given dataset, using the simulation method. Hence, this thesis provides a combination of the analysis of goodness-of-fit results and the predictive power of the models.
APA, Harvard, Vancouver, ISO, and other styles
49

Wright, Joshua P. "Geospatial and Negative Binomial Regression Analysis of Culex nigripalpus, Culex erraticus, Coquillettidia perturbans, and Aedes vexans Counts and Precipitation and Land use Land cover Covariates in Polk County, Florida." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6983.

Full text
Abstract:
Although mosquito monitoring systems in the form of dry-ice bated CDC light traps and sentinel chickens are used by mosquito control personnel in Polk County, Florida, the placement of these are random and do not necessarily reflect prevalent areas of vector mosquito populations. This can result in significant health, economic, and social impacts during disease outbreaks. Of these vector mosquitoes Culex nigripalpus, Culex erraticus, Coquillettidia perturbans, and Aedes vexans are present in Polk County and known to transmit multiple diseases, posing a public health concern. This study seeks to evaluate the effect of Land use Land cover (LULC) unique features and precipitation on spatial and temporal distribution of Cx. nigripalpus, Cx. erraticus, Cq. perturbans, and Ae. vexans in Polk County, Florida, during 2013 and 2014, using negative binomial regression on count data from eight environmentally unique light traps retrieved from Polk County Mosquito Control. The negative binomial regression revealed a statistical association among mosquito species for precipitation and LULC features during the two-year study period, with precipitation proving to be the most significant factor in mosquito count numbers. The findings from this study can aid in more precise targeting of mosquito species, saving time and resources on already stressed public health services.
APA, Harvard, Vancouver, ISO, and other styles
50

Puertas, Monica A. "Statistical and Prognostic Modeling of Clinical Outcomes with Complex Physiologic Data." Scholar Commons, 2014. https://scholarcommons.usf.edu/etd/5106.

Full text
Abstract:
Laboratory tests are a primary resource for diagnosing patient diseases. However, physicians often make decisions based on a single laboratory result and have a limited perspective of the role of commonly-measured parameters in enhancing the diagnostic process. By providing a dynamic patient profile, the diagnosis could be more accurate and timely, allowing physicians to anticipate changes in the recovery trajectory and intervene more effectively. The assessment and monitoring of the circulatory system is essential for patients in intensive care units (ICU). One component of this system is the platelet count, which is used in assessing blood clotting. However, platelet counts represent a dynamic equilibrium of many simultaneous processes, including altered capillary permeability, inflammatory cascades (sepsis), and the coagulation process. To characterize the value of dynamic changes in platelet count, analytical methods are applied to datasets of critically-ill patients in (1) a homogeneous population of ICU cardiac surgery patients and (2) a heterogeneous group of ICU patients with different conditions and several hospital admissions. The objective of this study was to develop a methodology to anticipate adverse events using metrics that capture dynamic changes of platelet counts in a homogeneous population, then redefine the methodology for a more heterogeneous and complex dataset. The methodology was extended to analyze other important physiological parameters of the circulatory system (i.e., calcium, albumin, anion gap, and total carbon dioxide). Finally, the methodology was applied to simultaneously analyze some parameters enhancing the predictive power of various models. This methodology assesses dynamic changes of clinical parameters for a heterogeneous population of ICU patients, defining rates of change determined by multiple point regression and by the simpler fixed time parameter value ratios at specific time intervals. Both metrics provide prognostic information, differentiating survivors from non-survivors and have demonstrated being more predictive than complex metrics and risk assessment scores with greater dimensionality. The goal was to determine a minimal set of biomarkers that would better assist care providers in assessing the risk of complications, allowing them alterations in the management of patients. These metrics should be simple and their implementation would be feasible in any environment and under uncertain conditions of the specific diagnosis and the onset of an acute event that causes a patient's admission to the ICU. The results provide evidence of the different behaviors of physiologic parameters during the recovery processes for survivors and non-survivors. These differences were observed during the first 8 to 10 days after a patient's admission to the ICU. The application of the presented methodology could enhance physicians' ability to diagnose more accurately, anticipate changes in recovery trajectories, and prescribe effective treatment, leading to more personalized care and reduced mortality rates.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography