Dissertations / Theses: 'Sample size estimation'

1

Denne, Jonathan S. "Sequential procedures for sample size estimation." Thesis, University of Bath, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.320460.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Che, Huiwen. "Cutoff sample size estimation for survival data: a simulation study." Thesis, Uppsala universitet, Statistiska institutionen, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-234982.

Full text

Abstract:

This thesis demonstrates the possible cutoff sample size point that balances goodness of es-timation and study expenditure by a practical cancer case. As it is crucial to determine the sample size in designing an experiment, researchers attempt to find the suitable sample size that achieves desired power and budget efficiency at the same time. The thesis shows how simulation can be used for sample size and precision calculations with survival data. The pre-sentation concentrates on the simulation involved in carrying out the estimates and precision calculations. The Kaplan-Meier estimator and the Cox regression coefficient are chosen as point estimators, and the precision measurements focus on the mean square error and the stan-dard error.

APA, Harvard, Vancouver, ISO, and other styles

3

Banton, Dwaine Stephen. "A BAYESIAN DECISION THEORETIC APPROACH TO FIXED SAMPLE SIZE DETERMINATION AND BLINDED SAMPLE SIZE RE-ESTIMATION FOR HYPOTHESIS TESTING." Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/369007.

Full text

Abstract:

Statistics Ph.D. This thesis considers two related problems that has application in the field of experimental design for clinical trials: • fixed sample size determination for parallel arm, double-blind survival data analysis to test the hypothesis of no difference in survival functions, and • blinded sample size re-estimation for the same. For the first problem of fixed sample size determination, a method is developed generally for testing of hypothesis, then applied particularly to survival analysis; for the second problem of blinded sample size re-estimation, a method is developed specifically for survival analysis. In both problems, the exponential survival model is assumed. The approach we propose for sample size determination is Bayesian decision theoretical, using explicitly a loss function and a prior distribution. The loss function used is the intrinsic discrepancy loss function introduced by Bernardo and Rueda (2002), and further expounded upon in Bernardo (2011). We use a conjugate prior, and investigate the sensitivity of the calculated sample sizes to specification of the hyper-parameters. For the second problem of blinded sample size re-estimation, we use prior predictive distributions to facilitate calculation of the interim test statistic in a blinded manner while controlling the Type I error. The determination of the test statistic in a blinded manner continues to be nettling problem for researchers. The first problem is typical of traditional experimental designs, while the second problem extends into the realm of adaptive designs. To the best of our knowledge, the approaches we suggest for both problems have never been done hitherto, and extend the current research on both topics. The advantages of our approach, as far as we see it, are unity and coherence of statistical procedures, systematic and methodical incorporation of prior knowledge, and ease of calculation and interpretation. Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

4

Serrano, Daniel Curran Patrick J. "Error of estimation and sample size in the linear mixed model." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2008. http://dc.lib.unc.edu/u?/etd,1653.

Full text

Abstract:

Thesis (M.A.)--University of North Carolina at Chapel Hill, 2008. Title from electronic title page (viewed Sep. 16, 2008). "... in partial fulfillment of the requirements for the degree of Master of Arts in the Department of Psychology." Discipline: Psychology; Department/School: Psychology.

APA, Harvard, Vancouver, ISO, and other styles

5

Song, Juhee. "Bootstrapping in a high dimensional but very low sample size problem." Texas A&M University, 2003. http://hdl.handle.net/1969.1/3853.

Full text

Abstract:

High Dimension, Low Sample Size (HDLSS) problems have received much attention recently in many areas of science. Analysis of microarray experiments is one such area. Numerous studies are on-going to investigate the behavior of genes by measuring the abundance of mRNA (messenger RiboNucleic Acid), gene expression. HDLSS data investigated in this dissertation consist of a large number of data sets each of which has only a few observations. We assume a statistical model in which measurements from the same subject have the same expected value and variance. All subjects have the same distribution up to location and scale. Information from all subjects is shared in estimating this common distribution. Our interest is in testing the hypothesis that the mean of measurements from a given subject is 0. Commonly used tests of this hypothesis, the t-test, sign test and traditional bootstrapping, do not necessarily provide reliable results since there are only a few observations for each data set. We motivate a mixture model having C clusters and 3C parameters to overcome the small sample size problem. Standardized data are pooled after assigning each data set to one of the mixture components. To get reasonable initial parameter estimates when density estimation methods are applied, we apply clustering methods including agglomerative and K-means. Bayes Information Criterion (BIC) and a new criterion, WMCV (Weighted Mean of within Cluster Variance estimates), are used to choose an optimal number of clusters. Density estimation methods including a maximum likelihood unimodal density estimator and kernel density estimation are used to estimate the unknown density. Once the density is estimated, a bootstrapping algorithm that selects samples from the estimated density is used to approximate the distribution of test statistics. The t-statistic and an empirical likelihood ratio statistic are used, since their distributions are completely determined by the distribution common to all subject. A method to control the false discovery rate is used to perform simultaneous tests on all small data sets. Simulated data sets and a set of cDNA (complimentary DeoxyriboNucleic Acid) microarray experiment data are analyzed by the proposed methods.

APA, Harvard, Vancouver, ISO, and other styles

6

Knowlton, Nicholas Scott. "Robust estimation of inter-chip variability to improve microarray sample size calculations." Oklahoma City : [s.n.], 2005.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

7

Ntambwe, Lupetu Ives. "Sequential sample size re-estimation in clinical trials with multiple co-primary endpoints." Thesis, University of Warwick, 2014. http://wrap.warwick.ac.uk/66339/.

Full text

Abstract:

In this thesis, we consider interim sample size adjustment in clinical trials with multiple co-primary continuous endpoints. We aim to answer two questions: First, how to adjust a sample size in clinical trial with multiple continuous co-primary endpoints using adaptive and group sequential design. Second, how to construct a test in order to control the family-wise type I error rate and maintain the power, even if the correlation ρ between endpoints is not known. To answer the first question, we conduct K different interim tests, each for one endpoint and each at level α/K (i.e. Bonferroni adjustment). To answer the second question, either we perform a sample size re-estimation in which the results of the interim analysis are used to estimate one or more nuisance parameters, and this information is used to determine the sample size for the rest of the trial or the inverse normal combination test type approach; or we conduct a group sequential test where we monitor the information, and the information is adjusted to allow the correlation ρ to be estimated at each stage or the inverse normal combination test type approach. We show that both methods control the family-wise type I error α and maintain the power and that the group sequential methodology seems to be more powerful, as this depends on the spending function.

APA, Harvard, Vancouver, ISO, and other styles

8

Oymak, Okan. "Sample size determination for estimation of sensor detection probabilities based on a test variable." Thesis, Monterey, Calif. : Naval Postgraduate School, 2007. http://bosun.nps.edu/uhtbin/hyperion-image.exe/07Jun%5FOymak.pdf.

Full text

Abstract:

Thesis (M.S. in Operations Research)--Naval Postgraduate School, June 2007. Thesis Advisor(s): Lyn R. Whitaker. "June 2007." Includes bibliographical references (p. 95-96). Also available in print.

APA, Harvard, Vancouver, ISO, and other styles

9

Cong, Danni. "The effect of sample size re-estimation on type I error rates when comparing two binomial proportions." Kansas State University, 2016. http://hdl.handle.net/2097/34504.

Full text

Abstract:

Master of Science Department of Statistics Christopher I. Vahl Estimation of sample size is an important and critical procedure in the design of clinical trials. A trial with inadequate sample size may not produce a statistically significant result. On the other hand, having an unnecessarily large sample size will definitely increase the expenditure of resources and may cause a potential ethical problem due to the exposure of unnecessary number of human subjects to an inferior treatment. A poor estimate of the necessary sample size is often due to the limited information at the planning stage. Hence, the adjustment of the sample size mid-trial has become a popular strategy recently. In this work, we introduce two methods for sample size re-estimation for trials with a binary endpoint utilizing the interim information collected from the trial: a blinded method and a partially unblinded method. The blinded method recalculates the sample size based on the first stage’s overall event proportion, while the partially unblinded method performs the calculation based only on the control event proportion from the first stage. We performed simulation studies with different combinations of expected proportions based on fixed ratios of response rates. In this study, equal sample size per group was considered. The study shows that for both methods, the type I error rates were preserved satisfactorily.

APA, Harvard, Vancouver, ISO, and other styles

10

Zhao, Songnian. "The impact of sample size re-estimation on the type I error rate in the analysis of a continuous end-point." Kansas State University, 2017. http://hdl.handle.net/2097/35326.

Full text

Abstract:

Master of Science Department of Statistics Christopher Vahl Sample size estimation is generally based on assumptions made during the planning stage of a clinical trial. Often, there is limited information available to estimate the initial sample size. This may result in a poor estimate. For instance, an insufficient sample size may not have the capability to produce statistically significant results, while an over-sized study will lead to a waste of resources or even ethical issues in that too many patients are exposed to potentially ineffective treatments. Therefore, an interim analysis in the middle of a trial may be worthwhile to assure that the significance level is at the nominal level and/or the power is adequate to detect a meaningful treatment difference. In this report, the impact of sample size re-estimation on the type I error rate for the continuous end-point in a clinical trial with two treatments is evaluated through a simulation study. Two sample size estimation methods are taken into consideration: blinded and partially unblinded. For the blinded method, all collected data for two groups are used to estimate the variance, while only data from the control group are used to re-estimate the sample size for the partially unblinded method. The simulation study is designed with different combinations of assumed variance, assumed difference in treatment means, and re-estimation methods. The end-point is assumed to follow normal distribution and the variance for both groups are assumed to be identical. In addition, equal sample size is required for each group. According to the simulation results, the type I error rates are preserved for all settings.

APA, Harvard, Vancouver, ISO, and other styles

11

Rahme, Elham H. "Sample size determination for prevalence estimation in the absence of a gold standard diagnostic test." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/tape16/PQDD_0010/NQ30366.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Rahme, Elham H. "Sample size determination for prevalence estimation in the absence of a gold standard diagnostic test." Thesis, McGill University, 1996. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=34434.

Full text

Abstract:

A common problem in medical research is the estimation of the prevalence of a disease in a given population. This is usually accomplished by applying a diagnostic test to a sample of subjects from the target population. In this thesis, we investigate the sample size requirements for the accurate estimation of disease prevalence for such experiments. When a gold standard diagnostic test is available, estimating the prevalence of a disease can be viewed as a problem in estimating a binomial proportion. In this case, we discuss some anomalies in the classical sample size criteria for binomial parameter estimation. These are especially important with small sample sizes. When a gold standard test is not available, one must take into account misclassification errors in order to avoid misleading results. When the sensitivity and the specificity of the diagnostic test are both known, a new adjustment to the maximum likelihood estimator of the prevalence is suggested, and confidence intervals and sample size estimates that arise from this estimator are given. A Bayesian approach is taken when the sensitivity and specificity of the diagnostic test are not exactly known. Here, a method to determine the sample size needed to satisfy a Bayesian sample size criterion that averages over the preposterior marginal distribution of the data is provided. Exact methods are given in some cases, and a sampling importance resampling algorithm is used for more complex situations. A main conclusion is that the degree to which the properties of a diagnostic test are known can have a very large effect on the sample size requirements.

APA, Harvard, Vancouver, ISO, and other styles

13

Ahlers, Zachary. "Estimating the necessary sample size for a binomial proportion confidence interval with low success probabilities." Kansas State University, 2017. http://hdl.handle.net/2097/35762.

Full text

Abstract:

Master of Science Department of Statistics Christopher Vahl Among the most used statistical concepts and techniques, seen even in the most cursory of introductory courses, are the confidence interval, binomial distribution, and sample size estimation. This paper investigates a particular case of generating a confidence interval from a binomial experiment in the case where zero successes are expected. Several current methods of generating a binomial proportion confidence interval are examined by means of large-scale simulations and compared in order to determine an ad-hoc method for generating a confidence interval with coverage as close as possible to nominal while minimizing width. This is then used to construct a formula which allows for the estimation of a sample size necessary to obtain a sufficiently narrow confidence interval (with some predetermined probability of success) using the ad-hoc method given a prior estimate of the probability of success for a single trial. With this formula, binomial experiments could potentially be planned more efficiently, allowing researchers to plan only for the amount of precision they deem necessary, rather than trying to work with methods of producing confidence intervals that result in inefficient or, at worst, meaningless bounds.

APA, Harvard, Vancouver, ISO, and other styles

14

Prohorenko, Didrik. "A forecasting approach to estimating cartel damages : The importance of considering estimation uncertainty." Thesis, Södertörns högskola, Nationalekonomi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:sh:diva-41021.

Full text

Abstract:

In this study, I consider the performance of simple forecast models frequently applied in counterfactual analysis when the information at hand is limited. Furthermore, I discuss the robustness of the standard t-test commonly used to statistically detect cartels. I empirically verify that the standard t-statistics encompasses parameter estimation uncertainty when one of the time series in a two-sided t-test has been estimated. Thereafter, I compare the results with those from a corrected t-test, recently proposed, where the uncertainty has been accounted for. The results from the study show that a simple OLS-model can be used to detect a cartel and to compute a counterfactual price when data is limited, at least as long as the price overcharge inflicted by the cartel members is relatively large. Yet, the level of accuracy may vary and at a point where the data used for estimating the model become relatively limited, the model predictions tend to be inaccurate.

APA, Harvard, Vancouver, ISO, and other styles

15

Gui, Jiang. "Regularized estimation in the high-dimension and low-sample size settings, with applications to genomic data /." For electronic version search Digital dissertations database. Restricted to UC campuses. Access is free to UC campus dissertations, 2005. http://uclibs.org/PID/11984.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Petrie, John Eric. "The Accuracy of River Bed Sediment Samples." Thesis, Virginia Tech, 1998. http://hdl.handle.net/10919/30957.

Full text

Abstract:

One of the most important factors that influences a stream's hydraulic and ecological health is the streambed's sediment size distribution. This distribution affects streambed stability, sediment transport rates, and flood levels by defining the roughness of the stream channel. Adverse effects on water quality and wildlife can be expected when excessive fine sediments enter a stream. Many chemicals and toxic materials are transported through streams by binding to fine sediments. Increases in fine sediments also seriously impact the survival of fish species present in the stream. Fine sediments fill tiny spaces between larger particles thereby denying fish embryos the necessary fresh water to survive. Reforestation, constructed wetlands, and slope stabilization are a few management practices typically utilized to reduce the amount of sediment entering a stream. To effectively gauge the success of these techniques, the sediment size distribution of the stream must be monitored. Gravel bed streams are typically stratified vertically, in terms of particle size, in three layers, with each layer having its own distinct grain size distribution. The top two layers of the stream bed, the pavement and subpavement, are the most significant in determining the characteristics of the stream. These top two layers are only as thick as the largest particle size contained within each layer. This vertical stratification by particle size makes it difficult to characterize the grain size distribution of the surface layer. The traditional bulk or volume sampling procedure removes a specified volume of material from the stream bed. However, if the bed exhibits vertical stratification, the volume sample will mix different populations, resulting in inaccurate sample results. To obtain accurate results for the pavement size distribution, a surface oriented sampling technique must be employed. The most common types of surface oriented sampling are grid and areal sampling. Due to limitations in the sampling techniques, grid samples typically truncate the sample at the finer grain sizes, while areal samples typically truncate the sample at the coarser grain sizes. When combined with an analysis technique, either frequency-by-number or frequency-by-weight, the sample results can be represented in terms of a cumulative grain size distribution. However, the results of different sampling and analysis procedures can lead to biased results, which are not equivalent to traditional volume sampling results. Different conversions, dependent on both the sampling and analysis technique, are employed to remove the bias from surface sample results. The topic of the present study is to determine the accuracy of sediment samples obtained by the different sampling techniques. Knowing the accuracy of a sample is imperative if the sample results are to be meaningful. Different methods are discussed for placing confidence intervals on grid sample results based on statistical distributions. The binomial distribution and its approximation with the normal distribution have been suggested for these confidence intervals in previous studies. In this study, the use of the multinomial distribution for these confidence intervals is also explored. The multinomial distribution seems to best represent the grid sampling process. Based on analyses of the different distributions, recommendations are made. Additionally, figures are given to estimate the grid sample size necessary to achieve a required accuracy for each distribution. This type of sample size determination figure is extremely useful when preparing for grid sampling in the field. Accuracy and sample size determination for areal and volume samples present difficulties not encountered with grid sampling. The variability in number of particles contained in the sample coupled with the wide range of particle sizes present make direct statistical analysis impossible. Limited studies have been reported on the necessary volume to sample for gravel deposits. The majority of these studies make recommendations based on empirical results that may not be applicable to different size distributions. Even fewer studies have been published that address the issue of areal sample size. However, using grid sample results as a basis, a technique is presented to estimate the necessary sizes for areal and volume samples. These areal and volume sample sizes are designed to match the accuracy of the original grid sample for a specified grain size percentile of interest. Obtaining grid and areal results with the same accuracy can be useful when considering hybrid samples. A hybrid sample represents a combination of grid and areal sample results that give a final grain size distribution curve that is not truncated. Laboratory experiments were performed on synthetic stream beds to test these theories. The synthetic stream beds were created using both glass beads and natural sediments. Reducing sampling errors and obtaining accurate samples in the field are also briefly discussed. Additionally, recommendations are also made for using the most efficient sampling technique to achieve the required accuracy. Master of Science

APA, Harvard, Vancouver, ISO, and other styles

17

Taylor, Veronica Nell. "The use of three partial areas for establishing bioequivalence and estimation of sample size for equivalence studies /." Search for this dissertation online, 2003. http://wwwlib.umi.com/cr/ksu/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Signorini, David F. "Practical aspects of kernel smoothing for binary regression and density estimation." Thesis, n.p, 1998. http://oro.open.ac.uk/19923/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Senteney, Michael H. "A Monte Carlo Study to Determine Sample Size for Multiple Comparison Procedures in ANOVA." Ohio University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou160433478343909.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Lartigue, Thomas. "Mixtures of Gaussian Graphical Models with Constraints Gaussian Graphical Model exploration and selection in high dimension low sample size setting." Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAX034.

Full text

Abstract:

La description des co-variations entre plusieurs variables aléatoires observées est un problème délicat. Les réseaux de dépendance sont des outils populaires qui décrivent les relations entre les variables par la présence ou l’absence d’arêtes entre les nœuds d’un graphe. En particulier, les graphes de corrélations conditionnelles sont utilisés pour représenter les corrélations “directes” entre les nœuds du graphe. Ils sont souvent étudiés sous l’hypothèse gaussienne et sont donc appelés “modèles graphiques gaussiens” (GGM). Un seul réseau peut être utilisé pour représenter les tendances globales identifiées dans un échantillon de données. Toutefois, lorsque les données observées sont échantillonnées à partir d’une population hétérogène, il existe alors différentes sous-populations qui doivent toutes être décrites par leurs propres graphes. De plus, si les labels des sous populations (ou “classes”) ne sont pas disponibles, des approches non supervisées doivent être mises en œuvre afin d’identifier correctement les classes et de décrire chacune d’entre elles avec son propre graphe. Dans ce travail, nous abordons le problème relativement nouveau de l’estimation hiérarchique des GGM pour des populations hétérogènes non labellisées. Nous explorons plusieurs axes clés pour améliorer l’estimation des paramètres du modèle ainsi que l’identification non supervisee des sous-populations. ´ Notre objectif est de s’assurer que les graphes de corrélations conditionnelles inférés sont aussi pertinents et interprétables que possible. Premièrement - dans le cas d’une population simple et homogène - nous développons une méthode composite qui combine les forces des deux principaux paradigmes de l’état de l’art afin d’en corriger les faiblesses. Pour le cas hétérogène non labellisé, nous proposons d’estimer un mélange de GGM avec un algorithme espérance-maximisation (EM). Afin d’améliorer les solutions de cet algorithme EM, et d’éviter de tomber dans des extrema locaux sous-optimaux quand les données sont en grande dimension, nous introduisons une version tempérée de cet algorithme EM, que nous étudions théoriquement et empiriquement. Enfin, nous améliorons le clustering de l’EM en prenant en compte l’effet que des cofacteurs externes peuvent avoir sur la position des données observées dans leur espace Describing the co-variations between several observed random variables is a delicate problem. Dependency networks are popular tools that depict the relations between variables through the presence or absence of edges between the nodes of a graph. In particular, conditional correlation graphs are used to represent the “direct” correlations between nodes of the graph. They are often studied under the Gaussian assumption and consequently referred to as “Gaussian Graphical Models” (GGM). A single network can be used to represent the overall tendencies identified within a data sample. However, when the observed data is sampled from a heterogeneous population, then there exist different sub-populations that all need to be described through their own graphs. What is more, if the sub-population (or “class”) labels are not available, unsupervised approaches must be implemented in order to correctly identify the classes and describe each of them with its own graph. In this work, we tackle the fairly new problem of Hierarchical GGM estimation for unlabelled heterogeneous populations. We explore several key axes to improve the estimation of the model parameters as well as the unsupervised identification of the sub-populations. Our goal is to ensure that the inferred conditional correlation graphs are as relevant and interpretable as possible. First - in the simple, homogeneous population case - we develop a composite method that combines the strengths of the two main state of the art paradigms to correct their weaknesses. For the unlabelled heterogeneous case, we propose to estimate a Mixture of GGM with an Expectation Maximisation (EM) algorithm. In order to improve the solutions of this EM algorithm, and avoid falling for sub-optimal local extrema in high dimension, we introduce a tempered version of this EM algorithm, that we study theoretically and empirically. Finally, we improve the clustering of the EM by taking into consideration the effect of external co-features on the position in space of the observed data

APA, Harvard, Vancouver, ISO, and other styles

21

Tanner, Whitney Ford. "Improved Standard Error Estimation for Maintaining the Validities of Inference in Small-Sample Cluster Randomized Trials and Longitudinal Studies." UKnowledge, 2018. https://uknowledge.uky.edu/epb_etds/20.

Full text

Abstract:

Data arising from Cluster Randomized Trials (CRTs) and longitudinal studies are correlated and generalized estimating equations (GEE) are a popular analysis method for correlated data. Previous research has shown that analyses using GEE could result in liberal inference due to the use of the empirical sandwich covariance matrix estimator, which can yield negatively biased standard error estimates when the number of clusters or subjects is not large. Many techniques have been presented to correct this negative bias; However, use of these corrections can still result in biased standard error estimates and thus test sizes that are not consistently at their nominal level. Therefore, there is a need for an improved correction such that nominal type I error rates will consistently result. First, GEEs are becoming a popular choice for the analysis of data arising from CRTs. We study the use of recently developed corrections for empirical standard error estimation and the use of a combination of two popular corrections. In an extensive simulation study, we find that nominal type I error rates can be consistently attained when using an average of two popular corrections developed by Mancl and DeRouen (2001, Biometrics 57, 126-134) and Kauermann and Carroll (2001, Journal of the American Statistical Association 96, 1387-1396) (AVG MD KC). Use of this new correction was found to notably outperform the use of previously recommended corrections. Second, data arising from longitudinal studies are also commonly analyzed with GEE. We conduct a simulation study, finding two methods to attain nominal type I error rates more consistently than other methods in a variety of settings: First, a recently proposed method by Westgate and Burchett (2016, Statistics in Medicine 35, 3733-3744) that specifies both a covariance estimator and degrees of freedom, and second, AVG MD KC with degrees of freedom equaling the number of subjects minus the number of parameters in the marginal model. Finally, stepped wedge trials are an increasingly popular alternative to traditional parallel cluster randomized trials. Such trials often utilize a small number of clusters and numerous time intervals, and these components must be considered when choosing an analysis method. A generalized linear mixed model containing a random intercept and fixed time and intervention covariates is the most common analysis approach. However, the sole use of a random intercept applies assumptions that will be violated in practice. We show, using an extensive simulation study based on a motivating example and a more general design, alternative analysis methods are preferable for maintaining the validity of inference in small-sample stepped wedge trials with binary outcomes. First, we show the use of generalized estimating equations, with an appropriate bias correction and a degrees of freedom adjustment dependent on the study setting type, will result in nominal type I error rates. Second, we show the use of a cluster-level summary linear mixed model can also achieve nominal type I error rates for equal cluster size settings.

APA, Harvard, Vancouver, ISO, and other styles

22

Asendorf, Thomas [Verfasser]. "Blinded Sample Size Re-estimation for Longitudinal Overdispersed Count Data in Randomized Clinical Trials with an Application in Multiple Sclerosis / Thomas Asendorf." Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2021. http://d-nb.info/1228364591/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Woo, Hin Kyeol. "Multiscale fractality with application and statistical modeling and estimation for computer experiment of nano-particle fabrication." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/45819.

Full text

Abstract:

The first chapter proposes multifractal analysis to measure inhomogeneity of regularity of 1H-NMR spectrum using wavelet-based multifractal tools. The geometric summaries of multifractal spectrum are informative summaries, and as such employed to discriminate 1H-NMR spectra associated with different treatments. The methodology is applied to evaluate the effect of sulfur amino acids. The second part of this thesis provides essential materials for understanding engineering background of a nano-particle fabrication process. The third chapter introduces a constrained random effect model. Since there are certain combinations of process variables resulting to unproductive process outcomes, a logistic model is used to characterize such a process behavior. For the cases with productive outcomes a normal regression serves the second part of the model. Additionally, random-effects are included in both logistics and normal regression models to describe the potential spatial correlation among data. This chapter researches a way to approximate the likelihood function and to find estimates for maximizing the approximated likelihood. The last chapter presents a method to decide the sample size under multi-layer system. The multi-layer is a series of layers, which become smaller and smaller. Our focus is to decide the sample size in each layer. The sample size decision has several objectives, and the most important purpose is the sample size should be enough to give a right direction to the next layer. Specifically, the bottom layer, which is the smallest neighborhood around the optimum, should meet the tolerance requirement. Performing the hypothesis test of whether the next layer includes the optimum gives the required sample size.

APA, Harvard, Vancouver, ISO, and other styles

24

Potgieter, Ryno. "Minimum sample size for estimating the Bayes error at a predetermined level." Diss., University of Pretoria, 2013. http://hdl.handle.net/2263/33479.

Full text

Abstract:

Determining the correct sample size is of utmost importance in study design. Large samples yield classifiers or parameters with more precision and conversely, samples that are too small yield unreliable results. Fixed sample size methods, as determined by the specified level of error between the obtained parameter and population value, or a confidence level associated with the estimate, have been developed and are available. These methods are extremely useful when there is little or no cost (consequences of action), financial and time, involved in gathering the data. Alternatively, sequential sampling procedures have been developed specifically to obtain a classifier or parameter estimate that is as accurate as deemed necessary by the researcher, while sampling the least number of observations required to obtain the specified level of accuracy. This dissertation discusses a sequential procedure, derived using Martingale Limit Theory, which had been developed to train a classifier with the minimum number of observations to ensure, with a high enough probability, that the next observation sampled has a low enough probability of being misclassified. Various classification methods are discussed and tested, with multiple combinations of parameters tested. Additionally, the sequential procedure is tested on microarray data. Various advantages and shortcomings of the sequential procedure are pointed out and discussed. This dissertation also proposes a new sequential procedure that trains the classifier to such an extent as to accurately estimate the Bayes error with a high probability. The sequential procedure retains all of the advantages of the previous method, while addressing the most serious shortcoming. Ultimately, the sequential procedure developed enables the researcher to dictate how accurate the classifier should be and provides more control over the trained classifier. Dissertation (MSc)--University of Pretoria, 2013. Statistics Unrestricted

APA, Harvard, Vancouver, ISO, and other styles

25

Xu, Yanzhi. "Effective GPS-based panel survey sample size for urban travel behavior studies." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33843.

Full text

Abstract:

This research develops a framework to estimate the effective sample size of Global Positioning System (GPS) based panel surveys in urban travel behavior studies for a variety of planning purposes. Recent advances in GPS monitoring technologies have made it possible to implement panel surveys with lengths of weeks, months or even years. The many advantageous features of GPS-based panel surveys make such surveys attractive for travel behavior studies, but the higher cost of such surveys compared to conventional one-day or two-day paper diary surveys requires scrutiny at the sample size planning stage to ensure cost-effectiveness. The sample size analysis in this dissertation focuses on three major aspects in travel behavior studies: 1) to obtain reliable means for key travel behavior variables, 2) to conduct regression analysis on key travel behavior variables against explanatory variables such as demographic characteristics and seasonal factors, and 3) to examine impacts of a policy measure on travel behavior through before-and-after studies. The sample size analyses in this dissertation are based on the GPS data collected in the multi-year Commute Atlanta study. The sample size analysis with regard to obtaining reliable means for key travel behavior variables utilizes Monte Carlo re-sampling techniques to assess the trend of means against various sample size and survey length combinations. The basis for the framework and methods of sample size estimation related to regression analysis and before-and-after studies are derived from various sample size procedures based on the generalized estimating equation (GEE) method. These sample size procedures have been proposed for longitudinal studies in biomedical research. This dissertation adapts these procedures to the design of panel surveys for urban travel behavior studies with the information made available from the Commute Atlanta study. The findings from this research indicate that the required sample sizes should be much larger than the sample sizes in existing GPS-based panel surveys. This research recommends a desired range of sample sizes based on the objectives and survey lengths of urban travel behavior studies.

APA, Harvard, Vancouver, ISO, and other styles

26

Houghton, Damon. "Minimum tree height sample sizes necessary for accurately estimating merchantable plot volume in Loblolly pine plantations." Thesis, This resource online, 1991. http://scholar.lib.vt.edu/theses/available/etd-05022009-040541/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Walters, Stephen John. "The use of bootstrap methods for estimating sample size and analysing health-related quality of life outcomes (particularly the SF-36)." Thesis, University of Sheffield, 2003. http://etheses.whiterose.ac.uk/6053/.

Full text

Abstract:

Health-Related Quality of Life (HRQoL) measures are becoming increasingly used in clinical trials and health services research, both as primary and secondary outcome measures. Investigators are now asking statisticians for advice on how to plan (e. g. sample size) and analyse studies using HRQoI_ outcomes. HRQoL outcomes like the SF-36 are usually measured on an ordinal scale. However, most investigators assume that there exists an underlying continuous latent variable that measures HRQoL, and that the actual measured outcomes (the ordered categories), reflect contiguous intervals along this continuum. The ordinal scaling of HRQoL measures means they tend to generate data that have discrete, bounded and skewed distributions. Thus, standard methods of analysis such as the t-test and linear regression that assume Normality and constant variance may not be appropriate. For this reason, non-parametric methods are often used to analyse HRQoL data. The bootstrap is one such computer intensive non-parametric method for estimating sample sizes and analysing data. From a review of the literature, I found five methods of estimating sample sizes for two-group cross-sectional comparisons of HRQoL outcomes. All five methods (amongst other factors) require the specification of an effect size, which varies according to the method of sample size estimation. The empirical effect sizes calculated from the various datasets suggested that large differences in HRQoL (as measured by the SF-36) between groups are unlikely, particularly from the RCT comparisons. Most of the observed effect sizes were mainly in the 'small' to 'moderate' range (0.2 to 0.5) using Cohen's (1988) criteria. I compared the power of various methods of sample size estimation for two-group cross-sectional study designs via bootstrap simulation. The results showed that under the location shift alternative hypothesis, conventional methods of sample size estimation performed well, particularly Whitehead's (1993) method. Whitehead's method is recommended if the HRQoL outcome has a limited number of discrete values (< 7) and/or the expected proportion of cases at either of the bounds is high. If a pilot dataset is readily available (to estimate the shape of the distribution) then bootstrap simulation may provide a more accurate and reliable estimate, than conventional methods. Finally, I used the bootstrap for hypothesis testing and the estimation of standard errors and confidence intervals for parameters, in four datasets (which illustrate the different aspects of study design). I then compared and contrasted the bootstrap with standard methods of analysing HRQoL outcomes as described in Fayers and Machin (2000). Overall, in the datasets studied with the SF-36 outcome the use of the bootstrap for estimating sample sizes and analysing HRQoL data appears to produce results similar to conventional statistical methods. Therefore, the results of this thesis suggest that bootstrap methods are not more appropriate for analysing HRQoL outcome data than standard methods. This result requires confirmation with other HRQoL outcome measures, interventions and populations.

APA, Harvard, Vancouver, ISO, and other styles

28

Saha, Dibakar. "Improved Criteria for Estimating Calibration Factors for Highway Safety Manual (HSM) Applications." FIU Digital Commons, 2014. http://digitalcommons.fiu.edu/etd/1701.

Full text

Abstract:

The Highway Safety Manual (HSM) estimates roadway safety performance based on predictive models that were calibrated using national data. Calibration factors are then used to adjust these predictive models to local conditions for local applications. The HSM recommends that local calibration factors be estimated using 30 to 50 randomly selected sites that experienced at least a total of 100 crashes per year. It also recommends that the factors be updated every two to three years, preferably on an annual basis. However, these recommendations are primarily based on expert opinions rather than data-driven research findings. Furthermore, most agencies do not have data for many of the input variables recommended in the HSM. This dissertation is aimed at determining the best way to meet three major data needs affecting the estimation of calibration factors: (1) the required minimum sample sizes for different roadway facilities, (2) the required frequency for calibration factor updates, and (3) the influential variables affecting calibration factors. In this dissertation, statewide segment and intersection data were first collected for most of the HSM recommended calibration variables using a Google Maps application. In addition, eight years (2005-2012) of traffic and crash data were retrieved from existing databases from the Florida Department of Transportation. With these data, the effect of sample size criterion on calibration factor estimates was first studied using a sensitivity analysis. The results showed that the minimum sample sizes not only vary across different roadway facilities, but they are also significantly higher than those recommended in the HSM. In addition, results from paired sample t-tests showed that calibration factors in Florida need to be updated annually. To identify influential variables affecting the calibration factors for roadway segments, the variables were prioritized by combining the results from three different methods: negative binomial regression, random forests, and boosted regression trees. Only a few variables were found to explain most of the variation in the crash data. Traffic volume was consistently found to be the most influential. In addition, roadside object density, major and minor commercial driveway densities, and minor residential driveway density were also identified as influential variables.

APA, Harvard, Vancouver, ISO, and other styles

29

Nåtman, Jonatan. "The performance of inverse probability of treatment weighting and propensity score matching for estimating marginal hazard ratios." Thesis, Uppsala universitet, Statistiska institutionen, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385502.

Full text

Abstract:

Propensity score methods are increasingly being used to reduce the effect of measured confounders in observational research. In medicine, censored time-to-event data is common. Using Monte Carlo simulations, this thesis evaluates the performance of nearest neighbour matching (NNM) and inverse probability of treatment weighting (IPTW) in combination with Cox proportional hazards models for estimating marginal hazard ratios. Focus is on the performance for different sample sizes and censoring rates, aspects which have not been fully investigated in this context before. The results show that, in the absence of censoring, both methods can reduce bias substantially. IPTW consistently had better performance in terms of bias and MSE compared to NNM. For the smallest examined sample size with 60 subjects, the use of IPTW led to estimates with bias below 15 %. Since the data were generated using a conditional parametrisation, the estimation of univariate models violates the proportional hazards assumption. As a result, censoring the data led to an increase in bias.

APA, Harvard, Vancouver, ISO, and other styles

30

Hagen, Clinton Ernest. "Comparing the performance of four calculation methods for estimating the sample size in repeated measures clinical trials where difference in treatment groups means is of interest." Oklahoma City : [s.n.], 2008.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

31

Good, Norman Markus. "Methods for estimating the component biomass of a single tree and a stand of trees using variable probability sampling techniques." Thesis, Queensland University of Technology, 2001. https://eprints.qut.edu.au/37097/1/37097_Good_2001.pdf.

Full text

Abstract:

This thesis developed multistage sampling methods for estimating the aggregate biomass of selected tree components, such as leaves, branches, trunk and total, in woodlands in central and western Queensland. To estimate the component biomass of a single tree randomised branch sampling (RBS) and importance sampling (IS) were trialed. RBS and IS were found to reduce the amount of time and effort to sample tree components in comparison with other standard destructive sampling methods such as ratio sampling, especially when sampling small components such as leaves and small twigs. However, RBS did not estimate leaf and small twig biomass to an acceptable degree of precision using current methods for creating path selection probabilities. In addition to providing an unbiased estimate of tree component biomass, individual estimates were used for developing allometric regression equations. Equations based on large components such as total biomass produced narrower confidence intervals than equations developed using ratio sampling. However, RBS does not estimate small component biomass such as leaves and small wood components with an acceptable degree of precision, and should be mainly used in conjunction with IS for estimating larger component biomass. A whole tree was completely enumerated to set up a sampling space with which RBS could be evaluated under a number of scenarios. To achieve a desired precision, RBS sample size and branch diameter exponents were varied, and the RBS method was simulated using both analytical and re-sampling methods. It was found that there is a significant amount of natural variation present when relating the biomass of small components to branch diameter, for example. This finding validates earlier decisions to question the efficacy of RBS for estimating small component biomass in eucalypt species. In addition, significant improvements can be made to increase the precision of RBS by increasing the number of samples taken, but more importantly by varying the exponent used for constructing selection probabilities. To further evaluate RBS on trees with differing growth forms from that enumerated, virtual trees were generated. These virtual trees were created using L-systems algebra. Decision rules for creating trees were based on easily measurable characteristics that influence a tree's growth and form. These characteristics included; child-to-child and children-to-parent branch diameter relationships, branch length and branch taper. They were modelled using probability distributions of best fit. By varying the size of a tree and/or the variation in the model describing tree characteristics; it was possible to simulate the natural variation between trees of similar size and fonn. By creating visualisations of these trees, it is possible to determine using visual means whether RBS could be effectively applied to particular trees or tree species. Simulation also aided in identifying which characteristics most influenced the precision of RBS, namely, branch length and branch taper. After evaluation of RBS/IS for estimating the component biomass of a single tree, methods for estimating the component biomass of a stand of trees (or plot) were developed and evaluated. A sampling scheme was developed which incorporated both model-based and design-based biomass estimation methods. This scheme clearly illustrated the strong and weak points associated with both approaches for estimating plot biomass. Using ratio sampling was more efficient than using RBS/IS in the field, especially for larger tree components. Probability proportional to size sampling (PPS) -size being the trunk diameter at breast height - generated estimates of component plot biomass that were comparable to those generated using model-based approaches. The research did, however, indicate that PPS is more precise than the use of regression prediction ( allometric) equations for estimating larger components such as trunk or total biomass, and the precision increases in areas of greater biomass. Using more reliable auxiliary information for identifying suitable strata would reduce the amount of within plot variation, thereby increasing precision. PPS had the added advantage of being unbiased and unhindered by numerous assumptions applicable to the population of interest, the case with a model-based approach. The application of allometric equations in predicting the component biomass of tree species other than that for which the allometric was developed is problematic. Differences in wood density need to be taken into account as well as differences in growth form and within species variability, as outlined in virtual tree simulations. However, the development and application of allometric prediction equations in local species-specific contexts is more desirable than PPS.

APA, Harvard, Vancouver, ISO, and other styles

32

Domrow, Nathan Craig. "Design, maintenance and methodology for analysing longitudinal social surveys, including applications." Thesis, Queensland University of Technology, 2007. https://eprints.qut.edu.au/16518/1/Nathan_Domrow_Thesis.pdf.

Full text

Abstract:

This thesis describes the design, maintenance and statistical analysis involved in undertaking a Longitudinal Survey. A longitudinal survey (or study) obtains observations or responses from individuals over several times over a defined period. This enables the direct study of changes in an individual's response over time. In particular, it distinguishes an individual's change over time from the baseline differences among individuals within the initial panel (or cohort). This is not possible in a cross-sectional study. As such, longitudinal surveys give correlated responses within individuals. Longitudinal studies therefore require different considerations for sample design and selection and analysis from standard cross-sectional studies. This thesis looks at the methodology for analysing social surveys. Most social surveys comprise of variables described as categorical variables. This thesis outlines the process of sample design and selection, interviewing and analysis for a longitudinal study. Emphasis is given to categorical response data typical of a survey. Included in this thesis are examples relating to the Goodna Longitudinal Survey and the Longitudinal Survey of Immigrants to Australia (LSIA). Analysis in this thesis also utilises data collected from these surveys. The Goodna Longitudinal Survey was conducted by the Queensland Office of Economic and Statistical Research (a portfolio office within Queensland Treasury) and began in 2002. It ran for two years whereby two waves of responses were collected.

APA, Harvard, Vancouver, ISO, and other styles

33

Domrow, Nathan Craig. "Design, maintenance and methodology for analysing longitudinal social surveys, including applications." Queensland University of Technology, 2007. http://eprints.qut.edu.au/16518/.

Full text

Abstract:

This thesis describes the design, maintenance and statistical analysis involved in undertaking a Longitudinal Survey. A longitudinal survey (or study) obtains observations or responses from individuals over several times over a defined period. This enables the direct study of changes in an individual's response over time. In particular, it distinguishes an individual's change over time from the baseline differences among individuals within the initial panel (or cohort). This is not possible in a cross-sectional study. As such, longitudinal surveys give correlated responses within individuals. Longitudinal studies therefore require different considerations for sample design and selection and analysis from standard cross-sectional studies. This thesis looks at the methodology for analysing social surveys. Most social surveys comprise of variables described as categorical variables. This thesis outlines the process of sample design and selection, interviewing and analysis for a longitudinal study. Emphasis is given to categorical response data typical of a survey. Included in this thesis are examples relating to the Goodna Longitudinal Survey and the Longitudinal Survey of Immigrants to Australia (LSIA). Analysis in this thesis also utilises data collected from these surveys. The Goodna Longitudinal Survey was conducted by the Queensland Office of Economic and Statistical Research (a portfolio office within Queensland Treasury) and began in 2002. It ran for two years whereby two waves of responses were collected.

APA, Harvard, Vancouver, ISO, and other styles

34

Kolodziej, Karolina [Verfasser], and Ralf [Akademischer Betreuer] Schulz. "Development of a method for wild boar (Sus scrofa) population size estimation by genotyping of non-invasive samples = Entwicklung einer Methode zur Populationsschätzung von Wildschweinen (Sus scrofa) mittels Genotypisierung nicht-invasiv gewonnener Proben [[Elektronische Ressource]] / Karolina Kolodziej. Gutachter: Ralf Schulz." Landau : Universitätsbibliothek Landau, 2012. http://d-nb.info/1028023227/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Chen, Hui-Chen, and 陳慧倩. "Sample Size Estimation for Replicated 2*2 Cross-over Design." Thesis, 1997. http://ndltd.ncl.edu.tw/handle/32949108085597784842.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Her, Chi-Way, and 何淇瑋. "The Optimal Sample Size For Interval Estimation Of Correlation Coefficient." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/20731985747790441131.

Full text

Abstract:

碩士 國立交通大學 管理科學系所 99 As the degree of correlation between two variables is one of concern to many social science issues, thus using the sample correlation coefficient to infer population correlation coefficient is a common method. However, the decision of the optimum sample size for the entire study will save a lot of time and cost. Traditionally, the sample size determination in addition to hypothesis testing method, this research will introduce the expected interval length method and the expected interval coverage probability method. Expected interval coverage probability method is based on interval estimation, but it can adjust sample size strict and loose according to the different set coverage probability. In this dissertation, the SAS software is used to construct model, after finding the optimal sample size, we will select the sample size randomly from the two designed population, and observe the interval width and interval coverage probability composed of sample size whether consistent with our original set. The results shows: the expected interval length method will have a better simulation results only when the samples are large enough, and the expected interval coverage probability method will shows unstable when the population parameters very close to 0.

APA, Harvard, Vancouver, ISO, and other styles

37

Phatakwala, Shantanu. "Estimation of form-error and determination of sample size in precision metrology." 2005. http://proquest.umi.com/pqdweb?did=1014319851&sid=16&Fmt=2&clientId=39334&RQT=309&VName=PQD.

Full text

Abstract:

Thesis (M.S.)--State University of New York at Buffalo, 2005. Title from PDF title page (viewed on Apr. 13, 2006) Available through UMI ProQuest Digital Dissertations. Thesis adviser: Gosavi, Abhijit.

APA, Harvard, Vancouver, ISO, and other styles

38

Tai, Chu-Chun, and 戴竹君. "Sample Size Calculations for Precise Interval Estimation of Intraclass Correlation Coefficient (2)." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/4mm6rh.

Full text

Abstract:

碩士 國立交通大學 管理科學系所 103 Whether in the natural sciences or the social sciences, intraclass correlation coefficient is always used to measure the degree of the correlation between the interclass variation and the intraclass variation. This index reflects the differences between groups of data. It’s always regard as a way to inspect variation of variable is interpreted by variable between groups for evaluating whether integrating individual data to group information. Also, intraclass correlation coefficient is used to inspect the consistency of retest measured results. In this dissertation, the SAS/IML software is used to construct the model, and there are two simulation situations, interval width and interval coverage rate, to find the optimal sample size in conditions of conditional level, measurement and intraclass correlation coefficient. These calculated sample size are analyzed and compared.

APA, Harvard, Vancouver, ISO, and other styles

39

葉家豪. "nterval Estimation and Sample Size Calculation for Fisher Transformation of Intraclass Correlation Coefficient." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/7d84zz.

Full text

Abstract:

碩士 國立交通大學 管理科學系所 101 ICCs is an indispensable part of social science research, it has to use the F-distribution through the computing process, it seems pretty hard to find the optimal sample size in some specific situations, so the domestic and foreign researchers have debated on the ICCs estimator for many years. Mostly, researchers choose to use the normal distribution because the normal distribution is relatively easy to use to find the confidence interval and optimal sample size. This article is based on the theory which proposed by Bonett (2002) , he used the formula to compute the sample size, however, he didn't show the confidence interval as a proof, so this research use two method to compute the optimal sample size and further compare to the original method. I also use the simulation to develop the confidence interval coverage to help me discuss the optimal sample size and the timing of using these two methods. I hope that I can provide more information for those researchers who use the ICCs as the research meth-od.

APA, Harvard, Vancouver, ISO, and other styles

40

Lin, Shih-Wen, and 林詩紋. "The Mediation Parameter Estimation and Sample Size Calculations for the Multiple Mediation Model." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/r7kfav.

Full text

Abstract:

碩士 國立交通大學 統計學研究所 107 Mediation analysis has widely applied to investigate the causal mechanism in management social psychology and public health, and several methods to evaluating mediation effect had been developed. However, the current methods discuss a specified causal relationship among variables, such one mediator or multiple mediators without the interaction. Therefore, this study focus on multiple mediation model, introduces a generalized mediation model and develops a computational approach via Monte Carlo simulation (G-computation) for estimation, and provides a method for sample size calculation. On the aspect of estimation, we compare G-computation with regression-based method (closed form). Results show that estimate of G-computation is unbiased and consistent with regression-based method; on the aspect of sample size calculation, we propose a normal approximation approach and numerical method to derive sample size under a given power, and results show that the proposed sample size of the normal approximation approach is similar to that of the numerical method for some path-specific effects. In conclusion, this research provides new methods not only for estimation but also for sample size calculation on multiple mediation model.

APA, Harvard, Vancouver, ISO, and other styles

41

Lee, Chung-Han, and 李宗翰. "Sample Size Calculation for Complete Data and Interval Estimation for the Multinomial Distribution." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/m84c6s.

Full text

Abstract:

博士 國立交通大學 統計學研究所 107 In this dissertation, we focus on two topics. The first topic is the interval estimation for the probability of the multinomial distribution. Statistical intervals are widely-used in many study fields. Simultaneous confidence intervals for the multinomial proportions have been proposed in many applications, including quality control and clinical data analysis. Because of these wide applications, the multinomial distribution plays an important role in many areas of science. Thus, we propose a method for constructing the confidence interval for the probability of the multinomial distribution. A simulation study is conducted to compare the performance of different intervals. The second topic is to derive the parameter estimators after the missing data imputation under the misspecified model and determine the sample size of the complete data. We consider the case that the misspecified model is underfitting. Finally, we apply the proposed methodology to analyze a stroke data. The time interval called the pre-hospital delay is important for thrombolytic therapy. Therefore, our study aimed at exploring the association of prehospital delay and arrival way, stroke severity, initial symptom and sign, and stroke risk factors.

APA, Harvard, Vancouver, ISO, and other styles

42

Cheng, Ya-Ching, and 鄭雅靜. "Sampling Properties of the Yield Index Spk With Estimation Accuracy and Sample Size Information." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/63398613030709840137.

Full text

Abstract:

博士 國立交通大學 工業工程與管理系所 97 The yield index Spk proposed by Boyles (1994) provides an exact measure on the production yield of normal processes. Lee et al. (2002) considered a normal approximation for estimating Spk. In the thesis, we extend the results and consider the sampling distribution of the yield index in three conditions (i) for multiple samples, (ii) in the convolution method, and (iii) for acceptance sampling. Under multiple samples, we derive the sampling distribution for the estimator of Spk, and observe that for the same Spk, the variance of would be largest when the process mean is on the center of specification limits. Lower bounds of Spk are tabulated for some commonly used capability requirements. To assess the normally approximated distribution of , we also check out the actual type I error and compare with the preset significant level. We also compute the sample sizes required for the normal approximation to converge to the actual Spk within a designated accuracy. Then, a real-world application of the one-cell rechargeable Li-ion battery packs is presented to illustrate how practitioners can apply the lower bounds to actual data collected in multiple samples. Next, we consider a convolution approximation for estimating Spk, and compare with the normal approximation. The comparison results show that the convolution method does provide a more accurate estimation to Spk as well as the production yield than the normal approximation. An efficient step-by-step procedure based on the convolution method is developed to illustrate how to estimate the production yield. Also investigated is the accuracy of the convolution method which provides useful information about sample size required for designated power levels, and for convergence. Finally, we consider the acceptance determination based on the Spk index. Acceptance sampling plans provide the vendor and the buyer decision rules for lot sentencing to meet their product quality needs. A variables sampling plan based on the index Spk is proposed to handle processes requiring very low PPM fraction of defectives. We develop an effective method for obtaining the required sample sizes n and the critical acceptance value c0 by solving simultaneously two nonlinear equations. Based on the designed sampling plan, the practitioners can determine the number of production items to be sampled for inspection and the corresponding critical acceptance value for lot sentencing.

APA, Harvard, Vancouver, ISO, and other styles

43

吳佳儒. "Influences of Computerized Adaptive Pretest on Estimation Precision of Difficulty Parameters in Small Sample Size Pretest." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/88598087367244099133.

Full text

Abstract:

碩士 國立臺灣師範大學 教育心理與輔導學系 98 The goal of the research is to investigate the influences of adaptive item selection method on the accuracy of pretest items calibration. The success of applications in computerized adaptive test (CAT) depends on the accuracy of each individual item parameters estimated. Typically, pretest calibration of item parameters is suggested to acquire large calibration sample size to reduce the estimation error. However, it may be difficult to reach such standard in reality. This paper proposes a Computerized Adaptive Pretest method (CAPT) for determining optimum items for each examinee to take in the pretest, thus improve the accuracy of item calibration. The research is composed of two studys. In Study 1, three kinds of ability distribution, normal, uniform and multi-group, were formed and to examine the influence on pretest item calibration. Study 2 is composed of two parts. Part one is to examine the difference of item estimation precision between CAPT and NEAT design. Part two is mainly to examine the difference of item estimation precision under three kinds of correlation between subjective difficulty and true difficluty. In addition, part two examines some variables that might be the factor which influences the item estimation precision under the CAPT design, such as test length, anchor items number, and the difficulty distribution of anchor itms. The result of Study 1 suggests there is no difference on the precision of item estimation between normal, uniform and multi-group distribution examinees. The difference between them is that the estimation is more precise for normal distribution at the average difficulty items, while it is not that accurate at the easy and hard items. Besides, uniform or multi-group examinees perform similarly accurate in all the items. The result of Study 2 suggests that the CAPT design performs better than NEAT design in the small sample size situation. With respect to the correlation between subjective difficulty and true difficluty, the higher the correlation, the more precise the item estimation. Furthermore, item estimation is more precise as the length of test is longer and the anchor items are fewer. The difficulty distribution of anchor items has little to do with the precision of item estimation. Generally speaking, this study sheds some light on future applications of pretest design for test users who can not acquire large sample size to estimate item parameters, as long as the correlation between subjective difficulty and true difficulty is equal to or higher than moderate level.

APA, Harvard, Vancouver, ISO, and other styles

44

"Robust Experimental Design for Speech Analysis Applications." Master's thesis, 2020. http://hdl.handle.net/2286/R.I.57412.

Full text

Abstract:

abstract: In many biological research studies, including speech analysis, clinical research, and prediction studies, the validity of the study is dependent on the effectiveness of the training data set to represent the target population. For example, in speech analysis, if one is performing emotion classification based on speech, the performance of the classifier is mainly dependent on the number and quality of the training data set. For small sample sizes and unbalanced data, classifiers developed in this context may be focusing on the differences in the training data set rather than emotion (e.g., focusing on gender, age, and dialect). This thesis evaluates several sampling methods and a non-parametric approach to sample sizes required to minimize the effect of these nuisance variables on classification performance. This work specifically focused on speech analysis applications, and hence the work was done with speech features like Mel-Frequency Cepstral Coefficients (MFCC) and Filter Bank Cepstral Coefficients (FBCC). The non-parametric divergence (D_p divergence) measure was used to study the difference between different sampling schemes (Stratified and Multistage sampling) and the changes due to the sentence types in the sampling set for the process. Dissertation/Thesis Masters Thesis Electrical Engineering 2020

APA, Harvard, Vancouver, ISO, and other styles

45

Asendorf, Thomas. "Blinded Sample Size Re-estimation for Longitudinal Overdispersed Count Data in Randomized Clinical Trials with an Application in Multiple Sclerosis." Thesis, 2021. http://hdl.handle.net/21.11130/00-1735-0000-0005-1581-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Bliss, Caleb Andrew. "Sample size re-estimation for superiority clinical trials with a dichotomous outcome using an unblinded estimate of the control group outcome rate." Thesis, 2014. https://hdl.handle.net/2144/14282.

Full text

Abstract:

Superiority clinical trials are often designed with a planned interim analysis for the purpose of sample size re-estimation (SSR) when limited information is available at the start of the trial to estimate the required sample size. Typically these trials are designed with a two-arm internal pilot where subjects are enrolled to both treatment arms prior to the interim analysis. Circumstances may sometimes call for a trial with a single-arm internal pilot (enroll only in the control group). For a dichotomous outcome, Herson and Wittes proposed a SSR method (HW-SSR) that can be applied to single-arm internal pilot trials using an unblinded estimate of the control group outcome rate. Previous evaluations of the HW-SSR method reported conflicting results regarding the impact of the method on the two-sided Type I error rate and power of the final hypothesis test. In this research we evaluate the HW-SSR method under the null and alternative hypothesis in various scenarios to investigate the one-sided Type I error rate and power of trials with a two-arm internal pilot. We find that the one-sided Type I error rate is sometimes inflated and that the power is sometimes reduced. We propose a new method, the Critical Value and Power Adjusted Sample Size Re-estimation (CVPA-SSR) algorithm to adjust the critical value cutoff used in the final Z-test and the power critical value used in the interim SSR formula to preserve the nominal Type I error rate and the desired power. We conduct simulations for trials with single-arm and two-arm internal pilots to confirm that the CVPA-SSR algorithm does preserve the nominal Type I error rate and the desired power. We investigate the robustness of the CVPA-SSR algorithm for trials with single-arm and two-arm internal pilots when the assumptions used in designing the trial are incorrect. No Type I error inflation is observed but significant over- or under-powering of the trial occurs when the treatment effect used to design the trial is misspecified.

APA, Harvard, Vancouver, ISO, and other styles

47

Orr, Aline Pinto. "Effects of sample size, ability distribution, and the length of Markov Chain Monte Carlo burn-in chains on the estimation of item and testlet parameters." Thesis, 2011. http://hdl.handle.net/2152/ETD-UT-2011-05-2684.

Full text

Abstract:

Item Response Theory (IRT) models are the basis of modern educational measurement. In order to increase testing efficiency, modern tests make ample use of groups of questions associated with a single stimulus (testlets). This violates the IRT assumption of local independence. However, a set of measurement models, testlet response theory (TRT), has been developed to address such dependency issues. This study investigates the effects of varying sample sizes and Markov Chain Monte Carlo burn-in chain lengths on the accuracy of estimation of a TRT model’s item and testlet parameters. The following outcome measures are examined: Descriptive statistics, Pearson product-moment correlations between known and estimated parameters, and indices of measurement effectiveness for final parameter estimates. text

APA, Harvard, Vancouver, ISO, and other styles

48

Popp, Eric C. "The effects on parameter estimation of sample size ratio, test length and trait correlation in a two-dimensional, two-parameter, compensatory item response model with dichotomous scoring." 2004. http://purl.galileo.usg.edu/uga%5Fetd/popp%5Feric%5Fc%5F200405%5Fphd.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Jang, Jia-Shin, and 張家鑫. "Estimating the number of species with small sample size." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/44525982626146281225.

Full text

Abstract:

碩士 國立中興大學 應用數學系所 97 There are many well known methods which are available for estimating thenumber of species in a plant community. However, these methods can reach reasonable estimate only at the case of large sample size, which is not allowable in real applications. Hwang & Shen(2008) poposed a new estimator which provide areduction in bias when the sample size is small. Neverless, the performence of root mean square error in Hwang & Shen estimator isn’t very well. The study would provide an improvement of Hwang & Shen(2008) an this is confirmed by several real data sets.

APA, Harvard, Vancouver, ISO, and other styles

50

Chen, Phillip, and 陳仁滄. "The simulation study of estimating populoation size in tri- sample capture-recapture experiments." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/60251023899604503859.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Sample size estimation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles