To see the other types of publications on this topic, follow the link: Sample size approximation.

Journal articles on the topic 'Sample size approximation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Sample size approximation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Millar, Russell B., and Christopher D. Nottingham. "Improved approximations for estimation of size-transition probabilities within size-structured models." Canadian Journal of Fisheries and Aquatic Sciences 76, no. 8 (2019): 1305–13. http://dx.doi.org/10.1139/cjfas-2017-0444.

Full text
Abstract:
Modelling annual growth of individuals in a size-structured model requires calculation of the size-transition probabilities for moving from one size class to another. This requires evaluation of two-dimensional integrals when there is individual variability in growth. For computational simplicity, it is common to approximate the integrals by setting all individuals in a size class to the midsize of that class or by ignoring the individual variability. We develop a more accurate approximation that assumes a uniform distribution in size within each size class. The approximation is fast and hence feasible for Bayesian models in which the matrix of transition probabilities must be computed for each posterior sample. The improved accuracy of the new approximation is shown to hold over a diverse range of formulations for incremental growth. For the New Zealand Paua 5A (Haliotis iris) stock assessment model, it was found to reduce the average approximation error of the size-transition probabilities by 86% and 98% compared with the midpoint and deterministic growth approximations, respectively. Moreover, the midpoint and deterministic approximations inflated the estimated maximum sustainable yield by 6% and 8%, respectively, and the current biomass by almost 30% in comparison with the more accurate approximation.
APA, Harvard, Vancouver, ISO, and other styles
2

Schwertman, Neil C., and Margaret A. Owens. "Simple approximation of sample size for the bivariate normal." Computational Statistics & Data Analysis 8, no. 2 (1989): 201–7. http://dx.doi.org/10.1016/0167-9473(89)90007-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ullah, Insha, Sudhir Paul, Zhenjie Hong, and You-Gan Wang. "Significance tests for analyzing gene expression data with small sample sizes." Bioinformatics 35, no. 20 (2019): 3996–4003. http://dx.doi.org/10.1093/bioinformatics/btz189.

Full text
Abstract:
Abstract Motivation Under two biologically different conditions, we are often interested in identifying differentially expressed genes. It is usually the case that the assumption of equal variances on the two groups is violated for many genes where a large number of them are required to be filtered or ranked. In these cases, exact tests are unavailable and the Welch’s approximate test is most reliable one. The Welch’s test involves two layers of approximations: approximating the distribution of the statistic by a t-distribution, which in turn depends on approximate degrees of freedom. This study attempts to improve upon Welch’s approximate test by avoiding one layer of approximation. Results We introduce a new distribution that generalizes the t-distribution and propose a Monte Carlo based test that uses only one layer of approximation for statistical inferences. Experimental results based on extensive simulation studies show that the Monte Carol based tests enhance the statistical power and performs better than Welch’s t-approximation, especially when the equal variance assumption is not met and the sample size of the sample with a larger variance is smaller. We analyzed two gene-expression datasets, namely the childhood acute lymphoblastic leukemia gene-expression dataset with 22 283 genes and Golden Spike dataset produced by a controlled experiment with 13 966 genes. The new test identified additional genes of interest in both datasets. Some of these genes have been proven to play important roles in medical literature. Availability and implementation R scripts and the R package mcBFtest is available in CRAN and to reproduce all reported results are available at the GitHub repository, https://github.com/iullah1980/MCTcodes. Supplementary information Supplementary data is available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
4

Lin, Hung-Chin. "USING NORMAL APPROXIMATION ON TESTING AND DETERMINING SAMPLE SIZE FORCpk." Journal of the Chinese Institute of Industrial Engineers 23, no. 1 (2006): 1–11. http://dx.doi.org/10.1080/10170660609508991.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Birnbaum, David. "Who Is at Risk of What?" Infection Control & Hospital Epidemiology 20, no. 10 (1999): 706–7. http://dx.doi.org/10.1086/501570.

Full text
Abstract:
AbstractIf you have calculated the sample size required for an employee survey or an observational study of departmental practices but found that the number of observations required is larger than the number of employees, chances are the error is due to use of approximation formulae. Many of us unknowingly were taught to use approximations that fail to include the finite population correction factor. Depending on the objective of a study and the proportion of a population sampled, it may be necessary to consider this correction factor in order to estimate standard error and sample size accurately.
APA, Harvard, Vancouver, ISO, and other styles
6

Pagurova, V. I. "On the approximation accuracy for quantiles in a random-size sample." Moscow University Computational Mathematics and Cybernetics 32, no. 4 (2008): 214–21. http://dx.doi.org/10.3103/s0278641908040043.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Madden, L. V., and G. Hughes. "An Effective Sample Size for Predicting Plant Disease Incidence in a Spatial Hierarchy." Phytopathology® 89, no. 9 (1999): 770–81. http://dx.doi.org/10.1094/phyto.1999.89.9.770.

Full text
Abstract:
For aggregated or heterogeneous disease incidence, one can predict the proportion of sampling units diseased at a higher scale (e.g., plants) based on the proportion of diseased individuals and heterogeneity of diseased individuals at a lower scale (e.g., leaves) using a function derived from the beta-binomial distribution. Here, a simple approximation for the beta-binomial-based function is derived. This approximation has a functional form based on the binomial distribution, but with the number of individuals per sampling unit (n) replaced by a parameter (v) that has similar interpretation as, but is not the same as, the effective sample size (ndeff ) often used in survey sampling. The value of v is inversely related to the degree of heterogeneity of disease and generally is intermediate between ndeff and n in magnitude. The choice of v was determined iteratively by finding a parameter value that allowed the zero term (probability that a sampling unit is disease free) of the binomial distribution to equal the zero term of the beta-binomial. The approximation function was successfully tested on observations of Eutypa dieback of grapes collected over several years and with simulated data. Unlike the beta-binomial-based function, the approximation can be rearranged to predict incidence at the lower scale from observed incidence data at the higher scale, making group sampling for heterogeneous data a more practical proposition.
APA, Harvard, Vancouver, ISO, and other styles
8

Zhu, Hong, Song Zhang, and Chul Ahn. "Sample size considerations for split-mouth design." Statistical Methods in Medical Research 26, no. 6 (2015): 2543–51. http://dx.doi.org/10.1177/0962280215601137.

Full text
Abstract:
Split-mouth designs are frequently used in dental clinical research, where a mouth is divided into two or more experimental segments that are randomly assigned to different treatments. It has the distinct advantage of removing a lot of inter-subject variability from the estimated treatment effect. Methods of statistical analyses for split-mouth design have been well developed. However, little work is available on sample size consideration at the design phase of a split-mouth trial, although many researchers pointed out that the split-mouth design can only be more efficient than a parallel-group design when within-subject correlation coefficient is substantial. In this paper, we propose to use the generalized estimating equation (GEE) approach to assess treatment effect in split-mouth trials, accounting for correlations among observations. Closed-form sample size formulas are introduced for the split-mouth design with continuous and binary outcomes, assuming exchangeable and “nested exchangeable” correlation structures for outcomes from the same subject. The statistical inference is based on the large sample approximation under the GEE approach. Simulation studies are conducted to investigate the finite-sample performance of the GEE sample size formulas. A dental clinical trial example is presented for illustration.
APA, Harvard, Vancouver, ISO, and other styles
9

Sterling, Grigoriy, Pavel Prikhodko, Evgeny Burnaev, Mikhail Belyaev, and Stephane Grihon. "On Approximation of Reserve Factors Dependency on Loads for Composite Stiffened Panels." Advanced Materials Research 1016 (August 2014): 85–89. http://dx.doi.org/10.4028/www.scientific.net/amr.1016.85.

Full text
Abstract:
We present two level approach to build accurate approximations for Reserve Factors dependency on loads for composite stiffened panels. Such dependency is continuous non-smooth function with complex form plateaux regions (i.e. regions where function has zero gradient), defined on low dimensional grids. The main problem that arises if one tries to construct global approximation in such case is the occurrence of Gibbs effect (i.e. harmonic oscillations of prediction) near the borders of plateaux that may significantly deteriorate approximation quality. Viable existing solution: approximation based on linear triangular interpolation avoids oscillations, but unlike the proposed approach it provides model that is not smooth outside plateaux regions and generally requires larger sample size to achieve same accuracy of approximation.
APA, Harvard, Vancouver, ISO, and other styles
10

Christoph, Gerd, and Vladimir V. Ulyanov. "Second Order Expansions for High-Dimension Low-Sample-Size Data Statistics in Random Setting." Mathematics 8, no. 7 (2020): 1151. http://dx.doi.org/10.3390/math8071151.

Full text
Abstract:
We consider high-dimension low-sample-size data taken from the standard multivariate normal distribution under assumption that dimension is a random variable. The second order Chebyshev–Edgeworth expansions for distributions of an angle between two sample observations and corresponding sample correlation coefficient are constructed with error bounds. Depending on the type of normalization, we get three different limit distributions: Normal, Student’s t-, or Laplace distributions. The paper continues studies of the authors on approximation of statistics for random size samples.
APA, Harvard, Vancouver, ISO, and other styles
11

H�glund, Thomas. "Bounds for the sample size to justify normal approximation of the confidence level." Annals of the Institute of Statistical Mathematics 43, no. 3 (1991): 565–78. http://dx.doi.org/10.1007/bf00053373.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

de Valpine, P., H. M. Bitter, M. P. S. Brown, and J. Heller. "A simulation-approximation approach to sample size planning for high-dimensional classification studies." Biostatistics 10, no. 3 (2009): 424–35. http://dx.doi.org/10.1093/biostatistics/kxp001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Moslim, Nor Hafizah, Yong Zulina Zubairi, Abdul Ghapor Hussin, Siti Fatimah Hassan, and Rossita Mohamad Yunus. "On the approximation of the concentration parameter for von Mises distribution." Malaysian Journal of Fundamental and Applied Sciences 13, no. 4-1 (2017): 390–93. http://dx.doi.org/10.11113/mjfas.v13n4-1.807.

Full text
Abstract:
The von Mises distribution is the ‘natural’ analogue on the circle of the Normal distribution on the real line and is widely used to describe circular variables. The distribution has two parameters, namely mean direction, and concentration parameter, κ. Solutions to the parameters, however, cannot be derived in the closed form. Noting the relationship of the κ to the size of sample, we examine the asymptotic normal behavior of the parameter. The simulation study is carried out and Kolmogorov-Smirnov test is used to test the goodness of fit for three level of significance values. The study suggests that as sample size and concentration parameter increase, the percentage of samples follow the normality assumption increase.
APA, Harvard, Vancouver, ISO, and other styles
14

Kiefer, Nicholas M., and Timothy J. Vogelsang. "HETEROSKEDASTICITY-AUTOCORRELATION ROBUST TESTING USING BANDWIDTH EQUAL TO SAMPLE SIZE." Econometric Theory 18, no. 6 (2002): 1350–66. http://dx.doi.org/10.1017/s026646660218604x.

Full text
Abstract:
Asymptotic theory for heteroskedasticity autocorrelation consistent (HAC) covariance matrix estimators requires the truncation lag, or bandwidth, to increase more slowly than the sample size. This paper considers an alternative approach covering the case with the asymptotic covariance matrix estimated by kernel methods with truncation lag equal to sample size. Although such estimators are inconsistent, valid tests (asymptotically pivotal) for regression parameters can be constructed. The limiting distributions explicitly capture the truncation lag and choice of kernel. A local asymptotic power analysis shows that the Bartlett kernel delivers the highest power within a group of popular kernels. Finite sample simulations suggest that, regardless of the kernel chosen, the null asymptotic approximation of the new tests is often more accurate than that for conventional HAC estimators and asymptotics. Finite sample results on power show that the new approach is competitive.
APA, Harvard, Vancouver, ISO, and other styles
15

Wang, Chuanmei, Suxiang He, and Haiying Wu. "An Implementable SAA Nonlinear Lagrange Algorithm for Constrained Minimax Stochastic Optimization Problems." Mathematical Problems in Engineering 2018 (December 9, 2018): 1–13. http://dx.doi.org/10.1155/2018/5498760.

Full text
Abstract:
This paper proposes an implementable SAA (sample average approximation) nonlinear Lagrange algorithm for the constrained minimax stochastic optimization problem based on the sample average approximation method. A computable nonlinear Lagrange function with sample average approximation functions of original functions is minimized and the Lagrange multiplier is updated based on the sample average approximation functions of original functions in the algorithm. And it is shown that the solution sequences obtained by the novel algorithm for solving subproblem converge to their true counterparts with probability one as the sample size approximates infinity under some moderate assumptions. Finally, numerical experiments are carried out for solving some typical test problems and the obtained numerical results preliminarily demonstrate that the proposed algorithm is promising.
APA, Harvard, Vancouver, ISO, and other styles
16

He, Suxiang, Yunyun Nie, and Xiaopeng Wang. "A Nonlinear Lagrange Algorithm for Stochastic Minimax Problems Based on Sample Average Approximation Method." Journal of Applied Mathematics 2014 (2014): 1–8. http://dx.doi.org/10.1155/2014/497262.

Full text
Abstract:
An implementable nonlinear Lagrange algorithm for stochastic minimax problems is presented based on sample average approximation method in this paper, in which the second step minimizes a nonlinear Lagrange function with sample average approximation functions of original functions and the sample average approximation of the Lagrange multiplier is adopted. Under a set of mild assumptions, it is proven that the sequences of solution and multiplier obtained by the proposed algorithm converge to the Kuhn-Tucker pair of the original problem with probability one as the sample size increases. At last, the numerical experiments for five test examples are performed and the numerical results indicate that the algorithm is promising.
APA, Harvard, Vancouver, ISO, and other styles
17

Shi, Dexin, Taehun Lee, and Alberto Maydeu-Olivares. "Understanding the Model Size Effect on SEM Fit Indices." Educational and Psychological Measurement 79, no. 2 (2018): 310–34. http://dx.doi.org/10.1177/0013164418783530.

Full text
Abstract:
This study investigated the effect the number of observed variables ( p) has on three structural equation modeling indices: the comparative fit index (CFI), the Tucker–Lewis index (TLI), and the root mean square error of approximation (RMSEA). The behaviors of the population fit indices and their sample estimates were compared under various conditions created by manipulating the number of observed variables, the types of model misspecification, the sample size, and the magnitude of factor loadings. The results showed that the effect of p on the population CFI and TLI depended on the type of specification error, whereas a higher p was associated with lower values of the population RMSEA regardless of the type of model misspecification. In finite samples, all three fit indices tended to yield estimates that suggested a worse fit than their population counterparts, which was more pronounced with a smaller sample size, higher p, and lower factor loading.
APA, Harvard, Vancouver, ISO, and other styles
18

Tillier, E. R., and G. B. Golding. "A sampling theory of selectively neutral alleles in a subdivided population." Genetics 119, no. 3 (1988): 721–29. http://dx.doi.org/10.1093/genetics/119.3.721.

Full text
Abstract:
Abstract Ewens' sampling distribution is investigated for a structured population. Samples are assumed to be taken from a single subpopulation that exchanges migrants with other subpopulations. A complete description of the probability distribution for such samples is not a practical possibility but an equilibrium approximation can be found. This approximation extracts the information necessary for constructing a continuous approximation to the complete distribution using known values of the distribution and its derivatives in randomly mating populations. It is shown that this approximation is as complete a description of a single biologically realistic subpopulation as is possible given standard uncertainties about the actual size of the migration rates, relative sizes of each of the subpopulations and other factors that might affect the genetic structure of a subpopulation. Any further information must be gained at the expense of generality. This approximation is used to investigate the effect of population subdivision on Watterson's test of neutrality. It is known that the infinite allele, sample distribution is independent of mutation rate when made conditional on the number of alleles in the sample. It is shown that the conditional, infinite allele, sample distribution from this approximation is also independent of population structure and hence Watterson's test is still approximately valid for subdivided populations.
APA, Harvard, Vancouver, ISO, and other styles
19

Tsutakawa, Robert K., and Michael J. Soltys. "Approximation for Bayesian Ability Estimation." Journal of Educational Statistics 13, no. 2 (1988): 117–30. http://dx.doi.org/10.3102/10769986013002117.

Full text
Abstract:
An approximation is proposed for the posterior mean and standard deviation of the ability parameter in an item response model. The procedure assumes that approximations to the posterior mean and covariance matrix of item parameters are available. It is based on the posterior mean of a Taylor series approximation to the posterior mean conditional on the item parameters. The method is illustrated for the two-parameter logistic model using data from an ACT math test with 39 items. A numerical comparison with the empirical Bayes method using n = 400 examinees shows that the point estimates are very similar but the standard deviations under empirical Bayes are about 2% smaller than those under Bayes. Moreover, when the sample size is decreased to n = 100, the standard deviation under Bayes is shown to increase by 14% in some cases.
APA, Harvard, Vancouver, ISO, and other styles
20

Kemp, Gordon C. R. "THE BEHAVIOR OF FORECAST ERRORS FROM A NEARLY INTEGRATED AR(1) MODEL AS BOTH SAMPLE SIZE AND FORECAST HORIZON BECOME LARGE." Econometric Theory 15, no. 2 (1999): 238–56. http://dx.doi.org/10.1017/s026646669915206x.

Full text
Abstract:
We develop asymptotic approximations to the distribution of forecast errors from an estimated AR(1) model with no drift when the true process is nearly I(1) and both the forecast horizon and the sample size are allowed to increase at the same rate. We find that the forecast errors are the sums of two components that are asymptotically independent. The first is asymptotically normal whereas the second is asymptotically nonnormal. This throws doubt on the suitability of a normal approximation to the forecast error distribution. We then perform a Monte Carlo study to quantify further the effects on the forecast errors of sampling variability in the parameter estimates as we allow both forecast horizon and sample size to increase.
APA, Harvard, Vancouver, ISO, and other styles
21

Hulle, Marc M. Van. "Edgeworth Approximation of Multivariate Differential Entropy." Neural Computation 17, no. 9 (2005): 1903–10. http://dx.doi.org/10.1162/0899766054323026.

Full text
Abstract:
We develop the general, multivariate case of the Edgeworth approximation of differential entropy and show that it can be more accurate than the nearest-neighbor method in the multivariate case and that it scales better with sample size. Furthermore, we introduce mutual information estimation as an application.
APA, Harvard, Vancouver, ISO, and other styles
22

Hanin, Leonid. "Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings." Mathematics 9, no. 6 (2021): 603. http://dx.doi.org/10.3390/math9060603.

Full text
Abstract:
I uncover previously underappreciated systematic sources of false and irreproducible results in natural, biomedical and social sciences that are rooted in statistical methodology. They include the inevitably occurring deviations from basic assumptions behind statistical analyses and the use of various approximations. I show through a number of examples that (a) arbitrarily small deviations from distributional homogeneity can lead to arbitrarily large deviations in the outcomes of statistical analyses; (b) samples of random size may violate the Law of Large Numbers and thus are generally unsuitable for conventional statistical inference; (c) the same is true, in particular, when random sample size and observations are stochastically dependent; and (d) the use of the Gaussian approximation based on the Central Limit Theorem has dramatic implications for p-values and statistical significance essentially making pursuit of small significance levels and p-values for a fixed sample size meaningless. The latter is proven rigorously in the case of one-sided Z test. This article could serve as a cautionary guidance to scientists and practitioners employing statistical methods in their work.
APA, Harvard, Vancouver, ISO, and other styles
23

Li, Yuanyuan, and Dietmar Bauer. "Modeling I(2) Processes Using Vector Autoregressions Where the Lag Length Increases with the Sample Size." Econometrics 8, no. 3 (2020): 38. http://dx.doi.org/10.3390/econometrics8030038.

Full text
Abstract:
In this paper the theory on the estimation of vector autoregressive (VAR) models for I(2) processes is extended to the case of long VAR approximation of more general processes. Hereby the order of the autoregression is allowed to tend to infinity at a certain rate depending on the sample size. We deal with unrestricted OLS estimators (in the model formulated in levels as well as in vector error correction form) as well as with two stage estimation (2SI2) in the vector error correction model (VECM) formulation. Our main results are analogous to the I(1) case: We show that the long VAR approximation leads to consistent estimates of the long and short run dynamics. Furthermore, tests on the autoregressive coefficients follow standard asymptotics. The pseudo likelihood ratio tests on the cointegrating ranks (using the Gaussian likelihood) used in the 2SI2 algorithm show under the null hypothesis the same distributions as in the case of data generating processes following finite order VARs. The same holds true for the asymptotic distribution of the long run dynamics both in the unrestricted VECM estimation and the reduced rank regression in the 2SI2 algorithm. Building on these results we show that if the data is generated by an invertible VARMA process, the VAR approximation can be used in order to derive a consistent initial estimator for subsequent pseudo likelihood optimization in the VARMA model.
APA, Harvard, Vancouver, ISO, and other styles
24

Fowler, Robert L. "Estimating the Standardized Mean Difference in Intervention Studies." Journal of Educational Statistics 13, no. 4 (1988): 337–50. http://dx.doi.org/10.3102/10769986013004337.

Full text
Abstract:
Methods for approximating confidence intervals for the population standardized mean difference, δ , were evaluated analytically in small samples. A procedure based on the large sample approximation of the distribution of the sample standardized mean difference d was quite accurate over most of the range .25 ≤ δ ≤1.5 and 6 ≤ N ≤ 40 examined. Use of Hedges’s adjustment to d yielded confidence intervals that were slightly conservative, whereas the unadjusted d produced intervals that were somewhat liberal. An empirically determined simple linear adjustment to d demonstrated the most consistent precision when used to construct confidence intervals for δ , having a maximum error of less than 2% of the nominal level of confidence over the effect size range of most interest to behavioral scientists.
APA, Harvard, Vancouver, ISO, and other styles
25

Kaufmann, E., and R. D. Reiss. "Poisson approximation of intermediate empirical processes." Journal of Applied Probability 29, no. 4 (1992): 825–37. http://dx.doi.org/10.2307/3214715.

Full text
Abstract:
We investigate the asymptotic behaviour of empirical processes truncated outside an interval about the (1 – s(n)/n)-quantile where s(n) → ∞ and s(n)/n → 0 as the sample size n tends to ∞. It is shown that extreme value (Poisson) processes and, alternatively, the homogeneous Poisson process may serve as approximations if certain von Mises conditions hold.
APA, Harvard, Vancouver, ISO, and other styles
26

Kaufmann, E., and R. D. Reiss. "Poisson approximation of intermediate empirical processes." Journal of Applied Probability 29, no. 04 (1992): 825–37. http://dx.doi.org/10.1017/s0021900200043709.

Full text
Abstract:
We investigate the asymptotic behaviour of empirical processes truncated outside an interval about the (1 – s(n)/n)-quantile where s(n) → ∞ and s(n)/n → 0 as the sample size n tends to ∞. It is shown that extreme value (Poisson) processes and, alternatively, the homogeneous Poisson process may serve as approximations if certain von Mises conditions hold.
APA, Harvard, Vancouver, ISO, and other styles
27

Kobayashi, Ken, Naoki Hamada, Akiyoshi Sannai, Akinori Tanaka, Kenichi Bannai, and Masashi Sugiyama. "Bézier Simplex Fitting: Describing Pareto Fronts of´ Simplicial Problems with Small Samples in Multi-Objective Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 2304–13. http://dx.doi.org/10.1609/aaai.v33i01.33012304.

Full text
Abstract:
Multi-objective optimization problems require simultaneously optimizing two or more objective functions. Many studies have reported that the solution set of an M-objective optimization problem often forms an (M − 1)-dimensional topological simplex (a curved line for M = 2, a curved triangle for M = 3, a curved tetrahedron for M = 4, etc.). Since the dimensionality of the solution set increases as the number of objectives grows, an exponentially large sample size is needed to cover the solution set. To reduce the required sample size, this paper proposes a Bézier simplex model and its fitting algorithm. These techniques can exploit the simplex structure of the solution set and decompose a high-dimensional surface fitting task into a sequence of low-dimensional ones. An approximation theorem of Bézier simplices is proven. Numerical experiments with synthetic and real-world optimization problems demonstrate that the proposed method achieves an accurate approximation of high-dimensional solution sets with small samples. In practice, such an approximation will be conducted in the postoptimization process and enable a better trade-off analysis.
APA, Harvard, Vancouver, ISO, and other styles
28

Zhao, Ji, and Deyu Meng. "FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test." Neural Computation 27, no. 6 (2015): 1345–72. http://dx.doi.org/10.1162/neco_a_00732.

Full text
Abstract:
The maximum mean discrepancy (MMD) is a recently proposed test statistic for the two-sample test. Its quadratic time complexity, however, greatly hampers its availability to large-scale applications. To accelerate the MMD calculation, in this study we propose an efficient method called FastMMD. The core idea of FastMMD is to equivalently transform the MMD with shift-invariant kernels into the amplitude expectation of a linear combination of sinusoid components based on Bochner’s theorem and Fourier transform (Rahimi & Recht, 2007 ). Taking advantage of sampling the Fourier transform, FastMMD decreases the time complexity for MMD calculation from [Formula: see text] to [Formula: see text], where N and d are the size and dimension of the sample set, respectively. Here, L is the number of basis functions for approximating kernels that determines the approximation accuracy. For kernels that are spherically invariant, the computation can be further accelerated to [Formula: see text] by using the Fastfood technique (Le, Sarlós, & Smola, 2013 ). The uniform convergence of our method has also been theoretically proved in both unbiased and biased estimates. We also provide a geometric explanation for our method, ensemble of circular discrepancy, which helps us understand the insight of MMD and we hope will lead to more extensive metrics for assessing the two-sample test task. Experimental results substantiate that the accuracy of FastMMD is similar to that of MMD and with faster computation and lower variance than existing MMD approximation methods.
APA, Harvard, Vancouver, ISO, and other styles
29

Oderwald, Richard G. "Augmenting inventories with basal area points to achieve desired precision." Canadian Journal of Forest Research 33, no. 7 (2003): 1208–10. http://dx.doi.org/10.1139/x03-046.

Full text
Abstract:
An approximation to the variances of the regression and mean of ratios estimators for double sampling is presented that can be used to determine how many basal area points are needed to augment an existing inventory to achieve a desired precision. The approximation may also be used to determine the effective regular sample size to achieve equal precision to a double sample.
APA, Harvard, Vancouver, ISO, and other styles
30

Darroch, J. N., M. Jirina, and T. P. Speed. "Sampling Without Replacement: Approximation to the Probability Distribution." Journal of the Australian Mathematical Society. Series A. Pure Mathematics and Statistics 44, no. 2 (1988): 197–213. http://dx.doi.org/10.1017/s1446788700029785.

Full text
Abstract:
AbstractLet P be the probability distribution of a sample without replacement of size n from a finite population represented by the set N={1,2,…N}. For each r=0, 1, …, an approximation Pr is described such that the uniform norm ‖P − Pr‖ is of order (n2/N)r+1 if n2/N→0. The approximation Pr is a linear combination of uniform probability product-measures concentrated on certain subspaces of the sample space Nn.
APA, Harvard, Vancouver, ISO, and other styles
31

Bonett, Douglas G., and Robert M. Price. "Inferential Methods for the Tetrachoric Correlation Coefficient." Journal of Educational and Behavioral Statistics 30, no. 2 (2005): 213–25. http://dx.doi.org/10.3102/10769986030002213.

Full text
Abstract:
The tetrachoric correlation describes the linear relation between two continuous variables that have each been measured on a dichotomous scale. The treatment of the point estimate, standard error, interval estimate, and sample size requirement for the tetrachoric correlation is cursory and incomplete in modern psychometric and behavioral statistics texts. A new and simple method of accurately approximating the tetrachoric correlation is introduced. The tetrachoric approximation is then used to derive a simple standard error, confidence interval, and sample size planning formula. The new confidence interval is shown to perform far better than the confidence interval computed by SAS. A method to improve the SAS confidence interval is proposed. All of the new results are computationally simple and are ideally suited for textbook and classroom presentations.
APA, Harvard, Vancouver, ISO, and other styles
32

Durbin, J. "Approximate distributions of Student's t-statistics for autoregressive coefficients calculated from regression residuals." Journal of Applied Probability 23, A (1986): 173–85. http://dx.doi.org/10.2307/3214351.

Full text
Abstract:
We consider a multiple regression model in which the regressors are Fourier cosine vectors. These regressors are intended as approximations to ‘slowly changing' regressors of the kind often found in time series regression applications. The errors in the model are assumed to be generated by a special type of autoregressive model defined so that the regressors are eigenvectors of the quadratic forms occurring in the exponent of the probability density of the errors. This autoregression is intended as an approximation to the usual stationary autoregression. Both approximations are adopted for the sake of mathematical convenience.Student's t-statistics are constructed for the autoregressive coefficients in a manner analogous to ordinary regression. It is shown that these statistics are distributed as Student's t to the first order of approximation, that is with errors in the density of order T−1/2, where T is the sample size, while the squares of the statistics are distributed as the square of Student's t to the second order of approximation, that is with errors in the density of order Τ–1.
APA, Harvard, Vancouver, ISO, and other styles
33

Durbin, J. "Approximate distributions of Student's t-statistics for autoregressive coefficients calculated from regression residuals." Journal of Applied Probability 23, A (1986): 173–85. http://dx.doi.org/10.1017/s0021900200117061.

Full text
Abstract:
We consider a multiple regression model in which the regressors are Fourier cosine vectors. These regressors are intended as approximations to ‘slowly changing' regressors of the kind often found in time series regression applications. The errors in the model are assumed to be generated by a special type of autoregressive model defined so that the regressors are eigenvectors of the quadratic forms occurring in the exponent of the probability density of the errors. This autoregression is intended as an approximation to the usual stationary autoregression. Both approximations are adopted for the sake of mathematical convenience. Student's t-statistics are constructed for the autoregressive coefficients in a manner analogous to ordinary regression. It is shown that these statistics are distributed as Student's t to the first order of approximation, that is with errors in the density of order T −1/2, where T is the sample size, while the squares of the statistics are distributed as the square of Student's t to the second order of approximation, that is with errors in the density of order Τ –1.
APA, Harvard, Vancouver, ISO, and other styles
34

Houshmand, Ali A., and Srinivasarao Panganamamula. "An analytical approximation and a neural network model for optimal sample size in vendor selection." Journal of Statistical Computation and Simulation 53, no. 1-2 (1995): 65–78. http://dx.doi.org/10.1080/00949659508811696.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Luzin, V. "Optimization of Texture Measurements. IV. The Influence of the Grain-Size Distribution on the Quality of Texture Measurements." Textures and Microstructures 31, no. 3 (1999): 177–86. http://dx.doi.org/10.1155/tsm.31.177.

Full text
Abstract:
It is the focus of attention that texture experiments deal with real samples which consist of a finite number of crystallites of different size. Because of this the main sample-induced statistics (grain statistics and grain-size statistics) have essential influence on the quality of experimental data. This article quantitatively analyzes the dependence of the integral error on the parameters of the mentioned statistics and the approximation goodness.
APA, Harvard, Vancouver, ISO, and other styles
36

SUN, HAILIN, HUIFU XU, and YONG WANG. "A SMOOTHING PENALIZED SAMPLE AVERAGE APPROXIMATION METHOD FOR STOCHASTIC PROGRAMS WITH SECOND-ORDER STOCHASTIC DOMINANCE CONSTRAINTS." Asia-Pacific Journal of Operational Research 30, no. 03 (2013): 1340002. http://dx.doi.org/10.1142/s0217595913400022.

Full text
Abstract:
In this paper, we propose a smoothing penalized sample average approximation (SAA) method for solving a stochastic minimization problem with second-order dominance constraints. The basic idea is to use sample average to approximate the expected values of the underlying random functions and then reformulate the discretized problem as an ordinary nonlinear programming problem with finite number of constraints. An exact penalty function method is proposed to deal with the latter and an elementary smoothing technique is used to tackle the nonsmoothness of the plus function and the exact penalty function. We investigate the convergence of the optimal value obtained from solving the smoothed penalized sample average approximation problem as sample size increases and show that with probability approaching to one at exponential rate with the increase of sample size the optimal value converges to its true counterpart. Some preliminary numerical results are reported.
APA, Harvard, Vancouver, ISO, and other styles
37

XU, HUIFU. "SAMPLE AVERAGE APPROXIMATION METHODS FOR A CLASS OF STOCHASTIC VARIATIONAL INEQUALITY PROBLEMS." Asia-Pacific Journal of Operational Research 27, no. 01 (2010): 103–19. http://dx.doi.org/10.1142/s0217595910002569.

Full text
Abstract:
In this paper we apply the well known sample average approximation (SAA) method to solve a class of stochastic variational inequality problems (SVIPs). We investigate the existence and convergence of a solution to the sample average approximated SVIP. Under some moderate conditions, we show that the sample average approximated SVIP has a solution with probability one and with probability approaching one exponentially fast with the increase of sample size, the solution converges to its true counterpart. Finally, we apply the existence and convergence results to SAA method for solving a class of stochastic nonlinear complementarity problems and stochastic programs with stochastic constraints.
APA, Harvard, Vancouver, ISO, and other styles
38

Soshko, Oksana. "Inventory Management in Multi Echelon Supply Chain using Sample Average Approximation." Scientific Journal of Riga Technical University. Computer Sciences 39, no. 1 (2009): 45–51. http://dx.doi.org/10.2478/v10143-010-0006-x.

Full text
Abstract:
Inventory Management in Multi Echelon Supply Chain using Sample Average ApproximationAn optimization model of multiechelon supply chain is presented in this paper. The decisions to be made are the amount of beer to be ordered in every echelon of supply chain in each echelon over the time horizon of one year. Since demand of the end customer is stochastic and presented by means of scenarios, the problem is solved by using sample average approximation method. This method uses only a subset of the scenarios, randomly sampled according to the distribution over scenarios, to represent the full scenario space. An important theoretical justification for this method is that as the number of scenarios sampled increases, the solution to the approximate problem converges to an optimal solution in the expected sense. The computational results are presented for two cases. First target level is chosen as a decision variable and then order size is chosen as a decision variable of the problem. The target level strategy is based on making inventory for each echelon; in its turn order strategy is based on determination of optimal order quantity, which is independent from scenarios. However target level strategy provides high service at low cost, but it offers less reality under uncertain demand than order strategy. Practical experiments on finding the optimal SAA parameters are presented in the paper and as well as the analysis of their impact on solution quality.
APA, Harvard, Vancouver, ISO, and other styles
39

Ramsey, Philip H., and Patricia P. Ramsey. "Evaluating the Normal Approximation to the Binomial Test." Journal of Educational Statistics 13, no. 2 (1988): 173–82. http://dx.doi.org/10.3102/10769986013002173.

Full text
Abstract:
The normal approximation to the binomial test with and without a continuity correction is evaluated in terms of control of Type I errors and power. The normal approximations are evaluated as robust for a given sample size, N, and at a given level α if the true Type I error rate never exceeds 1.5 α. The uncorrected normal test is found to be less robust than is implied by the currently applied guidelines. The most stringent currently used guideline of requiring σ2≥10 is adequate at α = .05 but must be increased to σ2 ≥35 at α = .01. The corrected test is shown to be robust but not conservative. Both tests are shown to have substantial power loss in comparison to the exact binomial test.
APA, Harvard, Vancouver, ISO, and other styles
40

Kabán, Ata. "Sufficient ensemble size for random matrix theory-based handling of singular covariance matrices." Analysis and Applications 18, no. 05 (2020): 929–50. http://dx.doi.org/10.1142/s0219530520400072.

Full text
Abstract:
Singular covariance matrices are frequently encountered in both machine learning and optimization problems, most commonly due to high dimensionality of data and insufficient sample sizes. Among many methods of regularization, here we focus on a relatively recent random matrix-theoretic approach, the idea of which is to create well-conditioned approximations of a singular covariance matrix and its inverse by taking the expectation of its random projections. We are interested in the error of a Monte Carlo implementation of this approach, which allows subsequent parallel processing in low dimensions in practice. We find that [Formula: see text] random projections, where [Formula: see text] is the size of the original matrix, are sufficient for the Monte Carlo error to become negligible, in the sense of expected spectral norm difference, for both covariance and inverse covariance approximation, in the latter case under mild assumptions.
APA, Harvard, Vancouver, ISO, and other styles
41

De Santis, Fulvio, and Stefania Gubbiotti. "Sample Size Requirements for Calibrated Approximate Credible Intervals for Proportions in Clinical Trials." International Journal of Environmental Research and Public Health 18, no. 2 (2021): 595. http://dx.doi.org/10.3390/ijerph18020595.

Full text
Abstract:
In Bayesian analysis of clinical trials data, credible intervals are widely used for inference on unknown parameters of interest, such as treatment effects or differences in treatments effects. Highest Posterior Density (HPD) sets are often used because they guarantee the shortest length. In most of standard problems, closed-form expressions for exact HPD intervals do not exist, but they are available for intervals based on the normal approximation of the posterior distribution. For small sample sizes, approximate intervals may be not calibrated in terms of posterior probability, but for increasing sample sizes their posterior probability tends to the correct credible level and they become closer and closer to exact sets. The article proposes a predictive analysis to select appropriate sample sizes needed to have approximate intervals calibrated at a pre-specified level. Examples are given for interval estimation of proportions and log-odds.
APA, Harvard, Vancouver, ISO, and other styles
42

Yeung, Dit-Yan, Hong Chang, and Guang Dai. "A Scalable Kernel-Based Semisupervised Metric Learning Algorithm with Out-of-Sample Generalization Ability." Neural Computation 20, no. 11 (2008): 2839–61. http://dx.doi.org/10.1162/neco.2008.05-07-528.

Full text
Abstract:
In recent years, metric learning in the semisupervised setting has aroused a lot of research interest. One type of semisupervised metric learning utilizes supervisory information in the form of pairwise similarity or dissimilarity constraints. However, most methods proposed so far are either limited to linear metric learning or unable to scale well with the data set size. In this letter, we propose a nonlinear metric learning method based on the kernel approach. By applying low-rank approximation to the kernel matrix, our method can handle significantly larger data sets. Moreover, our low-rank approximation scheme can naturally lead to out-of-sample generalization. Experiments performed on both artificial and real-world data show very promising results.
APA, Harvard, Vancouver, ISO, and other styles
43

Mohamad Zaidi, Umi Zalilah, A. R. Bushroa, Reza Rahbari Ghahnavyeh, and Reza Mahmoodian. "Crystallite size and microstrain: XRD line broadening analysis of AgSiN thin films." Pigment & Resin Technology 48, no. 6 (2019): 473–80. http://dx.doi.org/10.1108/prt-03-2018-0026.

Full text
Abstract:
Purpose This paper aims to determine the crystallite size and microstrain values of AgSiN thin films using potential approach called approximation method. This method can be used as a replacement for other determination methods such as Williamson-Hall (W-H) plot and Warren-Averbach analysis. Design/methodology/approach The monolayer AgSiN thin films on Ti6Al4V alloy were fabricated using magnetron sputtering technique. To evaluate the crystallite size and microstrain values, the thin films were deposited under different bias voltage (−75, −150 and −200 V). X-ray diffraction (XRD) broadening profile along with approximation method were used to determine the crystallite size and microstrain values. The reliability of the method was proved by comparing it with scanning electron microscopy graph and W-H plot method. The second parameters’ microstrain obtained was used to project the residual stress present in the thin films. Further discussion on the thin films was done by relating the residual stress with the adhesion strength and the thickness of the films. Findings XRD-approximation method results revealed that the crystallite size values obtained from the method were in a good agreement when it is compared with Scherer formula and W-H method. Meanwhile, the calculations for thin films corresponding residual stresses were correlated well with scratch adhesion critical loads with the lowest residual stress was noted for sample with lowest microstrain and has thickest thickness among the three samples. Practical implications The fabricated thin films were intended to be used in antibacterial applications. Originality/value Up to the knowledge from literature review, there are no reports on depositing AgSiN on Ti6Al4V alloy via magnetron sputtering to elucidate the crystallite size and microstrain properties using the approximation method.
APA, Harvard, Vancouver, ISO, and other styles
44

Xu, Mengyu, Danna Zhang, and Wei Biao Wu. "Pearson’s chi-squared statistics: approximation theory and beyond." Biometrika 106, no. 3 (2019): 716–23. http://dx.doi.org/10.1093/biomet/asz020.

Full text
Abstract:
Summary We establish an approximation theory for Pearson’s chi-squared statistics in situations where the number of cells is large, by using a high-dimensional central limit theorem for quadratic forms of random vectors. Our high-dimensional central limit theorem is proved under Lyapunov-type conditions that involve a delicate interplay between the dimension, the sample size, and the moment conditions. We propose a modified chi-squared statistic and introduce an adjusted degrees of freedom. A simulation study shows that the modified statistic outperforms Pearson’s chi-squared statistic in terms of both size accuracy and power. Our procedure is applied to the construction of a goodness-of-fit test for Rutherford’s alpha-particle data.
APA, Harvard, Vancouver, ISO, and other styles
45

Venkatakrishnan, S. V., Jeffrey Donatelli, Dinesh Kumar, et al. "A multi-slice simulation algorithm for grazing-incidence small-angle X-ray scattering." Journal of Applied Crystallography 49, no. 6 (2016): 1876–84. http://dx.doi.org/10.1107/s1600576716013273.

Full text
Abstract:
Grazing-incidence small-angle X-ray scattering (GISAXS) is an important technique in the characterization of samples at the nanometre scale. A key aspect of GISAXS data analysis is the accurate simulation of samples to match the measurement. The distorted-wave Born approximation (DWBA) is a widely used model for the simulation of GISAXS patterns. For certain classes of sample such as nanostructures embedded in thin films, where the electric field intensity variation is significant relative to the size of the structures, a multi-slice DWBA theory is more accurate than the conventional DWBA method. However, simulating complex structures in the multi-slice setting is challenging and the algorithms typically used are designed on a case-by-case basis depending on the structure to be simulated. In this paper, an accurate algorithm for GISAXS simulations based on the multi-slice DWBA theory is presented. In particular, fundamental properties of the Fourier transform have been utilized to develop an algorithm that accurately computes the average refractive index profile as a function of depth and the Fourier transform of the portion of the sample within a given slice, which are key quantities required for the multi-slice DWBA simulation. The results from this method are compared with the traditionally used approximations, demonstrating that the proposed algorithm can produce more accurate results. Furthermore, this algorithm is general with respect to the sample structure, and does not require any sample-specific approximations to perform the simulations.
APA, Harvard, Vancouver, ISO, and other styles
46

McKeigue, Paul. "Sample size requirements for learning to classify with high-dimensional biomarker panels." Statistical Methods in Medical Research 28, no. 3 (2017): 904–10. http://dx.doi.org/10.1177/0962280217738807.

Full text
Abstract:
A common problem in biomedical research is to calculate the sample size required to learn a classifier using a (possibly high-dimensional) panel of biomarkers. This paper describes a simple method based on a Gaussian approximation for calculating the predictive performance of the learned classifier given the size of the biomarker panel, the size of the training sample, and the optimal predictive performance (expressed as C-statistic [Formula: see text]) of the biomarker panel that could be obtained if a training sample of unlimited size were available. Under the assumption that the biomarker effect sizes have the same correlation structure as the biomarkers, the required sample size does not depend upon these correlations, but only upon [Formula: see text] and upon the sparsity of the distribution of effect sizes, defined by the proportion of biomarkers that have nonzero effects. To learn a classifier that extracts 80% of the predictive information, the required case sample size varies from about 0.1 cases per variable for a panel with [Formula: see text] and a sparse distribution of effect sizes (such that 1% of biomarkers have nonzero effect sizes) to nine cases per variable for a panel with [Formula: see text] and a diffuse distribution of effect sizes.
APA, Harvard, Vancouver, ISO, and other styles
47

Dolgov, Sergey, Karim Anaya-Izquierdo, Colin Fox, and Robert Scheichl. "Approximation and sampling of multivariate probability distributions in the tensor train decomposition." Statistics and Computing 30, no. 3 (2019): 603–25. http://dx.doi.org/10.1007/s11222-019-09910-z.

Full text
Abstract:
Abstract General multivariate distributions are notoriously expensive to sample from, particularly the high-dimensional posterior distributions in PDE-constrained inverse problems. This paper develops a sampler for arbitrary continuous multivariate distributions that is based on low-rank surrogates in the tensor train format, a methodology that has been exploited for many years for scalable, high-dimensional density function approximation in quantum physics and chemistry. We build upon recent developments of the cross approximation algorithms in linear algebra to construct a tensor train approximation to the target probability density function using a small number of function evaluations. For sufficiently smooth distributions, the storage required for accurate tensor train approximations is moderate, scaling linearly with dimension. In turn, the structure of the tensor train surrogate allows sampling by an efficient conditional distribution method since marginal distributions are computable with linear complexity in dimension. Expected values of non-smooth quantities of interest, with respect to the surrogate distribution, can be estimated using transformed independent uniformly-random seeds that provide Monte Carlo quadrature or transformed points from a quasi-Monte Carlo lattice to give more efficient quasi-Monte Carlo quadrature. Unbiased estimates may be calculated by correcting the transformed random seeds using a Metropolis–Hastings accept/reject step, while the quasi-Monte Carlo quadrature may be corrected either by a control-variate strategy or by importance weighting. We show that the error in the tensor train approximation propagates linearly into the Metropolis–Hastings rejection rate and the integrated autocorrelation time of the resulting Markov chain; thus, the integrated autocorrelation time may be made arbitrarily close to 1, implying that, asymptotic in sample size, the cost per effectively independent sample is one target density evaluation plus the cheap tensor train surrogate proposal that has linear cost with dimension. These methods are demonstrated in three computed examples: fitting failure time of shock absorbers; a PDE-constrained inverse diffusion problem; and sampling from the Rosenbrock distribution. The delayed rejection adaptive Metropolis (DRAM) algorithm is used as a benchmark. In all computed examples, the importance weight-corrected quasi-Monte Carlo quadrature performs best and is more efficient than DRAM by orders of magnitude across a wide range of approximation accuracies and sample sizes. Indeed, all the methods developed here significantly outperform DRAM in all computed examples.
APA, Harvard, Vancouver, ISO, and other styles
48

Lieberman, Offer, Judith Rousseau, and David M. Zucker. "SMALL-SAMPLE LIKELIHOOD-BASED INFERENCE IN THE ARFIMA MODEL." Econometric Theory 16, no. 2 (2000): 231–48. http://dx.doi.org/10.1017/s0266466600162048.

Full text
Abstract:
The autoregressive fractionally integrated moving average (ARFIMA) model has become a popular approach for analyzing time series that exhibit long-range dependence. For the Gaussian case, there have been substantial advances in the area of likelihood-based inference, including development of the asymptotic properties of the maximum likelihood estimates and formulation of procedures for their computation. Small-sample inference, however, has not to date been studied. Here we investigate the small-sample behavior of the conventional and Bartlett-corrected likelihood ratio tests (LRT) for the fractional difference parameter. We derive an expression for the Bartlett correction factor. We investigate the asymptotic order of approximation of the Bartlett-corrected test. In addition, we present a small simulation study of the conventional and Bartlett-corrected LRT's. We find that for simple ARFIMA models both tests perform fairly well with a sample size of 40 but the Bartlett-corrected test generally provides an improvement over the conventional test with a sample size of 20.
APA, Harvard, Vancouver, ISO, and other styles
49

Eaton, Brett C., R. Dan Moore, and Lucy G. MacKenzie. "Percentile-based grain size distribution analysis tools (GSDtools) – estimating confidence limits and hypothesis tests for comparing two samples." Earth Surface Dynamics 7, no. 3 (2019): 789–806. http://dx.doi.org/10.5194/esurf-7-789-2019.

Full text
Abstract:
Abstract. Most studies of gravel bed rivers present at least one bed surface grain size distribution, but there is almost never any information provided about the uncertainty in the percentile estimates. We present a simple method for estimating the grain size confidence intervals about sample percentiles derived from standard Wolman or pebble count samples of bed surface texture. The width of a grain size confidence interval depends on the confidence level selected by the user (e.g., 95 %), the number of stones sampled to generate the cumulative frequency distribution, and the shape of the frequency distribution itself. For a 95 % confidence level, the computed confidence interval would include the true grain size parameter in 95 out of 100 trials, on average. The method presented here uses binomial theory to calculate a percentile confidence interval for each percentile of interest, then maps that confidence interval onto the cumulative frequency distribution of the sample in order to calculate the more useful grain size confidence interval. The validity of this approach is confirmed by comparing the predictions using binomial theory with estimates of the grain size confidence interval based on repeated sampling from a known population. We also developed a two-sample test of the equality of a given grain size percentile (e.g., D50), which can be used to compare different sites, sampling methods, or operators. The test can be applied with either individual or binned grain size data. These analyses are implemented in the freely available GSDtools package, written in the R language. A solution using the normal approximation to the binomial distribution is implemented in a spreadsheet that accompanies this paper. Applying our approach to various samples of grain size distributions in the field, we find that the standard sample size of 100 observations is typically associated with uncertainty estimates ranging from about ±15 % to ±30 %, which may be unacceptably large for many applications. In comparison, a sample of 500 stones produces uncertainty estimates ranging from about ±9 % to ±18 %. In order to help workers develop appropriate sampling approaches that produce the desired level of precision, we present simple equations that approximate the proportional uncertainty associated with the 50th and 84th percentiles of the distribution as a function of sample size and sorting coefficient; the true uncertainty in any sample depends on the shape of the sample distribution and can only be accurately estimated once the sample has been collected.
APA, Harvard, Vancouver, ISO, and other styles
50

Heath, Anna, Natalia Kunst, Christopher Jackson, et al. "Calculating the Expected Value of Sample Information in Practice: Considerations from 3 Case Studies." Medical Decision Making 40, no. 3 (2020): 314–26. http://dx.doi.org/10.1177/0272989x20912402.

Full text
Abstract:
Background. Investing efficiently in future research to improve policy decisions is an important goal. Expected value of sample information (EVSI) can be used to select the specific design and sample size of a proposed study by assessing the benefit of a range of different studies. Estimating EVSI with the standard nested Monte Carlo algorithm has a notoriously high computational burden, especially when using a complex decision model or when optimizing over study sample sizes and designs. Recently, several more efficient EVSI approximation methods have been developed. However, these approximation methods have not been compared, and therefore their comparative performance across different examples has not been explored. Methods. We compared 4 EVSI methods using 3 previously published health economic models. The examples were chosen to represent a range of real-world contexts, including situations with multiple study outcomes, missing data, and data from an observational rather than a randomized study. The computational speed and accuracy of each method were compared. Results. In each example, the approximation methods took minutes or hours to achieve reasonably accurate EVSI estimates, whereas the traditional Monte Carlo method took weeks. Specific methods are particularly suited to problems where we wish to compare multiple proposed sample sizes, when the proposed sample size is large, or when the health economic model is computationally expensive. Conclusions. As all the evaluated methods gave estimates similar to those given by traditional Monte Carlo, we suggest that EVSI can now be efficiently computed with confidence in realistic examples. No systematically superior EVSI computation method exists as the properties of the different methods depend on the underlying health economic model, data generation process, and user expertise.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography