Log in

Relevant bibliographies by topics / Large Sample Size Problem

Contents

Journal articles

Academic literature on the topic 'Large Sample Size Problem'

Author: Grafiati

Published: 4 June 2021

Last updated: 1 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Large Sample Size Problem.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Large Sample Size Problem"

1

Choi, Jai Won, and Balgobin Nandram. "Large Sample Problems." International Journal of Statistics and Probability 10, no. 2 (2021): 81. http://dx.doi.org/10.5539/ijsp.v10n2p81.

Full text

Abstract:

Variance is very important in test statistics as it measures the degree of reliability of estimates. It depends not only on the sample size but also on other factors such as population size, type of data and its distribution, and method of sampling or experiments. But here, we assume that these other fasctors are fixed, and that the test statistic depends only on the sample size.  When the sample size is larger, the variance will be smaller. Smaller variance makes test statistics larger or gives more significant results in testing a hypothesis. Whatever the hypothesis is, it does not matter. Thus, the test result is often misleading because much of it reflects the sample size. Therefore, we discuss the large sample problem in performing traditional tests and show how to fix this problem.

APA, Harvard, Vancouver, ISO, and other styles

2

Armstrong, Richard A. "Is there a large sample size problem?" Ophthalmic and Physiological Optics 39, no. 3 (2019): 129–30. http://dx.doi.org/10.1111/opo.12618.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Kumar, A. "The Sample Size." Journal of Universal College of Medical Sciences 2, no. 1 (2014): 45–47. http://dx.doi.org/10.3126/jucms.v2i1.10493.

Full text

Abstract:

Finding an "appropriate sample size" has been the most basic and foremost problem; a research worker is always faced with, in all sampling based analytical researches. This is so, since a very large sized sample results to unnecessary wastage of resources, while a very small sized sample may affect adversely the accuracy of sample estimates and thus in turn losing the very efficacy of selected sampling plan. The present paper attempts to highlight the main determinant factors and the analytical approach towards estimation ofrequired sample size, along with a few illustrations. DOI: http://dx.doi.org/10.3126/jucms.v2i1.10493 Journal of Universal College of Medical Sciences (2014) Vol.2(1): 45-47

APA, Harvard, Vancouver, ISO, and other styles

4

Barreiro-Ures, Daniel, Ricardo Cao, and Mario Francisco-Fernández. "Bandwidth Selection in Nonparametric Regression with Large Sample Size." Proceedings 2, no. 18 (2018): 1166. http://dx.doi.org/10.3390/proceedings2181166.

Full text

Abstract:

In the context of nonparametric regression estimation, the behaviour of kernel methods such as the Nadaraya-Watson or local linear estimators is heavily influenced by the value of the bandwidth parameter, which determines the trade-off between bias and variance. This clearly implies that the selection of an optimal bandwidth, in the sense of minimizing some risk function (MSE, MISE, etc.), is a crucial issue. However, the task of estimating an optimal bandwidth using the whole sample can be very expensive in terms of computing time in the context of Big Data, due to the computational complexity of some of the most used algorithms for bandwidth selection (leave-one-out cross validation, for example, has O ( n 2 ) complexity). To overcome this problem, we propose two methods that estimate the optimal bandwidth for several subsamples of our large dataset and then extrapolate the result to the original sample size making use of the asymptotic expression of the MISE bandwidth. Preliminary simulation studies show that the proposed methods lead to a drastic reduction in computing time, while the statistical precision is only slightly decreased.

APA, Harvard, Vancouver, ISO, and other styles

5

Feldmann, Rodney M. "Decapod Crustacean Paleobiogeography: Resolving the Problem of Small Sample Size." Short Courses in Paleontology 3 (1990): 303–15. http://dx.doi.org/10.1017/s2475263000001847.

Full text

Abstract:

Studies of paleobiogeography have changed markedly in recent decades transforming a once static subject into one which now has great potential as a useful counterpart to systematic and ecological studies in the interpretation of the geological history of organisms. This has resulted, in large part, from the emergence of plate tectonic models which, in turn, have been used as the bases for extremely sophisticated paleoclimatic modeling. As a result, paleobiogeography has attained a level of precision comparable to that of the studies of paleoecology and systematic paleontology. It is now possible to consider causes for global patterns of origin and dispersal of organisms on a much more realistic level than was previously possible.

APA, Harvard, Vancouver, ISO, and other styles

6

Heckmann, T., K. Gegg, A. Gegg, and M. Becht. "Sample size matters: investigating the effect of sample size on a logistic regression debris flow susceptibility model." Natural Hazards and Earth System Sciences Discussions 1, no. 3 (2013): 2731–79. http://dx.doi.org/10.5194/nhessd-1-2731-2013.

Full text

Abstract:

Abstract. Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial datasets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In view of these results, we argue that researchers applying model selection should explore the behaviour of the model selection for different sample sizes, and that consensus models created from a number of random samples should be given preference over models relying on a single sample.

APA, Harvard, Vancouver, ISO, and other styles

7

Qin, S., and G. E. O. Widera. "Determination of Sample Size in Service Inspection." Journal of Pressure Vessel Technology 119, no. 1 (1997): 57–60. http://dx.doi.org/10.1115/1.2842267.

Full text

Abstract:

When performing inservice inspection on a large volume of identical components, it becomes an almost impossible task to inspect all those in which defects may exist, even if their failure probabilities are known. As a result, an appropriate sample size needs to be determined when setting up an inspection program. In this paper, a probabilistic analysis method is employed to solve this problem. It is assumed that the characteristic data of components has a certain distribution which can be taken as known when the mean and standard deviations of serviceable and defective sets of components are estimated. The sample size can then be determined within an acceptable assigned error range. In this way, both false rejection and acceptance can be avoided with a high degree of confidence.

APA, Harvard, Vancouver, ISO, and other styles

8

Heckmann, T., K. Gegg, A. Gegg, and M. Becht. "Sample size matters: investigating the effect of sample size on a logistic regression susceptibility model for debris flows." Natural Hazards and Earth System Sciences 14, no. 2 (2014): 259–78. http://dx.doi.org/10.5194/nhess-14-259-2014.

Full text

Abstract:

Abstract. Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable and reproducible results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and they approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial data sets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In view of these results, we argue that researchers applying model selection should explore the behaviour of the model selection for different sample sizes, and that consensus models created from a number of random samples should be given preference over models relying on a single sample.

APA, Harvard, Vancouver, ISO, and other styles

9

Jaki, Thomas, Minjung Kim, Andrea Lamont, et al. "The Effects of Sample Size on the Estimation of Regression Mixture Models." Educational and Psychological Measurement 79, no. 2 (2018): 358–84. http://dx.doi.org/10.1177/0013164418791673.

Full text

Abstract:

Regression mixture models are a statistical approach used for estimating heterogeneity in effects. This study investigates the impact of sample size on regression mixture’s ability to produce “stable” results. Monte Carlo simulations and analysis of resamples from an application data set were used to illustrate the types of problems that may occur with small samples in real data sets. The results suggest that (a) when class separation is low, very large sample sizes may be needed to obtain stable results; (b) it may often be necessary to consider a preponderance of evidence in latent class enumeration; (c) regression mixtures with ordinal outcomes result in even more instability; and (d) with small samples, it is possible to obtain spurious results without any clear indication of there being a problem.

APA, Harvard, Vancouver, ISO, and other styles

10

Thomas, Hoben. "Effect Size Standard Errors for the Non-Normal Non-Identically Distributed Case." Journal of Educational Statistics 11, no. 4 (1986): 293–303. http://dx.doi.org/10.3102/10769986011004293.

Full text

Abstract:

Suppose there are k independent studies and for each study the experimental and control groups have been sampled from independent but essentially arbitrary populations. The problem is to construct a plausible standard error of the effect size mean (effect sizes are standardized experimental-control group mean differences) when given only minimal sample statistic information. Standard errors based on the sample standard error, or bootstrap, will typically be much too large and have very large variance. A normal theory estimator may prove practically useful in more general settings. Asymptotic distribution-free estimators are provided for two cases.

APA, Harvard, Vancouver, ISO, and other styles

More sources

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!