Academic literature on the topic 'Unequal sample size problem'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Unequal sample size problem.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Unequal sample size problem"

1

Klaassen, Chris A. J. "Dixie cups: sampling with replacement from a finite population." Journal of Applied Probability 31, no. 4 (1994): 940–48. http://dx.doi.org/10.2307/3215319.

Full text
Abstract:
At which (random) sample size will every population element have been drawn at least m times? This special coupon collector's problem is often referred to as the Dixie cup problem. Some asymptotic properties of the Dixie cup problem with unequal sampling probabilities are described.
APA, Harvard, Vancouver, ISO, and other styles
2

Klaassen, Chris A. J. "Dixie cups: sampling with replacement from a finite population." Journal of Applied Probability 31, no. 04 (1994): 940–48. http://dx.doi.org/10.1017/s0021900200099472.

Full text
Abstract:
At which (random) sample size will every population element have been drawn at least m times? This special coupon collector's problem is often referred to as the Dixie cup problem. Some asymptotic properties of the Dixie cup problem with unequal sampling probabilities are described.
APA, Harvard, Vancouver, ISO, and other styles
3

Chakraborty, R. "A class of population genetic questions formulated as the generalized occupancy problem." Genetics 134, no. 3 (1993): 953–58. http://dx.doi.org/10.1093/genetics/134.3.953.

Full text
Abstract:
Abstract In categorical genetic data analysis when the sampling units are classified into an arbitrary number of distinct classes, sometimes the sample size may not be large enough to apply large sample approximations for hypothesis testing purposes. Exact sampling distributions of several statistics are derived here, using combinatorial approaches parallel to the classical occupancy problem to help overcome this difficulty. Since the multinomial probabilities can be unequal, this situation is described as a generalized occupancy problem. The sampling properties derived are used to examine nonrandomness of occurrence of mutagen-induced mutations across loci, to devise tests of Hardy-Weinberg proportions of genotype frequencies in the presence of a large number of alleles, and to provide a global test of gametic phase disequilibrium of several restriction site polymorphisms.
APA, Harvard, Vancouver, ISO, and other styles
4

Guo, Jiin-Huarng, Hubert J. Chen, and Wei-Ming Luh. "Optimal Sample Sizes for Testing the Equivalence of Two Means." Methodology 15, no. 3 (2019): 128–36. http://dx.doi.org/10.1027/1614-2241/a000171.

Full text
Abstract:
Abstract. Equivalence tests (also known as similarity or parity tests) have become more and more popular in addition to equality tests. However, in testing the equivalence of two population means, approximate sample sizes developed using conventional techniques found in the literature on this topic have usually been under-valued as having less statistical power than is required. In this paper, the authors first address the reason for this problem and then provide a solution using an exhaustive local search algorithm to find the optimal sample size. The proposed method is not only accurate but is also flexible so that unequal variances or sampling unit costs for different groups can be considered using different sample size allocations. Figures and a numerical example are presented to demonstrate various configurations. An R Shiny App is also available for easy use ( https://optimal-sample-size.shinyapps.io/equivalence-of-means/ ).
APA, Harvard, Vancouver, ISO, and other styles
5

Kumar, Narinder, Amar Nath Gill, and Gobind P. Mehta. "Selection Procedures for Location Parameters Based on Two-Sample U-Statistics." Calcutta Statistical Association Bulletin 42, no. 3-4 (1992): 201–20. http://dx.doi.org/10.1177/0008068319920305.

Full text
Abstract:
Let π1, ... , πk be k independent populations and let Fi ( x)= F( x - θi) be the absolutely continuous cumulative distribution function (cdf) of the i-th population indexed by the location parameter θi; i=1,,.... k. A class of subset selection procedures based on sub-sample extrema for unequal sample sizes is proposed for the problem of selecting a subset from ( π1, .... πk) which contains the population with largest location parameter. The proposed subset selection procedures are then compared with the subset selection procedures of Hsu (1981) in the sense of Pitman ARE (asymptotic relative efficiency). It is shown that these procedures can approximately be implemented with the help of existing tables and sample size sufficient for their implementation, based on simulation results, is discussed. AMS (1980) Subject Classification: Primary 62F07; Secondary 62H10
APA, Harvard, Vancouver, ISO, and other styles
6

Guo, Jiin-Huarng, and Wei-Ming Luh. "Approximate Sample Size Formulas for Testing Group Mean Differences When Variances Are Unequal in One-Way ANOVA." Educational and Psychological Measurement 68, no. 6 (2008): 959–71. http://dx.doi.org/10.1177/0013164408318759.

Full text
Abstract:
This study proposes an approach for determining appropriate sample size for Welch's F test when unequal variances are expected. Given a certain maximum deviation in population means and using the quantile of F and t distributions, there is no need to specify a noncentrality parameter and it is easy to estimate the approximate sample size needed for heterogeneous one-way ANOVA. The theoretical results are validated by a comparison to the results from a Monte Carlo simulation. Simulation results for the empirical power indicate that the sample size needed by the proposed formulas can almost always achieve the desired power level when Welch's F test is applied to data that are conditionally nonnormal and heterogeneous. Two illustrative examples of the use of the proposed procedure are given to calculate balanced and optimal sample sizes, respectively. Moreover, three sample size tables for two-, four-, and six-group problems are provided, respectively, for practitioners.
APA, Harvard, Vancouver, ISO, and other styles
7

Huillet, Thierry E. "Partitioning Problems Arising From Independent Shifted-Geometric and Exponential Samples With Unequal Intensities." International Journal of Statistics and Probability 8, no. 6 (2019): 31. http://dx.doi.org/10.5539/ijsp.v8n6p31.

Full text
Abstract:
Two problems dealing with the random skewed splitting of some population into J different types are considered.
 In a first discrete setup, the sizes of the sub-populations come from independent shifted-geometric with unequal characteristics. Various J → ∞ asymptotics of the induced occupancies are investigated: the total population size, the number of unfilled types, the index of consecutive filled types, the maximum number of individuals in some state and the index of the type(s) achieving this maximum. Equivalently, this problem is amenable to the classical one of assigning indistinguishable particles (Bosons) at J sites, in some random allocation problem.
 In a second parallel setup in the continuum, we consider a large population of say J ‘stars’, the intensities of which have independent exponential distributions with unequal inverse temperatures. Stars are being observed only if their intensities exceed some threshold value. Depending on the choice of the inverse temperatures, we investigate the energy partitioning among stars, the total energy emitted by the observed stars, the number of the observable stars and the energy and index of the star emitting the most.
APA, Harvard, Vancouver, ISO, and other styles
8

Kim, Tae-Hoon, Min-Chul Kang, Ga-Bin Jung, Dong Soo Kim, and Cheol-Woong Yang. "Novel Method for Preparing Transmission Electron Microscopy Samples of Micrometer-Sized Powder Particles by Using Focused Ion Beam." Microscopy and Microanalysis 23, no. 5 (2017): 1055–60. http://dx.doi.org/10.1017/s1431927617012557.

Full text
Abstract:
AbstractThe preparation of transmission electron microscopy (TEM) samples from powders is quite difficult and challenging. For powders with particles in the 1–5 μm size range, it is especially difficult to select an adequate sample preparation technique. Epoxy is commonly used to bind powder, but drawbacks, such as differential milling originating from unequal milling rates between the epoxy and powder, remain. We propose a new, simple method for preparing TEM samples. This method is especially useful for powders with particles in the 1–5 μm size range that are vulnerable to oxidation. The method uses solder as an embedding agent together with focused ion beam (FIB) milling. The powder was embedded in low-temperature solder using a conventional hot-mounting instrument. Subsequently, FIB was used to fabricate thin TEM samples via the lift-out technique. The solder proved to be more effective than epoxy in producing thin TEM samples with large areas. The problem of differential milling was mitigated, and the solder binder was more stable than epoxy under an electron beam. This methodology can be applied for preparing TEM samples from various powders that are either vulnerable to oxidation or composed of high atomic number elements.
APA, Harvard, Vancouver, ISO, and other styles
9

Chiang, Chieh, and Chin-Fu Hsiao. "Use of interval estimations in design and evaluation of multiregional clinical trials with continuous outcomes." Statistical Methods in Medical Research 28, no. 7 (2018): 2179–95. http://dx.doi.org/10.1177/0962280217751277.

Full text
Abstract:
Multiregional clinical trials have been accepted in recent years as a useful means of accelerating the development of new drugs and abridging their approval time. The statistical properties of multiregional clinical trials are being widely discussed. In practice, variance of a continuous response may be different from region to region, but it leads to the assessment of the efficacy response falling into a Behrens–Fisher problem—there is no exact testing or interval estimator for mean difference with unequal variances. As a solution, this study applies interval estimations of the efficacy response based on Howe’s, Cochran–Cox’s, and Satterthwaite’s approximations, which have been shown to have well-controlled type I error rates. However, the traditional sample size determination cannot be applied to the interval estimators. The sample size determination to achieve a desired power based on these interval estimators is then presented. Moreover, the consistency criteria suggested by the Japanese Ministry of Health, Labour and Welfare guidance to decide whether the overall results from the multiregional clinical trial obtained via the proposed interval estimation were also applied. A real example is used to illustrate the proposed method. The results of simulation studies indicate that the proposed method can correctly determine the required sample size and evaluate the assurance probability of the consistency criteria.
APA, Harvard, Vancouver, ISO, and other styles
10

Slinker, B. K., and S. A. Glantz. "Multiple linear regression is a useful alternative to traditional analyses of variance." American Journal of Physiology-Regulatory, Integrative and Comparative Physiology 255, no. 3 (1988): R353—R367. http://dx.doi.org/10.1152/ajpregu.1988.255.3.r353.

Full text
Abstract:
Physiologists often wish to compare the effects of several different treatments on a continuous variable of interest, which requires an analysis of variance. Analysis of variance, as presented in most statistics texts, generally requires that there be no missing data and often that each sample group be the same size. Unfortunately, this requirement is rarely satisfied, and investigators are confronted with the problem of how to analyze data that do not strictly fit the traditional analysis of variance paradigm. One can avoid these pitfalls by recasting the analysis of variance as a multiple linear regression problem. When there are no missing data, the results of a traditional analysis of variance and the corresponding multiple regression problem are identical; when the sample sizes are unequal or there are missing data, one can use a regression formulation to analyze data that cannot be easily handled in a traditional analysis of variance paradigm and thus overcome a practical computational limitation of traditional analysis of variance. In addition to overcoming practical limitations of traditional analysis of variance, the multiple linear regression approach is more efficient because in one run of a statistics routine, not only is the analysis of variance done but also one obtains estimates of the size of the treatment effects (as opposed to just an indication of whether such effects are present or not), and many of the pairwise multiple comparisons are done (they are equivalent to t tests for significance of the regression parameter estimates). Finally, interaction between the different treatment factors is easier to interpret than it is in traditional analysis of variance.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Unequal sample size problem"

1

Song, Juhee. "Bootstrapping in a high dimensional but very low sample size problem." Texas A&M University, 2003. http://hdl.handle.net/1969.1/3853.

Full text
Abstract:
High Dimension, Low Sample Size (HDLSS) problems have received much attention recently in many areas of science. Analysis of microarray experiments is one such area. Numerous studies are on-going to investigate the behavior of genes by measuring the abundance of mRNA (messenger RiboNucleic Acid), gene expression. HDLSS data investigated in this dissertation consist of a large number of data sets each of which has only a few observations. We assume a statistical model in which measurements from the same subject have the same expected value and variance. All subjects have the same distribution up to location and scale. Information from all subjects is shared in estimating this common distribution. Our interest is in testing the hypothesis that the mean of measurements from a given subject is 0. Commonly used tests of this hypothesis, the t-test, sign test and traditional bootstrapping, do not necessarily provide reliable results since there are only a few observations for each data set. We motivate a mixture model having C clusters and 3C parameters to overcome the small sample size problem. Standardized data are pooled after assigning each data set to one of the mixture components. To get reasonable initial parameter estimates when density estimation methods are applied, we apply clustering methods including agglomerative and K-means. Bayes Information Criterion (BIC) and a new criterion, WMCV (Weighted Mean of within Cluster Variance estimates), are used to choose an optimal number of clusters. Density estimation methods including a maximum likelihood unimodal density estimator and kernel density estimation are used to estimate the unknown density. Once the density is estimated, a bootstrapping algorithm that selects samples from the estimated density is used to approximate the distribution of test statistics. The t-statistic and an empirical likelihood ratio statistic are used, since their distributions are completely determined by the distribution common to all subject. A method to control the false discovery rate is used to perform simultaneous tests on all small data sets. Simulated data sets and a set of cDNA (complimentary DeoxyriboNucleic Acid) microarray experiment data are analyzed by the proposed methods.
APA, Harvard, Vancouver, ISO, and other styles
2

Gao, Hongjiang. "Hypothesis testing based on pool screening with unequal pool sizes." Thesis, Birmingham, Ala. : University of Alabama at Birmingham, 2010. https://www.mhsl.uab.edu/dt/2010p/gao.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Xiao. "Confidence Intervals for Population Size in a Capture-Recapture Problem." Digital Commons @ East Tennessee State University, 2007. https://dc.etsu.edu/etd/2022.

Full text
Abstract:
In a single capture-recapture problem, two new Wilson methods for interval estimation of population size are derived. Classical Chapman interval, Wilson and Wilson-cc intervals are examined and compared in terms of their expected interval width and exact coverage properties in two models. The new approach performs better than the Chapman in each model. Bayesian analysis also gives a different way to estimate population size.
APA, Harvard, Vancouver, ISO, and other styles
4

Awuor, Risper Akelo. "Effect of Unequal Sample Sizes on the Power of DIF Detection: An IRT-Based Monte Carlo Study with SIBTEST and Mantel-Haenszel Procedures." Diss., Virginia Tech, 2008. http://hdl.handle.net/10919/28321.

Full text
Abstract:
This simulation study focused on determining the effect of unequal sample sizes on statistical power of SIBTEST and Mantel-Haenszel procedures for detection of DIF of moderate and large magnitudes. Item parameters were estimated by, and generated with the 2PLM using WinGen2 (Han, 2006). MULTISIM was used to simulate ability estimates and to generate response data that were analyzed by SIBTEST. The SIBTEST procedure with regression correction was used to calculate the DIF statistics, namely the DIF effect size and the statistical significance of the bias. The older SIBTEST was used to calculate the DIF statistics for the M-H procedure. SAS provided the environment in which the ability parameters were simulated; response data generated and DIF analyses conducted. Test items were observed to determine if a priori manipulated items demonstrated DIF. The study results indicated that with unequal samples in any ratio, M-H had better Type I error rate control than SIBTEST. The results also indicated that not only the ratios, but also the sample size and the magnitude of DIF influenced the behavior of SIBTEST and M-H with regard to their error rate behavior. With small samples and moderate DIF magnitude, Type II errors were committed by both M-H and SIBTEST when the reference to focal group sample size ratio was 1:.10 due to low observed statistical power and inflated Type I error rates.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
5

Andrea, Rožnjik. "Optimizacija problema sa stohastičkim ograničenjima tipa jednakosti – kazneni metodi sa promenljivom veličinom uzorka." Phd thesis, Univerzitet u Novom Sadu, Prirodno-matematički fakultet u Novom Sadu, 2019. https://www.cris.uns.ac.rs/record.jsf?recordId=107819&source=NDLTD&language=en.

Full text
Abstract:
U disertaciji je razmatran problem stohastičkog programiranja s ograničenjima tipa jednakosti, odnosno problem minimizacije s ograničenjima koja su u obliku matematičkog očekivanja. Za re&scaron;avanje posmatranog problema kreirana su dva iterativna postupka u kojima se u svakoj iteraciji računa s uzoračkim očekivanjem kao aproksimacijom matematičkog očekivanja. Oba postupka koriste prednosti postupaka s promenljivom veličinom uzorka zasnovanih na adaptivnom ažuriranju veličine uzorka. To znači da se veličina uzorka određuje na osnovu informacija u tekućoj iteraciji. Konkretno, tekuće informacije o preciznosti aproksimacije očekivanja i tačnosti aproksimacije re&scaron;enja problema defini&scaron;u veličinu uzorka za narednu iteraciju. Oba iterativna postupka su zasnovana na linijskom pretraživanju, a kako je u pitanju problem s ograničenjima, i na kvadratnom kaznenom postupku prilagođenom stohastičkom okruženju. Postupci su zasnovani na istim idejama, ali s različitim pristupom.Po prvom pristupu postupak je kreiran za re&scaron;avanje SAA reformulacije problema stohastičkog programiranja, dakle za re&scaron;avanje aproksimacije originalnog problema. To znači da je uzorak definisan pre iterativnog postupka, pa je analiza konvergencije algoritma deterministička. Pokazano je da se, pod standardnim pretpostavkama, navedenim algoritmom dobija podniz iteracija čija je tačka nagomilavanja KKT tačka SAA reformulacije.Po drugom pristupu je formiran algoritam za re&scaron;avanje samog problemastohastičkog programiranja, te je analiza konvergencije stohastička. Predstavljenim algoritmom se generi&scaron;e podniz iteracija čija je tačka nagomilavanja, pod standardnim pretpostavkama za stohastičku optimizaciju, skoro sigurnoKKT tačka originalnog problema.Predloženi algoritmi su implementirani na istim test problemima. Rezultati numeričkog testiranja prikazuju njihovu efikasnost u re&scaron;avanju posmatranih problema u poređenju s postupcima u kojima je ažuriranje veličine uzorkazasnovano na unapred definisanoj &scaron;emi. Za meru efikasnosti je upotrebljenbroj izračunavanja funkcija. Dakle, na osnovu rezultata dobijenih na skuputestiranih problema može se zaključiti da se adaptivnim ažuriranjem veličineuzorka može u&scaron;tedeti u broju evaluacija funkcija kada su u pitanju i problemi sograničenjima.Kako je posmatrani problem deterministički, a formulisani postupci su stohastički, prva tri poglavlja disertacije sadrže osnovne pojmove determinističkei stohastiˇcke optimizacije, ali i kratak pregled definicija i teorema iz drugihoblasti potrebnih za lak&scaron;e praćenje analize originalnih rezultata. Nastavak disertacije čini prikaz formiranih algoritama, analiza njihove konvergencije i numerička implementacija.&nbsp;<br>Stochastic programming problem with equality constraints is considered within thesis. More precisely, the problem is minimization problem with constraints in the form of mathematical expectation. We proposed two iterative methods for solving considered problem. Both procedures, in each iteration, use a sample average function instead of the mathematical expectation function, and employ the advantages of the variable sample size method based on adaptive updating the sample size. That means, the sample size is determined at every iteration using information from the current iteration. Concretely, the current precision of the approximation of expectation and the quality of the approximation of solution determine the sample size for the next iteration. Both iterative procedures are based on the line search technique as well as on the quadratic penalty method adapted to stochastic environment, since the considered problem has constraints. Procedures relies on same ideas, but the approach is different.By first approach, the algorithm is created for solving an SAA reformulation of the stochastic programming problem, i.e., for solving the approximation of the original problem. That means the sample size is determined before the iterative procedure, so the convergence analyses is deterministic. We show that, under the standard assumptions, the proposed algorithm generates a subsequence which accumulation point is the KKT point of the SAA problem. Algorithm formed by the second approach is for solving the stochastic programming problem, and therefore the convergence analyses is stochastic. It generates a subsequence with&nbsp; accumulation point that is almost surely the KKT point of the original problem, under the standard assumptions for stochastic optimization.for sample size. The number of function evaluations is used as measure of efficiency. Results of the set of tested problems suggest that it is possible to make smaller number of function evaluations by adaptive sample size scheduling in the case of constrained problems, too.Since the considered problem is deterministic, but the formed procedures are stochastic, the first three chapters of thesis contain basic notations of deterministic and stochastic optimization, as well as a short sight of definitions and theorems from another fields necessary for easier tracking the original results analysis. The rest of thesis consists of the presented algorithms, their convergence analysis and numerical implementation.
APA, Harvard, Vancouver, ISO, and other styles
6

Cook, James Allen. "A decompositional investigation of 3D face recognition." Queensland University of Technology, 2007. http://eprints.qut.edu.au/16653/.

Full text
Abstract:
Automated Face Recognition is the process of determining a subject's identity from digital imagery of their face without user intervention. The term in fact encompasses two distinct tasks; Face Verficiation is the process of verifying a subject's claimed identity while Face Identification involves selecting the most likely identity from a database of subjects. This dissertation focuses on the task of Face Verification, which has a myriad of applications in security ranging from border control to personal banking. Recently the use of 3D facial imagery has found favour in the research community due to its inherent robustness to the pose and illumination variations which plague the 2D modality. The field of 3D face recognition is, however, yet to fully mature and there remain many unanswered research questions particular to the modality. The relative expense and specialty of 3D acquisition devices also means that the availability of databases of 3D face imagery lags significantly behind that of standard 2D face images. Human recognition of faces is rooted in an inherently 2D visual system and much is known regarding the use of 2D image information in the recognition of individuals. The corresponding knowledge of how discriminative information is distributed in the 3D modality is much less well defined. This dissertations addresses these issues through the use of decompositional techniques. Decomposition alleviates the problems associated with dimensionality explosion and the Small Sample Size (SSS) problem and spatial decomposition is a technique which has been widely used in face recognition. The application of decomposition in the frequency domain, however, has not received the same attention in the literature. The use of decomposition techniques allows a map ping of the regions (both spatial and frequency) which contain the discriminative information that enables recognition. In this dissertation these techniques are covered in significant detail, both in terms of practical issues in the respective domains and in terms of the underlying distributions which they expose. Significant discussion is given to the manner in which the inherent information of the human face is manifested in the 2D and 3D domains and how these two modalities inter-relate. This investigation is extended to cover also the manner in which the decomposition techniques presented can be recombined into a single decision. Two new methods for learning the weighting functions for both the sum and product rules are presented and extensive testing against established methods is presented. Knowledge acquired from these examinations is then used to create a combined technique termed Log-Gabor Templates. The proposed technique utilises both the spatial and frequency domains to extract superior performance to either in isolation. Experimentation demonstrates that the spatial and frequency domain decompositions are complimentary and can combined to give improved performance and robustness.
APA, Harvard, Vancouver, ISO, and other styles
7

Luo, Chao-Wei, and 羅兆為. "Using virtual sample and linear independence to solve small sample size problem." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/89244692498377091129.

Full text
Abstract:
碩士<br>朝陽科技大學<br>資訊工程系<br>103<br>This research proposed a novel algorithm to solve small sample size problem. The small sample size problem is difficult to solve due to it can’t use statistical methods to estimate the distribution of training samples. Therefore, the conventional method which applies to the large sample problem does not apply to the small sample size problem. This research generates virtual samples to increase the number of samples, and then calculating the probability of noise in order to filter data. After filtering, using linearly independence to select support vector. The experimental results indicate that the proposed method is effective.
APA, Harvard, Vancouver, ISO, and other styles
8

Athanasiadis, Savvas. "The small sample size problem in gene expression tasks." Master's thesis, 2015. http://www.nusl.cz/ntk/nusl-339536.

Full text
Abstract:
Charles University in Prague Faculty of Pharmacy in Hradec Králové Department of Biophysics and Physical Chemistry Candidate: Savvas Athanasiadis Supervisor: Jurjen Duintjer Tebbens Title of diploma thesis: The small sample size problem in gene expression tasks The thesis addresses classification of genes to tumor types based on their gene expression signatures. The number of variables (amino-acids) to be inves- tigated is typically very high (in the thousands) while it is expensive and time- consuming to analyze a high number of genes; usually at most tens of them are available. The combination of a small sample size with a large number of variables makes standard statistical classification methods inappropriate. The thesis focuses on a modification of a standard classification method, Fisher's linear discriminant analysis, for the case where the number of samples is smaller than the number of variables. It proposes an improved strategy to test this modified method with leave-one-out cross validation. Using so- called low rank updates of the involved covariance matrices, the computational costs of the cross validation process can be reduced by an order of magnitude. Memory demands are reduced as well.
APA, Harvard, Vancouver, ISO, and other styles
9

Chang, Kuang-Yu, and 張光佑. "Exploring the Factors of Feature Extractions for Small Sample Size Classification Problem." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/yv7ca2.

Full text
Abstract:
碩士<br>國立臺中教育大學<br>教育測驗統計研究所<br>94<br>For high dimensional data classification, statistics based classifier suffers from the Hughes phenomena because of limited training data. Feature Extraction prevents high dimensional data classification from this problem. It suffers from the singularity of the with class scatter matrix or nearly singular. In this thesis, the key points of designing a feature extraction will be explored in small sample size.   Many researches show that the definitions of within-class and between-class scatter matrices, and regularization techniques are the key points of designing a feature extraction for small sample size classification problem. Three kinds of different feature extractions, PCA, LDA, and NWFE, are used in this thesis. Some popular and new regularization techniques are compared. Eigenvalue decomposition is introduced in this thesis and explored with feature extraction and regularization.   Hypersectral image data and educational testing data are used in the experiment. In hypersectral image data there are two cases in training samples, ill-posed and poorly-posed. In educational testing data there are two cases in training samples. The training samples of each class are ten and twenty. Besides ML classifier, 1NN and SVM are compared. In the experiment result, different regularization techniques adapt to different eigenvalue decomposition. Nonparametric scatter matrices with RFE regularization and EIG decomposition in maximum likelihood classifier is the robust combination for small sample size classification.
APA, Harvard, Vancouver, ISO, and other styles
10

Lin, Hsien-Jen, and 林咸仁. "On Improving Linear Discriminant Analysis for Face Recognition with Small Sample Size Problem." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/26zux8.

Full text
Abstract:
碩士<br>國立成功大學<br>資訊工程學系碩博士班<br>90<br>In the literature of face recognition, LDA (Linear Discriminant Analysis) is a popular linear transformation technique to extract the discriminant feature vector. However, this technique may occur the singularity problem on calculating the inverse of the within-class scatter matrix when the training sample size is small. To overcome the problem, an improved linear discriminant analysis is proposed in this thesis. The proposed method aims to transform the original feature vector to a new feature space with the same degree of scattering and without the singularity problem. Then, We use the LDA to extract the new feature vector. In the experiments on five face databases, the results show that the recognition rates of the proposed method is significantly better than other LDA-based techniques when training data are sufficient. In case of very limited training data, the proposed method achieves desirable recognition performance with moderate training cost.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Unequal sample size problem"

1

Aitkin, Murray, and Mikis Stasinopoulos. "Likelihood Analysis of a Binomial Sample Size Problem." In Contributions to Probability and Statistics. Springer New York, 1989. http://dx.doi.org/10.1007/978-1-4612-3678-8_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Aitkin, Murray, and Mikis Stasinopoulos. "Likelihood Analysis of a Binomial Sample Size Problem." In Contributions to Probability and Statistics. Springer New York, 1989. http://dx.doi.org/10.1007/978-1-4612-3678-8_36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hothorn, L. "Sample Size Estimation for Several Trend Tests in the k-Sample Problem." In Computational Statistics. Physica-Verlag HD, 1992. http://dx.doi.org/10.1007/978-3-642-48678-4_50.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zheng, WeiShi, JianHuang Lai, and P. C. Yuen. "Fast Calculation for Fisher Criteria in Small Sample Size Problem." In Advances in Biometric Person Authentication. Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-30548-4_38.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dornaika, Fadi, and Alireza Bosagzadeh. "On Solving the Small Sample Size Problem for Marginal Fisher Analysis." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-39094-4_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Milov, Vladimir R., and Andrey V. Savchenko. "Classification of Dangerous Situations for Small Sample Size Problem in Maintenance Decision Support Systems." In Communications in Computer and Information Science. Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-52920-2_31.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kim, Sang-Woon. "On Using a Dissimilarity Representation Method to Solve the Small Sample Size Problem for Face Recognition." In Advanced Concepts for Intelligent Vision Systems. Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11864349_107.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Halldórsson, Bjarni V., Derek Aguiar, Ryan Tarpine, and Sorin Istrail. "The Clark Phase-able Sample Size Problem: Long-Range Phasing and Loss of Heterozygosity in GWAS." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-12683-3_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Chen, Wensheng, Pong C. Yuen, Jian Huang, and Daoqing Dai. "A Novel One-Parameter Regularized Linear Discriminant Analysis for Solving Small Sample Size Problem in Face Recognition." In Advances in Biometric Person Authentication. Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-30548-4_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kim, Sang-Woon, and Robert P. W. Duin. "On Combining Dissimilarity-Based Classifiers to Solve the Small Sample Size Problem for Appearance-Based Face Recognition." In Advances in Artificial Intelligence. Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-72665-4_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Unequal sample size problem"

1

CHEN, Wenan, and Hongbin ZHANG. "WEIGHTED PROJECTION APPROACH FOR SMALL SAMPLE SIZE PROBLEM." In 11th Joint International Computer Conference - JICC 2005. WORLD SCIENTIFIC, 2005. http://dx.doi.org/10.1142/9789812701534_0197.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Yu, ChunMei, Quan Pan, YongMei Cheng, and HongCai Zhang. "Small sample size problem of fault diagnosis for process industry." In 2010 8th IEEE International Conference on Control and Automation (ICCA). IEEE, 2010. http://dx.doi.org/10.1109/icca.2010.5524343.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Shao, Yuan-Hai, Zhen Wang, Chun-Na Li, and Nai-Yang Deng. "Locality Sensitive Proximal Classifier with Consistency for Small Sample Size Problem." In 2015 IEEE International Conference on Data Mining Workshop (ICDMW). IEEE, 2015. http://dx.doi.org/10.1109/icdmw.2015.180.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Gao, Liang, and Xiaoyun Liu. "Boundary detection method based on supervising for small sample size problem." In 2011 4th International Congress on Image and Signal Processing (CISP). IEEE, 2011. http://dx.doi.org/10.1109/cisp.2011.6100403.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhou, Li-na, Rui Huang, Xian-hua Li, and Ling Chen. "Semi-Supervised Covariance Estimation Using Clustering for Small Sample Size Problem." In 2009 1st International Conference on Information Science and Engineering (ICISE 2009). IEEE, 2009. http://dx.doi.org/10.1109/icise.2009.1056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Tong, Xiaoxia Cao, Tian Xia, and Zhizhen Yang. "Solving the small sample size problem in protein subcellular localization prediction." In 2012 5th International Conference on Biomedical Engineering and Informatics (BMEI). IEEE, 2012. http://dx.doi.org/10.1109/bmei.2012.6513152.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Yu-jie Zheng, Jing-yu Yang, Jian Yang, and Xiao-jun Wu. "Effective classification image space which can solve small sample size problem." In 18th International Conference on Pattern Recognition (ICPR'06). IEEE, 2006. http://dx.doi.org/10.1109/icpr.2006.472.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Sun, Wei, Xiaoying Song, and Qilong Zhang. "Resolving energy hole problem based on unequal cluster size in heterogeneous Wireless Sensor Networks." In 2020 7th International Conference on Information Science and Control Engineering (ICISCE). IEEE, 2020. http://dx.doi.org/10.1109/icisce50968.2020.00486.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Prasad, Saurabh, and Lori Mann Bruce. "Overcoming the Small Sample Size Problem in Hyperspectral Classification and Detection Tasks." In IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2008. http://dx.doi.org/10.1109/igarss.2008.4780108.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Rao, Wei, and Man-Wai Mak. "Alleviating the small sample-size problem in i-vector based speaker verification." In 2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012). IEEE, 2012. http://dx.doi.org/10.1109/iscslp.2012.6423527.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Unequal sample size problem"

1

Malej, Matt, and Fengyan Shi. Suppressing the pressure-source instability in modeling deep-draft vessels with low under-keel clearance in FUNWAVE-TVD. Engineer Research and Development Center (U.S.), 2021. http://dx.doi.org/10.21079/11681/40639.

Full text
Abstract:
This Coastal and Hydraulics Engineering Technical Note (CHETN) documents the development through verification and validation of three instability-suppressing mechanisms in FUNWAVE-TVD, a Boussinesq-type numerical wave model, when modeling deep-draft vessels with a low under-keel clearance (UKC). Many large commercial ports and channels (e.g., Houston Ship Channel, Galveston, US Army Corps of Engineers [USACE]) are traveled and affected by tens of thousands of commercial vessel passages per year. In a series of recent projects undertaken for the Galveston District (USACE), it was discovered that when deep-draft vessels are modeled using pressure-source mechanisms, they can suffer from model instabilities when low UKC is employed (e.g., vessel draft of 12 m¹ in a channel of 15 m or less of depth), rendering a simulation unstable and obsolete. As an increasingly large number of deep-draft vessels are put into service, this problem is becoming more severe. This presents an operational challenge when modeling large container-type vessels in busy shipping channels, as these often will come as close as 1 m to the bottom of the channel, or even touch the bottom. This behavior would subsequently exhibit a numerical discontinuity in a given model and could severely limit the sample size of modeled vessels. This CHETN outlines a robust approach to suppressing such instability without compromising the integrity of the far-field vessel wave/wake solution. The three methods developed in this study aim to suppress high-frequency spikes generated nearfield of a vessel. They are a shock-capturing method, a friction method, and a viscosity method, respectively. The tests show that the combined shock-capturing and friction method is the most effective method to suppress the local high-frequency noises, while not affecting the far-field solution. A strong test, in which the target draft is larger than the channel depth, shows that there are no high-frequency noises generated in the case of ship squat as long as the shock-capturing method is used.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography