Relevant bibliographies by topics / Variable sample size methods

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'Variable sample size methods'

Author: Grafiati

Published: 4 June 2021

Last updated: 1 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Variable sample size methods.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Variable sample size methods"

Krejić, Nataša, and Nataša Krklec Jerinkić. "Nonmonotone line search methods with variable sample size." Numerical Algorithms 68, no. 4 (2014): 711–39. http://dx.doi.org/10.1007/s11075-014-9869-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Krejić, Nataša, and Nataša Krklec. "Line search methods with variable sample size for unconstrained optimization." Journal of Computational and Applied Mathematics 245 (June 2013): 213–31. http://dx.doi.org/10.1016/j.cam.2012.12.020.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kitikidou, K., and G. Chatzilazarou. "Estimating the sample size for fitting taper equations." Journal of Forest Science 54, No. 4 (2008): 176–82. http://dx.doi.org/10.17221/789-jfs.

Full text

Abstract:

Much work has been done fitting taper equations to describe tree bole shapes, but few researchers have investigated how large the sample size should be. In this paper, a method that requires two variables that are linearly correlated was applied to determine the sample size for fitting taper equations. Two cases of sample size estimation were tested, based on the method mentioned above. In the first case, the sample size required is referred to the total number of diameters estimated in the sampled trees. In the second case, the sample size required is referred to the number of sampled trees. The analysis showed that both methods are efficient from a validity standpoint but the first method has the advantage of decreased cost, since it costs much more to incrementally sample another tree than it does to make another diameter measurement on an already sampled tree.

APA, Harvard, Vancouver, ISO, and other styles

Ali, Sabz, Amjad Ali, Sajjad Ahmad Khan, and Sundas Hussain. "Sufficient Sample Size and Power in Multilevel Ordinal Logistic Regression Models." Computational and Mathematical Methods in Medicine 2016 (2016): 1–8. http://dx.doi.org/10.1155/2016/7329158.

Full text

Abstract:

For most of the time, biomedical researchers have been dealing with ordinal outcome variable in multilevel models where patients are nested in doctors. We can justifiably apply multilevel cumulative logit model, where the outcome variable represents the mild, severe, and extremely severe intensity of diseases like malaria and typhoid in the form of ordered categories. Based on our simulation conditions, Maximum Likelihood (ML) method is better than Penalized Quasilikelihood (PQL) method in three-category ordinal outcome variable. PQL method, however, performs equally well as ML method where five-category ordinal outcome variable is used. Further, to achieve power more than 0.80, at least 50 groups are required for both ML and PQL methods of estimation. It may be pointed out that, for five-category ordinal response variable model, the power of PQL method is slightly higher than the power of ML method.

APA, Harvard, Vancouver, ISO, and other styles

Vinogradov, A. G. "USING R FOR PSYCHOLOGICAL RESEARCH: A TUTORIAL OF BASIC METHODS." Ukrainian Psychological Journal, no. 2 (14) (2020): 28–63. http://dx.doi.org/10.17721/upj.2020.2(14).2.

Full text

Abstract:

The article belongs to a special modern genre of scholar publications, so-called tutorials – articles devoted to the application of the latest methods of design, modeling or analysis in an accessible format in order to disseminate best practices. The article acquaints Ukrainian psychologists with the basics of using the R programming language to the analysis of empirical research data. The article discusses the current state of world psychology in connection with the Crisis of Confidence, which arose due to the low reproducibility of empirical research. This problem is caused by poor quality of psychological measurement tools, insufficient attention to adequate sample planning, typical statistical hypothesis testing practices, and so-called “questionable research practices.” The tutorial demonstrates methods for determining the sample size depending on the expected magnitude of the effect size and desired statistical power, performing basic variable transformations and statistical analysis of psychological research data using language and environment R. The tutorial presents minimal system of R functions required to carry out: modern analysis of reliability of measurement scales, sample size calculation, point and interval estimation of effect size for four the most widespread in psychology designs for the analysis of two variables’ interdependence. These typical problems include finding the differences between the means and variances in two or more samples, correlations between continuous and categorical variables. Practical information on data preparation, import, basic transformations, and application of basic statistical methods in the cloud version of RStudio is provided.

APA, Harvard, Vancouver, ISO, and other styles

Zhao, Naifei, Qingsong Xu, Man-lai Tang, and Hong Wang. "Variable Screening for Near Infrared (NIR) Spectroscopy Data Based on Ridge Partial Least Squares Regression." Combinatorial Chemistry & High Throughput Screening 23, no. 8 (2020): 740–56. http://dx.doi.org/10.2174/1386207323666200428114823.

Full text

Abstract:

Aim and Objective: Near Infrared (NIR) spectroscopy data are featured by few dozen to many thousands of samples and highly correlated variables. Quantitative analysis of such data usually requires a combination of analytical methods with variable selection or screening methods. Commonly-used variable screening methods fail to recover the true model when (i) some of the variables are highly correlated, and (ii) the sample size is less than the number of relevant variables. In these cases, Partial Least Squares (PLS) regression based approaches can be useful alternatives. Materials and Methods : In this research, a fast variable screening strategy, namely the preconditioned screening for ridge partial least squares regression (PSRPLS), is proposed for modelling NIR spectroscopy data with high-dimensional and highly correlated covariates. Under rather mild assumptions, we prove that using Puffer transformation, the proposed approach successfully transforms the problem of variable screening with highly correlated predictor variables to that of weakly correlated covariates with less extra computational effort. Results: We show that our proposed method leads to theoretically consistent model selection results. Four simulation studies and two real examples are then analyzed to illustrate the effectiveness of the proposed approach. Conclusion: By introducing Puffer transformation, high correlation problem can be mitigated using the PSRPLS procedure we construct. By employing RPLS regression to our approach, it can be made more simple and computational efficient to cope with the situation where model size is larger than the sample size while maintaining a high precision prediction.

APA, Harvard, Vancouver, ISO, and other styles

Endrenyi, Laszlo, and Laszlo Tothfalusi. "Sample Sizes for Designing Bioequivalence Studies for Highly Variable Drugs." Journal of Pharmacy & Pharmaceutical Sciences 15, no. 1 (2011): 73. http://dx.doi.org/10.18433/j3z88f.

Full text

Abstract:

Purpose. To provide tables of sample sizes which are required, by the European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA), for the design of bioequivalence (BE) studies involving highly variable drugs. To elucidate the complicated features of the relationship between sample size and within-subject variation. Methods. 3- and 4-period studies were simulated with various sample sizes. They were evaluated, at various variations and various true ratios of the two geometric means (GMR), by the approaches of scaled average BE and by average BE with expanding limits. The sample sizes required for yielding 80% and 90% statistical powers were determined. Results. Because of the complicated regulatory expectations, the features of the required sample sizes are also complicated. When the true GMR = 1.0 then, without additional constraints, the sample size is independent of the intrasubject variation. When the true GMR is increased or decreased from 1.0 then the required sample sizes rise at above but close to 30% variation. An additional regulatory constraint on the point estimate of GMR and a cap on the use of expanding limits further increase the required sample size at high variations. Fewer subjects are required by the FDA than by the EMA procedures. Conclusions. The methods proposed by EMA and FDA lower the required sample sizes in comparison with unscaled average BE. However, each additional regulatory requirement (applying the mixed procedure, imposing a constraint on the point estimate of GMR, and using a cap on the application of expanding limits) raises the required number of subjects.   This article is open to POST-PUBLICATION REVIEW. Registered readers (see “For Readers”) may comment by clicking on ABSTRACT on the issue’s contents page.

APA, Harvard, Vancouver, ISO, and other styles

Van Delden, Arnout, Bart J. Du Chatinier, and Sander Scholtus. "Accuracy in the Application of Statistical Matching Methods for Continuous Variables Using Auxiliary Data." Journal of Survey Statistics and Methodology 8, no. 5 (2019): 990–1017. http://dx.doi.org/10.1093/jssam/smz032.

Full text

Abstract:

Abstract Statistical matching is a technique to combine variables in two or more nonoverlapping samples that are drawn from the same population. In the current study, the unobserved joint distribution between two target variables in nonoverlapping samples is estimated using a parametric model. A classical assumption to estimate this joint distribution is that the target variables are independent given the background variables observed in both samples. A problem with the use of this conditional independence assumption is that the estimated joint distribution may be severely biased when the assumption does not hold, which in general will be unacceptable for official statistics. Here, we explored to what extent the accuracy can be improved by the use of two types of auxiliary information: the use of a common administrative variable and the use of a small additional sample from a similar population. This additional sample is included by using the partial correlation of the target variables given the background variables or by using an EM algorithm. In total, four different approaches were compared to estimate the joint distribution of the target variables. Starting with empirical data, we show how the accuracy of the joint distribution is affected by the use of administrative data and by the size of the additional sample included via a partial correlation and through an EM algorithm. The study further shows how this accuracy depends on the strength of the relations among the target and auxiliary variables. We found that including a common administrative variable does not always improve the accuracy of the results. We further found that the EM algorithm nearly always yielded the most accurate results; this effect is largest when the explained variance of the separate target variables by the common background variables is not large.

APA, Harvard, Vancouver, ISO, and other styles

Hussain, Sarfraz, Abdul Quddus, Pham Phat Tien, Muhammad Rafiq, and Drahomíra Pavelková. "The moderating role of firm size and interest rate in capital structure of the firms: selected sample from sugar sector of Pakistan." Investment Management and Financial Innovations 17, no. 4 (2020): 341–55. http://dx.doi.org/10.21511/imfi.17(4).2020.29.

Full text

Abstract:

The selection of financing is a top priority for businesses, particularly in short- and long-term investment decisions. Mixing debt and equity leads to decisions on the financial structure for businesses. This research analyzes the moderate position of company size and the interest rate in the capital structure over six years (2013–2018) for 29 listed Pakistani enterprises operating in the sugar market. This research employed static panel analysis and dynamic panel analysis on linear and nonlinear regression methods. The capital structure included debt to capital ratio, non-current liabilities, plus current liabilities to capital as a dependent variable. Independent variables were profitability, firm size, tangibility, Non-Debt Tax Shield, liquidity, and macroeconomic variables were exchange rates and interest rates. The investigation reported that profitability, firm size, and Non-Debt Tax Shield were significant and negative, while tangibility and interest rates significantly and positively affected debt to capital ratio. This means the sugar sector has greater financial leverage to manage the funding obligations for the better performance of firms. Therefore, the outcomes revealed that the moderators have an important influence on capital structure.

APA, Harvard, Vancouver, ISO, and other styles

Xu, Zhou, Xiaojing Chen, Liuwei Meng, Mingen Yu, Limin Li, and Wen Shi. "Sample Consensus Model and Unsupervised Variable Consensus Model for Improving the Accuracy of a Calibration Model." Applied Spectroscopy 73, no. 7 (2019): 747–58. http://dx.doi.org/10.1177/0003702819852174.

Full text

Abstract:

In the quantitative analysis of spectral data, small sample size and high dimensionality of spectral variables often lead to poor accuracy of a calibration model. We proposed two methods, namely sample consensus and unsupervised variable consensus models, in order to solve the problem of poor accuracy. Three public near-infrared (NIR) or infrared (IR) spectroscopy data from corn, wine, and soil were used to build the partial least squares regression (PLSR) model. Then, Monte Carlo sampling and unsupervised variable clustering methods of a self-organizing map were coupled with the consensus modeling strategy to establish the multiple sub-models. Finally, sample consensus and unsupervised variable consensus models were obtained by assigning the weights to each PLSR sub-model. The calculated results show that both sample consensus and unsupervised variable consensus models can significantly improve the accuracy of the calibration model compared to the single PLSR model. The effectiveness of these two methods points out a new approach to achieve a further accurate result, which can take full advantage of the sample information and valid variable information.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Variable sample size methods"

Nataša, Krklec Jerinkić. "Line search methods with variable sample size." Phd thesis, Univerzitet u Novom Sadu, Prirodno-matematički fakultet u Novom Sadu, 2014. http://dx.doi.org/10.2298/NS20140117KRKLEC.

Full text

Abstract:

The problem under consideration is an unconstrained optimization problem with the objective function in the form of mathematical ex-pectation. The expectation is with respect to the random variable that represents the uncertainty. Therefore, the objective  function is in fact deterministic. However, nding the analytical form of that objective function can be very dicult or even impossible. This is the reason why the sample average approximation is often used. In order to obtain reasonable good approximation of the objective function, we have to use relatively large sample size. We assume that the sample is generated at the beginning of the optimization process and therefore we can consider this sample average objective function as the deterministic one. However, applying some deterministic method on that sample average function from the start can be very costly. The number of evaluations of the function under expectation is a common way of measuring the cost of an algorithm. Therefore, methods that vary the sample size throughout the optimization process are developed. Most of them are trying to determine the optimal dynamics of increasing the sample size.The main goal of this thesis is to develop the clas of methods that can decrease the cost of an algorithm by decreasing the number of function evaluations. The idea is to decrease the sample size whenever it seems to be reasonable - roughly speaking, we do not want to impose a large precision, i.e. a large sample size when we are far away from the solution we search for. The detailed description of the new methods is presented in Chapter 4 together with the convergence analysis. It is shown that the approximate solution is of the same quality as the one obtained by dealing with the full sample from the start.Another important characteristic of the methods that are proposed here is the line search technique which is used for obtaining the sub-sequent iterates. The idea is to nd a suitable direction and to search along it until we obtain a sucient decrease in the  function value. The sucient decrease is determined throughout the line search rule. In Chapter 4, that rule is supposed to be monotone, i.e. we are imposing strict decrease of the function value. In order to decrease the cost of the algorithm even more and to enlarge the set of suitable search directions, we use nonmonotone line search rules in Chapter 5. Within that chapter, these rules are modied to t the variable sample size framework. Moreover, the conditions for the global convergence and the R-linear rate are presented. In Chapter 6, numerical results are presented. The test problems are various - some of them are academic and some of them are real world problems. The academic problems are here to give us more insight into the behavior of the algorithms. On the other hand, data that comes from the real world problems are here to test the real applicability of the proposed algorithms. In the rst part of that chapter, the focus is on the variable sample size techniques. Different implementations of the proposed algorithm are compared to each other and to the other sample schemes as well. The second part is mostly devoted to the comparison of the various line search rules combined with dierent search directions in the variable sample size framework. The overall numerical results show that using the variable sample size can improve the performance of the algorithms signicantly, especially when the nonmonotone line search rules are used.The rst chapter of this thesis provides the background material for the subsequent chapters. In Chapter 2, basics of the nonlinear optimization are presented and the focus is on the line search, while Chapter 3 deals with the stochastic framework. These chapters are here to provide the review of the relevant known results, while the rest of the thesis represents the original contribution.  U okviru ove teze posmatra se problem optimizacije bez ograničenja pri čcemu je funkcija cilja u formi matematičkog očekivanja. Očekivanje se odnosi na slučajnu promenljivu koja predstavlja neizvesnost. Zbog toga je funkcija cilja, u stvari, deterministička veličina. Ipak, odredjivanje analitičkog oblika te funkcije cilja može biti vrlo komplikovano pa čak i nemoguće. Zbog toga se za aproksimaciju često koristi uzoračko očcekivanje. Da bi se postigla dobra aproksimacija, obično je neophodan obiman uzorak. Ako pretpostavimo da se uzorak realizuje pre početka procesa optimizacije, možemo posmatrati uzoračko očekivanje kao determinističku funkciju. Medjutim, primena nekog od determinističkih metoda direktno na tu funkciju  moze biti veoma skupa jer evaluacija funkcije pod ocekivanjem često predstavlja veliki tro&scaron;ak i uobičajeno je da se ukupan tro&scaron;ak optimizacije meri po broju izračcunavanja funkcije pod očekivanjem. Zbog toga su razvijeni metodi sa promenljivom veličinom uzorka. Većcina njih je bazirana na odredjivanju optimalne dinamike uvećanja uzorka.Glavni cilj ove teze je razvoj algoritma koji, kroz smanjenje broja izračcunavanja funkcije, smanjuje ukupne tro&scaron;skove optimizacije. Ideja je da se veličina uzorka smanji kad god je to moguće. Grubo rečeno, izbegava se koriscenje velike preciznosti  (velikog uzorka) kada smo daleko od re&scaron;senja. U čcetvrtom poglavlju ove teze opisana je nova klasa metoda i predstavljena je analiza konvergencije. Dokazano je da je aproksimacija re&scaron;enja koju dobijamo bar toliko dobra koliko i za metod koji radi sa celim uzorkom sve vreme.Jo&scaron; jedna bitna karakteristika metoda koji su ovde razmatrani je primena linijskog pretražzivanja u cilju odredjivanja naredne iteracije. Osnovna ideja je da se nadje odgovarajući pravac i da se duž njega vr&scaron;si pretraga za dužzinom koraka koja će dovoljno smanjiti vrednost funkcije. Dovoljno smanjenje je odredjeno pravilom linijskog pretraživanja. U čcetvrtom poglavlju to pravilo je monotono &scaron;to znači da zahtevamo striktno smanjenje vrednosti funkcije. U cilju jos većeg smanjenja tro&scaron;kova optimizacije kao i pro&scaron;irenja skupa pogodnih pravaca, u petom poglavlju koristimo nemonotona pravila linijskog pretraživanja koja su modifikovana zbog promenljive velicine uzorka. Takodje, razmatrani su uslovi za globalnu konvergenciju i R-linearnu brzinu konvergencije.Numerički rezultati su predstavljeni u &scaron;estom poglavlju. Test problemi su razliciti - neki od njih su akademski, a neki su realni. Akademski problemi su tu da nam daju bolji uvid u pona&scaron;anje algoritama. Sa druge strane, podaci koji poticu od stvarnih problema služe kao pravi test za primenljivost pomenutih algoritama. U prvom delu tog poglavlja akcenat je na načinu ažuriranja veličine uzorka. Različite varijante metoda koji su ovde predloženi porede se medjusobno kao i sa drugim &scaron;emama za ažuriranje veličine uzorka. Drugi deo poglavlja pretežno je posvećen poredjenju različitih pravila linijskog pretraživanja sa različitim pravcima pretraživanja u okviru promenljive veličine uzorka. Uzimajuci sve postignute rezultate u obzir dolazi se do zaključcka da variranje veličine uzorka može značajno popraviti učinak algoritma, posebno ako se koriste nemonotone metode linijskog pretraživanja.U prvom poglavlju ove teze opisana je motivacija kao i osnovni pojmovi potrebni za praćenje preostalih poglavlja. U drugom poglavlju je iznet pregled osnova nelinearne optimizacije sa akcentom na metode linijskog pretraživanja, dok su u trećem poglavlju predstavljene osnove stohastičke optimizacije. Pomenuta poglavlja su tu radi pregleda dosada&scaron;njih relevantnih rezultata dok je originalni doprinos ove teze predstavljen u poglavljima 4-6.

APA, Harvard, Vancouver, ISO, and other styles

Fernandes, Jessica Katherine de Sousa. "Estudo de algoritmos de otimização estocástica aplicados em aprendizado de máquina." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-28092017-182905/.

Full text

Abstract:

Em diferentes aplicações de Aprendizado de Máquina podemos estar interessados na minimização do valor esperado de certa função de perda. Para a resolução desse problema, Otimização estocástica e Sample Size Selection têm um papel importante. No presente trabalho se apresentam as análises teóricas de alguns algoritmos destas duas áreas, incluindo algumas variações que consideram redução da variância. Nos exemplos práticos pode-se observar a vantagem do método Stochastic Gradient Descent em relação ao tempo de processamento e memória, mas, considerando precisão da solução obtida juntamente com o custo de minimização, as metodologias de redução da variância obtêm as melhores soluções. Os algoritmos Dynamic Sample Size Gradient e Line Search with variable sample size selection apesar de obter soluções melhores que as de Stochastic Gradient Descent, a desvantagem se encontra no alto custo computacional deles. In different Machine Learnings applications we can be interest in the minimization of the expected value of some loss function. For the resolution of this problem, Stochastic optimization and Sample size selection has an important role. In the present work, it is shown the theoretical analysis of some algorithms of these two areas, including some variations that considers variance reduction. In the practical examples we can observe the advantage of Stochastic Gradient Descent in relation to the processing time and memory, but considering accuracy of the solution obtained and the cost of minimization, the methodologies of variance reduction has the best solutions. In the algorithms Dynamic Sample Size Gradient and Line Search with variable sample size selection, despite of obtaining better solutions than Stochastic Gradient Descent, the disadvantage lies in their high computational cost.

APA, Harvard, Vancouver, ISO, and other styles

Andrea, Rožnjik. "Optimizacija problema sa stohastičkim ograničenjima tipa jednakosti – kazneni metodi sa promenljivom veličinom uzorka." Phd thesis, Univerzitet u Novom Sadu, Prirodno-matematički fakultet u Novom Sadu, 2019. https://www.cris.uns.ac.rs/record.jsf?recordId=107819&source=NDLTD&language=en.

Full text

Abstract:

U disertaciji je razmatran problem stohastičkog programiranja s ograničenjima tipa jednakosti, odnosno problem minimizacije s ograničenjima koja su u obliku matematičkog očekivanja. Za re&scaron;avanje posmatranog problema kreirana su dva iterativna postupka u kojima se u svakoj iteraciji računa s uzoračkim očekivanjem kao aproksimacijom matematičkog očekivanja. Oba postupka koriste prednosti postupaka s promenljivom veličinom uzorka zasnovanih na adaptivnom ažuriranju veličine uzorka. To znači da se veličina uzorka određuje na osnovu informacija u tekućoj iteraciji. Konkretno, tekuće informacije o preciznosti aproksimacije očekivanja i tačnosti aproksimacije re&scaron;enja problema defini&scaron;u veličinu uzorka za narednu iteraciju. Oba iterativna postupka su zasnovana na linijskom pretraživanju, a kako je u pitanju problem s ograničenjima, i na kvadratnom kaznenom postupku prilagođenom stohastičkom okruženju. Postupci su zasnovani na istim idejama, ali s različitim pristupom.Po prvom pristupu postupak je kreiran za re&scaron;avanje SAA reformulacije problema stohastičkog programiranja, dakle za re&scaron;avanje aproksimacije originalnog problema. To znači da je uzorak definisan pre iterativnog postupka, pa je analiza konvergencije algoritma deterministička. Pokazano je da se, pod standardnim pretpostavkama, navedenim algoritmom dobija podniz iteracija čija je tačka nagomilavanja KKT tačka SAA reformulacije.Po drugom pristupu je formiran algoritam za re&scaron;avanje samog problemastohastičkog programiranja, te je analiza konvergencije stohastička. Predstavljenim algoritmom se generi&scaron;e podniz iteracija čija je tačka nagomilavanja, pod standardnim pretpostavkama za stohastičku optimizaciju, skoro sigurnoKKT tačka originalnog problema.Predloženi algoritmi su implementirani na istim test problemima. Rezultati numeričkog testiranja prikazuju njihovu efikasnost u re&scaron;avanju posmatranih problema u poređenju s postupcima u kojima je ažuriranje veličine uzorkazasnovano na unapred definisanoj &scaron;emi. Za meru efikasnosti je upotrebljenbroj izračunavanja funkcija. Dakle, na osnovu rezultata dobijenih na skuputestiranih problema može se zaključiti da se adaptivnim ažuriranjem veličineuzorka može u&scaron;tedeti u broju evaluacija funkcija kada su u pitanju i problemi sograničenjima.Kako je posmatrani problem deterministički, a formulisani postupci su stohastički, prva tri poglavlja disertacije sadrže osnovne pojmove determinističkei stohastiˇcke optimizacije, ali i kratak pregled definicija i teorema iz drugihoblasti potrebnih za lak&scaron;e praćenje analize originalnih rezultata. Nastavak disertacije čini prikaz formiranih algoritama, analiza njihove konvergencije i numerička implementacija.  Stochastic programming problem with equality constraints is considered within thesis. More precisely, the problem is minimization problem with constraints in the form of mathematical expectation. We proposed two iterative methods for solving considered problem. Both procedures, in each iteration, use a sample average function instead of the mathematical expectation function, and employ the advantages of the variable sample size method based on adaptive updating the sample size. That means, the sample size is determined at every iteration using information from the current iteration. Concretely, the current precision of the approximation of expectation and the quality of the approximation of solution determine the sample size for the next iteration. Both iterative procedures are based on the line search technique as well as on the quadratic penalty method adapted to stochastic environment, since the considered problem has constraints. Procedures relies on same ideas, but the approach is different.By first approach, the algorithm is created for solving an SAA reformulation of the stochastic programming problem, i.e., for solving the approximation of the original problem. That means the sample size is determined before the iterative procedure, so the convergence analyses is deterministic. We show that, under the standard assumptions, the proposed algorithm generates a subsequence which accumulation point is the KKT point of the SAA problem. Algorithm formed by the second approach is for solving the stochastic programming problem, and therefore the convergence analyses is stochastic. It generates a subsequence with  accumulation point that is almost surely the KKT point of the original problem, under the standard assumptions for stochastic optimization.for sample size. The number of function evaluations is used as measure of efficiency. Results of the set of tested problems suggest that it is possible to make smaller number of function evaluations by adaptive sample size scheduling in the case of constrained problems, too.Since the considered problem is deterministic, but the formed procedures are stochastic, the first three chapters of thesis contain basic notations of deterministic and stochastic optimization, as well as a short sight of definitions and theorems from another fields necessary for easier tracking the original results analysis. The rest of thesis consists of the presented algorithms, their convergence analysis and numerical implementation.

APA, Harvard, Vancouver, ISO, and other styles

Hagen, Clinton Ernest. "Comparing the performance of four calculation methods for estimating the sample size in repeated measures clinical trials where difference in treatment groups means is of interest." Oklahoma City : [s.n.], 2008.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Suen, Wai-sing Alan, and 孫偉盛. "Sample size planning for clinical trials with repeated measurements." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B31972172.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Winkelried, Diego. "Methods to improve the finite sample behaviour of instrumental variable estimators." Thesis, University of Cambridge, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.609238.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Oymak, Okan. "Sample size determination for estimation of sensor detection probabilities based on a test variable." Thesis, Monterey, Calif. : Naval Postgraduate School, 2007. http://bosun.nps.edu/uhtbin/hyperion-image.exe/07Jun%5FOymak.pdf.

Full text

Abstract:

Thesis (M.S. in Operations Research)--Naval Postgraduate School, June 2007. Thesis Advisor(s): Lyn R. Whitaker. "June 2007." Includes bibliographical references (p. 95-96). Also available in print.

APA, Harvard, Vancouver, ISO, and other styles

Tan, Say Beng. "Bayesian decision theoretic methods for clinical trials." Thesis, Imperial College London, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.312988.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bofill, Roig Marta. "Statistical methods and software for clinical trials with binary and survival endpoints : efficiency, sample size and two-sample comparison." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/670371.

Full text

Abstract:

Defining the scientific question is the starting point for any clinical study. However, even though the main objective is generally clear, how this is addressed is not usually straightforward. Clinical studies very often encompass several questions, defined as primary and secondary hypotheses, and measured through different endpoints. In clinical trials with multiple endpoints, composite endpoints, defined as the union of several endpoints, are widely used as primary endpoints. The use of composite endpoints is mainly motivated because they are expected to increase the number of observed events and to capture more information than by only considering one endpoint. Besides, it is generally thought that the power of the study will increase if using composite endpoints and that the treatment effect on the composite endpoint will be similar to the average effect of its components. However, these assertions are not necessarily true and the design of a trial with a composite endpoint might be difficult. Different types of endpoints might be chosen for different research stages. This is the case for cancer trials, where short-term binary endpoints based on the tumor response are common in early-phase trials, whereas overall survival is the gold standard in late-phase trials. In the recent years, there has been a growing interest in designing seamless trials with both early response outcome and later event times. Considering these two endpoints together could provide a wider characterization of the treatment effect and also may reduce the duration of clinical trials and their costs. In this thesis, we provide novel methodologies to design clinical trials with composite binary endpoints and to compare two treatment groups based on binary and time-to-event endpoints. In addition, we present the implementation of the methodologies by means of different statistical tools. Specifically, in Chapter 2, we propose a general strategy for sizing a trial with a composite binary endpoint as primary endpoint based on previous information on its components. In Chapter 3, we present the ARE (Asymptotic Relative Efficiency) method to choose between a composite binary endpoint or one of its components as the primary endpoint of a trial. In Chapter 4, we propose a class of two-sample nonparametric statistics for testing the equality of proportions and the equality of survival functions. In Chapter 5, we describe the software developed to implement the methods proposed in this thesis. In particular, we present CompARE, a web-based tool for designing clinical trials with composite endpoints and its corresponding R package, and the R package SurvBin in which we have implemented the class of statistics presented in Chapter 4. We conclude this dissertation with general conclusions and some directions for future research in Chapter 6. La evaluación de la eficacia de los tratamientos es uno de los mayores retos en el diseño de ensayos clínicos. La variable principal cuantifica la respuesta clínica y define, en gran medida, el ensayo. Los ensayos clínicos generalmente abarcan varias cuestiones de interés. En estos casos, se establecen hipótesis primarias y secundarias, que son evaluadas a través de diferentes variables. Los ensayos clínicos con múltiples variables de interés utilizan frecuentemente las llamadas variables compuestas. Una variable compuesta se define como la unión de diversas variables de interés. La utilización de variables compuestas en lugar de variables simples estriba en que con éstas aumenta el número de eventos observados y se obtiene una información más completa sobre la respuesta al tratamiento. También se plantea a menudo, por un lado, que la potencia estadística del estudio es mayor si se usan variables compuestas y, por otro, que el efecto del tratamiento de la variable compuesta será similar al efecto medio de las variables que la componen. Sin embargo, estas afirmaciones no son necesariamente ciertas y el diseño de un estudio con una variable compuesta suele ser complejo. El tipo de variable escogida como variable principal puede diferir en las diferentes etapas de investigación. Por ejemplo, en el caso de estudios oncológicos, las variables binarias evaluadas a corto plazo son usadas en fases tempranas del desarrollo del tratamiento; mientras que en fases más avanzadas, las variables más usadas son tiempos de vida. En los últimos años, ha habido un interés creciente en el diseño de ensayos fase II/III con variables binarias y tiempos de vida. Este tipo de ensayos podría proporcionar una caracterización más amplia del efecto del tratamiento y también podría reducir la duración de los ensayos clínicos y sus costes. En esta tesis, proponemos nuevas metodologías, junto con el software estadístico correspondiente, para el diseño de ensayos clínicos con variables compuestas y para la comparación de dos grupos de tratamiento en base a variables binarias y tiempos de vida. Específicamente, en el capítulo 2, proponemos una estrategia para calcular el tamaño muestral de un ensayo con una variable compuesta como variable principal del estudio basado en la información previa sobre sus componentes. En el capítulo 3, presentamos el método ARE (Asymptotic Relative Efficiency) para elegir entre una variable compuesta o una de sus componentes como variable principal de un ensayo. En el capítulo 4, proponemos una clase de estadísticos no paramétricos para contrastar la igualdad de proporciones y la igualdad de las funciones de supervivencia. En el capítulo 5, describimos el software desarrollado para implementar los métodos propuestos en esta tesis. En particular, presentamos CompARE, una herramienta web para diseñar ensayos clínicos con variables compuestas y su correspondiente paquete R, y el paquete R SurvBin en el que hemos implementado la clase de estadísticos presentadas en el capítulo 4. La tesis concluye con un resumen de las principales aportaciones, algunas conclusiones de carácter general así como con una discusión sobre diversos problemas abiertos y futuras líneas de investigación. L’avaluació de l’eficàcia dels tractaments és un dels grans reptes en el disseny d'assajos clínics. La variable principal quantifica la resposta clínica i defineix, en gran manera, l'assaig. Els assaigs clínics generalment inclouen diverses qüestions d’interès. En aquests casos, s'estableixen hipòtesis primàries i secundàries, que són avaluades mitjançant diferents variables. Els assajos clínics amb múltiples variables d’interès utilitzen freqüentment les anomenades variables compostes. Una variable composta es defineix com la unió de diverses variables d’interès. La utilització de variables compostes en lloc de variables simples rau en el fet que amb aquestes augmenta el nombre d'esdeveniments observats i s’obté una informació més completa sobre la resposta al tractament. També es planteja sovint, d'una banda, que la potència estadística de l'estudi és més gran si es fan servir variables compostes i, de l'altra, que l'efecte del tractament de la variable composta serà semblant a l'efecte mitjà de les variables que la composen. No obstant això, aquestes afirmacions no són necessàriament certes i el disseny d'un estudi amb una variable composta sol ser complex. El tipus de variable escollida com a variable principal pot diferir en les diferents etapes d’investigació. Per exemple, en el cas d'estudis oncològics, les variables binàries avaluades a curt termini són utilitzades en fases inicials; mentre que en fases més avançades, les variables més utilitzades són temps de vida. En els últims anys, hi ha hagut un interès creixent en el disseny d'assaigs fase II/III amb variables binàries i temps de vida. Aquest tipus d'assajos podria proporcionar una caracterització més àmplia de l'efecte del tractament i també podria reduir la durada dels assaigs clínics i els seus costos. En aquesta tesi, proposem noves metodologies, juntament amb el software estadístic corresponent, per al disseny d'assajos clínics amb variables compostes i per a la comparació de dos grups de tractament a partir de variables binàries i temps de vida. Específicament, en el capítol 2, proposem una estratègia per calcular la mida mostral d'un assaig amb una variable composta com a variable principal d'estudi basat en la informació prèvia sobre els seus components. En el capítol 3, presentem el mètode ARE (Asymptotic Relative Efficiency) per triar entre una variable composta o una de les seves components com a variable principal d'un assaig. En el capítol 4, proposem una classe d’estadístics no paramètrics per contrastar la igualtat de proporcions i la igualtat de les funcions de supervivència. En el capítol 5, descrivim el software desenvolupat per implementar els mètodes proposats en aquesta tesi. En particular, presentem CompARE, una eina web per dissenyar assajos clínics amb variables compostes i el seu corresponent paquet d'R, i el paquet d'R SurvBin on hem implementat la classe d’estadístics presentada en el capítol 4. La tesi conclou amb un resum de les principals aportacions, algunes conclusions de caràcter general així com amb una discussió sobre diversos problemes oberts i futures línies d’investigació.

APA, Harvard, Vancouver, ISO, and other styles

Matsouaka, Roland Albert. "Contributions to Imputation Methods Based on Ranks and to Treatment Selection Methods in Personalized Medicine." Thesis, Harvard University, 2012. http://dissertations.umi.com/gsas.harvard:10078.

Full text

Abstract:

The chapters of this thesis focus two different issues that arise in clinical trials and propose novel methods to address them. The first issue arises in the analysis of data with non-ignorable missing observations. The second issue concerns the development of methods that provide physicians better tools to understand and treat diseases efficiently by using each patient's characteristics and personal biomedical profile. Inherent to most clinical trials is the issue of missing data, specially those that arise when patients drop out the study without further measurements. Proper handling of missing data is crucial in all statistical analyses because disregarding missing observations can lead to biased results. In the first two chapters of this thesis, we deal with the "worst-rank score" missing data imputation technique in pretest-posttest clinical trials. Subjects are randomly assigned to two treatments and the response is recorded at baseline prior to treatment (pretest response), and after a pre-specified follow-up period (posttest response). The treatment effect is then assessed on the change in response from baseline to the end of follow-up time. Subjects with missing response at the end of follow-up are assign values that are worse than any observed response (worst-rank score). Data analysis is then conducted using Wilcoxon-Mann-Whitney test. In the first chapter, we derive explicit closed-form formulas for power and sample size calculations using both tied and untied worst-rank score imputation, where the worst-rank scores are either a fixed value (tied score) or depend on the time of withdrawal (untied score). We use simulations to demonstrate the validity of these formulas. In addition, we examine and compare four different simplification approaches to estimate sample sizes. These approaches depend on whether data from the literature or a pilot study are available. In second chapter, we introduce the weighted Wilcoxon-Mann-Whitney test on un-tied worst-rank score (composite) outcome. First, we demonstrate that the weighted test is exactly the ordinary Wilcoxon-Mann-Whitney test when the weights are equal. Then, we derive optimal weights that maximize the power of the corresponding weighted Wilcoxon-Mann-Whitney test. We prove, using simulations, that the weighted test is more powerful than the ordinary test. Furthermore, we propose two different step-wise procedures to analyze data using the weighted test and assess their performances through simulation studies. Finally, we illustrate the new approach using data from a recent randomized clinical trial of normobaric oxygen therapy on patients with acute ischemic stroke. The third and last chapter of this thesis concerns the development of robust methods for treatment groups identification in personalized medicine. As we know, physicians often have to use a trial-and-error approach to find the most effective medication for their patients. Personalized medicine methods aim at tailoring strategies for disease prevention, detection or treatment by using each individual subject's personal characteristics and medical profile. This would result to (1) better diagnosis and earlier interventions, (2) maximum therapeutic benefits and reduced adverse events, (3) more effective therapy, and (4) more efficient drug development. Novel methods have been proposed to identify subgroup of patients who would benefit from a given treatment. In the last chapter of this thesis, we develop a robust method for treatment assignment for future patients based on the expected total outcome. In addition, we provide a method to assess the incremental value of new covariate(s) in improving treatment assignment. We evaluate the accuracy of our methods through simulation studies and illustrate them with two examples using data from two HIV/AIDS clinical trials.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Books on the topic "Variable sample size methods"

Zarnoch, Stanley J. Determining sample size for tree utilization surveys. U.S. Dept. of Agriculture, Forest Service, Southern Research Station, 2004.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Handbook of sample size guidelines for clinical trials. CRC Press, 1989.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Brush, Gary G. How to choose the proper sample size. American Society for Quality Control, 1988.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Lwanga, S. Kaggwa. Sample size determination in health studies: A practical manual. World Health Organization, 1991.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Julious, Steven A. Sample sizes for clinical trials. Taylor & Francis, 2009.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Shuster, Jonathan J. Practical handbook of sample size guidelines for clinical trials. CRC Press, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Shuster, Jonathan J. Practical handbook of sample size guidelines for clinical trials. CRC Press, 1993.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

1961-, Heo Moonseoung, Zhang Song 1976-, and Kim, Mimi (Mimi Y.), eds. Sample size calculations for clustered and longitudinal outcomes in clinical research. CRC Press, Taylor & Francis, 2015.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Kieser, Meinhard. Methods and Applications of Sample Size Calculation and Recalculation in Clinical Trials. Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-49528-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Moura, Eduardo C. How to determine sample size and estimate failure rate in life testing. ASQC Quality Press, 1991.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Variable sample size methods"

Chakraborty, Dev P. "Sample size estimation." In Observer Performance Methods for Diagnostic Imaging. CRC Press, 2017. http://dx.doi.org/10.1201/9781351228190-11.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Lee, Hang. "Sample Size and Power." In Foundations of Applied Statistical Methods. Springer International Publishing, 2013. http://dx.doi.org/10.1007/978-3-319-02402-8_8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Indrayan, Abhaya. "Sampling and Sample Size." In Research Methods for Medical Graduates. CRC Press, 2019. http://dx.doi.org/10.1201/9780429435034-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Patten, Mildred L., and Michelle Newhart. "Sample Size in Quantitative Studies." In Understanding Research Methods. Routledge, 2017. http://dx.doi.org/10.4324/9781315213033-36.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kelley, Ken, and Scott E. Maxwell. "Sample size planning." In APA handbook of research methods in psychology, Vol 1: Foundations, planning, measures, and psychometrics. American Psychological Association, 2012. http://dx.doi.org/10.1037/13619-012.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Lyman, Stephen. "Power and Sample Size." In Basic Methods Handbook for Clinical Orthopaedic Research. Springer Berlin Heidelberg, 2019. http://dx.doi.org/10.1007/978-3-662-58254-1_20.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Mütze, Tobias, and Tim Friede. "Sample Size Re-Estimation." In Handbook of Statistical Methods for Randomized Controlled Trials. Chapman and Hall/CRC, 2021. http://dx.doi.org/10.1201/9781315119694-16.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Wang, Hansheng, and Shein-Chung Chow. "Sample Size for Comparing Means." In Methods and Applications of Statistics in Clinical Trials. John Wiley & Sons, Inc., 2014. http://dx.doi.org/10.1002/9781118596333.ch39.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Wang, Hansheng, and Shein-Chung Chow. "Sample Size for Comparing Proportions." In Methods and Applications of Statistics in Clinical Trials. John Wiley & Sons, Inc., 2014. http://dx.doi.org/10.1002/9781118596333.ch40.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Wang, Hansheng, and Shien-Chung Chow. "Sample Size for Comparing Variabilities." In Methods and Applications of Statistics in Clinical Trials. John Wiley & Sons, Inc., 2014. http://dx.doi.org/10.1002/9781118596333.ch42.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Variable sample size methods"

Jalilzadeh, Afrooz, Angelia Nedic, Uday V. Shanbhag, and Farzad Yousefian. "A Variable Sample-Size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization." In 2018 IEEE Conference on Decision and Control (CDC). IEEE, 2018. http://dx.doi.org/10.1109/cdc.2018.8619209.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Radha, P., and K. Rajagopalan. "Reliability of Stiffened Cylindrical Shells Using Random Polar Sampling Technique." In ASME 2004 23rd International Conference on Offshore Mechanics and Arctic Engineering. ASMEDC, 2004. http://dx.doi.org/10.1115/omae2004-51059.

Full text

Abstract:

Rational design of stiffened cylindrical shell structures calls for the calculation of reliability. Random variables occur in modelling loads and strengths. The reliability can be evaluated with ‘Monte-Carlo Simulation’ (MCS) which consists of obtaining cumulative distribution functions for each and every random variable and simulating the ultimate strength of stiffened shells for combinations of random variable values. However for MCS to be successful, the sample size should be very large. Hence methods have been proposed to reduce the sample size without however sacrificing any accuracy on reliability. ‘Point Estimation Method’ (PEM), ‘Response Surface Technique’ (RST), ‘Importance Sampling Procedure Using Design points’ (ISPUD), ‘Latin Hypercube Sampling’ (LHS) etc., are some of these methods. In this paper, a method based on ‘Random Polar Sampling Technique’ (RPST) is proposed, in which combinations of variates are obtained using a polar sampling of Latin Hypercube sampled values. A typical example has been worked out using this method.

APA, Harvard, Vancouver, ISO, and other styles

Prots, Andriy, Lars Högner, Matthias Voigt, Ronald Mailach, and Florian Danner. "Improved Quality Assessment of Probabilistic Simulations and Application to Turbomachinery." In ASME Turbo Expo 2020: Turbomachinery Technical Conference and Exposition. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/gt2020-16147.

Full text

Abstract:

Abstract Probabilistic methods are gaining in importance in aerospace engineering due to their ability to describe the behavior of the system in the presence of input value variance. A frequently employed probabilistic method is the Monte Carlo Simulation (MCS). There, a sample of random representative realizations is evaluated deterministically and their results are afterwards analyzed with statistical methods. Possible statistical results are mean, standard deviation, quantile values and correlation coefficients. Since the sample is generated randomly, the result of a MCS will differ for each repetition. Therefore, it can be regarded as a random variable. Confidence Intervals (CIs) are commonly used to quantify this variance. To gain the true CI, many repetitions of the MCS have to be conducted, which is not desirable due to limitations in time and computational power. Hence, analytical formulations or bootstrapping is used to estimate the CI. In order to reduce the variance of the result of a MCS, sampling techniques with variance reduction properties like Latin Hypercube Sampling (LHS) are commonly used. But the known methods to determine the CI do not consider this variance reduction and tend to overestimate it instead. Furthermore, it is difficult to predict the change of the CI size with increasing size of the sample. In the present work, new methods to calculate the CI are introduced. They allow a more precise CI estimation when LHS is used for a MCS. For this purpose, the system is approximated by means of a meta model. The distribution of the result value is now approximated by repeating the MCS many times. The time consuming deterministic calculations of a MCS are thus replaced with an evaluation on the meta model. These so called virtual MCS can therefore be performed in a short amount of time. The estimated distribution of the result value can be used to estimate the CI. It is, however, not sufficient to use only the meta model. The error ε, defined as the difference between the true value y and the approximated value y, must be considered as well. The generated meta model can also be used to predict the size of the CI at different sample sizes. The suggested methods were applied to two test cases. The first test case examines a structural mechanics application of a bending beam, which features low computational cost. This allows to show that the predicted sizes of the CI are sufficiently precise. The second test case covers the aerodynamic application. Therefore, an aerodynamic Computational Fluid Dynamics (CFD) analysis accounting for geometrical variations of NASA’s Rotor 37 is conducted. For this, the blade is parametrized with the in-house tool Blade2Parameter. For different sample sizes, blades are generated using this parametrization. Their geometrical variance is based on experience values. CFD calculations for these blades are performed with the commercial software NUMECA. Afterwards, the CIs for result values of interest like mechanical efficiency are evaluated with the presented methods. The suggested methods predict a narrower and thus less conservative CI.

APA, Harvard, Vancouver, ISO, and other styles

Junbo, Liu, Ding Shuiting, and Li Guo. "Influence of Random Variable Dimension on the Fast Numerical Integration Method of Aero Engine Rotor Disk Failure Risk Analysis." In ASME 2020 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/imece2020-23513.

Full text

Abstract:

Abstract In the risk assessment of turbine rotor disks, the probability of failure of a certain disk type (after N flight cycles) is a vital criterion for estimating whether the disk is safe to use. Monte Carlo simulation (MCS) is often used to calculate the failure probability but is costly because it requires a large sample size. The numerical integration (NI) algorithm has been proven more efficient than MCS in conditions entailing three random variables. However, the previous studies on the NI method have not dealt with the influence of random variable dimension on calculation efficiency. Hence, this study aims to summarize the influence of variable dimensions on the time cost of a fastintegration algorithm. The time cost increases exponentially with the number of variables in the NI method. This conclusion provides a reference for the selection of probability algorithms involving multiple variables. The findings are expected to be of interest to the practice of efficient security design that considers multivariable conditions.

APA, Harvard, Vancouver, ISO, and other styles

Heredia-Zavoni, Ernesto, and Roberto Montes-Iturrizaga. "Analytical Models for the Probability Distributions of Fatigue Parameters and Crack Size of Offshore Structures Based on Bayesian Updating." In ASME 2002 21st International Conference on Offshore Mechanics and Arctic Engineering. ASMEDC, 2002. http://dx.doi.org/10.1115/omae2002-28111.

Full text

Abstract:

In this paper a bayesian framework is used for updating the probability distributions of the parameters of a fatigue model and of crack size in tubular joints using information from inspection reports of fixed offshore structures. For crack detection, the uncertainties are taken into account by means of probability-of-detection (POD) curves. According to the bayesian procedure, if during an inspection no crack is detected, the updated (posterior) distributions depend on the prior ones at time of such inspection and on the POD. On the other hand, if during an inspection a crack is detected and measured, the corresponding predicted crack depth at that time is estimated given values of parameters of a selected fatigue model and of the initial crack depth. Then, a sample value of the model and sizing error associated with the inspection performed, defined as the logarithmic difference between the measured and the predicted crack size, is calculated. Such error is considered to be a normally distributed random variable with known mean and uncertain variance. The distribution of the error variance is taken as a conjugate one for samples of normally distributed variables with known mean and uncertain variance. Based on these assumptions, an analytical expression is obtained for the updated (posterior) distributions of the parameters of the fatigue model and of crack size. It is shown that the updated distributions depend on POD and on the prior and updated parameters of the error variance distribution. Finally, the bayesian method proposed here is illustrated taking as a fatigue model the Paris-Erdogan relation, which estimates crack growth based on linear elastic fracture mechanics. Joint failure is considered to occur when the crack depth reaches the thickness of the element where the crack propagates. The evolution of reliability with time is assessed.

APA, Harvard, Vancouver, ISO, and other styles

Lichtenberg, Glen, and Kirsten Carr. "Performance Evaluation of Occupant Classification Systems." In ASME 2004 International Mechanical Engineering Congress and Exposition. ASMEDC, 2004. http://dx.doi.org/10.1115/imece2004-60132.

Full text

Abstract:

The new FMVSS 208 Federal Regulation requires restraint systems to focus on occupants other than the 50th percentile male. The new focus includes small adults and children. As a result, restraint systems may need to perform differently for several occupant classes, thereby creating a need for occupant classification systems (OCS). A typical regulation compliance strategy is to suppress the restraint system when a child occupies the front passenger seat and to enable the restraints when an adult occupies the seat. The regulation provides specific weight and height ranges to define these classes of seat occupants. The evolution of OCS technologies produced a need for test methodologies and objective metrics to measure classification system capability. The application of the statistical one-sided tolerance interval to OCS systems has proven invaluable in measuring classification performance and driving system improvements. The one-sided tolerance method is based on a single continuous variable, such as weight. A single common threshold, or tolerance limit, is used to compare two competing populations, such as 6-year-old versus 5th percentile female populations. Output of the method produces graphics demonstrating reliability as a function of potential threshold that objectively characterizes a system’s classification performance level. This paper also discusses the importance of applying the one-sided tolerance interval method to performance data that captures the noise sources that impact system performance. For occupant classification systems, noise sources include differences in test subjects’ sizes, how they sit in the seat, and how the seat is set-up. This paper also discusses the importance of sample size selection. Two methods of determining a sample size are presented. The first method uses the one-sided tolerance interval method equation directly. The second method simulates a noise source and selects a sample size where the noise standard deviation converges to its population variance. Once the mean, standard deviation, and sample size for each test case is known, the proposed method computes the reliability of each test case evaluated for a range of potential thresholds. A review of the resulting reliability curves characterizes classification performance. If an acceptable range of thresholds exists, the resulting range is referred to as a “threshold window.” System improvements can be directed toward those test cases that constrain the “threshold window.” This paper proposes a statistical method that can provide a solid measure of the robust capability of an OCS that classifies based on a single continuous variable (such as weight) to distinguish between occupant classes. This statistical method enables the careful balance necessary in setting thresholds.

APA, Harvard, Vancouver, ISO, and other styles

Seita, Daniel, Xinlei Pan, Haoyu Chen, and John Canny. "An Efficient Minibatch Acceptance Test for Metropolis-Hastings." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/753.

Full text

Abstract:

We present a novel Metropolis-Hastings method for large datasets that uses small expected-size mini-batches of data. Previous work on reducing the cost of Metropolis-Hastings tests yields only constant factor reductions versus using the full dataset for each sample. Here we present a method that can be tuned to provide arbitrarily small batch sizes, by adjusting either proposal step size or temperature. Our test uses the noise-tolerant Barker acceptance test with a novel additive correction variable. The resulting test has similar cost to a normal SGD update. Our experiments demonstrate several order-of-magnitude speedups over previous work.

APA, Harvard, Vancouver, ISO, and other styles

Mehmani, Ali, Souma Chowdhury, and Achille Messac. "Variable-Fidelity Optimization With In-Situ Surrogate Model Refinement." In ASME 2015 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2015. http://dx.doi.org/10.1115/detc2015-47188.

Full text

Abstract:

Owing to the typical low fidelity of surrogate models, it is often challenging to accomplish reliable optimum solutions in Surrogate-Based Optimization (SBO) for complex nonlinear problems. This paper addresses this challenge by developing a new model-independent approach to refine the surrogate model during optimization, with the objective to maintain a desired level of fidelity and robustness “where” and “when” needed. The proposed approach, called Adaptive Model Refinement (AMR), is designed to work particularly with population-based optimization algorithms. In AMR, reconstruction of the model is performed by sequentially adding a batch of new samples at any given iteration (of SBO) when a refinement metric is met. This metric is formulated by comparing (1) the uncertainty associated with the outputs of the current model, and (2) the distribution of the latest fitness function improvement over the population of candidate designs. Whenever the model-refinement metric is met, the history of the fitness function improvement is used to determine the desired fidelity for the upcoming iterations of SBO. Predictive Estimation of Model Fidelity (an advanced surrogate model error metric) is applied to determine the model uncertainty and the batch size for the samples to be added. The location of the new samples in the input space is determined based on a hypercube enclosing promising candidate designs, and a distance-based criterion that minimizes the correlation between the current sample points and the new points. The powerful mixed-discrete PSO algorithm is used in conjunction with different surrogate models (e.g., Kriging, RBF, SVR) to apply the new AMR method. The performance of the proposed AMR-based SBO is investigated through three different benchmark functions.

APA, Harvard, Vancouver, ISO, and other styles

Zhu, Zhifu, and Xiaoping Du. "A System Reliability Method With Dependent Kriging Predictions." In ASME 2016 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2016. http://dx.doi.org/10.1115/detc2016-59030.

Full text

Abstract:

When limit-state functions are highly nonlinear, traditional reliability methods, such as the first order and second order reliability methods, are not accurate. Monte Carlo simulation (MCS), on the other hand, is accurate if a sufficient sample size is used, but is computationally intensive. This research proposes a new system reliability method that combines MCS and the Kriging method with improved accuracy and efficiency. Cheaper surrogate models are created for limit-state functions with the minimal variance in the estimate of the system reliability, thereby producing high accuracy for the system reliability prediction. Instead of employing global optimization, this method uses MCS samples from which training points for the surrogate models are selected. By considering the dependence between responses from a surrogate model, this method captures the true contribution of each MCS sample to the uncertainty in the estimate of the system reliability and therefore chooses training points efficiently. Good accuracy and efficiency are demonstrated by three examples.

APA, Harvard, Vancouver, ISO, and other styles

Lee, Ungki, and Ikjin Lee. "Sampling-Based Reliability Analysis Using Deep Feedforward Neural Network (DFNN)." In ASME 2020 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/detc2020-22275.

Full text

Abstract:

Abstract Reliability analysis that evaluates a probabilistic constraint is an important part of reliability-based design optimization (RBDO). Inverse reliability analysis evaluates the percentile value of the performance function that satisfies the reliability. To compute the percentile value, analytical methods, surrogate model based methods, and sampling-based methods are commonly used. In case the dimension or nonlinearity of the performance function is high, sampling-based methods such as Monte Carlo simulation, Latin hypercube sampling, and importance sampling can be directly used for reliability analysis since no analytical formulation or surrogate model is required in these methods. The sampling-based methods have high accuracy but require a large number of samples, which can be very time-consuming. Therefore, this paper proposes methods that can improve the accuracy of reliability analysis when the number of samples is not enough and the sampling-based methods are considered to be better candidates. This study starts with the idea of training the relationship between the realization of the performance function at a small sample size and the corresponding true percentile value of the performance function. Deep feedforward neural network (DFNN), which is one of the promising artificial neural network models that approximates high dimensional models using deep layered structures, is trained using the realization of various performance functions at a small sample size and the corresponding true percentile values as input and target training data, respectively. In this study, various polynomial functions and random variables are used to create training data sets consisting of various realizations and corresponding true percentile values. A method that approximates the realization of the performance function through kernel density estimation and trains the DFNN with the discrete points representing the shape of the kernel distribution to reduce the dimension of the training input data is also presented. Along with the proposed reliability analysis methods, a strategy that reuses samples of the previous design point to enhance the efficiency of the percentile value estimation is explained. The results show that the reliability analysis using the DFNN is more accurate than the method using only samples. In addition, compared to the method that trains the DFNN using the realization of the performance function, the method that trains the DFNN with the discrete points representing the shape of the kernel distribution improves the accuracy of reliability analysis and reduces the training time. The proposed sample reuse strategy is verified that the burden of function evaluation at the new design point can be reduced by reusing the samples of the previous design point when the design point changes while performing RBDO.

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Variable sample size methods"

Kott, Phillip S. Better Coverage Intervals for Estimators from a Complex Sample Survey. RTI Press, 2020. http://dx.doi.org/10.3768/rtipress.2020.mr.0041.2002.

Full text

Abstract:

Coverage intervals for a parameter estimate computed using complex survey data are often constructed by assuming the parameter estimate has an asymptotically normal distribution and the measure of the estimator’s variance is roughly chi-squared. The size of the sample and the nature of the parameter being estimated render this conventional “Wald” methodology dubious in many applications. I developed a revised method of coverage-interval construction that “speeds up the asymptotics” by incorporating an estimated measure of skewness. I discuss how skewness-adjusted intervals can be computed for ratios, differences between domain means, and regression coefficients.

APA, Harvard, Vancouver, ISO, and other styles

Webb, David W. A Comparison of Various Methods Used to Determine the Sample Size Requirements for Meeting a 90/90 Reliability Specification. Defense Technical Information Center, 2011. http://dx.doi.org/10.21236/ada540716.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bobashev, Georgiy, R. Joey Morris, Elizabeth Costenbader, and Kyle Vincent. Assessing network structure with practical sampling methods. RTI Press, 2018. http://dx.doi.org/10.3768/rtipress.2018.op.0049.1805.

Full text

Abstract:

Using data from an enumerated network of worldwide flight connections between airports, we examine how sampling designs and sample size influence network metrics. Specifically, we apply three types of sampling designs: simple random sampling, nonrandom strategic sampling (i.e., selection of the largest airports), and a variation of snowball sampling. For the latter sampling method, we design what we refer to as a controlled snowball sampling design, which selects nodes in a manner analogous to a respondent-driven sampling design. For each design, we evaluate five commonly used measures of network structure and examine the percentage of total air traffic accounted for by each design. The empirical application shows that (1) the random and controlled snowball sampling designs give rise to more efficient estimates of the true underlying structure, and (2) the strategic sampling method can account for a greater proportion of the total number of passenger movements occurring in the network.

APA, Harvard, Vancouver, ISO, and other styles

McPhedran, R., K. Patel, B. Toombs, et al. Food allergen communication in businesses feasibility trial. Food Standards Agency, 2021. http://dx.doi.org/10.46756/sci.fsa.tpf160.

Full text

Abstract:

Background: Clear allergen communication in food business operators (FBOs) has been shown to have a positive impact on customers’ perceptions of businesses (Barnett et al., 2013). However, the precise size and nature of this effect is not known: there is a paucity of quantitative evidence in this area, particularly in the form of randomised controlled trials (RCTs). The Food Standards Agency (FSA), in collaboration with Kantar’s Behavioural Practice, conducted a feasibility trial to investigate whether a randomised cluster trial – involving the proactive communication of allergen information at the point of sale in FBOs – is feasible in the United Kingdom (UK). Objectives: The trial sought to establish: ease of recruitments of businesses into trials; customer response rates for in-store outcome surveys; fidelity of intervention delivery by FBO staff; sensitivity of outcome survey measures to change; and appropriateness of the chosen analytical approach. Method: Following a recruitment phase – in which one of fourteen multinational FBOs was successfully recruited – the execution of the feasibility trial involved a quasi-randomised matched-pairs clustered experiment. Each of the FBO’s ten participating branches underwent pair-wise matching, with similarity of branches judged according to four criteria: Food Hygiene Rating Scheme (FHRS) score, average weekly footfall, number of staff and customer satisfaction rating. The allocation ratio for this trial was 1:1: one branch in each pair was assigned to the treatment group by a representative from the FBO, while the other continued to operate in accordance with their standard operating procedure. As a business-based feasibility trial, customers at participating branches throughout the fieldwork period were automatically enrolled in the trial. The trial was single-blind: customers at treatment branches were not aware that they were receiving an intervention. All customers who visited participating branches throughout the fieldwork period were asked to complete a short in-store survey on a tablet affixed in branches. This survey contained four outcome measures which operationalised customers’: perceptions of food safety in the FBO; trust in the FBO; self-reported confidence to ask for allergen information in future visits; and overall satisfaction with their visit. Results: Fieldwork was conducted from the 3 – 20 March 2020, with cessation occurring prematurely due to the closure of outlets following the proliferation of COVID-19. n=177 participants took part in the trial across the ten branches; however, response rates (which ranged between 0.1 - 0.8%) were likely also adversely affected by COVID-19. Intervention fidelity was an issue in this study: while compliance with delivery of the intervention was relatively high in treatment branches (78.9%), erroneous delivery in control branches was also common (46.2%). Survey data were analysed using random-intercept multilevel linear regression models (due to the nesting of customers within branches). Despite the trial’s modest sample size, there was some evidence to suggest that the intervention had a positive effect for those suffering from allergies/intolerances for the ‘trust’ (β = 1.288, p<0.01) and ‘satisfaction’ (β = 0.945, p<0.01) outcome variables. Due to singularity within the fitted linear models, hierarchical Bayes models were used to corroborate the size of these interactions. Conclusions: The results of this trial suggest that a fully powered clustered RCT would likely be feasible in the UK. In this case, the primary challenge in the execution of the trial was the recruitment of FBOs: despite high levels of initial interest from four chains, only one took part. However, it is likely that the proliferation of COVID-19 adversely impacted chain participation – two other FBOs withdrew during branch eligibility assessment and selection, citing COVID-19 as a barrier. COVID-19 also likely lowered the on-site survey response rate: a significant negative Pearson correlation was observed between daily survey completions and COVID-19 cases in the UK, highlighting a likely relationship between the two. Limitations: The trial was quasi-random: selection of branches, pair matching and allocation to treatment/control groups were not systematically conducted. These processes were undertaken by a representative from the FBO’s Safety and Quality Assurance team (with oversight from Kantar representatives on pair matching), as a result of the chain’s internal operational restrictions.

APA, Harvard, Vancouver, ISO, and other styles

Malej, Matt, and Fengyan Shi. Suppressing the pressure-source instability in modeling deep-draft vessels with low under-keel clearance in FUNWAVE-TVD. Engineer Research and Development Center (U.S.), 2021. http://dx.doi.org/10.21079/11681/40639.

Full text

Abstract:

This Coastal and Hydraulics Engineering Technical Note (CHETN) documents the development through verification and validation of three instability-suppressing mechanisms in FUNWAVE-TVD, a Boussinesq-type numerical wave model, when modeling deep-draft vessels with a low under-keel clearance (UKC). Many large commercial ports and channels (e.g., Houston Ship Channel, Galveston, US Army Corps of Engineers [USACE]) are traveled and affected by tens of thousands of commercial vessel passages per year. In a series of recent projects undertaken for the Galveston District (USACE), it was discovered that when deep-draft vessels are modeled using pressure-source mechanisms, they can suffer from model instabilities when low UKC is employed (e.g., vessel draft of 12 m¹ in a channel of 15 m or less of depth), rendering a simulation unstable and obsolete. As an increasingly large number of deep-draft vessels are put into service, this problem is becoming more severe. This presents an operational challenge when modeling large container-type vessels in busy shipping channels, as these often will come as close as 1 m to the bottom of the channel, or even touch the bottom. This behavior would subsequently exhibit a numerical discontinuity in a given model and could severely limit the sample size of modeled vessels. This CHETN outlines a robust approach to suppressing such instability without compromising the integrity of the far-field vessel wave/wake solution. The three methods developed in this study aim to suppress high-frequency spikes generated nearfield of a vessel. They are a shock-capturing method, a friction method, and a viscosity method, respectively. The tests show that the combined shock-capturing and friction method is the most effective method to suppress the local high-frequency noises, while not affecting the far-field solution. A strong test, in which the target draft is larger than the channel depth, shows that there are no high-frequency noises generated in the case of ship squat as long as the shock-capturing method is used.

APA, Harvard, Vancouver, ISO, and other styles

McDonagh, Marian, Andrea C. Skelly, Amy Hermesch, et al. Cervical Ripening in the Outpatient Setting. Agency for Healthcare Research and Quality (AHRQ), 2021. http://dx.doi.org/10.23970/ahrqepccer238.

Full text

Abstract:

Objectives. To assess the comparative effectiveness and potential harms of cervical ripening in the outpatient setting (vs. inpatient, vs. other outpatient intervention) and of fetal surveillance when a prostaglandin is used for cervical ripening. Data sources. Electronic databases (Ovid® MEDLINE®, Embase®, CINAHL®, Cochrane Central Register of Controlled Trials, and Cochrane Database of Systematic Reviews) to July 2020; reference lists; and a Federal Register notice. Review methods. Using predefined criteria and dual review, we selected randomized controlled trials (RCTs) and cohort studies of cervical ripening comparing prostaglandins and mechanical methods in outpatient versus inpatient settings; one outpatient method versus another (including placebo or expectant management); and different methods/protocols for fetal surveillance in cervical ripening using prostaglandins. When data from similar study designs, populations, and outcomes were available, random effects using profile likelihood meta-analyses were conducted. Inconsistency (using I2) and small sample size bias (publication bias, if ≥10 studies) were assessed. Strength of evidence (SOE) was assessed. All review methods followed Agency for Healthcare Research and Quality Evidence-based Practice Center methods guidance. Results. We included 30 RCTs and 10 cohort studies (73% fair quality) involving 9,618 women. The evidence is most applicable to women aged 25 to 30 years with singleton, vertex presentation and low-risk pregnancies. No studies on fetal surveillance were found. The frequency of cesarean delivery (2 RCTs, 4 cohort studies) or suspected neonatal sepsis (2 RCTs) was not significantly different using outpatient versus inpatient dinoprostone for cervical ripening (SOE: low). In comparisons of outpatient versus inpatient single-balloon catheters (3 RCTs, 2 cohort studies), differences between groups on cesarean delivery, birth trauma (e.g., cephalohematoma), and uterine infection were small and not statistically significant (SOE: low), and while shoulder dystocia occurred less frequently in the outpatient group (1 RCT; 3% vs. 11%), the difference was not statistically significant (SOE: low). In comparing outpatient catheters and inpatient dinoprostone (1 double-balloon and 1 single-balloon RCT), the difference between groups for both cesarean delivery and postpartum hemorrhage was small and not statistically significant (SOE: low). Evidence on other outcomes in these comparisons and for misoprostol, double-balloon catheters, and hygroscopic dilators was insufficient to draw conclusions. In head to head comparisons in the outpatient setting, the frequency of cesarean delivery was not significantly different between 2.5 mg and 5 mg dinoprostone gel, or latex and silicone single-balloon catheters (1 RCT each, SOE: low). Differences between prostaglandins and placebo for cervical ripening were small and not significantly different for cesarean delivery (12 RCTs), shoulder dystocia (3 RCTs), or uterine infection (7 RCTs) (SOE: low). These findings did not change according to the specific prostaglandin, route of administration, study quality, or gestational age. Small, nonsignificant differences in the frequency of cesarean delivery (6 RCTs) and uterine infection (3 RCTs) were also found between dinoprostone and either membrane sweeping or expectant management (SOE: low). These findings did not change according to the specific prostaglandin or study quality. Evidence on other comparisons (e.g., single-balloon catheter vs. dinoprostone) or other outcomes was insufficient. For all comparisons, there was insufficient evidence on other important outcomes such as perinatal mortality and time from admission to vaginal birth. Limitations of the evidence include the quantity, quality, and sample sizes of trials for specific interventions, particularly rare harm outcomes. Conclusions. In women with low-risk pregnancies, the risk of cesarean delivery and fetal, neonatal, or maternal harms using either dinoprostone or single-balloon catheters was not significantly different for cervical ripening in the outpatient versus inpatient setting, and similar when compared with placebo, expectant management, or membrane sweeping in the outpatient setting. This evidence is low strength, and future studies are needed to confirm these findings.

APA, Harvard, Vancouver, ISO, and other styles

Tucker-Blackmon, Angelicque. Engagement in Engineering Pathways “E-PATH” An Initiative to Retain Non-Traditional Students in Engineering Year Three Summative External Evaluation Report. Innovative Learning Center, LLC, 2020. http://dx.doi.org/10.52012/tyob9090.

Full text

Abstract:

The summative external evaluation report described the program's impact on faculty and students participating in recitation sessions and active teaching professional development sessions over two years. Student persistence and retention in engineering courses continue to be a challenge in undergraduate education, especially for students underrepresented in engineering disciplines. The program's goal was to use peer-facilitated instruction in core engineering courses known to have high attrition rates to retain underrepresented students, especially women, in engineering to diversify and broaden engineering participation. Knowledge generated around using peer-facilitated instruction at two-year colleges can improve underrepresented students' success and participation in engineering across a broad range of institutions. Students in the program participated in peer-facilitated recitation sessions linked to fundamental engineering courses, such as engineering analysis, statics, and dynamics. These courses have the highest failure rate among women and underrepresented minority students. As a mixed-methods evaluation study, student engagement was measured as students' comfort with asking questions, collaboration with peers, and applying mathematics concepts. SPSS was used to analyze pre-and post-surveys for statistical significance. Qualitative data were collected through classroom observations and focus group sessions with recitation leaders. Semi-structured interviews were conducted with faculty members and students to understand their experiences in the program. Findings revealed that women students had marginalization and intimidation perceptions primarily from courses with significantly more men than women. However, they shared numerous strategies that could support them towards success through the engineering pathway. Women and underrepresented students perceived that they did not have a network of peers and faculty as role models to identify within engineering disciplines. The recitation sessions had a positive social impact on Hispanic women. As opportunities to collaborate increased, Hispanic womens' social engagement was expected to increase. This social engagement level has already been predicted to increase women students' persistence and retention in engineering and result in them not leaving the engineering pathway. An analysis of quantitative survey data from students in the three engineering courses revealed a significant effect of race and ethnicity for comfort in asking questions in class, collaborating with peers outside the classroom, and applying mathematical concepts. Further examination of this effect for comfort with asking questions in class revealed that comfort asking questions was driven by one or two extreme post-test scores of Asian students. A follow-up ANOVA for this item revealed that Asian women reported feeling excluded in the classroom. However, it was difficult to determine whether these differences are stable given the small sample size for students identifying as Asian. Furthermore, gender differences were significant for comfort in communicating with professors and peers. Overall, women reported less comfort communicating with their professors than men. Results from student metrics will inform faculty professional development efforts to increase faculty support and maximize student engagement, persistence, and retention in engineering courses at community colleges. Summative results from this project could inform the national STEM community about recitation support to further improve undergraduate engineering learning and educational research.

APA, Harvard, Vancouver, ISO, and other styles

Paynter, Robin A., Celia Fiordalisi, Elizabeth Stoeger, et al. A Prospective Comparison of Evidence Synthesis Search Strategies Developed With and Without Text-Mining Tools. Agency for Healthcare Research and Quality (AHRQ), 2021. http://dx.doi.org/10.23970/ahrqepcmethodsprospectivecomparison.

Full text

Abstract:

Background: In an era of explosive growth in biomedical evidence, improving systematic review (SR) search processes is increasingly critical. Text-mining tools (TMTs) are a potentially powerful resource to improve and streamline search strategy development. Two types of TMTs are especially of interest to searchers: word frequency (useful for identifying most used keyword terms, e.g., PubReminer) and clustering (visualizing common themes, e.g., Carrot2). Objectives: The objectives of this study were to compare the benefits and trade-offs of searches with and without the use of TMTs for evidence synthesis products in real world settings. Specific questions included: (1) Do TMTs decrease the time spent developing search strategies? (2) How do TMTs affect the sensitivity and yield of searches? (3) Do TMTs identify groups of records that can be safely excluded in the search evaluation step? (4) Does the complexity of a systematic review topic affect TMT performance? In addition to quantitative data, we collected librarians' comments on their experiences using TMTs to explore when and how these new tools may be useful in systematic review search¬¬ creation. Methods: In this prospective comparative study, we included seven SR projects, and classified them into simple or complex topics. The project librarian used conventional “usual practice” (UP) methods to create the MEDLINE search strategy, while a paired TMT librarian simultaneously and independently created a search strategy using a variety of TMTs. TMT librarians could choose one or more freely available TMTs per category from a pre-selected list in each of three categories: (1) keyword/phrase tools: AntConc, PubReMiner; (2) subject term tools: MeSH on Demand, PubReMiner, Yale MeSH Analyzer; and (3) strategy evaluation tools: Carrot2, VOSviewer. We collected results from both MEDLINE searches (with and without TMTs), coded every citation’s origin (UP or TMT respectively), deduplicated them, and then sent the citation library to the review team for screening. When the draft report was submitted, we used the final list of included citations to calculate the sensitivity, precision, and number-needed-to-read for each search (with and without TMTs). Separately, we tracked the time spent on various aspects of search creation by each librarian. Simple and complex topics were analyzed separately to provide insight into whether TMTs could be more useful for one type of topic or another. Results: Across all reviews, UP searches seemed to perform better than TMT, but because of the small sample size, none of these differences was statistically significant. UP searches were slightly more sensitive (92% [95% confidence intervals (CI) 85–99%]) than TMT searches (84.9% [95% CI 74.4–95.4%]). The mean number-needed-to-read was 83 (SD 34) for UP and 90 (SD 68) for TMT. Keyword and subject term development using TMTs generally took less time than those developed using UP alone. The average total time was 12 hours (SD 8) to create a complete search strategy by UP librarians, and 5 hours (SD 2) for the TMT librarians. TMTs neither affected search evaluation time nor improved identification of exclusion concepts (irrelevant records) that can be safely removed from the search set. Conclusion: Across all reviews but one, TMT searches were less sensitive than UP searches. For simple SR topics (i.e., single indication–single drug), TMT searches were slightly less sensitive, but reduced time spent in search design. For complex SR topics (e.g., multicomponent interventions), TMT searches were less sensitive than UP searches; nevertheless, in complex reviews, they identified unique eligible citations not found by the UP searches. TMT searches also reduced time spent in search strategy development. For all evidence synthesis types, TMT searches may be more efficient in reviews where comprehensiveness is not paramount, or as an adjunct to UP for evidence syntheses, because they can identify unique includable citations. If TMTs were easier to learn and use, their utility would be increased.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Contents

Academic literature on the topic 'Variable sample size methods'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Journal articles on the topic "Variable sample size methods"

Dissertations / Theses on the topic "Variable sample size methods"

Books on the topic "Variable sample size methods"

Book chapters on the topic "Variable sample size methods"

Conference papers on the topic "Variable sample size methods"

Reports on the topic "Variable sample size methods"