Literatura académica sobre el tema "Missing observations (Statistics) Sampling (Statistics) Multiple imputation (Statistics)"

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Missing observations (Statistics) Sampling (Statistics) Multiple imputation (Statistics)".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Artículos de revistas sobre el tema "Missing observations (Statistics) Sampling (Statistics) Multiple imputation (Statistics)"

1

Liu, Gang y Jukka-Pekka Onnela. "Bidirectional imputation of spatial GPS trajectories with missingness using sparse online Gaussian Process". Journal of the American Medical Informatics Association 28, n.º 8 (8 de junio de 2021): 1777–84. http://dx.doi.org/10.1093/jamia/ocab069.

Texto completo
Resumen
Abstract Objective We propose a bidirectional GPS imputation method that can recover real-world mobility trajectories even when a substantial proportion of the data are missing. The time complexity of our online method is linear in the sample size, and it provides accurate estimates on daily or hourly summary statistics such as time spent at home and distance traveled. Materials and Methods To preserve a smartphone’s battery, GPS may be sampled only for a small portion of time, frequently <10%, which leads to a substantial missing data problem. We developed an algorithm that simulates an individual’s trajectory based on observed GPS location traces using sparse online Gaussian Process to addresses the high computational complexity of the existing method. The method also retains the spherical geometry of the problem, and imputes the missing trajectory in a bidirectional fashion with multiple condition checks to improve accuracy. Results We demonstrated that (1) the imputed trajectories mimic the real-world trajectories, (2) the confidence intervals of summary statistics cover the ground truth in most cases, and (3) our algorithm is much faster than existing methods if we have more than 3 months of observations; (4) we also provide guidelines on optimal sampling strategies. Conclusions Our approach outperformed existing methods and was significantly faster. It can be used in settings in which data need to be analyzed and acted on continuously, for example, to detect behavioral anomalies that might affect treatment adherence, or to learn about colocations of individuals during an epidemic.
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Siswantining, Titin, Muhammad Ihsan, Saskya Mary Soemartojo, Devvi Sarwinda, Herley Shaori Al-Ash y Ika Marta Sari. "MULTIPLE IMPUTATION FOR ORDINARY COUNT DATA BY NORMAL DISTRIBUTION APPROXIMATION". MEDIA STATISTIKA 14, n.º 1 (24 de junio de 2021): 68–78. http://dx.doi.org/10.14710/medstat.14.1.68-78.

Texto completo
Resumen
Missing values are a problem that is often encountered in various fields and must be addressed to obtain good statistical inference such as parameter estimation. Missing values can be found in any type of data, included count data that has Poisson distributed. One solution to overcome that problem is applying multiple imputation techniques. The multiple imputation technique for the case of count data consists of three main stages, namely the imputation, the analysis, and pooling parameter. The use of the normal distribution refers to the sampling distribution using the central limit theorem for discrete distributions. This study is also equipped with numerical simulations which aim to compare accuracy based on the resulting bias value. Based on the study, the solutions proposed to overcome the missing values in the count data yield satisfactory results. This is indicated by the size of the bias parameter estimate is small. But the bias value tends to increase with increasing percentage of observation of missing values and when the parameter values are small.
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Srivastava, Muni S. y Mohammad Dolatabadi. "Multiple imputation and other resampling schemes for imputing missing observations". Journal of Multivariate Analysis 100, n.º 9 (octubre de 2009): 1919–37. http://dx.doi.org/10.1016/j.jmva.2009.06.003.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Lewis, Taylor, Elizabeth Goldberg, Nathaniel Schenker, Vladislav Beresovsky, Susan Schappert, Sandra Decker, Nancy Sonnenfeld y Iris Shimizu. "The Relative Impacts of Design Effects and Multiple Imputation on Variance Estimates: A Case Study with the 2008 National Ambulatory Medical Care Survey". Journal of Official Statistics 30, n.º 1 (1 de marzo de 2014): 147–61. http://dx.doi.org/10.2478/jos-2014-0008.

Texto completo
Resumen
Abstract The National Ambulatory Medical Care Survey collects data on office-based physician care from a nationally representative, multistage sampling scheme where the ultimate unit of analysis is a patient-doctor encounter. Patient race, a commonly analyzed demographic, has been subject to a steadily increasing item nonresponse rate. In 1999, race was missing for 17 percent of cases; by 2008, that figure had risen to 33 percent. Over this entire period, single imputation has been the compensation method employed. Recent research at the National Center for Health Statistics evaluated multiply imputing race to better represent the missing-data uncertainty. Given item nonresponse rates of 30 percent or greater, we were surprised to find many estimates’ ratios of multiple-imputation to single-imputation estimated standard errors close to 1. A likely explanation is that the design effects attributable to the complex sample design largely outweigh any increase in variance attributable to missing-data uncertainty.
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Javadi, Sara, Abbas Bahrampour, Mohammad Mehdi Saber, Behshid Garrusi y Mohammad Reza Baneshi. "Evaluation of Four Multiple Imputation Methods for Handling Missing Binary Outcome Data in the Presence of an Interaction between a Dummy and a Continuous Variable". Journal of Probability and Statistics 2021 (17 de mayo de 2021): 1–14. http://dx.doi.org/10.1155/2021/6668822.

Texto completo
Resumen
Multiple imputation by chained equations (MICE) is the most common method for imputing missing data. In the MICE algorithm, imputation can be performed using a variety of parametric and nonparametric methods. The default setting in the implementation of MICE is for imputation models to include variables as linear terms only with no interactions, but omission of interaction terms may lead to biased results. It is investigated, using simulated and real datasets, whether recursive partitioning creates appropriate variability between imputations and unbiased parameter estimates with appropriate confidence intervals. We compared four multiple imputation (MI) methods on a real and a simulated dataset. MI methods included using predictive mean matching with an interaction term in the imputation model in MICE (MICE-interaction), classification and regression tree (CART) for specifying the imputation model in MICE (MICE-CART), the implementation of random forest (RF) in MICE (MICE-RF), and MICE-Stratified method. We first selected secondary data and devised an experimental design that consisted of 40 scenarios (2 × 5 × 4), which differed by the rate of simulated missing data (10%, 20%, 30%, 40%, and 50%), the missing mechanism (MAR and MCAR), and imputation method (MICE-Interaction, MICE-CART, MICE-RF, and MICE-Stratified). First, we randomly drew 700 observations with replacement 300 times, and then the missing data were created. The evaluation was based on raw bias (RB) as well as five other measurements that were averaged over the repetitions. Next, in a simulation study, we generated data 1000 times with a sample size of 700. Then, we created missing data for each dataset once. For all scenarios, the same criteria were used as for real data to evaluate the performance of methods in the simulation study. It is concluded that, when there is an interaction effect between a dummy and a continuous predictor, substantial gains are possible by using recursive partitioning for imputation compared to parametric methods, and also, the MICE-Interaction method is always more efficient and convenient to preserve interaction effects than the other methods.
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Kovtun, N. V. y A. N. Ya Fataliieva. "New Trends in Evidence-based Statistics: Data Imputation Problems". Statistics of Ukraine 87, n.º 4 (12 de marzo de 2020): 4–13. http://dx.doi.org/10.31767/su.4(87)2019.04.01.

Texto completo
Resumen
The main reasons for omissions are: 1. Exclusion of the subject from the study due to non-compliance with study requirements; 2. The occurrence of an adverse event; 3. Missing result; 4. Lack of registration; 5. Researchers’ act of omission and / or commission.We can define the following data gap limits: 1) Less than 5% of omissions are insignificant and they do not affect the research results; 2) Data losses of 20% and more question the integrity of research results. The higher the share of the missing data, the less reliable the conclusions are, and the more difficult to prove the treatment efficiency is. Consequently, missing data is a potential source of bias when analyzing data. Exclusion of subjects can affect the compatibility of groups and subgroups, which leads to bias in the estimates.There are different ways to deal with missing data. The simplest is to exclude the subject from the calculations. But the consequences of this approach are: reduction in sample size; compromise in the extent of relevance for statistical inferences; change of a confidence interval (e.g. narrowing resulting from underestimation of variances). Hence, it is important to identify the nature of the omission when dealing with missing data which can be of missing completely at random (MCAR), missing at random (MAR) and missing not at random. This necessitates using an appropriate method of data processing with missing values: exclusion, filling, weighing and modeling. All these methods give different results with different volumes and nature of omissions.We attempted to evaluate the results of different imputation methods by using a sample with different proportions of missing data that were simulated. Thus, with 10% of the MCAR omissions, parameter estimates and p-value for two factors, resulting from the application of the first group of methods, were close to the result from complete data. Average square errors that were calculated by using the method of the absolute average, and the method of filling blank spaces with successive selection, were closer to the standard; all other methods overvalued this estimate. Coefficient of determination was almost similar to the initial data when the method of filling blank spaces with successive selection was applied. Data with 25% of missing MCAR: factor – treatment group became insignificant when the method of filling with absolute and conditional averages was applied. The lowest estimate for coefficient of determination was found when the method of filling with absolute average values was applied, and overestimation was the least when the method of filling blank spaces with successive selection was applied. The changes were minimal with other approaches. Thus, parameter estimates and p-value resulting from the application of the analysis method of available cases were closer to the result available from the regression on the complete data.Data with 50% of missing MCAR: Pre-treatment weight became insignificant when the analysis method of complete observations was applied. Factor treatment group became insignificant when the method of filling blank spaces with successive selection was applied. The most accurate estimate of pre-treatment weight variable was received from the result of the method of conditional average. But, the method of filling with absolute average can be singled out - its results were the closest to the initial data.According to the results of imputation with 10% and 50% of missing MAR data by each method, the change in parameter estimate for an intercept and two factors were minimal. It is with the application of the methods of multiple imputation that average square error and determination coefficient were the closest to the results, received from using complete data.This study identifies the weaknesses and the strengths of different methods of data imputation, and presents the effectiveness of applying the one method over the other one with different shares of missed information. Undisputedly, the result from this study established that the approach to the imputation process cannot be an “one-size-fits-all” and the imputation problem should be solved on a case-by-case basis by analysis of the existing database, taking into account not only the characteristics of the data itself and the volume of omissions, but also the expected contribution(s) from a particular study.
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Lu, Kaifeng. "Number of imputations needed to stabilize estimated treatment difference in longitudinal data analysis". Statistical Methods in Medical Research 26, n.º 2 (10 de octubre de 2014): 674–90. http://dx.doi.org/10.1177/0962280214554439.

Texto completo
Resumen
Multiple imputation procedures replace each missing value with a set of plausible values based on the posterior predictive distribution of missing data given observed data. In many applications, as few as five imputations are adequate to achieve high efficiency relative to an infinite number of imputations. However, substantially more imputations are often needed to stabilize imputation-based inference at the analysis stage. Imputation-based inference at the analysis stage is considered stable if the conditional variability of the multiple imputation estimator, half-width of 95% confidence interval, test statistic, and estimated fraction of missing information given observed data is within specified thresholds for simulation error. For the estimation of treatment difference at study end for normally distributed responses in longitudinal trials, we calculate the multiple imputation quantities for an infinite number of imputations analytically and use simulations to assess the variability of the number of imputations needed at the analysis stage in repeated sampling.
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Hughes, RA, JAC Sterne y K. Tilling. "Comparison of imputation variance estimators". Statistical Methods in Medical Research 25, n.º 6 (11 de julio de 2016): 2541–57. http://dx.doi.org/10.1177/0962280214526216.

Texto completo
Resumen
Appropriate imputation inference requires both an unbiased imputation estimator and an unbiased variance estimator. The commonly used variance estimator, proposed by Rubin, can be biased when the imputation and analysis models are misspecified and/or incompatible. Robins and Wang proposed an alternative approach, which allows for such misspecification and incompatibility, but it is considerably more complex. It is unknown whether in practice Robins and Wang’s multiple imputation procedure is an improvement over Rubin’s multiple imputation. We conducted a critical review of these two multiple imputation approaches, a re-sampling method called full mechanism bootstrapping and our modified Rubin’s multiple imputation procedure via simulations and an application to data. We explored four common scenarios of misspecification and incompatibility. In general, for a moderate sample size ( n = 1000), Robins and Wang’s multiple imputation produced the narrowest confidence intervals, with acceptable coverage. For a small sample size ( n = 100) Rubin’s multiple imputation, overall, outperformed the other methods. Full mechanism bootstrapping was inefficient relative to the other methods and required modelling of the missing data mechanism under the missing at random assumption. Our proposed modification showed an improvement over Rubin’s multiple imputation in the presence of misspecification. Overall, Rubin’s multiple imputation variance estimator can fail in the presence of incompatibility and/or misspecification. For unavoidable incompatibility and/or misspecification, Robins and Wang’s multiple imputation could provide more robust inferences.
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Ni, Daiheng y John D. Leonard. "Markov Chain Monte Carlo Multiple Imputation Using Bayesian Networks for Incomplete Intelligent Transportation Systems Data". Transportation Research Record: Journal of the Transportation Research Board 1935, n.º 1 (enero de 2005): 57–67. http://dx.doi.org/10.1177/0361198105193500107.

Texto completo
Resumen
The rich data on intelligent transportation systems (ITS) are a precious resource for transportation researchers and practitioners. However, the usability of this resource is greatly limited by missing data. Many imputation methods have been proposed in the past decade. However, some issues are still not addressed or are not sufficiently addressed, for example, the missing of entire records, temporal correlation in observations, natural characteristics in raw data, and unbiased estimates for missing values. This paper proposes an advanced imputation method based on recent development in other disciplines, especially applied statistics. The method uses a Bayesian network to learn from the raw data and a Markov chain Monte Carlo technique to sample from the probability distributions learned by the Bayesian network. It imputes the missing data multiple times and makes statistical inferences about the result. In addition, the method incorporates a time series model so that it allows data missing in entire rows–-an unfavorable missing pattern frequently seen in ITS data. Empirical study shows that the proposed method is robust and accurate. It is ideal for use as a high-quality imputation method for off-line application.
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Fu, Yingpeng, Hongjian Liao y Longlong Lv. "A Comparative Study of Various Methods for Handling Missing Data in UNSODA". Agriculture 11, n.º 8 (30 de julio de 2021): 727. http://dx.doi.org/10.3390/agriculture11080727.

Texto completo
Resumen
UNSODA, a free international soil database, is very popular and has been used in many fields. However, missing soil property data have limited the utility of this dataset, especially for data-driven models. Here, three machine learning-based methods, i.e., random forest (RF) regression, support vector (SVR) regression, and artificial neural network (ANN) regression, and two statistics-based methods, i.e., mean and multiple imputation (MI), were used to impute the missing soil property data, including pH, saturated hydraulic conductivity (SHC), organic matter content (OMC), porosity (PO), and particle density (PD). The missing upper depths (DU) and lower depths (DL) for the sampling locations were also imputed. Before imputing the missing values in UNSODA, a missing value simulation was performed and evaluated quantitatively. Next, nonparametric tests and multiple linear regression were performed to qualitatively evaluate the reliability of these five imputation methods. Results showed that RMSEs and MAEs of all features fluctuated within acceptable ranges. RF imputation and MI presented the lowest RMSEs and MAEs; both methods are good at explaining the variability of data. The standard error, coefficient of variance, and standard deviation decreased significantly after imputation, and there were no significant differences before and after imputation. Together, DU, pH, SHC, OMC, PO, and PD explained 91.0%, 63.9%, 88.5%, 59.4%, and 90.2% of the variation in BD using RF, SVR, ANN, mean, and MI, respectively; and this value was 99.8% when missing values were discarded. This study suggests that the RF and MI methods may be better for imputing the missing data in UNSODA.
Los estilos APA, Harvard, Vancouver, ISO, etc.
Más fuentes

Tesis sobre el tema "Missing observations (Statistics) Sampling (Statistics) Multiple imputation (Statistics)"

1

Kosler, Joseph Stephen. "Multiple comparisons using multiple imputation under a two-way mixed effects interaction model". Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1150482904.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Guo, Xu. "Checking the adequacy of regression models with complex data structure". HKBU Institutional Repository, 2014. https://repository.hkbu.edu.hk/etd_oa/90.

Texto completo
Resumen
In this thesis, we investigate the model checking problem for parametric regression model with missing response at random and nonignorable missing response. Besides, we also propose a hypothesis-adaptive procedure which is based on the dimension reduction theory. Finally, to extend our methods to missing response situation, we consider the dimension reduction problem with missing response at random. The .rst part of the thesis introduces the model checking for parametric models with response missing at random which is a more general missing mechanism than missing completely at random. Di.erent from existing approaches, two tests have normal distributions as the limiting null distributions no matter whether the inverse probability weight is estimated parametrically or nonparametrically. Thus, p-values can be easily determined. This observation shows that slow convergence rate of non­parametric estimation does not have signi.cant e.ect on the asymptotic behaviours of the tests although it may have impact in .nite sample scenarios. The tests can de­tect the alternatives distinct from the null hypothesis at a nonparametric rate which is an optimal rate for locally smoothing-based methods in this area. Simulation study is carried out to examine the performance of the tests. The tests are also applied to analyze a data set on monozygotic twins for illustration. In the second part of the thesis, we consider model checking for general linear re­gression model with non-ignorable missing response. Based on an exponential tilting model, we .rst propose three estimators for the unknown parameter in the general linear regression model. Three empirical process-based tests are constructed. We discuss the asymptotic properties of the proposed tests under null and local alterna­tive hypothesis with di.erent scenarios. We .nd that these three tests perform the same in the asymptotic sense. Simulation studies are also carried out to assess the performance of our proposed test procedures. In the third part, we revisit traditional local smoothing model checking proce­dures. Noticing that the general nonparametric regression model can be considered as a special multi-index model, we propose an adaptive testing procedure based on the dimension reduction theory. To our surprise, our method can detect local alter­native at faster rate than the traditional optimal rate. The theory indicates that in model checking problem, dimensionality may not have strong impact. Simulations are carried out to examine the performance of our methodology. A real data analysis is conducted for illustration. In the last part, we study the dimension reduction problem with missing response at random. Based on the work in this part, we can extend the adaptive testing pro­cedure introduced in the third part to the missing response situation. When there are many predictors, how to e.ciently impute responses missing at random is an important problem to deal with for regression analysis because this missing mech­anism, unlike missing completely at random, is highly related to high-dimensional predictor vector. In su.cient dimension reduction framework, the fusion-re.nement (FR) method in the literature is a promising approach. To make estimation more accurate and e.cient, two methods are suggested in this paper. Among them, one method uses the observed data to help on missing data generation, and the other one is an ad hoc approach that mainly reduces the dimension in the nonparametric smoothing in data generation. A data-adaptive synthesization of these two methods is also developed. Simulations are conducted to examine their performance and a HIV clinical trial dataset is analysed for illustration. Keywords: Model checking; Inverse probability weight; Non-ignorable missing re­sponse; Adaptive; Central subspace; Dimension reduction; Data-adaptive Synthesiza­tion; Missing recovery; Missing response at random; Multiple imputation.
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Alemdar, Meltem. "A Monte Carlo study the impact of missing data in cross-classification random effects models /". Atlanta, Ga. : Georgia State University, 2008. http://digitalarchive.gsu.edu/eps_diss/34/.

Texto completo
Resumen
Thesis (Ph. D.)--Georgia State University, 2008.
Title from title page (Digital Archive@GSU, viewed July 20, 2010) Carolyn F. Furlow, committee chair; Philo A. Hutcheson, Phillip E. Gagne, Sheryl A. Gowen, committee members. Includes bibliographical references (p. 96-100).
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Merkle, Edgar C. "Bayesian estimation of factor analysis models with incomplete data". Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1126895149.

Texto completo
Resumen
Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains xi, 106 p.; also includes graphics. Includes bibliographical references (p. 103-106). Available online via OhioLINK's ETD Center
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Deng, Wei. "Multiple imputation for marginal and mixed models in longitudinal data with informative missingness". Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1126890027.

Texto completo
Resumen
Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains xiii, 108 p.; also includes graphics. Includes bibliographical references (p. 104-108). Available online via OhioLINK's ETD Center
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Kinney, Satkartar K. "Model Selection and Multivariate Inference Using Data Multiply Imputed for Disclosure Limitation and Nonresponse". Diss., 2007. http://hdl.handle.net/10161/437.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Amer, Safaa R. "Neural network imputation : a new fashion or a good tool". Thesis, 2004. http://hdl.handle.net/1957/29926.

Texto completo
Resumen
Most statistical surveys and data collection studies encounter missing data. A common solution to this problem is to discard observations with missing data while reporting the percentage of missing observations in different output tables. Imputation is a tool used to fill in the missing values. This dissertation introduces the missing data problem as well as traditional imputation methods (e.g. hot deck, mean imputation, regression, Markov Chain Monte Carlo, Expectation-Maximization, etc.). The use of artificial neural networks (ANN), a data mining technique, is proposed as an effective imputation procedure. During ANN imputation, computational effort is minimized while accounting for sample design and imputation uncertainty. The mechanism and use of ANN in imputation for complex survey designs is investigated. Imputation methods are not all equally good, and none are universally good. However, simulation results and applications in this dissertation show that regression, Markov chain Monte Carlo, and ANN yield comparable results. Artificial neural networks could be considered as implicit models that take into account the sample design without making strong parametric assumptions. Artificial neural networks make few assumptions about the data, are asymptotically good and robust to multicollinearity and outliers. Overall, ANN could be time and resources efficient for an experienced user compared to other conventional imputation techniques.
Graduation date: 2005
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Hassan, Ali Satty Ali. "Comparative approaches to handling missing data, with particular focus on multiple imputation for both cross-sectional and longitudinal models". Thesis, 2012. http://hdl.handle.net/10413/9119.

Texto completo
Resumen
Much data-based research are characterized by the unavoidable problem of incompleteness as a result of missing or erroneous values. This thesis discusses some of the various strategies and basic issues in statistical data analysis to address the missing data problem, and deals with both the problem of missing covariates and missing outcomes. We restrict our attention to consider methodologies which address a specific missing data pattern, namely monotone missingness. The thesis is divided into two parts. The first part placed a particular emphasis on the so called missing at random (MAR) assumption, but focuses the bulk of attention on multiple imputation techniques. The main aim of this part is to investigate various modelling techniques using application studies, and to specify the most appropriate techniques as well as gain insight into the appropriateness of these techniques for handling incomplete data analysis. This thesis first deals with the problem of missing covariate values to estimate regression parameters under a monotone missing covariate pattern. The study is devoted to a comparison of different imputation techniques, namely markov chain monte carlo (MCMC), regression, propensity score (PS) and last observation carried forward (LOCF). The results from the application study revealed that we have universally best methods to deal with missing covariates when the missing data pattern is monotone. Of the methods explored, the MCMC and regression methods of imputation to estimate regression parameters with monotone missingness were preferable to the PS and LOCF methods. This study is also concerned with comparative analysis of the techniques applied to incomplete Gaussian longitudinal outcome or response data due to random dropout. Three different methods are assessed and investigated, namely multiple imputation (MI), inverse probability weighting (IPW) and direct likelihood analysis. The findings in general favoured MI over IPW in the case of continuous outcomes, even when the MAR mechanism holds. The findings further suggest that the use of MI and direct likelihood techniques lead to accurate and equivalent results as both techniques arrive at the same substantive conclusions. The study also compares and contrasts several statistical methods for analyzing incomplete non-Gaussian longitudinal outcomes when the underlying study is subject to ignorable dropout. The methods considered include weighted generalized estimating equations (WGEE), multiple imputation after generalized estimating equations (MI-GEE) and generalized linear mixed model (GLMM). The current study found that the MI-GEE method was considerably robust, doing better than all the other methods in terms of small and large sample sizes, regardless of the dropout rates. The primary interest of the second part of the thesis falls under the non-ignorable dropout (MNAR) modelling frameworks that rely on sensitivity analysis in modelling incomplete Gaussian longitudinal data. The aim of this part is to deal with non-random dropout by explicitly modelling the assumptions that caused the dropout and incorporated this additional sub-model into the model for the measurement data, and to assess the sensitivity of the modelling assumptions. The study pays attention to the analysis of repeated Gaussian measures subject to potentially non-random dropout in order to study the influence on inference that might be caused in the data by the dropout process. We consider the construction of a particular type of selection model, namely the Diggle-Kenward model as a tool for assessing the sensitivity of a selection model in terms of the modelling assumptions. The major conclusions drawn were that there was evidence in favour of the MAR process rather than an MCAR process in the context of the assumed model. In addition, there was the need to obtain further insight into the data by comparing various sensitivity analysis frameworks. Lastly, two families of models were also compared and contrasted to investigate the potential influence on inference that dropout might have or exert on the dependent measurement data considered, and to deal with incomplete sequences. The models were based on selection and pattern mixture frameworks used for sensitivity analysis to jointly model the distribution of the dropout process and longitudinal measurement process. The results of the sensitivity analysis were in agreement and hence led to similar parameter estimates. Additional confidence in the findings was gained as both models led to similar results for significant effects such as marginal treatment effects.
Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2012.
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Oh, Sohae. "Multiple Imputation on Missing Values in Time Series Data". Thesis, 2015. http://hdl.handle.net/10161/10447.

Texto completo
Resumen

Financial stock market data, for various reasons, frequently contain missing values. One reason for this is that, because the markets close for holidays, daily stock prices are not always observed. This creates gaps in information, making it difficult to predict the following day’s stock prices. In this situation, information during the holiday can be “borrowed” from other countries’ stock market, since global stock prices tend to show similar movements and are in fact highly correlated. The main goal of this study is to combine stock index data from various markets around the world and develop an algorithm to impute the missing values in individual stock index using “information-sharing” between different time series. To develop imputation algorithm that accommodate time series-specific features, we take multiple imputation approach using dynamic linear model for time-series and panel data. This algorithm assumes ignorable missing data mechanism, as which missingness due to holiday. The posterior distributions of parameters, including missing values, is simulated using Monte Carlo Markov Chain (MCMC) methods and estimates from sets of draws are then combined using Rubin’s combination rule, rendering final inference of the data set. Specifically, we use the Gibbs sampler and Forward Filtering and Backward Sampling (FFBS) to simulate joint posterior distribution and posterior predictive distribution of latent variables and other parameters. A simulation study is conducted to check the validity and the performance of the algorithm using two error-based measurements: Root Mean Square Error (RMSE), and Normalized Root Mean Square Error (NRMSE). We compared the overall trend of imputed time series with complete data set, and inspected the in-sample predictability of the algorithm using Last Value Carried Forward (LVCF) method as a bench mark. The algorithm is applied to real stock price index data from US, Japan, Hong Kong, UK and Germany. From both of the simulation and the application, we concluded that the imputation algorithm performs well enough to achieve our original goal, predicting the stock price for the opening price after a holiday, outperforming the benchmark method. We believe this multiple imputation algorithm can be used in many applications that deal with time series with missing values such as financial and economic data and biomedical data.


Thesis
Los estilos APA, Harvard, Vancouver, ISO, etc.

Libros sobre el tema "Missing observations (Statistics) Sampling (Statistics) Multiple imputation (Statistics)"

1

Flexible imputation of missing data. Boca Raton: CRC Press, 2012.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Stata multiple-imputation reference manual: Release 12. College Station, Tex: Stata Press, 2011.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

LP, StataCorp. Stata multiple-imputation reference manual: Release 11. College Station, Tex: Stata Press, 2009.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Schreuder, Hans T. Data estimation and prediction for natural resources public data. [Fort Collins, Colo.?]: U.S. Dept. of Agriculture, Forest Service, Rocky Mountain Research Station, 1998.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

(Statistician), Shao Jun, ed. Statistical methods for handling incomplete data. Boca Raton: CRC Press, Taylor & Francis Group, 2014.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Armknecht, Paul A. Price imputation and other techniques for dealing with missing observations, seasonality and quality change in price indices. [Washington, D.C.]: International Monetary Fund, Statistics Department, 1999.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Selection Models For Nonignorable Missing Data (Anwendungsorientierte Statistik, Bd. 8). Morehouse Publishing, 2005.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

M, Reich Robin y Rocky Mountain Research Station (Fort Collins, Colo.), eds. Data estimation and prediction for natural resources public data. [Fort Collins, Colo.?]: U.S. Dept. of Agriculture, Forest Service, Rocky Mountain Research Station, 1998.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

LP, StataCorp, ed. Stata multiple-imputation reference manual: Release 11. College Station, Tex: Stata Press, 2009.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Fehlende Daten in Additiven Modellen (Anwendungsorientierte Statistik). Peter Lang Publishing, 2003.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!

Pasar a la bibliografía