Dissertations / Theses: 'Ridge regression (Statistics)'

1

Williams, Ulyana P. "On Some Ridge Regression Estimators for Logistic Regression Models." FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3667.

Full text

Abstract:

The purpose of this research is to investigate the performance of some ridge regression estimators for the logistic regression model in the presence of moderate to high correlation among the explanatory variables. As a performance criterion, we use the mean square error (MSE), the mean absolute percentage error (MAPE), the magnitude of bias, and the percentage of times the ridge regression estimator produces a higher MSE than the maximum likelihood estimator. A Monto Carlo simulation study has been executed to compare the performance of the ridge regression estimators under different experimental conditions. The degree of correlation, sample size, number of independent variables, and log odds ratio has been varied in the design of experiment. Simulation results show that under certain conditions, the ridge regression estimators outperform the maximum likelihood estimator. Moreover, an empirical data analysis supports the main findings of this study. This thesis proposed and recommended some good ridge regression estimators of the logistic regression model for the practitioners in the field of health, physical and social sciences.

APA, Harvard, Vancouver, ISO, and other styles

2

Zaldivar, Cynthia. "On the Performance of some Poisson Ridge Regression Estimators." FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3669.

Full text

Abstract:

Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo simulation study was conducted to compare performance of the estimators under three experimental conditions: correlation, sample size, and intercept. It is evident from simulation results that all ridge estimators performed better than the ML estimator. We proposed new estimators based on the results, which performed very well compared to the original estimators. Finally, the estimators are illustrated using data on recreational habits.

APA, Harvard, Vancouver, ISO, and other styles

3

Saha, Angshuman. "Application of ridge regression for improved estimation of parameters in compartmental models /." Thesis, Connect to this title online; UW restricted, 1998. http://hdl.handle.net/1773/8945.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Björkström, Anders. "Regression methods in multidimensional prediction and estimation." Doctoral thesis, Stockholm University, Department of Mathematics, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-7025.

Full text

Abstract:

In regression with near collinear explanatory variables, the least squares predictor has large variance. Ordinary least squares regression (OLSR) often leads to unrealistic regression coefficients. Several regularized regression methods have been proposed as alternatives. Well-known are principal components regression (PCR), ridge regression (RR) and continuum regression (CR). The latter two involve a continuous metaparameter, offering additional flexibility.

For a univariate response variable, CR incorporates OLSR, PLSR, and PCR as special cases, for special values of the metaparameter. CR is also closely related to RR. However, CR can in fact yield regressors that vary discontinuously with the metaparameter. Thus, the relation between CR and RR is not always one-to-one. We develop a new class of regression methods, LSRR, essentially the same as CR, but without discontinuities, and prove that any optimization principle will yield a regressor proportional to a RR, provided only that the principle implies maximizing some function of the regressor's sample correlation coefficient and its sample variance. For a multivariate response vector we demonstrate that a number of well-established regression methods are related, in that they are special cases of basically one general procedure. We try a more general method based on this procedure, with two meta-parameters. In a simulation study we compare this method to ridge regression, multivariate PLSR and repeated univariate PLSR. For most types of data studied, all methods do approximately equally well. There are cases where RR and LSRR yield larger errors than the other methods, and we conclude that one-factor methods are not adequate for situations where more than one latent variable are needed to describe the data. Among those based on latent variables, none of the methods tried is superior to the others in any obvious way.

APA, Harvard, Vancouver, ISO, and other styles

5

Bakshi, Girish. "Comparison of ridge regression and neural networks in modeling multicollinear data." Ohio : Ohio University, 1996. http://www.ohiolink.edu/etd/view.cgi?ohiou1178815205.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Gatz, Philip L. Jr. "A comparison of three prediction based methods of choosing the ridge regression parameter k." Thesis, Virginia Tech, 1985. http://hdl.handle.net/10919/45724.

Full text

Abstract:

A solution to the regression model y = xβ+ε is usually obtained using ordinary least squares. However, when the condition of multicollinearity exists among the regressor variables, then many qualities of this solution deteriorate. The qualities include the variances, the length, the stability, and the prediction capabilities of the solution. An analysis called ridge regression introduced a solution to combat this deterioration (Hoerl and Kennard, 1970a). The method uses a solution biased by a parameter k. Many methods have been developed to determine an optimal value of k. This study chose to investigate three little used methods of determining k: the PRESS statistic, Mallows' C_k. statistic, and DF-trace. The study compared the prediction capabilities of the three methods using data that contained various levels of both collinearity and leverage. This was completed by using a Monte Carlo experiment.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

7

Pascual, Francisco L. "Essays on the optimal selection of series functions." Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2007. http://wwwlib.umi.com/cr/ucsd/fullcit?p3274811.

Full text

Abstract:

Thesis (Ph. D.)--University of California, San Diego, 2007.
Title from first page of PDF file (viewed October 4, 2007). Available via ProQuest Digital Dissertations. Vita. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

8

Shah, Smit. "Comparison of Some Improved Estimators for Linear Regression Model under Different Conditions." FIU Digital Commons, 2015. http://digitalcommons.fiu.edu/etd/1853.

Full text

Abstract:

Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.

APA, Harvard, Vancouver, ISO, and other styles

9

Schwarz, Patrick. "Prediction with Penalized Logistic Regression : An Application on COVID-19 Patient Gender based on Case Series Data." Thesis, Karlstads universitet, Handelshögskolan (from 2013), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-85642.

Full text

Abstract:

The aim of the study was to evaluate dierent types of logistic regression to find the optimal model to predict the gender of hospitalized COVID-19 patients. The models were based on COVID-19 case series data from Pakistan using a set of 18 explanatory variables out of which patient age and BMI were numerical and the rest were categorical variables, expressing symptoms and previous health issues. Compared were a logistic regression using all variables, a logistic regression that used stepwise variable selection with 4 explanatory variables, a logistic Ridge regression model, a logistic Lasso regression model and a logistic Elastic Net regression model. Based on several metrics assessing the goodness of fit of the models and the evaluation of predictive power using the area under the ROC curve the Elastic Net that was only using the Lasso penalty had the best result and was able to predict 82.5% of the test cases correctly.

APA, Harvard, Vancouver, ISO, and other styles

10

Moller, Jurgen Johann. "The implementation of noise addition partial least squares." Thesis, Stellenbosch : University of Stellenbosch, 2009. http://hdl.handle.net/10019.1/3362.

Full text

Abstract:

Thesis (MComm (Statistics and Actuarial Science))--University of Stellenbosch, 2009.
When determining the chemical composition of a specimen, traditional laboratory techniques are often both expensive and time consuming. It is therefore preferable to employ more cost effective spectroscopic techniques such as near infrared (NIR). Traditionally, the calibration problem has been solved by means of multiple linear regression to specify the model between X and Y. Traditional regression techniques, however, quickly fail when using spectroscopic data, as the number of wavelengths can easily be several hundred, often exceeding the number of chemical samples. This scenario, together with the high level of collinearity between wavelengths, will necessarily lead to singularity problems when calculating the regression coefficients. Ways of dealing with the collinearity problem include principal component regression (PCR), ridge regression (RR) and PLS regression. Both PCR and RR require a significant amount of computation when the number of variables is large. PLS overcomes the collinearity problem in a similar way as PCR, by modelling both the chemical and spectral data as functions of common latent variables. The quality of the employed reference method greatly impacts the coefficients of the regression model and therefore, the quality of its predictions. With both X and Y subject to random error, the quality the predictions of Y will be reduced with an increase in the level of noise. Previously conducted research focussed mainly on the effects of noise in X. This paper focuses on a method proposed by Dardenne and Fernández Pierna, called Noise Addition Partial Least Squares (NAPLS) that attempts to deal with the problem of poor reference values. Some aspects of the theory behind PCR, PLS and model selection is discussed. This is then followed by a discussion of the NAPLS algorithm. Both PLS and NAPLS are implemented on various datasets that arise in practice, in order to determine cases where NAPLS will be beneficial over conventional PLS. For each dataset, specific attention is given to the analysis of outliers, influential values and the linearity between X and Y, using graphical techniques. Lastly, the performance of the NAPLS algorithm is evaluated for various

APA, Harvard, Vancouver, ISO, and other styles

11

Jansson, Daniel, and Nils Niklasson. "En analys av statens samhällssatsningar och dess effektivitet för att reducera brottslighet." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-275665.

Full text

Abstract:

Through an analysis of the Swedish state budget, models have been developed to deepen the understanding of the effects that government expenditures have on reducing crime. This has been modeled by examining selected crime categories using the mathematical methods Ridge Regression, Lasso Regression and Principal Component Analysis. Combined with a qualitative study of previous research on the economic aspects of crime, an analysis has been conducted. The mathematical methods indicate that it may be more effective to invest in crime prevention measures, such as increased social protection and focus on vulnerable groups, rather than more direct efforts such as increased resources for the police force. However, the result contradicts some of the accepted economic conclusions on the subject, as these highlight the importance of increasing the number of police officers and harsher penalties. These do however also mention the importance of crime prevention measures such as reducing the gaps in society, which is in line with the results of this work. The conclusion should however be used with caution as the models are based on a number of assumptions and could be improved upon further analysis of these, together with more data points that would strengthen the validity of the analysis more.
Genom en analys av Sveriges statsbudget har modeller tagits fram för att försöka förstå de effekter olika samhällssatsningar har på brottslighet i Sverige. Detta har modellerats genom att undersöka utvalda brottskategorier med hjälp av de matematiska metoderna Ridge Regression, Lasso Regression samt Principal Component Analysis. Tillsammans med en kvalitativ undersökning av tidigare forskning gällande nationalekonomiska aspekter kring brottslighet har en analys sedan genomförts. De matematiska metoderna tyder på att det kan vara mer effektivt att satsa på brottsförebyggande åtgärder, såsom ökat socialt skydd och fokus på utsatta grupper, istället för mer direkta satsningar på brottsförhindrande åtgärder som exempelvis ökade resurser till polisväsendet. Däremot motsäger resultatet en del av de vedertagna nationalekonomiska slutsatserna om ämnet, då dessa belyser vikten av ökade antalet poliser och hårdare straff. De lyfter även fram vikten av brottsförebyggande åtgärder såsom att minska klyftorna i samhället, vilket går i linje med resultatet av detta arbete. Slutsatsen ska dock användas med försiktighet då modellerna bygger på flertalet antaganden och skulle kunna förbättras vid ytterligare analys utav dessa, tillsammans med fler datapunkter som skulle stärka validiteten.

APA, Harvard, Vancouver, ISO, and other styles

12

Solomon, Mary Joanna. "Multivariate Analysis of Korean Pop Music Audio Features." Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1617105874719868.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Bécu, Jean-Michel. "Contrôle des fausses découvertes lors de la sélection de variables en grande dimension." Thesis, Compiègne, 2016. http://www.theses.fr/2016COMP2264/document.

Full text

Abstract:

Dans le cadre de la régression, de nombreuses études s’intéressent au problème dit de la grande dimension, où le nombre de variables explicatives mesurées sur chaque échantillon est beaucoup plus grand que le nombre d’échantillons. Si la sélection de variables est une question classique, les méthodes usuelles ne s’appliquent pas dans le cadre de la grande dimension. Ainsi, dans ce manuscrit, nous présentons la transposition de tests statistiques classiques à la grande dimension. Ces tests sont construits sur des estimateurs des coefficients de régression produits par des approches de régressions linéaires pénalisées, applicables dans le cadre de la grande dimension. L’objectif principal des tests que nous proposons consiste à contrôler le taux de fausses découvertes. La première contribution de ce manuscrit répond à un problème de quantification de l’incertitude sur les coefficients de régression réalisée sur la base de la régression Ridge, qui pénalise les coefficients de régression par leur norme l2, dans le cadre de la grande dimension. Nous y proposons un test statistique basé sur le rééchantillonage. La seconde contribution porte sur une approche de sélection en deux étapes : une première étape de criblage des variables, basée sur la régression parcimonieuse Lasso précède l’étape de sélection proprement dite, où la pertinence des variables pré-sélectionnées est testée. Les tests sont construits sur l’estimateur de la régression Ridge adaptive, dont la pénalité est construite à partir des coefficients de régression du Lasso. Une dernière contribution consiste à transposer cette approche à la sélection de groupes de variables
In the regression framework, many studies are focused on the high-dimensional problem where the number of measured explanatory variables is very large compared to the sample size. If variable selection is a classical question, usual methods are not applicable in the high-dimensional case. So, in this manuscript, we develop the transposition of statistical tests to the high dimension. These tests operate on estimates of regression coefficients obtained by penalized linear regression, which is applicable in high-dimension. The main objective of these tests is the false discovery control. The first contribution of this manuscript provides a quantification of the uncertainty for regression coefficients estimated by ridge regression in high dimension. The Ridge regression penalizes the coefficients on their l2 norm. To do this, we devise a statistical test based on permutations. The second contribution is based on a two-step selection approach. A first step is dedicated to the screening of variables, based on parsimonious regression Lasso. The second step consists in cleaning the resulting set by testing the relevance of pre-selected variables. These tests are made on adaptive-ridge estimates, where the penalty is constructed on Lasso estimates learned during the screening step. A last contribution consists to the transposition of this approach to group-variables selection

APA, Harvard, Vancouver, ISO, and other styles

14

CROPPER, JOHN PHILIP. "TREE-RING RESPONSE FUNCTIONS. AN EVALUATION BY MEANS OF SIMULATIONS (DENDROCHRONOLOGY RIDGE REGRESSION, MULTICOLLINEARITY)." Diss., The University of Arizona, 1985. http://hdl.handle.net/10150/187946.

Full text

Abstract:

The problem of determining the response of tree ring width growth to monthly climate is examined in this study. The objective is to document which of the available regression methods are best suited to deciphering the complex link between tree growth variation and climate. Tree-ring response function analysis is used to determine which instrumental climatic variables are best associated with tree-ring width variability. Ideally such a determination would be accomplished, or verified, through detailed physiological monitoring of trees in their natural environment. A statistical approach is required because such biological studies on mature trees are currently too time consuming to perform. The use of lagged climatic data to duplicate a biological, rather than a calendar, year has resulted in an increase in the degree of intercorrelation (multicollinearity) of the independent climate variables. The presence of multicollinearity can greatly affect the sign and magnitude of estimated regression coefficients. Using series of known response, the effectiveness of five different regression methods were objectively assessed in this study. The results from each of the 2000 regressions were compared to the known regression weights and a measure of relative efficiency computed. The results indicate that ridge regression analysis is, on average, four times more efficient (average relative efficiency of 4.57) than unbiased multiple linear regression at producing good coefficient estimates. The results from principal components regression are slight improvements over those from multiple linear regression with an average relative efficiency of 1.45.

APA, Harvard, Vancouver, ISO, and other styles

15

Rahman, Md Abdur. "Statistical and Machine Learning for assessment of Traumatic Brain Injury Severity and Patient Outcomes." Thesis, Högskolan Dalarna, Institutionen för information och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:du-37710.

Full text

Abstract:

Traumatic brain injury (TBI) is a leading cause of death in all age groups, causing society to be concerned. However, TBI diagnostics and patient outcomes prediction are still lacking in medical science. In this thesis, I used a subset of TBIcare data from Turku University Hospital in Finland to classify the severity, patient outcomes, and CT (computerized tomography) as positive/negative. The dataset was derived from the comprehensive metabolic profiling of serum samples from TBI patients. The study included 96 TBI patients who were diagnosed as 7 severe (sTBI=7), 10 moderate (moTBI=10), and 79 mild (mTBI=79). Among them, there were 85 good recoveries (Good_Recovery=85) and 11 bad recoveries (Bad_Recovery=11), as well as 49 CT positive (CT. Positive=49) and 47 CT negative (CT. Negative=47). There was a total of 455 metabolites (features), excluding three response variables. Feature selection techniques were applied to retain the most important features while discarding the rest. Subsequently, four classifications were used for classification: Ridge regression, Lasso regression, Neural network, and Deep learning. Ridge regression yielded the best results for binary classifications such as patient outcomes and CT positive/negative. The accuracy of CT positive/negative was 74% (AUC of 0.74), while the accuracy of patient outcomes was 91% (AUC of 0.91). For severity classification (multi-class classification), neural networks performed well, with a total accuracy of 90%. Despite the limited number of data points, the overall result was satisfactory.

APA, Harvard, Vancouver, ISO, and other styles

16

Dall'Olio, Lorenzo. "Estimation of biological vascular ageing via photoplethysmography: a comparison between statistical learning and deep learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/21687/.

Full text

Abstract:

This work aims to exploit the biological ageing phenomena which affects human blood vessels. The analysis is performed starting from a database of photoplethysmographic signals acquired through smartphones. The further step involves a preprocessing phase, where the signals are detrended using a central moving average filter, demoduled using the envelope of the analytic signal obtained from the Hilbert transform, denoised using the central moving average filter over the envelope. After the preprocessing we compared two different approaches. The first one regards Statistical Learning, which involves feature extraction and selection through the usage of statistics and machine learning algorithms. This in order to perform a classification supervised task over the chronological age of the individual, which is used as a proxy for healthy/non healthy vascular ageing. The second one regards Deep Learning, which involves the realisation of a convolutional neural network to perform the same task, but avoiding the feature extraction/selection step and so possible bias introduced by such phases. Doing so we obtained comparable outcomes in terms of area under the curve metrics from a 12 layers ResNet convolutional network and a support vector machine using just covariates together with a couple of extracted features, acquiring clues regarding the possible usage of such features as biomarkers for the vascular ageing process. The two mentioned features can be related with increasing arterial stiffness and increasing signal randomness due to ageing.

APA, Harvard, Vancouver, ISO, and other styles

17

Dumora, Christophe. "Estimation de paramètres clés liés à la gestion d'un réseau de distribution d'eau potable : Méthode d'inférence sur les noeuds d'un graphe." Thesis, Bordeaux, 2020. http://www.theses.fr/2020BORD0325.

Full text

Abstract:

L'essor des données générées par les capteurs et par les outils opérationnels autour de la gestion des réseaux d'alimentation en eau potable (AEP) rendent ces systèmes de plus en plus complexes et de façon générale les événements plus difficiles à appréhender. L'historique de données lié à la qualité de l’eau distribuée croisé avec la connaissance du patrimoine réseau, des données contextuelles et des paramètres temporels amène à étudier un système complexe de par sa volumétrie et l'existence d'interactions entre ces différentes données de natures diverses pouvant varier dans le temps et l’espace. L'utilisation de graphes mathématiques permet de regrouper toute cette diversité et fournit une représentation complète des réseaux AEP ainsi que les évènements pouvant y survenir ou influer sur leur bon fonctionnement. La théorie des graphes associées à ces graphes mathématiques permet une analyse structurelle et spectrale des réseaux ainsi constitués afin de répondre à des problématiques métiers concrètes et d'améliorer des processus internes existants. Ces graphes sont ensuite utilisés pour répondre au problème d'inférence sur les noeuds d'un très grand graphe à partir de l'observation partielle de quelques données sur un faible nombre de noeuds. Une approche par algorithme d'optimisation sur les graphes est utilisée pour construire une variable numérique de débit en tout noeuds du graphe (et donc en tout point du réseau physique) à l'aide d'algorithme de flots et des données issues des débitmètres réseau. Ensuite une approche de prédiction par noyau reposant sur un estimateur pénalisé de type Ridge, qui soulève des problèmes d'analyse spectrale de grande matrice creuse, permet l'inférence d'un signal observé sur un certains nombre de noeuds en tout point d'un réseau AEP
The rise of data generated by sensors and operational tools around water distribution network (WDN) management make these systems more and more complex and in general the events more difficult to predict. The history of data related to the quality of distributed water crossed with the knowledge of network assets, contextual data and temporal parameters lead to study a complex system due to its volume and the existence of interactions between these various type of data which may vary in time and space. This big variety of data is grouped by the use of mathematical graph and allow to represent WDN as a whole and all the events that may arise therein or influence their proper functioning. The graph theory associated with these mathematical graphs allow a structural and spectral analysis of WDN to answer to specific needs and enhance existing process. These graphs are then used to answer the probleme of inference on the nodes of large graph from the observation of data on a small number of nodes. An approach by optminisation algorithm is used to construct a variable of flow on every nodes of a graph (therefore at any point of a physical network) using flow algorithm and data measured in real time by flowmeters. Then, a kernel prediction approach based on a Ridge estimator, which raises spectral analysis problems of a large sparse matrix, allow the inference of a signal measured on specific nodes of a graph at any point of a WDN

APA, Harvard, Vancouver, ISO, and other styles

18

"Supervised ridge regression in high dimensional linear regression." 2013. http://library.cuhk.edu.hk/record=b5549319.

Full text

Abstract:

在機器學習領域，我們通常有很多的特徵變量，以確定一些回應變量的行為。例如在基因測試問題，我們有數以萬計的基因用來作為特徵變量，而它們與某些疾病的關係需要被確定。沒有提供具體的知識，最簡單和基本的方法來模擬這種問題會是一個線性的模型。有很多現成的方法來解決線性回歸問題，像傳統的普通最小二乘回歸法，嶺回歸和套索回歸。設 N 為樣本數和，p 為特徵變量數，在普通的情況下，我們通常有足夠的樣本（N> P）。在這種情況下，普通線性回歸的方法，例如嶺回歸通常會給予合理的對未來的回應變量測值的預測。隨著現代統計學的發展，我們經常會遇到高維問題（N << P），如 DNA 芯片數據的測試問題。在這些類型的高維問題中，確定特徵變量和回應變量之間的關係在沒有任何進一步的假設的情況下是相當困難的。在很多現實問題中，儘管有大量的特徵變量存在，但是完全有可能只有極少數的特徵變量和回應變量有直接關係，而大部分其他的特徵變量都是無效的。套索和嶺回歸等傳統線性回歸在高維問題中有其局限性。套索回歸在應用於高維問題時，會因為測量噪聲的存在而表現得很糟糕，這將導致非常低的預測準確率。嶺回歸也有其明顯的局限性。它不能夠分開真正的特徵變量和無效的特徵變量。我提出的新方法的目的就是在高維線性回歸中克服以上兩種方法的局限性，從而導致更精確和穩定的預測。想法其實很簡單，與其做一個單一步驟的線性回歸，我們將回歸過程分成兩個步驟。第一步，我们棄那些預測有相關性很小或為零的特徵變量。第二步，我們應該得到一個消減過的特徵變量集，我們將用這個集和回應變量來進行嶺回歸從而得到我們需要的結果。
In the field of statistical learning, we usually have a lot of features to determine the behavior of some response. For example in gene testing problems we have lots of genes as features and their relations with certain disease need to be determined. Without specific knowledge available, the most simple and fundamental way to model this kind of problem would be a linear model. There are many existing method to solve linear regression, like conventional ordinary least squares, ridge regression and LASSO (least absolute shrinkage and selection operator). Let N denote the number of samples and p denote the number of predictors, in ordinary settings where we have enough samples (N > p), ordinary linear regression methods like ridge regression will usually give reasonable predictions for the future values of the response. In the development of modern statistical learning, it's quite often that we meet high dimensional problems (N << p), like documents classification problems and microarray data testing problems. In high-dimensional problems it is generally quite difficult to identify the relationship between the predictors and the response without any further assumptions. Despite the fact that there are many predictors for prediction, most of the predictors are actually spurious in a lot of real problems. A predictor being spurious means that it is not directly related to the response. For example in microarray data testing problems, millions of genes may be available for doing prediction, but only a few hundred genes are actually related to the target disease. Conventional techniques in linear regression like LASSO and ridge regression both have their limitations in high-dimensional problems. The LASSO is one of the "state of the art technique for sparsity recovery, but when applied to high-dimensional problems, LASSO's performance is degraded a lot due to the presence of the measurement noise, which will result in high variance prediction and large prediction error. Ridge regression on the other hand is more robust to the additive measurement noise, but has its obvious limitation of not being able to separate true predictors from spurious predictors. As mentioned previously in many high-dimensional problems a large number of the predictors could be spurious, then in these cases ridge's disability in separating spurious and true predictors will result in poor interpretability of the model as well as poor prediction performance. The new technique that I will propose in this thesis aims to accommodate for the limitations of these two methods thus resulting in more accurate and stable prediction performance in a high-dimensional linear regression problem with signicant measurement noise. The idea is simple, instead of the doing a single step regression, we divide the regression procedure into two steps. In the first step we try to identify the seemingly relevant predictors and those that are obviously spurious by calculating the uni-variant correlations between the predictors and the response. We then discard those predictors that have very small or zero correlation with the response. After the first step we should have obtained a reduced predictor set. In the second step we will perform a ridge regression between the reduced predictor set and the response, the result of this ridge regression will then be our desired output. The thesis will be organized as follows, first I will start with a literature review about the linear regression problem and introduce in details about the ridge and LASSO and explain more precisely about their limitations in high-dimensional problems. Then I will introduce my new method called supervised ridge regression and show the reasons why it should dominate the ridge and LASSO in high-dimensional problems, and some simulation results will be demonstrated to strengthen my argument. Finally I will conclude with the possible limitations of my method and point out possible directions for further investigations.
Detailed summary in vernacular field only.
Zhu, Xiangchen.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves 68-69).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts also in Chinese.
Chapter 1. --- BASICS ABOUT LINEAR REGRESSION --- p.2
Chapter 1.1 --- Introduction --- p.2
Chapter 1.2 --- Linear Regression and Least Squares --- p.2
Chapter 1.2.1 --- Standard Notations --- p.2
Chapter 1.2.2 --- Least Squares and Its Geometric Meaning --- p.4
Chapter 2. --- PENALIZED LINEAR REGRESSION --- p.9
Chapter 2.1 --- Introduction --- p.9
Chapter 2.2 --- Deficiency of the Ordinary Least Squares Estimate --- p.9
Chapter 2.3 --- Ridge Regression --- p.12
Chapter 2.3.1 --- Introduction to Ridge Regression --- p.12
Chapter 2.3.2 --- Expected Prediction Error And Noise Variance Decomposition of Ridge Regression --- p.13
Chapter 2.3.3 --- Shrinkage effects on different principal components by ridge regression --- p.18
Chapter 2.4 --- The LASSO --- p.22
Chapter 2.4.1 --- Introduction to the LASSO --- p.22
Chapter 2.4.2 --- The Variable Selection Ability and Geometry of LASSO --- p.25
Chapter 2.4.3 --- Coordinate Descent Algorithm to solve for the LASSO --- p.28
Chapter 3. --- LINEAR REGRESSION IN HIGH-DIMENSIONAL PROBLEMS --- p.31
Chapter 3.1 --- Introduction --- p.31
Chapter 3.2 --- Spurious Predictors and Model Notations for High-dimensional Linear Regression --- p.32
Chapter 3.3 --- Ridge and LASSO in High-dimensional Linear Regression --- p.34
Chapter 4. --- THE SUPERVISED RIDGE REGRESSION --- p.39
Chapter 4.1 --- Introduction --- p.39
Chapter 4.2 --- Definition of Supervised Ridge Regression --- p.39
Chapter 4.3 --- An Underlying Latent Model --- p.43
Chapter 4.4 --- Ridge LASSO and Supervised Ridge Regression --- p.45
Chapter 4.4.1 --- LASSO vs SRR --- p.45
Chapter 4.4.2 --- Ridge regression vs SRR --- p.46
Chapter 5. --- TESTING AND SIMULATION --- p.49
Chapter 5.1 --- A Simulation Example --- p.49
Chapter 5.2 --- More Experiments --- p.54
Chapter 5.2.1 --- Correlated Spurious and True Predictors --- p.55
Chapter 5.2.2 --- Insufficient Amount of Data Samples --- p.59
Chapter 5.2.3 --- Low Dimensional Problem --- p.62
Chapter 6. --- CONCLUSIONS AND DISCUSSIONS --- p.66
Chapter 6.1 --- Conclusions --- p.66
Chapter 6.2 --- References and Related Works --- p.68

APA, Harvard, Vancouver, ISO, and other styles

19

Lee, Andy Ho-Won. "Ridge regression and diagnostics in generalized linear models." Phd thesis, 1987. http://hdl.handle.net/1885/138444.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

O'Donnell, Robert P. (Robert Paul). "Fisher and logistic discriminant function estimation in the presence of collinearity." Thesis, 1990. http://hdl.handle.net/1957/37471.

Full text

Abstract:

The relative merits of the Fisher linear discriminant function (Efron, 1975) and logistic regression procedure (Press and Wilson, 1978; McLachlan and Byth, 1979), applied to the two group discrimination problem under conditions of multivariate normality and common covariance, have been debated. In related research, DiPillo (1976, 1977, 1979) has argued that a biased Fisher linear discriminant function is preferable when one or more collinearities exist among the classifying variables. This paper proposes a generalized ridge logistic regression (GRL) estimator as a logistic analog to DiPillo's biased alternative estimator. Ridge and Principal Component logistic estimators proposed by Schaefer et al. (1984) for conventional logistic regression are shown to be special cases of this generalized ridge logistic estimator. Two Fisher estimators (Linear Discriminant Function (LDF) and Biased Linear Discriminant Function (BLDF)) and three logistic estimators (Linear Logistic Regression (LLR), Ridge Logistic Regression (RLR) and Principal Component Logistic Regression (PCLR)) are compared in a Monte Carlo simulation under varying conditions of distance between populations, training set s1ze and degree of collinearity. A new approach to the selection of the ridge parameter in the BLDF method is proposed and evaluated. The results of the simulation indicate that two of the biased estimators (BLDF, RLR) produce smaller MSE values and are more stable estimators (smaller standard deviations) than their unbiased counterparts. But the improved performance for MSE does not translate into equivalent improvement in error rates. The expected actual error rates are only marginally smaller for the biased estimators. The results suggest that small training set size, rather than strong collinearity, may produce the greatest classification advantage for the biased estimators. The unbiased estimators (LDF, LLR) produce smaller average apparent error rates. The relative advantage of the Fisher estimators over the logistic estimators is maintained. But, given that the comparison is made under conditions most favorable to the Fisher estimators, the absolute advantage of the Fisher estimators is small. The new ridge parameter selection method for the BLDF estimator performs as well as, but no better than, the method used by DiPillo. The PCLR estimator shows performance comparable to the other estimators when there is a high level of collinearity. However, the estimator gives up a significant degree of performance in conditions where collinearity is not a problem.
Graduation date: 1991

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Ridge regression (Statistics)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles