Dissertations / Theses on the topic 'Multicollinearity'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 48 dissertations / theses for your research on the topic 'Multicollinearity.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Clark, Patrick Carl Jr. "The Effects of Multicollinearity in Multilevel Models." Wright State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=wright1375956788.
Full textDuxbury, Scott W. "Diagnosing Multicollinearity in Exponential Random Graph Models." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1491393848069144.
Full textGou, Zhenkun. "Canonical correlation analysis and artificial neural networks." Thesis, University of the West of Scotland, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.269409.
Full textMånsson, Kristofer. "Issues of multicollinearity and conditional heteroscedasticy in time series econometrics." Doctoral thesis, Internationella Handelshögskolan, Högskolan i Jönköping, IHH, Statistik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-31977.
Full textMoineddin, Rahim. "Comments on Mallow's C¦p statistics and multicollinearity effects on predictions." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/MQ58663.pdf.
Full textBakshi, Girish. "Comparison of ridge regression and neural networks in modeling multicollinear data." Ohio : Ohio University, 1996. http://www.ohiolink.edu/etd/view.cgi?ohiou1178815205.
Full textAlbarracin, Orlando Yesid Esparza. "Generalized autoregressive and moving average models: control charts, multicollinearity, and a new modified model." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-21112017-184544/.
Full textRecentemente, no campo da saúde, gráficos de controle têm sido propostos para monitorar a morbidade ou a mortalidade decorrentes de doenças. Este trabalho está composto por três artigos. Nos dois primeiros artigos, gráficos de controle CUSUM e EWMA foram propostos para monitorar séries temporais de contagens com efeitos sazonais e de tendência usando os modelos Generalized autoregressive and moving average models (GARMA), em vez dos modelos lineares generalizados (GLM), como usualmente são utilizados na prática. Diferentes estatísticas baseadas em transformações, para variávies que seguem uma distribuição Binomial Negativa, foram usadas nestes gráficos de controle. No segundo artigo foram propostas duas novas estatísticas baseadas na razão da função de log-verossimilhança. Diferentes cenários que descrevem perfis de doenças foram considerados para avaliar o efeito da omissão da correlação serial nesses gráficos de controle. Este impacto foi medido em termos do Average Run Lenght (ARL). Notou-se que a negligência da correlação serial induz um aumento de falsos alarmes. Em geral, todas as estatísticas monitoradas apresentaram menores valores de ARL_0 para maiores valores de autocorrelação. No entanto, nenhuma estatística entre as consideradas mostrou ser mais robusta, no sentido de produzir o menor aumento de falsos alarmes nos cenários considerados. No último artigo, foram estudados os modelos GARMA (p, q) com p e q simultaneamente diferentes de zero, uma vez que duas características foram observadas na prática. A primeira é a presença de multicolinearidade, que induz à não-convergência do método de máxima verossimilhança usando mínimos quadrados ponderados reiterados. A segunda é a inclusão dos mesmos termos defasados nos componentes autorregressivos e de médias móveis. Um modelo modificado, GARMA-M, foi apresentado para lidar com a multicolinearidade e melhorar a interpretação dos parâmetros. Em sentido geral, estudos de simulação mostraram que o modelo modificado fornece estimativas mais próximas dos parâmetros e intervalos de confiança com uma cobertura percentual maior do que a obtida nos modelos GARMA. No entanto, algumas restrições no espaço paramétrico são impostas para garantir a estacionariedade do processo. Por último, uma análise de dados reais ilustra o ajuste do modelo GARMA-M para o número de internações diárias de idosos devido a doenças respiratórias de outubro de 2012 a abril de 2015 na cidade de São Paulo, Brasil.
CROPPER, JOHN PHILIP. "TREE-RING RESPONSE FUNCTIONS. AN EVALUATION BY MEANS OF SIMULATIONS (DENDROCHRONOLOGY RIDGE REGRESSION, MULTICOLLINEARITY)." Diss., The University of Arizona, 1985. http://hdl.handle.net/10150/187946.
Full textKuroki, Quispe André Francisco, and Taza Gianella Milagros Soto. "Factores que determinan el comportamiento del volumen de exportación de café peruano con partida 090111 según los años 1980 - 2017." Bachelor's thesis, Universidad Peruana de Ciencias Aplicadas (UPC), 2019. http://hdl.handle.net/10757/628233.
Full textThe present thesis is focused on the factors that explain the export volume of coffee in the period from 1980 to 2017 based on the area of cultivation, average price and coffee yield. The purpose of this research is the development of a statistical model that allows producers in the coffee sector to forecast their export volumes, our methodology is to conduct a quantitative research, with a conclusive non-experimental design and a correlational descriptive scope. The results showed that the average price is not a significant variable that affects the export volume, the cultivated area and the yield are the main factors that the producer must take care of to increase its volume. The yield of coffee is a very sensitive variable and in essence its good management leads to significantly increase the volume of the producer.
Tesis
Gripencrantz, Sarah. "Evaluating the Use of Ridge Regression and Principal Components in Propensity Score Estimators under Multicollinearity." Thesis, Uppsala universitet, Statistiska institutionen, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-226924.
Full textLee, Wonwoo. "Fractional principal components regression: a general approach to biased estimators." Diss., Virginia Polytechnic Institute and State University, 1986. http://hdl.handle.net/10919/49819.
Full textGatz, Philip L. Jr. "A comparison of three prediction based methods of choosing the ridge regression parameter k." Thesis, Virginia Tech, 1985. http://hdl.handle.net/10919/45724.
Full textMaster of Science
Pingel, Ronnie. "Some Aspects of Propensity Score-based Estimators for Causal Inference." Doctoral thesis, Uppsala universitet, Statistiska institutionen, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-229341.
Full textHansen, John A. "A comparison of parametric and nonparametric techniques used to estimate school district production functions analysis of model response to change in sample size and multicollinearity /." [Bloomington, Ind.] : Indiana University, 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3324516.
Full textTitle from PDF t.p. (viewed on May 12, 2009). Source: Dissertation Abstracts International, Volume: 69-08, Section: A, page: 3030. Adviser: Daniel Mueller.
Williams, Ulyana P. "On Some Ridge Regression Estimators for Logistic Regression Models." FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3667.
Full textNdiritu, Gachiri Charles. "An Application of Multiple Regression in Exchange Rate Arrangements." Thesis, University of the Western Cape, 2008. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_1863_1263418792.
Full textThis project "
An application of multiple regression in exchange rate arrangement"
focused on the processes followed by different countries when choosing an exchange rate regime for currency stabilization. It analyses the consequences faced by emerging markets as a result of changes in volatility of developed countries&rsquo
currencies (American Dollar, Japanese Yen, EURO, British Pound and the Canadian Dollar).
Atems, Bebonchu. "Essays in nonlinear macroeconomic modeling and econometrics." Diss., Kansas State University, 2011. http://hdl.handle.net/2097/11985.
Full textDepartment of Economics
Lance J. Bachmeier
This dissertation consists of three essays in nonlinear macroeconomic modeling and econometrics. In the first essay, we decompose oil price movements into oil demand (stock market) shocks and oil supply (oil-market) shocks, and examine the response of the stock market to these shocks. We find that when oil prices are “net-increasing”, a stock market shock that causes the S&P 500 to rise by one percentage point will cause the price of oil to rise approximately 0.2 percentage points, with a statistically significant positive effect one day after the stock market shock. On the other hand, the response of the stock market to an oil market shock is a decline of 6.8 percent when the price of oil doubles. For other days, the initial response of the oil market to a stock market shock is the same as in the net oil price increase case (by construction). We then analyze the response of monetary policy to the identified stock market and oil market shocks and find that short-term interest rates respond to the stock market shocks but not the oil market shocks. Finally, we evaluate the predictive power of the decomposed stock market and oil shocks relative to the change in the price of oil. We find statistically significant gains in both the in-sample fit and out-of-sample forecast accuracy when using the identified stock market and oil market shocks rather than the change in the price of oil. The second essay revisits the statistical specification of near-multicollinearity in the logistic regression model using the Probabilistic Reduction approach. We argue that the ceteris paribus clause invoked with near-multicollinearity is rather misleading. This assumption states that one can assess the impact of near-multicollinearity by holding the parameters of the logistic regression model constant, while examining the impact on their standard errors and t-ratios as the correlation (\rho) between the regressors increases. Using the Probabilistic Reduction approach, we derive the parameters (and related statisitics) of the logistic regression model and show that they are functions of \rho , indicating the ceteris paribus clause in the traditional account of near multicollinearity is unattainable. Monte carlo simulations in the paper confirm these findings. We also show that traditional near-multicollinearity diagnostics, such as the variance inflation factor and condition number can fail to detect near-multicollinearity. Overall, the paper finds that near-multicollinearity in the logistic model is highly variable and may not lead to the problems indicated by the traditional account. Therefore, unexpected, unreliable or unstable estimates and inferences should not be blamed on near-multicollinearity. Rather the modeler should return to economic theory or statistical respecification of their model to address these problems. The third essay examines the correlations between income inequality and economic growth using a panel of income distribution data for 3,109 counties of the U.S. We examine the non-spatial dynamic correlations between county inequality and growth using a System GMM approach, and find significant negative relationships between changes in inequality in one period and growth in the subsequent period. We show that this finding is robust across different sample sizes. We further argue that because the space-specific time-invariant variables that affect economic growth and inequality can differ significantly across counties, failure to incorporate spatial effects into a model of growth and inequality may lead to biased results.We assume that dependence among counties only arises from the disturbance process, hence the estimation of a spatial error model. Our results indicate that the bias in the parameter for inequality amounts to about 2.66 percent, while that for initial income amounts to about 21.51 percent.
Nakamura, Karina Gernhardt. "Multicolinearidade em modelos de regressão logística." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-28052013-222241/.
Full textThis work proposes the use of some biased estimators to investigate whether is possible minimize the multicollinearity effects in logistic regression models. Initially, the latter model was presented, as well as its fitting process (therefore obtaining the maximum likelihood estimator), some tests to evaluate the significance of the parameters and techniques to analyze goodness of fit were also considered. Furthermore, the effects of multicollinearity in the fitting process and in the parameters inference were discussed, as well as techniques to identify the presence of multicollinearity. In order to diminish the effect of this problem, two alternative estimators were presented: ridge estimator and principal component estimator. Therefore, these three estimators performances were compared using a simulation study and applied in a real data set. The manly conclusion was that, in the presence of multicollinearity, the alternative estimators performed better than the maximum likelihood estimator, besides reducing its effects.
Evani, Bhanu M. "WEIGHTED QUANTILE SUM REGRESSION FOR ANALYZING CORRELATED PREDICTORS ACTING THROUGH A MEDIATION PATHWAY ON A BIOLOGICAL OUTCOME." VCU Scholars Compass, 2017. http://scholarscompass.vcu.edu/etd/4760.
Full textCarbonera, Roberto. "Atributos físicos e fisiológicos de sementes de aveia preta." Universidade Federal de Santa Maria, 2016. http://repositorio.ufsm.br/handle/1/3273.
Full textAs plantas forrageiras desempenham importante papel na produção animal na Região Sul do Brasil. Dentre as espécies, a aveia preta se destaca por apresentar maior área de cultivo no inverno, ocupando uma área de 3,8 milhões de hectares no Estado do Rio Grande do Sul. Para a sua adequada semeadura e estabelecimento, são produzidas sementes que devem conter elevados padrões de qualidade, que é aferida por laboratórios de análise. Frente a isso, a presente pesquisa teve como objetivos avaliar os atributos físicos e fisiológicos de sementes de aveia preta, associar a qualidade de sementes ao perfil de produção e aos possíveis efeitos provocados por fatores meteorológicos. Visou, ainda, identificar as variáveis que se correlacionam com a porcentagem de sementes puras e a emergência de plântulas, identificar a presença de multicolinearidade, as variáveis mais importantes em relação à variável dependente principal, porcentagem de plântulas normais, e agrupar a amostras por seus graus de parecença. Foram avaliadas 2.910 amostras, sendo 2.229 análises de sementes oriundas do processo de produção de sementes, 357 análises de sementes de uso próprio e 324 análises de tetrazólio analisadas pelo laboratório de análise de sementes do Curso de Agronomia da UNIJUÍ, seguindo a metodologia descrita nas regras de análise de sementes. Os resultados foram submetidos às análises de estatísticas descritivas, à dispersão dos dados, foram estimados os coeficientes de correlação linear de Pearson, o diagnóstico de multicolinearidade, os efeitos diretos e indiretos através da análise de trilha e o agrupamento entre as amostras. As sementes produzidas segundo o sistema nacional de sementes e mudas apresentaram excelentes níveis de qualidade física e fisiológica nos anos de 2006 a 2010. Entre os anos de 2011 a 2014, 14 e 14,5% das sementes foram comprometidas pelas presenças de outras sementes de espécies cultivadas e de nocivas toleradas, respectivamente. As sementes de uso próprio apresentaram ampla variabilidade com 18,1 e 31,7% de amostras abaixo dos padrão para germinação nos anos de 2006 a 2010 e 2011 a 2014, respectivamente, enquanto que as amostras analisadas pelo teste de tetrazólio apresentaram níveis de reprovação de 19,4 e 12,5 %, respectivamente. Destaca-se que a qualidade fisiológica das sementes está relacionada aos anos com níveis de precipitações e temperaturas adequadas ao desenvolvimento vegetativo, maturidade fisiológica e colheita. A variável plântulas normais apresentou maior correlação, de sinal negativo, com sementes mortas. As variáveis plântulas anormais e sementes mortas apresentaram os maiores efeitos diretos sobre porcentagem de germinação, de sinal negativo e a análise de agrupamento revelou a existência de três grupos de parecença em sementes produzidas segundo o sistema nacional de sementes mudas e de quatro grupos em sementes de uso próprio.
Casarotto, Gabriele. "Relações lineares entre caracteres fenológicos, morfológicos e produtivos em milho." Universidade Federal de Santa Maria, 2013. http://repositorio.ufsm.br/handle/1/5084.
Full textThis study aimed to verify the existence of linear relationships among phenological, morphological and productive characters of maize cultivars (Zea mays L.) of early and veryearly cycle and transgenic class and also to identify which characters have high correlation and direct effects on grain productivity. Six experiments were performed with early and veryearly and transgenic maize cultivars in the growing seasons 2009-2010 and 2010-2011, in the experimental area of the Department of Plant Science, of Federal University of Santa Maria. In the 2009-2010 harvest were evaluated 36 early cultivars, 22 veryearly and 18 transgenic and 2010-2011 harvest, 23 early, 9 veryearly and 27 transgenic. The experimental design was a randomized block design with three replications. The experimental unit consisted of two rows of five meters in length, spaced at 0,80 m. The seeding rate was adjusted to 62,500 plants ha-1. In each experimental unit it were tagged, randomized, three plants, and it were evaluated 15 characters of each one. The average of these three plants was the value of repetition. It were evaluated phenological (total number of leaves per plant (NFO), phyllochron estimated with the number of expanded leaves(FNFE), phyllochron estimated with the total number of leaves (FNTF) in ° C day leaf-1, the number of days of seeding until male flowering (FM) and number of days of seeding until female flowering (FF)), morphological (plant height (PH) and ear insertion height (AE), in cm) and productive (ear weight (PE), in g, number of kernel rows per ear (NFI), ear length (CE), in cm, ear diameter (DE), in mm, cob weight (PS), in g, cob diameter (DS), in mm, hundred kernel weight (MCG), in g, and grain productivity (PRO) in g ear-1) characters. Analysis of variance (ANOVA) was performed and the means of the cultivars were compared by Scott-Knott test at 5% probability. The linear correlation coefficients of Pearson among 15 evaluated characters were estimated for each experiment. For the path analysis, the PRO was considered the main character and the other characters were considered explanatory ones. It was accomplished multicollinearity diagnosis in the correlation matrix among the explanatory characters and the characters causing high degree of multicollinearity were eliminated. The direct and indirect effects on the PRO were estimated using path analysis and the verification of characters that influence PRO and their contribution in predicting the PRO were estimated by stepwise regression analysis. There are linear relationships among the phenological, morphological and productive characters maize plants. The characters PE and DE showed linear correlation coefficients of Pearson very strong (r≥0,97) and moderate to strong (0,55≤r≤0,78), respectively, with the PRO. In general, the character DE has high correlation and positive direct effects (0,6686 ≤ direct effect ≤ 1,1818) on the PRO. Allied to DE, the CE has a high positive contribution in predicting the PRO. Therefore, they can be used for indirect selection in maize breeding programs.
Este estudo teve como objetivos verificar a existência de relações lineares entre caracteres fenológicos, morfológicos e produtivos de cultivares de milho (Zea mays L.) de ciclos precoce e superprecoce e classe transgênica, e identificar quais caracteres possuem elevada correlação e efeitos diretos sobre a produtividade de grãos. Para isso, foram conduzidos seis experimentos com cultivares precoces, superprecoces e transgênicas de milho, nas safras agrícolas 2009-2010 e 2010-2011, na área experimental do Departamento de Fitotecnia da Universidade Federal de Santa Maria. Na safra 2009-2010 foram avaliadas 36 cultivares precoces, 22 superprecoces e 18 transgênicas e na safra 2010-2011, 23 precoces, 9 superprecoces e 27 transgênicas. Nos seis experimentos, o delineamento experimental foi de blocos casualizados, com três repetições. As unidades experimentais foram constituídas de duas filas de cinco metros de comprimento, espaçadas em 0,80m. A densidade de semeadura foi ajustada para 62.500 plantas ha-1. Em cada unidade experimental foram marcadas, aleatoriamente, três plantas, onde foram avaliados 15 caracteres. A média dessas três plantas constituiu o valor da repetição. Foram avaliados os caracteres fenológicos (número total de folhas por planta (NFO), filocrono estimado com número de expandidas (FNFE), filocrono estimado com o número total de folhas (FNTF), em °C dia folha-1, número de dias da semeadura até o florescimento masculino (FM) e número de dias da semeadura até o florescimento feminino (FF)), morfológicos (altura de planta (AP) e altura de inserção de espiga (AE), em cm) e produtivos (peso de espiga (PE), em g, número de fileiras de grãos por espiga (NFI), comprimento de espiga (CE), em cm, diâmetro de espiga (DE), em mm, peso de sabugo (PS), em g, diâmetro de sabugo (DS), em mm, massa de cem grãos (MCG), em g, e produtividade de grãos (PRO), em g espiga-1). Foi realizada análise de variância individual e as médias das cultivares foram comparadas por meio do teste de Scott-Knott, a 5% de probabilidade. Posteriormente, foram estimados, para cada experimento, os coeficientes de correlação linear de Pearson entre os 15 caracteres avaliados. Para a análise de trilha, a PRO foi considerada o caractere principal e os demais explicativos. Foi realizado o diagnóstico de multicolinearidade na matriz de correlação entre os caracteres explicativos e eliminados os caracteres causadores de alto grau de multicolinearidade. Os efeitos diretos e indiretos sobre a PRO foram estimados por meio de análise de trilha e a verificação dos caracteres que influenciam a PRO e a contribuição deles na predição da PRO foram estimados por meio de análise de regressão stepwise. Existem relações lineares entre os caracteres fenológicos, morfológicos e produtivos de plantas milho. Os caracteres PE e DE possuem coeficientes de correlação linear de Pearson fortíssimos (r≥0,97) e moderados a fortes (0,55≤r≤0,78), respectivamente, com a PRO. De maneira geral, o caractere DE possui elevada correlação e efeitos diretos (0,6686 ≤ efeito direto ≤ 1,1818) positivos sobre a PRO. Aliado ao DE, o CE possui elevada contribuição positiva na predição da PRO. Portanto, podem ser utilizados para seleção indireta em programas de melhoramento genético de milho.
Лисенко, О. В. "Моделювання причинно-наслідкових зв’язків між тіньовою економікою та соціально-економічними процесами." Master's thesis, Сумський державний університет, 2021. https://essuir.sumdu.edu.ua/handle/123456789/86994.
Full textThe paper examines the existence of causal relationships between shadow economy and socio-economic processes. The main purpose of this work is to build economic and mathematical models of the impact of economic indicators and the impact of social indicators on the level of the shadow economy. The key research methods are principal component analysis and multiple correlation and regression analysis, which were implemented using Statistica software.
Rey, Diana. "A Gasoline Demand Model for the United States Light Vehicle Fleet." Master's thesis, University of Central Florida, 2009. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2351.
Full textM.S.
Department of Civil and Environmental Engineering
Engineering and Computer Science
Civil Engineering MS
Huschens, Stefan. "Einführung in die Ökonometrie." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-222629.
Full textNÓBREGA, Jarley Palmeira. "Um método de aprendizagem seqüencial com filtro de Kalman e Extreme Learning Machine para problemas de regressão e previsão de séries temporais." Universidade Federal de Pernambuco, 2015. https://repositorio.ufpe.br/handle/123456789/15951.
Full textMade available in DSpace on 2016-03-15T12:52:14Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Tese_Jarley_Nobrega_CORRIGIDA.pdf: 12392055 bytes, checksum: 30d9ff36e7236d22ddc3a16dd942341f (MD5) Previous issue date: 2015-08-24
Em aplicações de aprendizagem de máquina, é comum encontrar situações onde o conjunto de entrada não está totalmente disponível no início da fase de treinamento. Uma solução conhecida para essa classe de problema é a realização do processo de aprendizagem através do fornecimento sequencial das instâncias de treinamento. Entre as abordagens mais recentes para esses métodos, encontram-se as baseadas em redes neurais do tipo Single Layer Feedforward Network (SLFN), com destaque para as extensões da Extreme Learning Machine (ELM) para aprendizagem sequencial. A versão sequencial da ELM, chamada de Online Sequential Extreme Learning Machine (OS-ELM), utiliza uma solução recursiva de mínimos quadrados para atualizar os pesos de saída da rede através de uma matriz de covariância. Entretanto, a implementação da OS-ELM e suas extensões sofrem com o problema de multicolinearidade entre os elementos da matriz de covariância. Essa tese introduz um novo método para aprendizagem sequencial com capacidade para tratar os efeitos da multicolinearidade. Chamado de Kalman Learning Machine (KLM), o método proposto utiliza o filtro de Kalman para a atualização sequencial dos pesos de saída de uma SLFN baseada na OS-ELM. Esse trabalho também propõe uma abordagem para a estimativa dos parâmetros do filtro, com o objetivo de diminuir a complexidade computacional do treinamento. Além disso, uma extensão do método chamada de Extended Kalman Learning Machine (EKLM) é apresentada, voltada para problemas onde a natureza do sistema em estudo seja não linear. O método proposto nessa tese foi comparado com alguns dos mais recentes e efetivos métodos para o tratamento de multicolinearidade em problemas de aprendizagem sequencial. Os experimentos executados mostraram que o método proposto apresenta um desempenho melhor que a maioria dos métodos do estado da arte, quando medidos o de erro de previsão e o tempo de treinamento. Um estudo de caso foi realizado, aplicando o método proposto a um problema de previsão de séries temporais para o mercado financeiro. Os resultados confirmaram que o KLM consegue simultaneamente reduzir o erro de previsão e o tempo de treinamento, quando comparado com os demais métodos investigados nessa tese.
In machine learning applications, there are situations where the input dataset is not fully available at the beginning of the training phase. A well known solution for this class of problem is to perform the learning process through the sequential feed of training instances. Among most recent approaches for sequential learning, we can highlight the methods based on Single Layer Feedforward Network (SLFN) and the extensions of the Extreme Learning Machine (ELM) approach for sequential learning. The sequential version of the ELM algorithm, named Online Sequential Extreme Learning Machine (OS-ELM), uses a recursive least squares solution for updating the output weights through a covariance matrix. However, the implementation of OS-ELM and its extensions suffer from the problem of multicollinearity for the hidden layer output matrix. This thesis introduces a new method for sequential learning in which the effects of multicollinearity is handled. The proposed Kalman Learning Machine (KLM) updates sequentially the output weights of an OS-ELM based network by using the Kalman filter iterative procedure. In this work, in order to reduce the computational complexity of the training process, a new approach for estimating the filter parameters is presented. Moreover, an extension of the method, named Extended Kalman Learning Machine (EKLM), is presented for problems where the dynamics of the model are non linear. The proposed method was evaluated by comparing the related state-of-the-art methods for sequential learning based on the original OS-ELM. The results of the experiments show that the proposed method can achieve the lowest forecast error when compared with most of their counterparts. Moreover, the KLM algorithm achieved the lowest average training time when all experiments were considered, as an evidence that the proposed method can reduce the computational complexity for the sequential learning process. A case study was performed by applying the proposed method for a problem of financial time series forecasting. The results reported confirm that the KLM algorithm can decrease the forecast error and the average training time simultaneously, when compared with other sequential learning algorithms.
Zaldivar, Cynthia. "On the Performance of some Poisson Ridge Regression Estimators." FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3669.
Full textToebe, Marcos. "Não-normalidade multivariada e multicolinearidade em análise de trilha na cultura de milho." Universidade Federal de Santa Maria, 2012. http://repositorio.ufsm.br/handle/1/5057.
Full textThe path analysis allows evaluation of the direct and indirect effects of the explicative variables on variable of interest, through the breakdown of the correlation coefficients. In order to make the results obtained through the path analysis reliable, some assumptions must be met. Thus, the objectives of this study were to verify the normality and the multicollinearity interference in the corn path analysis and compare alternative methods for estimating the path coefficients. Data from 44 trials of corn cultivars was used, carried out in the state of Rio Grande do Sul, between the crop years 2002/03 and 2004/05. In each cultivar, of each trial, were measured (number of days until the male flowering, plant height, ear insertion height, relative position of the ear, number of plants, number of ears and prolificacy) and the main variable (grain yield). For each trial, descriptive statistics were calculated and univariate and multivariate normality diagnoses were conducted using the Shapiro-Wilk test and the Shapiro-Wilk multivariate generalized by Royston test, respectively. Thereupon, in the trials data that did not present a normal distribution, a transformation of the data by the Box-Cox family of transformations was carried out. The correlation coefficients between the seven explicative variables (correlation matrix X'X) and the correlation coefficients of each explicative variable with the grain yield (correlation matrix X'Y) were calculated for the original and transformed data. Then, the multicollinearity was diagnosed in the correlation matrix X'X, using four methods: variance inflation factor, tolerance, the condition number and the matrix determinant. Finally, the path analysis was performed, using the normal equations system X X �� = X Y, in three forms: traditional path analysis, path analysis under multicollinearity and traditional path analysis, with elimination of variables. The data transformation, to obtain multivariate normality, contributes to the degree of multicollinearity decrease and in the stabilization of the direct effects in path analysis with high degree of multicollinearity. The high degrees of multicollinearity adverse effects in the estimation of the direct effects in path analysis are larger than the multivariate non-normality. The traditional path analysis, with elimination of variables, is more appropriate than the path analysis under multicollinearity.
A análise de trilha permite avaliar os efeitos diretos e indiretos de variáveis explicativas sobre a variável de interesse, por meio do desdobramento dos coeficientes de correlação. Para que os resultados gerados pela análise de trilha apresentem confiabilidade adequada, alguns pressupostos devem ser atendidos. Assim, os objetivos deste trabalho foram: verificar a interferência da não-normalidade multivariada e da multicolinearidade em análise de trilha na cultura de milho e, comparar métodos alternativos de estimação dos coeficientes de trilha. Foram utilizados dados de 44 ensaios de competição de cultivares de milho, conduzidos no estado do Rio Grande do Sul, entre os anos agrícolas de 2002/03 e 2004/05. Em cada cultivar, de cada ensaio, foram mensuradas sete variáveis explicativas (número de dias até o florescimento masculino, estatura de plantas, altura de inserção da espiga, posição relativa da espiga, número de plantas, número de espigas e prolificidade) e a variável principal (produtividade de grãos). Para cada ensaio, foram calculadas estatísticas descritivas e realizado o diagnóstico de normalidade uni e multivariada, por meio dos testes de Shapiro-Wilk e de Shapiro-Wilk multivariado generalizado por Royston, respectivamente. A seguir, nos dados dos ensaios que não apresentaram distribuição normal, foi realizada a transformação dos dados com a utilização da família de transformações Box-Cox. Para os dados originais e os dados transformados, foram calculados os coeficientes de correlação entre as sete variáveis explicativas (matriz de correlação X X) e os coeficientes de correlação de cada variável explicativa com a produtividade de grãos (matriz de correlação X Y). A seguir, foi realizado o diagnóstico de multicolinearidade na matriz de correlação X X, por meio de quatro métodos: fator de inflação de variância, tolerância, número de condição e determinante da matriz. Por fim, foi realizada a análise de trilha, com a utilização do sistema de equações normais X X �� = X Y, por três formas: análise de trilha tradicional, análise de trilha sob multicolinearidade e análise de trilha tradicional, com eliminação de variáveis. A transformação de dados, a fim de obter a normalidade multivariada, contribui para a redução do grau de multicolinearidade e na estabilização das estimativas dos efeitos diretos em análise de trilha com alto grau de multicolinearidade. Os efeitos adversos do alto grau de multicolinearidade na estimativa dos efeitos diretos de análises de trilha são maiores que a não-normalidade multivariada. A análise de trilha tradicional, com eliminação de variáveis, é mais adequada que a análise de trilha sob multicolinearidade.
Brunelli, Renata Trevisan. "Análise do impacto de perturbações sobre medidas de qualidade de ajuste para modelos de equações estruturais." Universidade de São Paulo, 2012. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-24032013-123415/.
Full textThe Structural Equation Modeling (SEM) is a multivariate methodology that allows the study of cause-and-efect relationships and correlation of a set of variables (that may be observed or latent ones), simultaneously. The technique has become more diuse in the last years, in different fields of knowledge. One of its main applications is on the confirmation of theoretical models proposed by the researcher (Confirmatory Factorial Analysis). There are several measures suggested by literature to measure the goodness of t of a SEM model. However, there is a scarce number of texts that list relationships between the values of different of those measures with possible problems that may occur on the sample or the specication of the SEM model, like information concerning what problems of this nature impact which measures (and which not), and how does the impact occur. This information is important because it allows the understanding of the reasons why a model could be considered bad fitted. The objective of this work is to investigate how different disturbances of the sample, the model specification and the estimation of a SEM model are able to impact the measures of goodness of fit; additionally, to understand if the sample size has influence over this impact. It will also be investigated if those disturbances affect the estimates of the parameters, given the fact that there are disturbances for which occurrence some of the measures indicate badness of fit but the parameters are not affected; at the same time, that are occasions on which the measures indicate a good fit and there are disturbances on the estimates of the parameters. Those investigations will be made simulating examples of different size samples for which type of disturbance. Then, SEM models with different specifications will be fitted to each sample, and their parameters will be estimated by two dierent methods: Generalized Least Squares and Maximum Likelihood. Given those answers, a researcher that wants to apply the SEM methodology to his work will be able to be more careful and, among the available measures of goodness of fit, to chose those that are more adequate to the characteristics of his study.
Shehzad, Muhammad Ahmed. "Pénalisation et réduction de la dimension des variables auxiliaires en théorie des sondages." Phd thesis, Université de Bourgogne, 2012. http://tel.archives-ouvertes.fr/tel-00812880.
Full textHuang, Sheng-Yao, and 黃生耀. "The Comparison of Different Investment Models Under Multicollinearity." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/63728456035882588154.
Full text國立臺灣海洋大學
應用經濟研究所
94
In this thesis, the investment related series are under investigation. We use three econometric methods in our study. We apply stepwise regression method combined with big four theories about making decision for investment behavior. The four theories are generalized accelerator, cash flow, neoclassical and securities valuation. In total, we choose 38 items of variables from investment function. And then we use the tool of multiple regression to proceed all the analysis. After that we compare it with results of principal component and ridge regression. All the data are from AREMOS data bank. The sample period ranges from the first quarter in 1976 through the fourth quarter in 2004 . In the first step, we consider the unit root test to check the stationarity in the data. The finding is that variables in the set are nonstationary in general. However we have done the work both in the differenced form and in the level form. For the former solution, we follow the tradition procedure suggestion to take difference prior to the analysis. For the latter one, we adopt the theory under the framework of cointegration regression. Judging from all the results, we have found stepwise regression method under the guidance of economic theories performs best. And the outcomes from principal component and ridge regression are not reliable if the data range wide and normalization is not applied. We hope this result is good for all the private agencies and organizations in the government. Key words: investment function, stepwise regression, principal component, ridge regression.
Pan, Li-Hsiang, and 潘立翔. "A Study on the effect of Multicollinearity in Polynomial model." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/17227311806609276791.
Full text淡江大學
數學學系碩士班
99
In this paper the process of regression analysis of linear prone to the problem, and most of the information in the past, collinearity problems are to; (1) the high correlation coefficient with each other predictors just take a important variable into analysis, (2) ridge regression, (3) principal component regression a total of three methods to solve linear problems come and go, but some of them are not local, so paper Cipian aims to investigate a new method to solve the collinearity problem, and with the ridge regression and principal component regression to do more.
Bhattacharya, Indranil. "Feature Selection under Multicollinearity & Causal Inference on Time Series." Thesis, 2017. http://etd.iisc.ernet.in/2005/3980.
Full textLiu, Yu-Chen, and 劉育呈. "Analyzing the Factors Affecting Survival of Cancer Patients under Multicollinearity Problem." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/ar7a28.
Full text國立臺北科技大學
經營管理系碩士班
101
In Taiwan, cancer stands as top one of the most lethal disease. Not only affect the quality of life of the patient and their families, but also cause a huge medical expenses and years of potential life lost. In order to reduce the incidence of cancer effectively, we try to find out the causes of low cancer rate by analyzing the pattern and availability of survival influence factors. The objective of this research is to investigate factors which influence survival time of cancer patients. Because survival time can be impacted by factors which are highly correlated with each other, to appropriate treatment of medical, the problem of multicollinearity must be solved. Therefore, this research proposes a new solution, Cox proportional hazards model combines independent component analysis method, to eliminate the multicollinearity among explanatory variables. To evaluate the performance of the proposed method relative to alternative approaches, we report one experiment study based on the dataset from Monte Carlo simulation experiments. The result shows that the proposed approach can solve the problem of multicollinearity and the significant effect between survival time and factors with cancer patients.
Riley, Fransell Rena Copeland. "Testing the equality of regression coefficients and a pooling methodology from multiple samples when the data is multicollinear." 2009. http://hdl.handle.net/10106/1737.
Full textLee, Shih-Yun, and 李詩芸. "Model selection in regression analysis on the data with multicollinearity and missing covariates." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/68017326545657148002.
Full text國立陽明大學
公共衛生研究所
92
In the research of public health, the problems of multicollinearity and missing values are frequently encountered. When the covariates are related to each other, this is called multicollinearity. Serious multicollinearity in regression analysis will make unstableness of regression parameter estimates, which result in the increase of standard errors, probably even in interfering or misleading the variable selection of the model. When missing values exist in important variables, certain degrees of information will lose, and the results of analysis will be likely to bias. In the past, there are few researches to discuss the effect on variable selection of both multicollinearity and missing values at the same time. The purpose of this research is to compare (1) whether the missing values of covariates exist or not, (2) whether covariates exist with multicollinearity or not ,and (3) what the effect on the variable selection of the model is when the missing values and multicollinearity exist at the same time. Use a sample of Obstructive Sleep Apnea Syndrome (OSAS) patients’ records of a hospital center as the population to do various simulation studies. Study different sample sizes, data structures, proportions of missingness, magnitude of multicollinearity, the criteria of variables in/out the model and selection methods in different conditions to the effect of the variable selection in regression models. According to the data characters, this research will study most of linear regression and part of logistic regression. The followings research three subjects—multicollinearity, missing values and variable selection. The result of linear regression shows that the missing values have little effect on the proportion of choosing the correct model and only a little effect on a small sample size. Multicollinearity has a substantial effect on the proportion of choosing the correct model, and candidate variables or the true model which have collinearity will decrease the proportion of choosing the correct model. When neither the true model nor candidate variables have collinearity, different levels of R2, criteria of variables in/out the model and selection methods have little effect on the correct model. When the true model with quadratic forms or candidate variables with interactions, the result is bad at small sample sizes if criteria of variables in/out the model are strict, but is better at large sample sizes. The bigger the R2 is, the better the proportion of choices will be, no matter the sample size is large or small. When the true model has collinearity in itself and candidate variables with interactions, that is, when serious multicollinearity exists, the method of backward selection is better than the method of forward stepwise selection on choosing the correct model. According to the result of the P values, when true models with collinearity, if the samples are large enough, the covariates in estimated models won’t interfere but become significant. When true models are without collinearity, as long as the covariates in estimated models don’t appear in true models, the result won’t be significant no matter whether there is the interference of collinearity or not. The result of logistic regression shows that selection methods, no matter whether it is forward stepwise or backward selection, have little difference. The criterion of variables in/out the model with small sample sizes is 0.05 worse than 0.1, but better than 0.1 with large sample sizes. The existence of missing values will slightly decrease the proportion of choosing the correct model. In conclusion, the sample size will effect the result greatly.
Chen, Hsin-Fen, and 陳杏棻. "A Solution to Cox regression with Multicollinearity - An application of Independent Component Analysis." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/s72wmw.
Full text國立臺北科技大學
管理學院管理博士班
104
The purpose of this study is to solve the multicollinearity problem in Cox regression model. The Cox regression model has been widely used to describe the relationship between survival information and covariates. Multicollinearity refers to that there exist one or several approximate linear relations among explanatory variable. Multicollinearity troubles many researchers because when multicollinearity is present, the collective power of explanation is considerably less than the sum of their individual power. Moreover, the presence of multicollinearity invalidates the ordinary least square (OLS) estimation, which assumes that explanatory variables are uncorrelated with each other, and makes it impossible to estimate the unique effects of individual variables in the analysis. So, the problem of multicollinearity must be taken care. Therefore, this study proposes a new solution: Independent Component Analysis - Cox Regression (ICA-CR) to eliminate the multicollinearity among explanatory variables. To evaluate the performance of the proposed method relative to alternative approaches, such as Cox regression, ridge regression, principal component regression, A dataset from one of the biggest mutual fund brokers in Taiwan was used to illustrate the proposed approach. Two Monte Carlo simulation experiments with various degrees of multicollinearity, censored rate, and sample size were conducted. The result shows that the proposed ICA-CR approach could successfully solve the multicollinearity problem in the data. The mutual fund holding time was impacted by economic environments in a significantly different way during and after financial crisis. This result indicates that, after financial crisis, mutual fund investors have adjusted their risk tolerance and can response to the financial environment more rationally.
Lin, Yu-Wei. "Gram-Schmidt Transformation Minimization Algorithm and Its Applications to Regression Analysis with Multicollinearity." 2008. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-2708200816471900.
Full textFang, Wei-Quan, and 方偉泉. "A Note on Parameters Estimation in Linear Regression Models Subject to Measurement Error and Multicollinearity." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/75369753916933474623.
Full text中原大學
應用數學研究所
104
In recent years, big data applications and intelligence have become ubiquitous and keep discovering more insights from the academia and the industry/economy alike. Data analyses in those researches may sometimes require complex statistical methods where, in general, no closed-form solutions can be directly used because of imprecision of measurements and/or unexpected errors. In addition, there have been, to date, a number of literature reviews that some statistical results may lead to undesirable conclusions due to collinearity in data structure. Hence investigators should pay attention to such situations when digging the data and drawing the information. In this dissertation, we study the mis-measured and collinear issues in linear models. To be a tool for interpreting the possibility of cause-effect relations, regression analysis plays an important role for a long time. One of the main goals of this study is to remind the readership that classical estimation approach, least-squares method, may need to correct for the biases of parameter coefficients in certain applications. According to the aforementioned, we propose two new methods to correct such biases and give an outlook on further extended works. It is also our hope that some of the estimation approaches proposed in this dissertation will contribute to the subsequent development of a more general theory for biases correction in regression analysis.
Jurczyk, Tomáš. "Robustifikace statistických a ekonometrických metod regrese." Doctoral thesis, 2016. http://www.nusl.cz/ntk/nusl-351516.
Full textLiu, Xiaoming. "A class of generalized shrunken least squares estimators in linear model." 2010. http://hdl.handle.net/1993/4188.
Full textParandvash, G. Hossein. "On the incorporation of nonnumeric information into the estimation of economic relationships in the presence of multicollinearity." Thesis, 1987. http://hdl.handle.net/1957/26851.
Full textChen, Ai-Chun, and 陳愛群. "A class of Liu-type estimators based on ridge regression under multicollinearity with an application to mixture experiments." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/bquhze.
Full text國立中央大學
統計研究所
103
In the linear regression, the least square estimator does not perform well in terms of mean squared error when multicollinearity exists. The problem of multicollinearity occurs in industrial mixture experiments, where regressors are constrained.Hoerl and Kennard (1970) proposed the ordinary ridge estimator to overcome the problem of the least squared estimator under multicollinearity. Recently, the ridge regression is successfully applied to mixture experiments. However, the application of ridge becomes difficult if the linear model has the intercept term and the regressors are standardized as occurring in mixture experiments. This paper considers a special class of Liu-type estimators (Liu, 2003) with intercept. We derive the theoretical formula of the mean squared error for the proposed method. We perform simulations to compare the proposed estimator with the ridge estimator in terms of mean squared error. We demonstrate this special class using the dataset on Portland cement with mixture experiment (Woods et al., 1932).
Γρηγοριάδου, Μαρία. "Παραβιάσεις των βασικών υποθέσεων του γραμμικού μοντέλου παλινδρόμησης." Thesis, 2014. http://hdl.handle.net/10889/8276.
Full textThe statistical model is a standarization of stochastic relationships between variables in a form of mathematical equations in order to accurately describe a system, either phenomena, or facts. Almost every system includes some variable amounts that change.The interesting question is to investigate the effects those variables have (or appear to have) on other variables. This kind of investigation is the object of the regression analysis,a widely used statistical technic, which is used so as to detect relations and dependences between variables. Linear regression models are created when there are linear relations between variables. In addition, statistical models are based on some significant assumptions, that we are obliged to validate before we analyze the model. However, these assumptions are often violated in practise. Especially when we have to face with <
Bartel, Joseph. "A study on the effects of multicollinearity, autocorrelation and four sampling designs on the predictive ability of the 1994 and 1995 variable-exponent taper functions." Thesis, 1999. http://hdl.handle.net/2429/9003.
Full textMeňhartová, Ivana. "Metody dynamické analýzy složení portfolia." Master's thesis, 2012. http://www.nusl.cz/ntk/nusl-305048.
Full text"A Spatial Statistical Framework for Evaluating Landscape Pattern and Its Impacts on the Urban Thermal Environment." Doctoral diss., 2016. http://hdl.handle.net/2286/R.I.39433.
Full textDissertation/Thesis
Doctoral Dissertation Geography 2016
Berger, Swetlana. "Scale effects on genomic modelling and prediction." Doctoral thesis, 2015. http://hdl.handle.net/11858/00-1735-0000-0022-6086-8.
Full textIn dieser Arbeit wird eine neue Methode für den skalenunabhängigen Vergleich von LD-Strukturen in unterschiedlichen genomischen Regionen vorgeschlagen. Verschiedene Aspekte durch Skalen verursachter Probleme – von der Präzision der Schätzung der Marke-reffekte bis zur Genauigkeit der Vorhersage für neue Individuen - wurden untersucht. Darüber hinaus, basierend auf den Leistungsvergleichen von unterschiedlichen statistischen Methoden, wurden Empfehlungen für die Verwendungen der untersuchten Methoden gegeben
Onishi, Tamaki. "Institutional influence on the manifestation of entrepreneurial orientation: A case of social investment funders." Thesis, 2014. http://hdl.handle.net/1805/4656.
Full textLinking the new institutionalism to entrepreneurial orientation (EO), my dissertation investigates institutional forces and entrepreneurial forces—two contradicting types of forces—as main effects and moderating effects upon practices and performance of organizations embedded in the institutional duality. The case chosen observes unique hybrid funders that this study collectively calls social investment funders (SIF), which integrate philanthropy and venture capital investment to create and implement a venture philanthropy model for a pursuit of their mission. A theoretical framework is developed to propose regulative and normative pressures from two dominant institutions governing SIFs. Original data collected from 146 organizations are scrutinized by moderated multiple regressions for two empirical studies: Study 1 for effects on SIFs’ venture philanthropy practices, and Study 2 for effects on SIFs’ social and financial performance. Multiple imputations, diagnostic analyses, and several post hoc analyses are also conducted for robustness of data and results from multiple regression analyses. Results from these analyses find that EO and venture capital institutional forces both enhance SIFs’ venture philanthropy practices. A hypothesis postulated for a negative relationship between the nonprofit status and venture philanthropy practices is also supported. Results from moderated regression analyses, along with a subgroup and EO subdimension analyses, confirm a moderating effect between EO and the nonprofit status, i.e., a regulative institutional pressure. A positive relationship is found in EO- financial performance, but not in EO-social performance. While support is lent to hypotheses posited for a social/financial performance relationship with donors’/investors’ demand for social outcomes, and with the management team’s training in business, the overall results remain mixed for Study 2. Nonetheless, this dissertation appears to be the first study to theorize and test EO as a micro-level condition enabling organizations to strategically shape and resist institutional pressures, and it reinforces that organizations’ behavior is not merely a product of their passive conformity to environmental forces, but of the agency, also. As such, this study aims to contribute to scholarly efforts by the “agency camp” of the new institutionalism and EO, answering a call from the leading scholars of both EO (Miller) and the new institutionalism (Oliver).