Dissertations / Theses on the topic 'Bayesian LASSO'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 24 dissertations / theses for your research on the topic 'Bayesian LASSO.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Han, Yuchen. "Bayesian Variable Selection Using Lasso." Case Western Reserve University School of Graduate Studies / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=case1491775118610981.
Full textXing, Guan. "LASSOING MIXTURES AND BAYESIAN ROBUST ESTIMATION." Case Western Reserve University School of Graduate Studies / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=case1164135815.
Full textGao, Di. "Bayesian Lasso Models – With Application to Sports Data." Diss., North Dakota State University, 2018. https://hdl.handle.net/10365/27949.
Full textJoo, LiJin. "Bayesian lasso| An extension for genome-wide association study." Thesis, New York University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10243856.
Full textIn genome-wide association study (GWAS), variable selection has been used for prioritizing candidate single-nucleotide polymorphism (SNP). Relating densely located SNPs to a complex trait, we need a method that is robust under various genetic architectures, yet is sensitive enough to detect the marginal difference between null and non-null factors. For this problem, ordinary Lasso produced too many false positives, and Bayesian Lasso by Gibbs samplers became too conservative when selection criterion was posterior credible sets. My proposals to improve Bayesian Lasso include two aspects: To use stochastic approximation, variational Bayes for increasing computational efficiency and to use a Dirichlet-Laplace prior for separating small effects from nulls better. Both a double exponential prior of Bayesian Lasso and a Dirichlet-Laplace prior have a global-local mixture representation, and variational Bayes can effectively handle the hierarchies of a model due to the mixture representation. In the analysis of simulated and real sequencing data, the proposed methods showed meaningful improvements on both efficiency and accuracy.
Zhou, Xiaofei. "Bayesian Lasso for Detecting Rare Genetic Variants Associated with Common Diseases." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1563455460578675.
Full textWang, Meng. "Family-Based Bayesian LASSO for Detecting Association of Rare Haplotypes with Common Diseases." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1398896091.
Full textZhang, Yiran. "Bayesian Variable Selection for High-Dimensional Data with an Ordinal Response." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1565283865507018.
Full textXia, Shuang. "Detecting Rare Haplotype-Environment Interaction and Dynamic Effects of Rare Haplotypes using Logistic Bayesian LASSO." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1406246686.
Full textFragoso, Tiago de Miranda. "Seleção bayesiana de variáveis em modelos multiníveis da teoria de resposta ao item com aplicações em genômica." Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-14112014-110028/.
Full textRecent investigations about the genetic architecture of complex diseases use diferent sources of information. Diferent symptoms are measured to obtain a diagnosis, individuals may not be independent due to kinship or common environment and their genetic makeup may be measured through a large quantity of genetic markers. In the present work, a multilevel item response theory (IRT) model is proposed that unifies all these diferent sources of information through a latent variable. Furthermore, the large ammount of molecular markers induce a variable selection problem, for which procedures based on stochastic search variable selection and the Bayesian LASSO are considered. Parameter estimation and variable selection is conducted under a Bayesian framework in which a Markov chain Monte Carlo algorithm is derived and implemented to obtain posterior distribution samples. The estimation procedure is validated through a series of simulation studies in which parameter recovery, variable selection and estimation error are evaluated in scenarios similar to the real dataset. The estimation procedure showed adequate recovery of the structural parameters and the capability to correctly nd a large number of the covariates even in high dimensional settings albeit it also produced biased estimates for the incidental latent variables. The proposed methods were then applied to the real dataset collected on the \'Corações de Baependi\' familiar association study and was able to apropriately model the metabolic syndrome, a series of symptoms associated with elevated heart failure and diabetes risk. The multilevel model produced a latent trait that could be identified with the syndrome and an associated molecular marker was found.
Zhang, Han. "Detecting Rare Haplotype-Environmental Interaction and Nonlinear Effects of Rare Haplotypes using Bayesian LASSO on Quantitative Traits." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu149969433115895.
Full textAzevedo, Camila Ferreira. "Ridge, lasso and bayesian additive-dominance genomic models and new estimators for the experimental accuracy of genome selection." Universidade Federal de Viçosa, 2015. http://www.locus.ufv.br/handle/123456789/7176.
Full textMade available in DSpace on 2016-01-13T08:21:37Z (GMT). No. of bitstreams: 1 texto completo.pdf: 1062061 bytes, checksum: 36720a2028bf76afcba32f3865472cd7 (MD5) Previous issue date: 2015-10-26
A principal contribuição da genética molecular no melhoramento é a utilização direta das informações de DNA no processo de identificação de indivíduos geneticamente superiores. Sob esse enfoque, idealizou-se a seleção genômica ampla (Genome Wide Selection – GWS), a qual consiste na análise de um grande número de marcadores SNPs (Single Nucleotide Polymorphisms) amplamente distribuídos no genoma. Este trabalho de simulação apresenta uma abordagem completa para a seleção genômica por meio de adequados modelos genéticos incluindo efeitos aditivos e devido à dominância, que são essenciais para a seleção de clones e de cruzamentos, bem como para melhorar a estimativa de efeitos aditivos para a seleção. Até o momento, as abordagens via Ridge Bayesiana e Lasso para modelos aditivo-dominante não foram avaliados e comparados na literatura. Neste trabalho, foram avaliados o desempenho de 10 modelos de predição aditivo-dominante (incluindo os modelos existentes e propostas de modificação). Um novo método Bayesiano/Lasso modificado (chamado BayesA* B* ou t-BLASSO) obteve melhor desempenho na estimação de valores genéticos genômicos dos indivíduos, em todos os quatro cenários (dois níveis de herdabilidades × duas arquiteturas genéticas). Os métodos do tipo BayesA*B* apresentaram melhor capacidade para recuperar a razão entre a variância de dominância e a variância aditiva. Além disso, o papel das três fontes de informação da genética quantitativa (chamadas de desequilíbrio de ligação, co-segregação e relações de parentesco) na seleção genômica foram elucidadas pela decomposição da herdabilidade e da acurácia nos três componentes, mostrando suas relações com a estrutura de populações e o melhoramento genético, a curto e longo prazo. Além disso, neste trabalho de simulação também foi desenvolvido dois novos estimadores para a acurácia preditiva da seleção genômica. O trabalho propõe e avalia o desempenho e a eficiência destes novos estimadores chamados estimador regularizado (RE) e estimador híbrido (HE). O estimador regularizado leva em consideração tanto a herdabilidade genômica quanto a herdabilidade da característica, além da capacidade preditiva. Enquanto, o estimador híbrido (HE), combina as acurácias experimental e esperada. As comparações entre RE e HE com o estimador tradicional (TE) foram feitas sob quatro procedimentos de validação. Em geral, RE apresentou acurácias mais próximas aos valores paramétricos, principalmente quando há seleção de marcadores. RE também foi menos tendencioso e mais preciso, com desvios padrão menores do que o estimador tradicional. Diante dos resultados, o TE pode ser usado apenas com a validação independente, em que tende a ter um melhor desempenho do que RE, embora superestimando a acurácia. O estimador híbrido (HE) provou ser muito eficaz na ausência de validação. Enquanto, que a validação independente mostrou-se superior em relação aos procedimentos de Jacknife, perseguindo melhor a acurácia paramétrica com ou sem seleção de marcador. As seguintes inferências podem ser feitas de acordo com o estimador de acurácia e tipo de validação: (i) a acurácia mais provável: HE sem validação; (ii) a maior acurácia possível (acurácia superestimada): TE com validação independente; (iii) a menor acurácia possível (acurácia subestimada): RE com validação independente.
The main contribution of molecular genetics is the direct use of DNA information to identify genetically superior individuals. Under this approach, genome-wide selection (GWS) can be used with this purpose. GWS consists in analyzing of a large number of SNP markers widely distributed in the genome. This simulation work presents a complete approach for genomic selection by using adequate genetic models including dominance effects, which are essential for selecting crosses and clones as well as for improving the estimation of additive effects for parent selection. To date, the approaches via Ridge, Lasso and Bayesian additive-dominance models have not been evaluated and compared in the literature.The performance of 10 additive-dominance prediction models (including current ones and proposed modifications) were evaluated. A new modified Bayesian/Lasso method (called BayesA*B* or t-BLASSO) performed best in the prediction of genomic breeding value of individuals, in all the four scenarios (two heritabilities × two genetic architectures). The BayesA*B*-type methods showed better ability for recovering the dominance variance/additive variance ratio. Also, the role of the three quantitative genetics information sources (called linkage disequilibrium, co- segregation and pedigree relationships) in genomic selection were elucidated by decomposing the heritability and accuracy in the three components and showing their relations with the structure of populations and the genetic improvement in the short and long run. Moreover, this simulation work also, we developed the new estimators for the prediction accuracy of genomic selection. The work proposes and evaluates the performance and efficiency of these new estimators called regularized estimator (RE) and hybrid estimator (HE). The regularized estimator takes in consideration both the genomic and trait heritabilities, in addition to the predictive ability. The hybrid estimator (HE), combines both experimental and expected accuracies. The comparisons of the RE and HE with the traditional (TE) were done under four validation procedures. In general, the new estimator presented accuracies closer to the parametric ones, mainly when selecting markers. It was also less biased and more precise, with smaller standard deviations than the traditional estimator. The TE can be used only with independent validation, where it tends to perform better than RE, although overestimating the accuracy. The hybrid estimator (HE) proved to be very effective in the absence of validation. The independent validation showed to be superior over the Jacknife procedures, chasing better the parametric accuracy with or without marker selection. The following inferences can be made according to the accuracy estimator and kind of validation: (i) most probable accuracy: HE without validation; (ii) highest possible accuracy: TE with independent validation; (iii) lowest possible accuracy: RE with independent validation.
Sem Agência de Fomento
Ocloo, Isaac Xoese. "Energy Distance Correlation with Extended Bayesian Information Criteria for feature selection in high dimensional models." Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1625238661031258.
Full textMarques, Matheus Augustus Pumputis. "Análise e comparação de alguns métodos alternativos de seleção de variáveis preditoras no modelo de regressão linear." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-23082018-210710/.
Full textIn this work, some new variable selection methods that have appeared in the last 15 years in the context of linear regression are studied, specifically the LARS - Least Angle Regression, the NAMS - Noise Addition Model Selection, the False Selection Rate - FSR, the Bayesian LASSO and the Spike-and-Slab LASSO. The methodology was the analysis and comparison of the studied methods. After this study, applications to real data bases are made, as well as a simulation study, in which all methods are shown to be promising, with the Bayesian methods showing the best results.
Hidalgo, André Marubayashi. "Fine mapping and single nucleotide polymorphism effects estimation on pig chromosomes 1, 4, 7, 8, 17 and X." Universidade Federal de Viçosa, 2011. http://locus.ufv.br/handle/123456789/4753.
Full textCoordenação de Aperfeiçoamento de Pessoal de Nível Superior
Mapeamento de loci de caracaterística quantitativas (QTL) geralmente resultam na detecção de regiões genômicas que explicam parte da variação quantitativa da característica. Entretanto essas regiões são muito amplas e não permitem uma acurada identificação dos genes. Dessa forma, torna-se necessário o estreitamento dos intervalos onde os QTL estão localizados. Com a seleção genômica ampla (GWS), foram desenvolvidas ferramentas estatísticas de forma a se estimar os efeitos de cada marcador. A partir dos valores desses efeitos, pode-se analisar quais são os marcadores de maiores efeitos. Assim, objetivou-se realizar o mapeamento fino dos cromossomos suínos 1, 4, 7, 8, 17, e X, usando marcadores microsatélites e polimorfismo de base única (SNP), em uma população F2 produzida pelo cruzamento de varrões da raça naturalizada brasileira Piau com fêmeas comerciais, associados com características de desempenho, carcaça, orgãos internos, cortes e qualidade de carne. Também objetivou-se estimar os efeitos dos marcadores SNP nas características que tiveram QTL detectados, analisar quais são os mais expressivos e verificar se eles estão localizados dentro do intervalo de confiança do QTL. Os QTL foram identificados por meio do método regressão por intervalo de mapeamento e as análises foram realizadas pelo software GridQTL. O efeito de cada marcador foi estimado pela regressão de LASSO Bayesiano, usando o software R. No total, 32 QTL foram encontrados ao nível cromossômico de significância de 5%, destes, 12 eram significativos ao nível cromossômico de 1% e 7 destes eram significativos ao nível genômico de 5%. Seis de sete QTL apresentaram marcadores de efeito expressivo dentro do intervalo de confiança do QTL. Resultados deste estudo confirmaram QTL de outros trabalhos e identificaram vários outros novos. Os resultados encontrados utilizando marcadores microsatélites junto com SNPs aumentaram a saturação do genoma levando a um menor intervalo de confiança dos QTL encontrados. Os métodos usados foram importantes para estimar os efeitos dos marcadores, e também para localizar aqueles com efeitos mais expressivos dentro do intervalo de confiança do QTL, validando os QTL encontrados pelo método da regressão.
Quantitative Trait Loci (QTL) mapping efforts often result in the detection of genomic regions that explain part of the quantitative trait variation. However, these regions are very large and do not allow accurate gene identification, hence the interval must be narrowed where the QTL was located. With the genome wide selection (GWS), many statistical tools have been developed in order to estimate the effects for each marker. With the marker effects values it is possible to analyze which markers have large effects. Hence, the objective of this investigation was to fine map pig chromosomes 1, 4, 7, 8, 17 and X, using microsatellites and SNP markers, in a F2 population produced by crossing naturalized Brazilian Piau boars with commercial females, associated with performance, carcass, internal organs, cut yields and meat quality traits. A further aim was to estimate the effects of single nucleotide polymorphism (SNP) markers on traits with detected QTL, analyze the most expressive ones and verify whether the markers with larger effects were indeed within the QTL confidence interval. QTL were identified by regression interval mapping using the GridQTL software. Individual marker effects were estimated by Bayesian LASSO regression using the R software. In total, 32 QTL for the studied traits were significant at the 5% chromosome-wide level, including 12 significant QTL at the 1% chromosome-wide level and 7 significant at the 5% genome-wide level. Six out of seven QTL with genome-wide significance had markers of large effect within their confidence interval. These results confirmed some previous QTL and identified numerous novel QTL for the investigated traits. Our results have shown that the use of microsatellites and SNP markers that increase the genome saturation lead to QTL of smaller confidence intervals. The methods used were also valuable to estimate the marker effects and to locate the most expressive markers within the QTL confidence interval, validating those QTL found by the regression method.
Hashem, Hussein Abdulahman. "Regularized and robust regression methods for high dimensional data." Thesis, Brunel University, 2014. http://bura.brunel.ac.uk/handle/2438/9197.
Full textBitto, Angela, and Sylvia Frühwirth-Schnatter. "Achieving shrinkage in a time-varying parameter model framework." Elsevier, 2019. http://dx.doi.org/10.1016/j.jeconom.2018.11.006.
Full textNICOLAZZI, EZEQUIEL LUIS. "New trends in dairy cattle genetic evaluation." Doctoral thesis, Università Cattolica del Sacro Cuore, 2011. http://hdl.handle.net/10280/966.
Full textGenetic evaluation systems are in rapid development worldwide. In most countries, “traditional” breeding programs based on phenotypes and relationships between animals are currently being integrated and in the future might be replaced by the introduction of molecular information. This thesis stands in this transition period, therefore it covers research on both types of genetic evaluations: from the assessment of the accuracy of (traditional) international genetic evaluations to the study of statistical methods used to integrate genomic information into breeding (genomic selection). Three chapters investigate and evaluate approaches for the estimation of genetic values from genomic data reducing the number of independent variables. In particular, Bonferroni correction and Permutation test combined with single marker regression (Chapter III), principal component analysis combined with BLUP (Chapter IV) and Fst across breeds combined with BayesA (Chapter VI). In addition, Chapter V analyzes the accuracy of direct genomic values with BLUP, BayesA and Bayesian LASSO including all available variables. The results of this thesis indicate that the genetic gains expected from the analysis of simulated data can be obtained on real data. Still, further research is needed to optimize the use of genome-wide information and obtain the best possible estimates for all traits under selection.
NICOLAZZI, EZEQUIEL LUIS. "New trends in dairy cattle genetic evaluation." Doctoral thesis, Università Cattolica del Sacro Cuore, 2011. http://hdl.handle.net/10280/966.
Full textGenetic evaluation systems are in rapid development worldwide. In most countries, “traditional” breeding programs based on phenotypes and relationships between animals are currently being integrated and in the future might be replaced by the introduction of molecular information. This thesis stands in this transition period, therefore it covers research on both types of genetic evaluations: from the assessment of the accuracy of (traditional) international genetic evaluations to the study of statistical methods used to integrate genomic information into breeding (genomic selection). Three chapters investigate and evaluate approaches for the estimation of genetic values from genomic data reducing the number of independent variables. In particular, Bonferroni correction and Permutation test combined with single marker regression (Chapter III), principal component analysis combined with BLUP (Chapter IV) and Fst across breeds combined with BayesA (Chapter VI). In addition, Chapter V analyzes the accuracy of direct genomic values with BLUP, BayesA and Bayesian LASSO including all available variables. The results of this thesis indicate that the genetic gains expected from the analysis of simulated data can be obtained on real data. Still, further research is needed to optimize the use of genome-wide information and obtain the best possible estimates for all traits under selection.
Karlsson, Jonas, and Roger Karlsson. "Inkrementell responsanalys : Vilka kunder bör väljas vid riktad marknadsföring?" Thesis, Linköpings universitet, Statistik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-96593.
Full textKim, Byung-Jun. "Semiparametric and Nonparametric Methods for Complex Data." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/99155.
Full textDoctor of Philosophy
A variety of complex data has broadened in many research fields such as epidemiology, genomics, and analytical chemistry with the development of science, technologies, and design scheme over the past few decades. For example, in epidemiology, the matched case-crossover study design is used to investigate the association between the clustered binary outcomes of disease and a measurement error in covariate within a certain period by stratifying subjects' conditions. In genomics, high-correlated and high-dimensional(HCHD) data are required to identify important genes and their interaction effect over diseases. In analytical chemistry, multiple time series data are generated to recognize the complex patterns among multiple classes. Due to the great diversity, we encounter three problems in analyzing the following three types of data: (1) matched case-crossover data, (2) HCHD data, and (3) Time-series data. We contribute to the development of statistical methods to deal with such complex data. First, under the matched study, we discuss an idea about hypothesis testing to effectively determine the association between observed factors and risk of interested disease. Because, in practice, we do not know the specific form of the association, it might be challenging to set a specific alternative hypothesis. By reflecting the reality, we consider the possibility that some observations are measured with errors. By considering these measurement errors, we develop a testing procedure under the matched case-crossover framework. This testing procedure has the flexibility to make inferences on various hypothesis settings. Second, we consider the data where the number of variables is very large compared to the sample size, and the variables are correlated to each other. In this case, our goal is to identify important variables for outcome among a large amount of the variables and build their network. For example, identifying few genes among whole genomics associated with diabetes can be used to develop biomarkers. By our proposed approach in the second project, we can identify differentially expressed and important genes and their network structure with consideration for the outcome. Lastly, we consider the scenario of changing patterns of interest over time with application to gas chromatography. We propose an efficient detection method to effectively distinguish the patterns of multi-level subjects in time-trend analysis. We suggest that our proposed method can give precious information on efficient search for the distinguishable patterns so as to reduce the burden of examining all observations in the data.
Fu, Haoda. "Sparsity and smoothness for disease rate maping via spatial bayesian lasso." 2007. http://www.library.wisc.edu/databases/connect/dissertations.html.
Full text"Bayesian model selection for semiparametric structural equation models with modified group Lasso." 2014. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1291541.
Full text在结构方程模型的实际应用中,选择一个合适的模型是一个核心问题。但是由于模型的复杂性,对于含有函数型结构的半参数结构方程模型进行模型选择十分困难。在本文中,我们提出了一种新的贝叶斯自适应群Lasso,并应用它来对半参数结构方程模型同时进行参数估计和模型选择。我们在非参数结构方程模型中引入了部分线性结构,并通过一种新的基底函数展开来近似结构方程里的未知函数。这种结构同时具备了线性模型和非参数模型的优势。本文的方法可以自动识别半参数结构方程模型里面的非线性和线性结构,并筛除不重要的变量。这种带有自适应惩罚的群Lasso不仅减小了传统Lasso方法在估计参数时产生的偏差,而且解决了由潜变量的基底表示导致的组效应和相关性引起的模型选择的困难。由模拟实验的结果可以看出本文提出的方法十分有效。我们还应用所提出的方法分析了一组关于糖尿病型肾病的数据,并得到了一些有意义的结果。
Feng, Xiangnan.
Thesis M.Phil. Chinese University of Hong Kong 2014.
Includes bibliographical references (leaves 51-56).
Abstracts also in Chinese.
Title from PDF title page (viewed on 18, October, 2016).
Detailed summary in vernacular field only.
Adjogou, Adjobo Folly Dzigbodi. "Analyse statistique de données fonctionnelles à structures complexes." Thèse, 2017. http://hdl.handle.net/1866/20581.
Full textPaccapelo, María Valeria. "Modelos de selección genómica para caracteres cuantitativos basados en marcadores moleculares aplicados al mejoramiento de maíz." Master's thesis, 2015. http://hdl.handle.net/11086/2355.
Full textEn la actualidad, los modelos de selección genómica (SG) han cobrado gran importancia ya que permiten predecir los valores genéticos de los individuos en función de marcadores moleculares (MM). La incorporación de numerosos MM en modelos de regresión conduce a problemas de dimensionalidad y multicolinealidad. Esta tesis tuvo como objetivo evaluar seis métodos de SG que confrontan estas dificultades (selección de variables, estimación penalizada y la combinación de ambos) desde enfoques clásicos o bayesianos y evaluar su habilidad predictiva para tres caracteres fenotípicos observados en 20 poblaciones de maíz (Zea mays L.). Los resultados indican que la habilidad predictiva se vio asociada a la heredabilidad del carácter y fue superior para los métodos penalizados, entre los que se recomienda la Regresión de Ridge vía modelos mixtos (RR-BLUP). Este trabajo permitió analizar diferentes técnicas estadísticas aplicadas a la SG en un contexto propio de un programa de mejoramiento genético de maíz.
Fil: Paccapelo, María Valeria. Universidad Nacional de Córdoba; Argentina.