To see the other types of publications on this topic, follow the link: Genomics. Bioinformatics.

Dissertations / Theses on the topic 'Genomics. Bioinformatics'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Genomics. Bioinformatics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Meng, Da. "Bioinformatics tools for evaluating microbial relationships." Pullman, Wash. : Washington State University, 2009. http://www.dissertations.wsu.edu/Dissertations/Spring2009/d_meng_042209.pdf.

Full text
Abstract:
Thesis (Ph. D.)--Washington State University, May 2009.<br>Title from PDF title page (viewed on June 8, 2009). "School of Electrical Engineering and Computer Science." Includes bibliographical references.
APA, Harvard, Vancouver, ISO, and other styles
2

Hvidsten, Torgeir R. "Predicting Function of Genes and Proteins from Sequence, Structure and Expression Data." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis : Univ.-bibl. [distributör], 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-4490.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Åkerborg, Örjan. "Taking advantage of phylogenetic trees in comparative genomics." Doctoral thesis, KTH, Beräkningsbiologi, CB, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4757.

Full text
Abstract:
Phylogenomics can be regarded as evolution and genomics in co-operation. Various kinds of evolutionary studies, gene family analysis among them, demand access to genome-scale datasets. But it is also clear that many genomics studies, such as assignment of gene function, are much improved by evolutionary analysis. The work leading to this thesis is a contribution to the phylogenomics field. We have used phylogenetic relationships between species in genome-scale searches for two intriguing genomic features, namely and A-to-I RNA editing. In the first case we used pairwise species comparisons, specifically human-mouse and human-chimpanzee, to infer existence of functional mammalian pseudogenes. In the second case we profited upon later years' rapid growth of the number of sequenced genomes, and used 17-species multiple sequence alignments. In both these studies we have used non-genomic data, gene expression data and synteny relations among these, to verify predictions. In the A-to-I editing project we used 454 sequencing for experimental verification. We have further contributed a maximum a posteriori (MAP) method for fast and accurate dating analysis of speciations and other evolutionary events. This work follows recent years' trend of leaving the strict molecular clock when performing phylogenetic inference. We discretised the time interval from the leaves to the root in the tree, and used a dynamic programming (DP) algorithm to optimally factorise branch lengths into substitution rates and divergence times. We analysed two biological datasets and compared our results with recent MCMC-based methodologies. The dating point estimates that our method delivers were found to be of high quality while the gain in speed was dramatic. Finally we applied the DP strategy in a new setting. This time we used a grid laid out on a species tree instead of on an interval. The discretisation gives together with speciation times a common timeframe for a gene tree and the corresponding species tree. This is the key to integration of the sequence evolution process and the gene evolution process. Out of several potential application areas we chose gene tree reconstruction. We performed genome-wide analysis of yeast gene families and found that our methodology performs very well.<br>QC 20100923
APA, Harvard, Vancouver, ISO, and other styles
4

Andrade, Jorge. "Grid and High-Performance Computing for Applied Bioinformatics." Doctoral thesis, Stockholm : Bioteknologi, Kungliga Tekniska högskolan, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4573.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Hervás, Fernàndez Sergi. "Population genomics in Drosophila melanogaster: a bioinformatics approach." Doctoral thesis, Universitat Autònoma de Barcelona, 2018. http://hdl.handle.net/10803/665851.

Full text
Abstract:
High-throughput sequencing technologies are allowing the description of genome-wide variation patterns for an ever-growing number of organisms. However, we still lack a thorough comprehension of the relative amount of different types of genetic variation, their phenotypic effects, and the detection and quantification of distinct selection regimes acting on genomes. The recent compilation of more than one thousand of worldwide wild-derived Drosophila melanogaster genome sequences reassembled using a standardized pipeline (Drosophila Genome Nexus, DGN, Lack et al. 2015, 2016) provides a unique resource to test molecular population genetics hypotheses, and ultimately understand the evolutionary dynamics of genetic variation in the populations. Besides, the increasing amount of genomic data available requires the continuous development and optimization of bioinformatics tools able to handle and analyze such information. Thus, the development and implementation of new biologically-oriented software addressing several steps from data acquisition, filtering, processing, display or analysis to the final reporting step is a constantly growing need, especially in fields dealing with large data sets, such as population genomics. This thesis is conceived as a comprehensive bioinformatics and population genomics project. It is centered in the development and application of bioinformatics tools for the analysis and visualization of nucleotide variation patterns and the detection of selective events in the genome of D. melanogaster, using the DGN data. The main goal is accomplished in three sequential steps: (i) capture the evolutionary properties of the analyzed sequences (i.e., create a catalog of population genetics metrics) and implement a tool for the graphical display of such information; (ii) develop a statistical package for the computation of the diverse selection regimes acting on genomes (positive and purifying selection), and finally (iii) perform an initial population genomics analysis in D. melanogaster using the previously developed tools. The common approach applied to process the data, starting at the assembly of genome sequences and ending up at the estimates of population genetics metrics, allows performing, for the first time, a comprehensive comparison and interpretation of results using samples from five continents. Overall, this work provides a global overview of the nucleotide variation and adaptation patterns along the genome, and a general assessment of the relative impact of the major genomic determinants of genetic variation, in Drosophila meta-populations with different geographical origin.
APA, Harvard, Vancouver, ISO, and other styles
6

Kemmer, Danielle. "Genomics and bioinformatics approaches to functional gene annotation /." Stockholm, 2006. http://diss.kib.ki.se/2006/91-7140-636-0/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Johansson, Annelie. "Identifying gene regulatory interactions using functional genomics data." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-230285.

Full text
Abstract:
Previously studies used correlation of DNase I hypersensitivity sites sequencing (DNase-seq) experiments to predict interactions between enhancers and its target promoter gene. We investigate the correlation methods Pearson’s correlation and Mutual Information, using DNase-seq data for 100 cell-types in regions on chromosome one. To assess the performances, we compared our results of correlation scores to Hi-C data from Jin et al. 2013. We showed that the performances are low when comparing it to the Hi-C data, and there is a need of improved correlation metrics. We also demonstrate that the use of Hi-C data as a gold standard is limited, because of its low resolution, and we suggest using another gold standard in further studies.
APA, Harvard, Vancouver, ISO, and other styles
8

Novotny, Marian. "Applications of Structural Bioinformatics for the Structural Genomics Era." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis Acta Universitatis Upsaliensis, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-7593.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Janvid, Vincent. "Building a genomic variant based prediction model for lung cancer toxicity." Thesis, KTH, Tillämpad fysik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297411.

Full text
Abstract:
Since the completion of the the Human genome project in 2003, the evident complexity of our genome and its regulation has only grown. The idea that having sequenced the human genome would solve this mystery was quickly discarded. With the decreasing costs of DNA sequencing, a plethora of new methods have evolved to further understand the role of non-coding regions of our genome, which makes up 98% its length. Genetic variations in these regions are therefore abundant in the human population, but their e ects are hard to characterize. Many non-coding variants have been linked to complex diseases such as cancer predisposition. This thesis aims to investigate the potential e ects of non-coding variants on drug toxicity, that is, how severe the adverse e ects of a drug are to the treated patients. More specifically it will study the effects of two cancer drugs, Gemcitabine and Carboplatin, on a set of 96 patients with lung cancer. To do this we use spatial data acquired by the promoter-targeting method HiCap as well as expression data obtained from blood cell lines. Using the variants obtained through whole genome sequencing of the patients, a supervised learning approach was attempted to predict the final toxicity experienced by the patients. The large number of variants present among the comparably few patients resulted in poor accuracy. The conclusion was drawn that the resolution of HiCap is too low compared to the density of variants in the non-coding regions. Additional data, such as transcription factor Chip-Seq data, and transcription factor motifs are needed to locate potentially contributing variants within the interactions.<br>Sedan den första sekvenseringen av det mänskliga genomet 2003 har vår bild av vårt genom och hur det regleras bara blivit mer komplex. Iden om att ha tillgång till ett helt genom skulle losa detta mysterium förkastades snabbt. Med de sjunkande kostnaderna for sekvensering har ett brett utbud av nya metoder utvecklats for att bättre förstå de icke-kodande regionernas roll i v art genom. Då dessa regioner utgör98% av vårt DNA ar innehåller de stor variation bland det mänskliga släktet, men att förutsaga deras effekt är mycket svårt. Många icke-kodande variationer har kopplats till komplexa sjukdomar så som ökad risk för cancer.Denna uppsats syftar till att undersoka de potentiella effekterna av icke-kodande varianter på hur allvarliga biverkningar en patient får av en cancerbehandling. Närmare undersöks två mediciners, Gemcitabins och Carboplatins effekt på 96 lungcancerpatienter. För detta används spatial data samt genuttrycksdata från blodcellinjer.Med utgångspunkt från genetiska varianter bland patienternas sekvenserade genom testades övervakad inlärning för att förutsäga graden av biverkningar hos patienterna. Den stora mängden varianter som bärs av de förhållandevis få patienterna resulterade i låg träffsäkerhet hos prediktorn. Slutsatsen drogs att upplösningen av HiCap är för låg i jämförelse med den höga densiteten av varianter i icke-kodanderegioner. Mer data, så som Chip-Seq data från transkriptionsfaktorer samt deras specifika bindningsekvenser behövs för att lokalisera varianter inom en interaktion, som potentiellt skulle kunna påverka biverkningarna.
APA, Harvard, Vancouver, ISO, and other styles
10

Cleary, Alan Michael. "Computational Pan-Genomics| Algorithms and Applications." Thesis, Montana State University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10792396.

Full text
Abstract:
<p> As the cost of sequencing DNA continues to drop, the number of sequenced genomes rapidly grows. In the recent past, the cost dropped so low that it is no longer prohibitively expensive to sequence multiple genomes for the same species. This has led to a shift from the single reference genome per species paradigm to the more comprehensive pan-genomics approach, where populations of genomes from one or more species are analyzed together. </p><p> The total genomic content of a population is vast, requiring algorithms for analysis that are more sophisticated and scalable than existing methods. In this dissertation, we explore new algorithms and their applications to pan-genome analysis, both at the nucleotide and genic resolutions. Specifically, we present the Approximate Frequent Subpaths and Frequented Regions problems as a means of mining syntenic blocks from pan-genomic de Bruijn graphs and provide efficient algorithms for mining these structures. We then explore a variety of analyses that mining synteny blocks from pan-genomic data enables, including meaningful visualization, genome classification, and multidimensional-scaling. We also present a novel interactive data mining tool for pan-genome analysis&mdash;the Genome Context Viewer&mdash;which allows users to explore pan-genomic data distributed across a heterogeneous set of data providers by using gene family annotations as a unit of search and comparison. Using this approach, the tool is able to perform traditionally cumbersome analyses on-demand in a federated manner.</p><p>
APA, Harvard, Vancouver, ISO, and other styles
11

Podowski, Raf M. "Applied bioinformatics for gene characterization /." Stockholm, 2006. http://diss.kib.ki.se/2006/91-7140-818-5/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Li, Yang. "Statistical Methods for Large-Scale Integrative Genomics." Thesis, Harvard University, 2016. http://nrs.harvard.edu/urn-3:HUL.InstRepos:33493551.

Full text
Abstract:
In the past 20 years, we have witnessed a significant advance of high-throughput genetic and genomic technologies. With the massively generated genomics data, there is a pressing need for statistical methods that can utilize them to make quantitative inference on substantive scientific questions. My research has been focusing on statistical methods for large-scale integrative genomics. The human genome encodes more than 20,000 genes, while the functions of about 50% (>10,000) genes remains unknown up to date. The determination of the functions of the poorly characterized genes is crucial for understanding biological processes and human diseases. In the era of Big Data, the availability of massive genomic data provides us unprecedented opportunity to identify the association between genes and predict their biological functions. Genome sequencing data and mRNA expression data are the two most important classes of genomic data. This thesis presents three research projects in self-contained chapters: (1) a statistical framework for inferring evolutionary history of human genes and identifying gene modules with shared evolutionary history from genome sequencing data, (2) a statistical method to predict frequent and specific gene co-expression by integrating a large number of mRNA expression datasets, and (3) robust variable and interaction selection for high-dimensional classification problem under the discriminant analysis and logistic regression model. Chapter 1. Human has more than 20,000 genes but till now most of their functions are uncharacterized. Determination of the function for poorly characterized genes is crucial for understanding biological processes and study of human diseases. Functionally associated genes tend to gain and lose simultaneously during evolution, therefore identifying co-evolution of genes predicts gene-gene associations. In this chapter, we propose a mixture of tree-structured hidden Markov models for gene evolution process, and a Bayesian model-based clustering algorithm to detect gene modules with shared evolutionary history (named as evolutionary conserved modules, ECM). Dirichlet process prior is adopted for estimation of number of gene clusters and an efficient Gibbs sampler is developed for posterior distribution computation. By simulation study and benchmarks on real data sets, we show that our algorithm outperforms traditional methods that use simple metrics (e.g. Hamming distance, Pearson correlation) to measure the similarity between genes presence/absence patterns. We apply our methods on 1,025 canonical human pathways gene sets, and found a large portion of the detected gene associations are substantiated by other sources of evidence. The rest of genes have predicted functions of high priority to be verified by further biological experiments. Chapter 2. The availability of gene expression measurements across thousands of experimental conditions provides the opportunity to predict gene function based on shared mRNA expression. While many biological complexes and pathways are coordinately expressed, their genes may be organized into co-expression modules with distinct patterns in certain tissues or conditions, which can provide insight into pathway organization and function. We developed the algorithm CLIC (clustering by inferred co-expression, www.gene-clic.org) that clusters a set of functionally-related genes into co-expressed modules, highlights the most relevant datasets, and predicts additional co-expressed genes. Using a statistical Bayesian partition model, CLIC simultaneously partitions the input gene set into disjoint co-expression modules and weights the most relevant datasets for each module. CLIC then expands each module with additional members that co-express with the module’s genes more than the background model in the weighted datasets. We applied CLIC to (i) model the background correlation in each of 3,662 mouse and human microarray datasets from the Gene Expression Omnibus (GEO), (ii) partition each of 900 annotated complexes/pathways into co-expression modules, and (iii) expand each co-expression module with additional genes showing frequent and specific co-expression over multiple GEO datasets. CLIC provided very strong functional predictions for many completely uncharacterized genes, including a link between protein C7orf55 and the mitochondrial ATP synthase complex that we experimentally validated via CRISPR knock-out. CLIC software is freely available and should become increasingly powerful with the growing wealth of transcriptomic datasets. Chapter 3. Discriminant analysis and logistic regression are fundamental tools for classification problems. Quadratic discriminant analysis has the ability to exploit interaction effects of predictors, but the selection of interaction terms is non-trivial and the Gaussian assumption is often too restrictive for many real problems. Under the logistic regression framework, we propose a forward-backward method, SODA, for variable selection with both main and quadratic interaction terms, where in the forward stage, a stepwise procedure is conducted to screen for important predictors with both main and interaction effects, and in the backward stage SODA remove insignificant terms so as to optimize the extended BIC (EBIC) criterion. Compared with existing methods on quadratic discriminant analysis variable selection (e.g., (Murphy et al., 2010), (Zhang and Wang, 2011) and (Maugis et al., 2011)), SODA can deal with high-dimensional data with the number of predictors much larger than the sample size and does not require the joint normality assumption on predictors, leading to much enhanced robustness. Theoretical analysis establishes the consistency of SODA under high-dimensional setting. Empirical performance of SODA is assessed on both simulated and real data and is found to be superior to all existing methods we have tested. For all the three real datasets we have studied, SODA selected more parsimonious models achieving higher classification accuracies compared to other tested methods.<br>Statistics
APA, Harvard, Vancouver, ISO, and other styles
13

Perkins, J. R. "Functional genomics and bioinformatics protocols for the elucidation of pain." Thesis, University College London (University of London), 2013. http://discovery.ucl.ac.uk/1384822/.

Full text
Abstract:
Microarray technologies enable us to profile the expression of thousands of gene transcripts within a given cell or tissue. Within pain research they have been used extensively to search for genes that change in expression as a result of the induction of a clinically-relevant pain state, often using an animal model of pain. Studying these genes has led to improvements in our understanding of the genes, pathways and other biological processes involved in pain. These themes are explored further in the first (introductory) chapter of this thesis. These experiments result in large numbers of genes declared differentially expressed between samples, many of which are not directly involved in pain. There is often little overlap of these genes between different pain models. The second chapter of this thesis is concerned with the use of systems biology methods to prioritise these genes based on their likelihood of being pain-related. In the third chapter a web-based software application is described. It allows a pain researcher to combine data from various pain-related microarray experiments with other data sources in order to build their own pain networks. Exemplary usage scenarios are presented. The fourth chapter describes a comparison between microarrays and a new technology, RNA-seq, which uses next generation sequencing technology to quantify the RNA present within a tissue. Samples obtained using a well characterised animal pain model, spinal nerve transection, are used for this purpose. In the fifth chapter the effects of RNA-seq sequencing depth on the detection of differentially expressed genes and the discovery of novel transcribed regions of the genome are investigated. In keeping with the theme of gene expression profiling using animal models of pain, the sixth chapter of this thesis reports a software package for the analysis of high-throughput RT-qPCR data and presents an experiment in which this package was used to analyse cytokine expression.
APA, Harvard, Vancouver, ISO, and other styles
14

Walter, Klaudia. "Statistical methods for comparative genomics in the field of bioinformatics." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.611909.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Canzler, Sebastian. "Insights into the Evolution of small nucleolar RNAs." Doctoral thesis, Universitätsbibliothek Leipzig, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-217924.

Full text
Abstract:
Over the last decades, the formerly irrevocable believe that proteins are the only key-factors in the complex regulatory machinery of a cell was crushed by a plethora of findings in all major eukaryotic lineages. These suggested a rugged landscape in the eukaryotic genome consist- ing of sequential, overlapping, or even bi-directional transcripts and myriads of regulatory elements. The vast part of the genome is indeed transcribed into an RNA intermediate, but solely a small fraction is finally translated into functional proteins. The sweeping majority, however, is either degraded or functions as a non-protein coding RNA (ncRNA). Due to continuous developments in experimental and computational research, the variety of ncRNA classes grew larger and larger, ranging from key-processes in the cellular lifespan to regulatory processes that are driven and guided by ncRNAs. The bioinformatical part pri- marily concentrates on the prediction, annotation, and extraction of characteristic properties of novel ncRNAs. Due to conservation of sequence and/or structure, this task is often deter- mined by an homology-search that utilizes information about functional, and hence conserved regions, as an indicator. This thesis focuses mainly on a special class of ncRNAs, small nucleolar RNAs (snoRNAs). These abundant molecules are mainly responsible for the guidance of 2’-O-ribose-methylations and pseudouridylations in different types of RNAs, such as ribosomal and spliceosomal RNAs. Although the relevance of single modifications is still rather unclear, the elimination of a bunch of modifications is shown to cause severe effects, including lethality. Several de novo prediction programs have been published over the last years and a substantial amount of publicly available snoRNA databases has originated. Normally, these are restricted to a small amount of species and a collection of experimentally extracted snoRNA. The detection of snoRNAs by means of wet lab experiments and/or de novo prediction tools is generally time consuming (wet lab) and a quite tedious task (identification of snoRNA-specific characteristics). The snoRNA annotation pipeline snoStrip was developed with the intention to circumvent these obstacles. It therefore utilizes a homology-based search procedure to reliably predict snoRNA genes in genomic sequences. In a subsequent step, all candidates are filtered with respect to specific sequence motifs and secondary structures. In a functional analysis, poten- tial target sites are predicted in ribosomal and spliceosomal RNA sequences. In contrast to de novo prediction tools, snoStrip focuses on the extension of the known snoRNA world to uncharted organisms and the mapping and unification of the existing diversity of snoRNAs into functional, homologous families. The pipeline is properly suited to analyze a manifold set of organisms in search for their snoRNAome in short timescales. This offers the opportunity to generate large scale analyses over whole eukaryotic kingdoms to gain insights into the evolutionary history of these spe- cial ncRNA molecules. A set of experimentally validated snoRNA genes in Deuterostomia and Fungi were starting points for highly comprehensive surveys searching and analyzing the snoRNA repertoire in these two major eukaryotic clades. In both cases, the snoStrip pipeline proved itself as a fast and reliable tool and collected thousands of snoRNA genes in nearly 200 organisms. Additionally, the Interaction Conservation Index (ICI), which is am- plified to additionally work on single lineages, provides a convenient measure to analyze and evaluate the conservation of snoRNA-targetRNA interactions across different species. The massive amount of data and the possibility to score the conservation of predicted interactions constitute the main pillars to gain an extraordinary insight into the evolutionary history of snoRNAs on both the sequence and the functional level. A substantial part of the snoR- NAome is traceable down to the root of both eukaryotic lineages and might indicate an even more ancient origin of these snoRNAs. However, a plenitude of lineage specific innovation and deletion events are also discernible. Due to its automated detection of homologous and functionally related snoRNA sequences, snoStrip identified extraordinary target switches in fungi. These unveiled a coupled evolutionary history of several snoRNA families that were previously thought to be independent. Although these findings are exceedingly interesting, the broad majority of snoRNA families is found to show remarkable conservation of the se- quence and the predicted target interactions. On two occasions, this thesis will shift its focus from a genuine snoRNA inspection to an analysis of introns. Both investigations, however, are still conducted under an evolutionary viewpoint. In case of the ubiquitously present U3 snoRNA, functional genes in a notable amount of fungi are found to be disrupted by U2-dependent introns. The set of previously known U3 genes is considerably enlarged by an adapted snoStrip-search procedure. Intron- disrupted genes are found in several fungal lineages, while their precise insertion points within the snoRNA-precursor are located in a small and homologous region. A potential targetRNA of snoRNA genes, U6 snRNA, is also found to contain intronic sequences. Within this work, U6 genes are detected and annotated in nearly all fungal organisms. Although a few U6 intron- carrying genes have been known before, the widespread of these findings and the diversity regarding the particular insertion points are surprising. Those U6 genes are commonly found to contain more than just one intron. In both cases of intron-disrupted non-coding RNA genes, the detected RNA molecules seem to be functional and the intronic sequences show remarkable sequence conservation for both their splice sites and the branch site. In summary, the snoStrip pipeline is shown to be a reliable and fast prediction tool that works on homology-based search principles. Large scale analyses on whole eukaryotic lineages become feasible on short notice. Furthermore, the automated detection of functionally related but not yet mapped snoRNA families adds a new layer of information. Based on surveys covering the evolutionary history of Fungi and Deuterostomia, profound insights into the evolutionary history of this ncRNA class are revealed suggesting ancient origin for a main part of the snoRNAome. Lineage specific innovation and deletion events are also found to occur at a large number of distinct timepoints.
APA, Harvard, Vancouver, ISO, and other styles
16

Bradwell, Katie. "Genomic comparisons and genome architecture of divergent Trypanosoma species." VCU Scholars Compass, 2016. http://scholarscompass.vcu.edu/etd/4598.

Full text
Abstract:
Virulent Trypanosoma cruzi, and the non-pathogenic Trypanosoma conorhini and Trypanosoma rangeli are protozoan parasites with divergent lifestyles. T. cruzi and T. rangeli are endemic to Latin America, whereas T. conorhini is tropicopolitan. Reduviid bug vectors spread these parasites to mammalian hosts, within which T. rangeli and T. conorhini replicate extracellularly, while T. cruzi has intracellular stages. Firstly, this work compares the genomes of these parasites to understand their differing phenotypes. Secondly, genome architecture of T. cruzi is examined to address the effect of a complex hybridization history, polycistronic transcription, and genome plasticity on this organism, and study its highly repetitive nature and cryptic genome organization. Whole genome sequencing, assembly and comparison, as well as chromosome-scale genome mapping were employed. This study presents the first comprehensive whole-genome maps of Trypanosoma, and the first T. conorhini strain ever sequenced. Original contributions vii to knowledge include the ~21-25 Mbp assembled genomes of the less virulent T. cruzi G, T. rangeli AM80, and T. conorhini 025E, containing ~10,000 to 13,000 genes, and the ~36 Mbp genome assembly of highly virulent T. cruzi CL with ~24,000 genes. The T. cruzi strains exhibited ~74% identity to proteins of T. rangeli or T. conorhini. T. rangeli and T. conorhini displayed greater complex carbohydrate metabolic capabilities, and contained fewer retrotransposons and multigene family copies, e.g. mucins, DGF-1, and MASP, compared to T. cruzi. Although all four genomes appear highly syntenic, T. rangeli and T. conorhini exhibited greater karyotype conservation. T. cruzi genome architecture studies revealed 66 maps varying from 0.13 to 2.4 Mbp. At least 2.6% of the genome comprises highly repetitive repeat regions, and 7.4% exhibits repetitive regions barren of labels. The 66 putative chromosomes identified are likely diploid. However, 20 of these maps contained regions of up to 1.25 Mbp of homology to at least one other map, suggestive of widespread segmental duplication or an ancient hybridization event that resulted in a genome with significant redundancy. Assembled genomes of these parasites closely reflect their phylogenetic relationships and give a greater context for understanding their divergent lifestyles. Genome mapping provides insight on the genomic evolution of these parasites.
APA, Harvard, Vancouver, ISO, and other styles
17

Li, Yang. "Understanding lineage-specific biology through comparative genomics." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:23398cc7-8bbe-4f5a-8cd9-1104591400cc.

Full text
Abstract:
A major challenge in biology is to identify how different species arose and acquired distinct phenotypic traits. High-throughput sequencing is transforming our understanding of biology by allowing us to study genomes and cellular processes at genome-wide levels. Only a decade subsequent to the publication of the first human genome draft, genome assemblies of hundreds of organisms have been produced. Yet, genome analysis remains challenging and advances have lagged far behind our sequencing abilities and other technological advances. The next generation of comparative genomicists must therefore understand, invent and apply a wide number of computational tools in order to study biology in the most efficient manner and in order to pose the most interesting questions. This thesis spans areas covering evolutionary genomics, gene regulation, and computational methods development. A major aim was to understand how genetic variation contributes to variation in phenotypic traits. This was approached using a large variety of evolutionary and comparative genomics tools. In particular, high-throughput sequencing datasets were analysed to study single-cell transcriptomics, gene duplications, gene architecture evolution, and alternative splicing. Additionally, in cases where off-the-shelf analysis tools were inexistent, novel pipelines and programs were designed and implemented to solve algorithmic problems such as scaffolding genome assemblies and short-read mapping onto small exons.
APA, Harvard, Vancouver, ISO, and other styles
18

Sharpnack, Michael F. Sharpnack. "Integrative Genomics Methods for Personalized Treatment of Non-Small-Cell LungCancer." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1523890139956055.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Mostowy, Serge. "Comparative genomics of the Mycobacterium tuberculosis complex." Thesis, McGill University, 2005. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=111834.

Full text
Abstract:
The study of microbial evolution has been recently accelerated by the advent of comparative genomics, an approach enabling investigation of organisms at the whole-genome level. Tools of comparative genomics, including the DNA microarray, have been applied in bacterial genomes towards studying heterogeneity in DNA content, and to monitor global gene expression. When focused upon the study of microbial pathogens, genome analysis has provided unprecedented insight into their evolution, virulence, and host adaptation. Contributing towards this, I herein explore the evolutionary change affecting genomes of the Mycobacterium tuberculosis complex (MTC), a group of closely related bacterial organisms responsible for causing tuberculosis (TB) across a diverse range of mammals. Despite the introduction nearly a century ago of BCG, a family of live attenuated vaccines intentioned on preventing human TB, the uncertainty surrounding its usefulness is punctuated by the reality that TB continues to be responsible for claiming over 2 million lives per year. As pursued throughout this thesis, a precise understanding of the differences in genomic content among the MTC, and its impact on gene expression and biological function, promises to expose underlying mechanisms of TB pathogenesis, and suggest rational approaches towards the design of improved diagnostics and vaccines to prevent disease.<br>With the availability of whole-genome sequence data and tools of comparative genomics, our publications have advanced the recognition that large sequence polymorphisms (LSPs) deleted from Mycobacterium tuberculosis, the causative agent of TB in humans, serve as accurate markers for molecular epidemiologic assessment and phylogenetic analysis. These LSPs have proven informative both for the types of genes that vary between strains, and for the molecular signatures that characterize different MTC members. Genomic analysis of atypical MTC has revealed their diversity and adaptability, illuminating previously unexpected directions of MTC evolution. As demonstrated from parallel analysis of BCG vaccines, a phylogenetic stratification of genotypes offers a predictive framework upon which to base future genetic and phenotypic studies of the MTC. Overall, the work presented in this thesis has provided unique insights and lessons having direct clinical relevance towards understanding TB pathogenesis and BCG vaccination.
APA, Harvard, Vancouver, ISO, and other styles
20

Ferrer, Samuel. "STAIRS : Data reduction strategy on genomics." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-383465.

Full text
Abstract:
Background. An enormous accumulation of genomic data has been taking place over the last ten years. This makes the activities of visualization and manual inspection, key steps in trying to understand large datasets containing DNA sequences with millions of letters. This situation has created a gap between data complexity and qualified personnel due to the need of trading between visualization, reduction capacity and exploratory functions, features rarely achieved by existing tools, such as SRA toolkit (https://www.ncbi.nlm.nih.gov/sra/docs/toolkitsoft/), for instance. A novel approach to the problem of genomic analysis and visualization was pursued in this project, by means of STrAtified Interspersed Reduction Structures (STAIRS). Result. Ten weeks of intense work resulted in novel algorithms to compress data, transform it into stairs vectors and align them. Smith–Waterman and Needleman–Wunsch algorithms have been specially modified for this purpose and the application brought about statistical performance and behavioural charts.
APA, Harvard, Vancouver, ISO, and other styles
21

Quinlan, Aaron Ryan. "Discovery and interpretation of genetic variation with next‐generation sequencing technologies." Thesis, Boston College, 2008. http://hdl.handle.net/2345/32.

Full text
Abstract:
Thesis advisor: Gabor T. Marth<br>Improvements in molecular and computational technologies have driven and will continue to drive advances in our understanding of genetic variation and its relationship to phenotypic diversity. Over the last three years, several new DNA sequencing technologies have been developed that greatly improve upon the cost and throughput of the capillary DNA sequencing technologies that were used to sequence the first human genome. The economy of these so‐called “next‐generation” technologies has enabled researchers to conduct genome‐wide studies in genetic variation that were previously intractable or too expensive. However, because the new technologies employ novel molecular techniques, the resulting sequence data is quite different from the capillary sequences to which the genomics field is accustomed. Moreover, the vast amounts of sequence data that these technologies produce present novel statistical and computational challenges in order to make even the simplest observations. The focus of my dissertation has been the development of novel computational and analytical methods that facilitate genome‐wide studies in genetic variation with traditional capillary sequencers and with new sequencing technologies. I present a novel method that produces more accurate error estimates for sequence data from one of these next‐generation sequencing technologies. I also present two studies that illustrate the utility of two such technologies for genome‐wide polymorphism discovery studies in Drosophila melanogaster and Caenorhabditis elegans. These studies accurately estimate the degree of genetic diversity in the fruitfly and nematode, respectively. I later describe how new sequencing approaches can be used to accelerate the mapping of causal genetic mutations in forward geetic screens. Lastly, I remark on where I believe these technologies will lead future studies in human genetic variation and describe their relevance to several of my future research interests<br>Thesis (PhD) — Boston College, 2008<br>Submitted to: Boston College. Graduate School of Arts and Sciences<br>Discipline: Biology
APA, Harvard, Vancouver, ISO, and other styles
22

Suen, Garret. "Understanding prokaryotic diversity in the post-genomics era." Related electronic resource: Current Research at SU : database of SU dissertations, recent titles available, full text:, 2008. http://wwwlib.umi.com/cr/syr/main.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Mungall, Christopher. "Next-generation information systems for genomics." Thesis, University of Edinburgh, 2011. http://hdl.handle.net/1842/5020.

Full text
Abstract:
The advent of next-generation sequencing technologies is transforming biology by enabling individual researchers to sequence the genomes of individual organisms or cells on a massive scale. In order to realize the translational potential of this technology we will need advanced information systems to integrate and interpret this deluge of data. These systems must be capable of extracting the location and function of genes and biological features from genomic data, requiring the coordinated parallel execution of multiple bioinformatics analyses and intelligent synthesis of the results. The resulting databases must be structured to allow complex biological knowledge to be recorded in a computable way, which requires the development of logic-based knowledge structures called ontologies. To visualise and manipulate the results, new graphical interfaces and knowledge acquisition tools are required. Finally, to help understand complex disease processes, these information systems must be equipped with the capability to integrate and make inferences over multiple data sets derived from numerous sources. RESULTS: Here I describe research, design and implementation of some of the components of such a next-generation information system. I first describe the automated pipeline system used for the annotation of the Drosophila genome, and the application of this system in genomic research. This was succeeded by the development of a flexible graphoriented database system called Chado, which relies on the use of ontologies for structuring data and knowledge. I also describe research to develop, restructure and enhance a number of biological ontologies, adding a layer of logical semantics that increases the computability of these key knowledge sources. The resulting database and ontology collection can be accessed through a suite of tools. Finally I describe how the combination of genome analysis, ontology-based database representation and powerful tools can be combined in order to make inferences about genotype-phenotype relationships within and across species. CONCLUSION: The large volumes of complex data generated by high-throughput genomic and systems biology technology threatens to overwhelm us, unless we can devise better computing tools to assist us with its analysis. Ontologies are key technologies, but many existing ontologies are not interoperable or lack features that make them computable. Here I have shown how concerted ontology, tool and database development can be applied to make inferences of value to translational research.
APA, Harvard, Vancouver, ISO, and other styles
24

Ramakrishnan, Ranjani. "A data cleaning and annotation framework for genome-wide studies." Full text open access at:, 2007. http://content.ohsu.edu/u?/etd,263.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Hsu, Jeff. "It's Complicated: Analyzing the Role of Genetics and Genomics in Cardiovascular Disease." Case Western Reserve University School of Graduate Studies / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=case1373030141.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Tessmann, Jonathon. "Neuroendocrine genomics for tumor variant discovery." Thesis, University of Iowa, 2018. https://ir.uiowa.edu/etd/6305.

Full text
Abstract:
An exome sequencing analysis pipeline was constructed to analyze NET germline and somatic samples. SNPs and INDELs were called and annotated from germline and somatic tissue. CNVs were also called for the tumor samples. This was accomplished using open source bioinformatics software that has been developed by the research community. Broad Institute "best practices" were followed. Some of the tools that were used include BWA, SAMtools, GATK, Varscan, VT, VEP, and GEMINI. Computational resources were provided by The University of Iowa NEON computer cluster. 57 germline samples and 15 tumor samples across 23 families with a history of NETs produced 4,452 germline variants, 1,695 somatic variants, 5,853 LOH events, and 627 CNV calls. False positive and driver candidacy filtering was applied. One family with Currarino syndrome has an inherited germline missense variant in MNX1. This variant has a phred-scaled Combined Annotation Dependant Depletion score of 35, putting it in the top 0.031% of deleterious variants. CNV analysis demonstrates that 8 of the 15 tumor samples have large-scale deletions of chromosome 18, three of which have nearly the entire chromosome deleted. An affected tumor suppressor gene in this region includes DCC, which is present in all three variant discovery techniques. Variant prioritization techniques are effective, but need further development to increase candidate variant/gene discovery rate.
APA, Harvard, Vancouver, ISO, and other styles
27

Campbell, Kieran. "Probabilistic modelling of genomic trajectories." Thesis, University of Oxford, 2017. https://ora.ox.ac.uk/objects/uuid:24e6704c-8a7f-4967-9fcd-95d6034eab39.

Full text
Abstract:
The recent advancement of whole-transcriptome gene expression quantification technology - particularly at the single-cell level - has created a wealth of biological data. An increasingly popular unsupervised analysis is to find one dimensional manifolds or trajectories through such data that track the development of some biological process. Such methods may be necessary due to the lack of explicit time series measurements or due to asynchronicity of the biological process at a given time. This thesis aims to recast trajectory inference from high-dimensional "omics" data as a statistical latent variable problem. We begin by examining sources of uncertainty in current approaches and examine the consequences of propagating such uncertainty to downstream analyses. We also introduce a model of switch-like differentiation along trajectories. Next, we consider inferring such trajectories through parametric nonlinear factor analysis models and demonstrate that incorporating information about gene behaviour as informative Bayesian priors improves inference. We then consider the case of bifurcations in data and demonstrate the extent to which they may be modelled using a hierarchical mixture of factor analysers. Finally, we propose a novel type of latent variable model that performs inference of such trajectories in the presence of heterogeneous genetic and environmental backgrounds. We apply this to both single-cell and population-level cancer datasets and propose a nonparametric extension similar to Gaussian Process Latent Variable Models.
APA, Harvard, Vancouver, ISO, and other styles
28

Croft, Larry. "Design of information systems in computational genomics /." [St. Lucia, Qld.], 2002. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe17545.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Petri, Eric D. C. "Bioinformatics Tools for Finding the Vocabularies of Genomes." Ohio University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1213730223.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Radhakrishnan, Radhika. "Genome data modeling and data compression." abstract and full text PDF (free order & download UNR users only), 2007. http://0-gateway.proquest.com.innopac.library.unr.edu/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:1447611.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Ertel, Adam M. T̈ozeren Aydin. "Annotation and function of switch-like genes in health and disease /." Philadelphia, Pa. : Drexel University, 2008. http://hdl.handle.net/1860/2813.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Breland, Adrienne E. "A supervised strain classifier." abstract and full text PDF (free order & download UNR users only), 2008. http://0-gateway.proquest.com.innopac.library.unr.edu/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:1453199.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Wallace, Jeffrey B. "An efficient method for searching compressed genomic databases /." abstract and full text PDF (UNR users only), 2008. http://0-gateway.proquest.com.innopac.library.unr.edu/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:1455652.

Full text
Abstract:
Thesis (M.S.)--University of Nevada, Reno, 2008.<br>"May, 2008." Includes bibliographical references (leaves 38-40). Library also has microfilm. Ann Arbor, Mich. : ProQuest Information and Learning Company, [2008]. 1 microfilm reel ; 35 mm. Online version available on the World Wide Web.
APA, Harvard, Vancouver, ISO, and other styles
34

LAMONTANARA, ANTONELLA. "Sviluppo ed applicazione di pipilines bioinformatiche per l'analisi di dati NGS." Doctoral thesis, Università Cattolica del Sacro Cuore, 2015. http://hdl.handle.net/10280/6068.

Full text
Abstract:
Lo sviluppo delle tecnologie di sequenziamento ha portato alla nascita di strumenti in grado di produrre gigabasi di dati di sequenziamento in una singola corsa. Queste tecnologie, comunemente indicate come Next Generation Sequencing o NGS, producono grandi e complessi dataset la cui analisi comporta diversi problemi a livello bioinformatico. L'analisi di questo tipo di dati richiede la messa a punto di pipelines computazionali il cui sviluppo richiede un lavoro di scripting necessario per concatenare i softwares già esistenti. Questa tesi tratta l'aspetto metodologico dell'analisi di dati NGS ottenuti con tecnologia Illumina. In particolare in essa sono state sviluppate tre pipelines bioinformatiche applicate ai seguenti casi studio: 1) uno studio di espressione genica mediante RNA-seq in "Olea europaea" finalizzato all’indagine dei meccanismi molecolari alla base dell’acclimatazione al freddo in questa specie; 2) uno studio mediante RNA-seq finalizzato all’identificazione dei polimorfismi di sequenza nel trascrittoma di due razze bovine mirato a produrre un ampio catalogo di marcatori di tipo SNPs; 3) il sequenziamento, l’assemblaggio e l’annotazione del genoma di un ceppo di Lactobacillus plantarum che mostrava potenziali proprietà probiotiche.<br>The advance in sequencing technologies has led to the birth of sequencing platforms able to produce gigabases of sequencing data in a single run. These technologies commonly referred to as Next Generation Sequencing or NGS produce millions of short sequences called “reads” generating large and complex datasets that pose several challenges for Bioinformatics. The analysis of large omics dataset require the development of bioinformatics pipelines that are the organization of the bioinformatics tools in computational chains in which the output of one analysis is the input of the subsequent analysis. A work of scripting is needed to chain together a group of existing software tools.This thesis deals with the methodological aspect of the data analysis in NGS sequencing performed with the Illumina technology. In this thesis three bioinformatics pipelines were developed.to the following cases of study: 1) a global transcriptome profiling of “Oleaeuropeae” during cold acclimation, aimed to unravel the molecular mechanisms of cold acclimation in this species; 2) a SNPs profiling in the transcriptome of two cattle breeds aimed to produce an extensive catalogue of SNPs; 3) the genome sequencing, the assembly and annotation of the genome of a Lactobacillus plantarum strain showing probiotic properties.
APA, Harvard, Vancouver, ISO, and other styles
35

Kusnierczyk, Waclaw. "Augmenting Bioinformatics Research with Biomedical Ontologies." Doctoral thesis, Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, 2008. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-2001.

Full text
Abstract:
<p>The main objective of the reported study was to investigate how biomedical ontologies, logically structured representations of various aspects of the biomedical reality, can help researchers in analyzing experimental data. The dissertation reports two attempts to construct tools for the analysis of high-throughput experimental results using explicit domain knowledge representations. Furthermore, integrative efforts made by the community of Open Biomedical Ontologies (OBO), in which the author has participated, are reported, and a framework for consistently connecting the Gene Ontology (GO) with the Taxonomy of Species is proposed and discussed.</p>
APA, Harvard, Vancouver, ISO, and other styles
36

Alonso, Arnald. "Bioinformatics methods for the genomics and metabolomics analysis of immune-mediated inflammatory diseases." Doctoral thesis, Universitat Politècnica de Catalunya, 2015. http://hdl.handle.net/10803/320191.

Full text
Abstract:
During the last decade, genomics have been widely used to the characterization of the molecular basis of common diseases. Genome-wide association studies (GWAS) have been highly successful in characterizing the genetic variation that influences human traits including the susceptibility to common diseases. In metabolomics, recent improvements of analytical technologies have enabled the analysis of complete metabolomic profiles. Using this approach, high-throughput metabolomics studies have already demonstrated a high potential for the discovery of disease biomarkers. The use of powerful high-throughput measurement technologies has resulted in the generation of large datasets of biological variation. In order to extract relevant biological information from this data, highly specialized bioinformatics methods are required. This thesis is focused on the development of new methodological tools to improve the processing of genomics and metabolomics high-throughput data. These new tools have been used in the analysis framework of the Immune-Mediated Inflammatory Diseases (IMIDs) Consortium. The IMID Consortium is a large Spanish network of biomedical researchers on autoimmune diseases, which holds one of the largest collections of biological samples from this group of diseases, as well as healthy controls. The first analysis tool that has been developed is a computationally efficient algorithm for simultaneous genotyping of single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) using microarray data. This bioinformatics tool, called GStream, integrates the genotyping of both types of genomic variants into a single processing pipeline. We demonstrate that the developed algorithms provide a significant increase in genotyping accuracy and call rate when compared to previous algorithms. Using GStream, the researchers performing large-scale GWASs will not only benefit from the combined and fast genotyping of SNPs and CNVs but, more importantly, they will also improve the accuracy and therefore the statistical power of their studies. The second tool that was developed during this thesis was FOCUS, a bioinformatics framework that provides a complete data analysis workflow for high-throughput metabolomics studies based on one-dimensional nuclear magnetic resonance (NMR). FOCUS workflow includes quality control, peak alignment, peak picking and metabolite identification. The algorithms included in FOCUS were designed to overcome several technical challenges that can dramatically affect the quality of the results. FOCUS allows users to easily obtain high-quality NMR feature matrices, which are ready for chemometric analysis, as well as metabolite identification scores for each peak that greatly simplify the biological interpretation of the results. When tested against previous NMR data processing methodologies, FOCUS clearly showed a superior performance, even in datasets with high levels of spectral unalignment. he final research work included in this thesis is a GWAS in Crohn's disease (CD) clinical phenotypes. CD is the most prevalent chronic inflammatory disease of the bowel, and is characterized by segmental and transmural inflammation of the gastrointestinaltract. CD is a highly heterogeneous disease, with patients showing different degrees of severity. The identification of the genetic basis associated with disease severity is therefore a major objective in CD translational research. The present PhD thesis includes the first GWAS of clinically relevant phenotypes in CD. A total of 17 phenotypes associated with different clinical complications were analyzed. In this study, we identified new genetic regions significantly associated to complicated disease course, disease location, mild disease course, and erythema nodosum. These findings are of high relevance since they show the existence of a genetic component for disease heterogeneity that is independent of the genetic variation associated with susceptibility to CD.<br>Durant la darrera dècada, la genòmica ha jugat un paper clau en la caracterització de la base molecular de les malalties complexes. Els estudis d'associació de genoma complet (GWAS) han permès caracteritzar les regions genètiques que influencien fenotips humans tals com la susceptibilitat a desenvolupar malalties complexes. En metabolòmica, millores en les tecnologies analítiques han impulsat l'obtenció de perfils metabolòmics en grans cohorts de mostres. Els estudis resultants han demostrat també un gran potencial per a identificar biomarcadors d'utilitat en malalties humanes. L'aplicació de les tecnologies high-throughput permet generar grans conjunts de dades de variació biològica i l'extracció de la informació rellevant requereix l'aplicació de potents eines bioinformàtiques. Aquesta tesi es centra en el desenvolupament de nous mètodes per a millorar i agilitzar el processat de dades genòmiques i metabolòmiques high-throughput, així com la seva posterior implementació en forma d'aplicacions bioinformàtiques. Aquestes aplicacions s'han incorporat al flux d'anàlisi del consorci IMID (malalties inflamatòries mediades per immunitat). Aquest consorci és una xarxa espanyola d'investigadors biomèdics amb l'interès comú de l'estudi de malalties autoimmunes i disposa d'una de les col·leccions de mostres més extenses de pacients d'aquestes malalties. La primera eina bioinformàtica implementada consisteix en un conjunt d'algoritmes que integren el genotipat de polimorfismes de nucleòtid simple i variacions de nombre de còpies sobre dades de microarrays de genotipat. Aquesta eina, anomenada GStream, incorpora de forma eficient tot el flux d'anàlisi necessari per al genotipat en GWAS. S'ha demostrat que els algoritmes desenvolupats milloren significativament la precisió del genotipat i augmenten el nombre de variants genètiques identificades respecte a les metodologies anteriors. La utilització d'aquesta eina permet doncs ampliar el nombre de variants genètiques analitzades, incrementant de forma significativa el poder estadístic dels estudis genètics GWAS. La segona eina desenvolupada ha estat FOCUS. Es tracta d'una eina bioinformàtica integrada que inclou totes les etapes de processat d'espectres de ressonància magnètica nuclear per a estudis de metabolòmica. El flux d'anàlisi inclou el control de qualitat, l'alineament/quantificació de pics espectrals i la identificació dels metabolits associats als pics quantificats. Tots els algoritmes han estat dissenyats per a corregir els biaixos que limiten considerablement la qualitat dels resultats i que són un dels reptes tècnics de la metabolòmica actual. FOCUS obté una matriu numèrica d'alta qualitat llesta per a l'anàlisi quimiomètric, i genera uns scores d'identificació que simplifiquen la interpretació biològica dels resultats. FOCUS ha assolit un rendiment significativament superior al de metodologies prèvies. Aquesta tesi conclou amb el primer GWAS de fenotips clínics de malaltia de Crohn. Aquesta malaltia IMID és la malaltia inflamatòria intestinal de major prevalença i és molt heterogènia, amb pacients que presenten graus molt diferents de gravetat. La identificació de variants genètiques associades als fenotips d'aquesta malaltia és, per tant, un dels objectius més rellevants per a la investigació translacional. Un total de 17 fenotips han estat analitzats utilitzant cohorts de descobriment i validació per tal d'identificar i replicar loci de risc associats a cadascun d'ells. Els resultats de l'estudi han permès identificar, per primer cop, regions genètiques associades a l'evolució de la malaltia i a la seva localització. Aquests resultats són de gran rellevància ja que no tan sols han permès identificar noves vies biològiques associades a fenotips clínics, sinó que també demostren, per primer cop, la existència d'un component genètic de la heterogeneïtat a la malaltia de Crohn i que és independent de la variació genètica associada al risc de patir la malaltia.
APA, Harvard, Vancouver, ISO, and other styles
37

Diboun, I. "Bioinformatics protocols for analysis of functional genomics data applied to neuropathy microarray datasets." Thesis, University College London (University of London), 2010. http://discovery.ucl.ac.uk/19298/.

Full text
Abstract:
Microarray technology allows the simultaneous measurement of the abundance of thousands of transcripts in living cells. The high-throughput nature of microarray technology means that automatic analytical procedures are required to handle the sheer amount of data, typically generated in a single microarray experiment. Along these lines, this work presents a contribution to the automatic analysis of microarray data by attempting to construct protocols for the validation of publicly available methods for microarray. At the experimental level, an evaluation of amplification of RNA targets prior to hybridisation with the physical array was undertaken. This had the important consequence of revealing the extent to which the significance of intensity ratios between varying biological conditions may be compromised following amplification as well as identifying the underlying cause of this effect. On the basis of these findings, recommendations regarding the usability of RNA amplification protocols with microarray screening were drawn in the context of varying microarray experimental conditions. On the data analysis side, this work has had the important outcome of developing an automatic framework for the validation of functional analysis methods for microarray. This is based on using a GO semantic similarity scoring metric to assess the similarity between functional terms found enriched by functional analysis of a model dataset and those anticipated from prior knowledge of the biological phenomenon under study. Using such validation system, this work has shown, for the first time, that ‘Catmap’, an early functional analysis method performs better than the more recent and most popular methods of its kind. Crucially, the effectiveness of this validation system implies that such system may be reliably adopted for validation of newly developed functional analysis methods for microarray.
APA, Harvard, Vancouver, ISO, and other styles
38

Migeon, Pierre. "Comparative genomics of repetitive elements between maize inbred lines B73 and Mo17." Thesis, Kansas State University, 2017. http://hdl.handle.net/2097/35377.

Full text
Abstract:
Master of Science<br>Genetics Interdepartmental Program<br>Sanzhen Liu<br>The major component of complex genomes is repetitive elements, which remain recalcitrant to characterization. Using maize as a model system, we analyzed whole genome shotgun (WGS) sequences for the two maize inbred lines B73 and Mo17 using k-mer analysis to quantify the differences between the two genomes. Significant differences were identified in highly repetitive sequences, including centromere, 45S ribosomal DNA (rDNA), knob, and telomere repeats. Genotype specific 45S rDNA sequences were discovered. The B73 and Mo17 polymorphic k-mers were used to examine allele-specific expression of 45S rDNA in the hybrids. Although Mo17 contains higher copy number than B73, equivalent levels of overall 45S rDNA expression indicates that transcriptional or post-transcriptional regulation mechanisms operate for the 45S rDNA in the hybrids. Using WGS sequences of B73xMo17 doubled haploids, genomic locations showing differential repetitive contents were genetically mapped, revealing differences in organization of highly repetitive sequences between the two genomes. In an analysis of WGS sequences of HapMap2 lines, including maize wild progenitor, landraces, and improved lines, decreases and increases in abundance of additional sets of k-mers associated with centromere, 45S rDNA, knob, and retrotransposons were found among groups, revealing global evolutionary trends of genomic repeats during maize domestication and improvement.
APA, Harvard, Vancouver, ISO, and other styles
39

Marwaha, Shruti. "A Genomics and Mathematical Modeling Approach for the Study of Helicobacter Pylori associated Gastritis and Gastric Cancer." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439308645.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Hershman, Steven Gregory. "Personal Genomics and Mitochondrial Disease." Thesis, Harvard University, 2013. http://dissertations.umi.com/gsas.harvard:10863.

Full text
Abstract:
Mitochondrial diseases involving dysfunction of the respiratory chain are the most common inborn errors of metabolism. Mitochondria are found in all cell types besides red blood cells; consequently, patients can present with any symptom in any organ at any age. These diseases are genetically heterogeneous, and exhibit maternal, autosomal dominant, autosomal recessive and X-linked modes of inheritance. Historically, clinical genetic evaluation of mitochondrial disease has been limited to sequencing of the mitochondrial DNA (mtDNA) or several candidate genes. As human genome sequencing transformed from a research grade effort costing $250,000 to a clinical test orderable by doctors for under $10,000, it has become practical for researchers to sequence individual patients. This thesis describes our experiences in applying "MitoExome" sequencing of the mtDNA and exons of >1000 nuclear genes encoding mitochondrial proteins in ~200 patients with suspected mitochondrial disease. In 42 infants, we found that 55% harbored pathogenic mtDNA variants or compound heterozygous mutations in candidate genes. The pathogenicity of two nuclear genes not previously linked to disease, NDUFB3 and AGK, was supported by complementation studies and evidence from multiple patients, respectively. In an additional two unrelated children presenting with Leigh syndrome and combined OXPHOS deficiency, we identified compound heterozygous mutations in MTFMT. Patient fibroblasts exhibit severe defects in mitochondrial translation that can be rescued by exogenous expression of MTFMT. Furthermore, patient fibroblasts have dramatically reduced fMet-\(tRNA^{Met}\) levels and an abnormal formylation profile of mitochondrially translated \(COX_1\). These results demonstrate that MTFMT is critical for human mitochondrial translation. Lastly, to facilitate evaluation of copy number variants (CNVs), we developed a web-interface that integrates CNV calling with genetic and phenotypic information. Additional diagnoses are suggested and in a male with ataxia, neuropathy, azoospermia, and hearing loss we found a deletion compounded with a missense variant in D-bifunctional protein, \(HSD_{17}B_4\), a peroxisomal enzyme that catalyzes beta-oxidation of very long chain fatty acids. Retrospective review of metabolic testing from this patient revealed alterations of long- and very-long chain fatty acid metabolism consistent with a peroxisomal disorder. This work expands the molecular basis of mitochondrial disease and has implications for clinical genomics.
APA, Harvard, Vancouver, ISO, and other styles
41

Liu, Shaolin 1968. "Oligonucleotides applied in genomics, bioinformatics and development of molecular markers for rice and barley." Thesis, McGill University, 2004. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=85569.

Full text
Abstract:
A genome sequence can be conceptualized as a 'book' written with four nucleotide 'letters' in oligonucleotide (oligo) 'words'. These words can be used in genomics, bioinformatics and the development of molecular markers. The whole-genome sequence for rice (Oryza sativa L.) is almost finished and has been assembled into pseudomolecules. For barley ( Hordeum vulgare L.) expressed sequence tags (ESTs) have been assembled into 21,981 tentative consensus sequences (TCs). The availability of such sequence information provides opportunities to investigate oligo usage within and between genomes. For the first of three studies reported in this thesis, a C++ program was written to automatically design oligos that are conserved between two sets of sequence information. In silico mapping between rice coding sequences (CDS) and barley TCs indicated that oligos between 18 and 24 bp provide good specificity and sensitivity (83% and 86%, respectively, for 20mers). Conserved oligos used as PCR primers had a high (91%) success rate on barley lines. Sequencing of PCR products revealed conservation in exon sequence, size and order between barley and rice. Introns were not conserved in sequence but were relatively stable in size. Map locations of eight new markers in barley revealed both genome colinearity and rearrangements between barley and rice. The second study reported in this thesis examined word frequency within the rice genome. A non-random landscape composed of high-frequency and low-frequency zones was observed. Interestingly, high-frequency words seemed to be rice specific while single-copy words were gene specific and conserved across species. As in the first study, oligos of 12 bp or less were not specific, and 18 bp seemed to be a critical length for the specificity of oligos. The third study reported in this thesis involved the development of molecular markers for known genes using public sequence information. Six new polymorphic markers were d
APA, Harvard, Vancouver, ISO, and other styles
42

Jentzsch, Iris Miriam Vargas. "Comparative genomics of microsatellite abundance: a critical analysis of methods and definitions." Thesis, University of Canterbury. Biological Sciences, 2009. http://hdl.handle.net/10092/4282.

Full text
Abstract:
This PhD dissertation is focused on short tandemly repeated nucleotide patterns which occur extremely often across DNA sequences, called microsatellites. The main characteristic of microsatellites, and probably the reason why they are so abundant across genomes, is the extremely high frequency of specific replication errors occurring within their sequences, which usually cause addition or deletion of one or more complete tandem repeat units. Due to these errors, frequent fluctuations in the number of repetitive units can be observed among cellular and organismal generations. The molecular mechanisms as well as the consequences of these microsatellite mutations, both, on a generational as well as on an evolutionary scale, have sparked debate and controversy among the scientific community. Furthermore, the bioinformatic approaches used to study microsatellites and the ways microsatellites are referred to in the general literature are often not rigurous, leading to misinterpretations and inconsistencies among studies. As an introduction to this complex topic, in Chapter I I present a review of the knowledge accumulated on microsatellites during the past two decades. A major part of this chapter has been published in the Encyclopedia of Life Sciences in a Chapter about microsatellite evolution (see Publication 1 in Appendix II). The ongoing controversy about the rates and patterns of microsatellite mutation was evident to me since before starting this PhD thesis. However, the subtler problems inherent to the computational analyses of microsatellites within genomes only became apparent when retrieving information on microsatellite distribution and abundance for the design of comparative genomic analyses. There are numerous publications analyzing the microsatellite content of genomes but, in most cases, the results presented can neither be reliably compared nor reproduced, mainly due to the lack of details on the microsatellite search process (particularly the program’s algorithm and the search parameters used) and because the results are expressed in terms that are relative to the search process (i.e. measures based on the absolute number of microsatellites). Therefore, in Chapter II I present a critical review of all available software tools designed to scan DNA sequences for microsatellites. My aim in undertaking this review was to assess the comparability of search results among microsatellite programs, and to identify the programs most suitable for the generation of microsatellite datasets for a thorough and reproducible comparative analysis of microsatellite content among genomic sequences. Using sequence data where the number and types of microsatellites were empirical know I compared the ability of 19 programs to accurately identify and report microsatellites. I then chose the two programs which, based on the algorithm and its parameters as well as the output informativity, offered the information most suitable for biological interpretation, while also reflecting as close as possible the microsatellite content of the test files. From the analysis of microsatellite search results generated by the various programs available, it became apparent that the program’s search parameters, which are specified by the user in order to define the microsatellite characteristics to the program, influence dramatically the resulting datasets. This is especially true for programs suited to allow imperfections within tandem repeats, because imperfect repetitions can not be defined accurately as is the case for perfect ones, and because several different algorithms have been proposed to address this problem. The detection of approximate microsatellites is, however, essential for the study of microsatellite evolution and for comparative analyses based on microsatellites. It is now well accepted that small deviations from perfect tandem repeat structure are common within microsatellites and larger repeats, and a number of different algorithms have been developed to confront the challenge of finding and registering microsatellites with all expectable kinds of imperfection. However, biologists have still to apply these tools to their full potential. In biological analyses single tandem repeat hits are consistently interpreted as isolated and independent repeats. This interpretation also depends on the search strategy used to report the microsatellites in DNA sequences and, therefore, I was particularly interested in the capacity of repeat finding programs to report imperfect microsatellites allowing interpretations that are useful in a biological sense. After analzying a series of tandem repeat finding programs I optimized my microsatellite searches to yield the best possible datasets for assessing and comparing the degree of imperfection of microsatellites among different genomes (Chapter III) During the program comparisons performed in Chapter II, I show that the most critical search parameter influencing microsatellite search results is the minimum length threshold. Biologically speaking, there is no consensus with respect to the minimum length, beyond which a short tandem repeat is expected to become prone to microsatellite-like mutations. Usually, a single absolute value of ~12 nucleotides is assigned irrespective of motif length.. In other cases thresholds are assigned in terms of number of repeat units (i.e. 3 to 5 repeats or more), which are better applied individually for each motif. The variation in these thresholds is considerable and not always justifiable. In addition, any current minimum length measures are likely naïve because it is clear that different microsatellite motifs undergo replication slippage at different length thresholds. Therefore, in Chapter III, I apply two probabilistic models to predict the minimum length at which microsatellites of varying motif types become overrepresented in different genomes based on the individual oligonucleotide frequency data of these genomes. Finally, after a range of optimizations and critical analyses, I performed a preliminary analysis of microsatellite abundance among 24 high quality complete eukaryotic genomes, including also 8 prokaryotic and 5 archaeal genomes for contrast. The availability of the methodologies and the microsatellite datasets generated in this project will allow informed formulation of questions for more specific genome research, either about microsatellites, or about other genomic features microsatellites could influence. These datasets are what I would have needed at the beginning of my PhD to support my experimental design, and are essential for the adequate data interpretation of microsatellite data in the context of the major evolutionary units; chromosomes and genomes.
APA, Harvard, Vancouver, ISO, and other styles
43

Akhurst, Timothy John. "The role of parallel computing in bioinformatics." Thesis, Rhodes University, 2005. http://eprints.ru.ac.za/162/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Cury, Jean. "Evolutionary genomics of conjugative elements and integrons." Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCB062/document.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Murat, Katarzyna. "Bioinformatics analysis of epigenetic variants associated with melanoma." Thesis, University of Bradford, 2018. http://hdl.handle.net/10454/17220.

Full text
Abstract:
The field of cancer genomics is currently being enhanced by the power of Epigenome-wide association studies (EWAS). Over the last couple of years comprehensive sequence data sets have been generated, allowing analysis of genome-wide activity in cohorts of different individuals to be increasingly available. Finding associations between epigenetic variation and phenotype is one of the biggest challenges in biomedical research. Laboratories lacking dedicated resources and programming experience require bioinformatics expertise which can be prohibitively costly and time-consuming. To address this, we have developed a collection of freely available Galaxy tools (Poterlowicz, 2018a), combining analytical methods into a range of convenient analysis pipelines with graphical user-friendly interface.The tool suite includes methods for data preprocessing, quality assessment and differentially methylated region and position discovery. The aim of this project was to make EWAS analysis flexible and accessible to everyone and compatible with routine clinical and biological use. This is exemplified by my work undertaken by integrating DNA methylation profiles of melanoma patients (at baseline and mitogen-activated protein kinase inhibitor MAPKi treatment) to identify novel epigenetic switches responsible for tumour resistance to therapy (Hugo et al., 2015). Configuration files are publicly published on our GitHub repository (Poterlowicz, 2018b) with scripts and dependency settings also available to download and install via Galaxy test toolshed (Poterlowicz, 2018a). Results and experiences using this framework demonstrate the potential for Galaxy to be a bioinformatics solution for multi-omics cancer biomarker discovery tool.
APA, Harvard, Vancouver, ISO, and other styles
46

Fei, Zhangjun. "Implementation of genomics and bioinformatics approaches for identification and characterization of tomato ripening-related genes." Texas A&M University, 2003. http://hdl.handle.net/1969.1/257.

Full text
Abstract:
Initial activities were focused on isolation and characterization of fruit ripening-related genes from tomato. Screening of four tomato cDNA libraries at low stringency with 10 fruit development and ripening-related genes yielded ~3000 positives clones. Microarray expression analysis of half of these positives in mature green and breaker stage fruits resulted in eight ripening-induced genes. RNA gel-blot analysis and previously published data confirmed expression for seven of the eight. One novel gene, designated LeEREBP1, was chosen for further characterization. LeEREBP1 encodes an AP2/ERF-domain transcription factor and is ethylene inducible. The expression profiles of LeEREBP1 parallel previously characterized ripening-related genes from tomato. Transgenic plants with increased and decreased expression of LeEREBP1 were generated and are currently being characterized to define the function of LeEREBP1. A large public tomato EST dataset was mined to gain insight into the tomato transcriptome. By clustering genes according to the respective expression profiles of individual tissues, tissue and developmental expression patterns were generated and genes with similar functions grouped together. Tissues effectively clustered for relatedness according to their profiles confirming the integrity of the approach used to calculate gene expression. Statistical analysis of EST prevalence in fruit and pathogenesis-related libraries resulted in 333 genes being classified as fruit ripening-induced, 185 as fruit ripening-repressed, and 169 as pathogenesis-related. We performed a parallel analysis on public EST data for grape and compared the results for ripening-induced genes to tomato to identify similar and distinct ripening factors in addition to candidates for conserved regulators of fruit ripening. An online interactive database for tomato gene expression data - Tomato Expression Database (TED) was implemented. TED contains normalized expression data for approximately 12,000 ESTs over ten time points during fruit development. It also contains comprehensive annotation of each EST. Through TED, we provide multiple approaches to pursue analysis of specific genes of interest and/or access the larger microarray dataset to identify sets of genes that may behave in a pattern of interest. In addition, a set of useful data mining and data visualization tools were developed and are under continuing expansion.
APA, Harvard, Vancouver, ISO, and other styles
47

Fernandez, Daniel. "Cell States and Cell Fate: Statistical and Computational Models in (Epi)Genomics." Thesis, Harvard University, 2015. http://nrs.harvard.edu/urn-3:HUL.InstRepos:14226043.

Full text
Abstract:
This dissertation develops and applies several statistical and computational methods to the analysis of Next Generation Sequencing (NGS) data in order to gain a better understanding of our biology. In the rest of the chapter we introduce key concepts in molecular biology, and recent technological developments that help us better understand this complex science, which, in turn, provide the foundation and motivation for the subsequent chapters. In the second chapter we present the problem of estimating gene/isoform expression at the allelic level, and different models to solve this problem. First, we describe the observed data and the computational workflow to process the data. Next, we propose frequentist and bayesian models motivated by the central dogma of molecular biology and the data generating process (DGP) for RNA-Seq. We develop EM and Gibbs sampling approaches to estimate gene and transcript-specic expression from our proposed models. Finally, we present the performance of our models in simulations and we end with the analysis of experimental RNA-Seq data at the allelic level. In the third chapter we present our paired factorial experimental design to study parentally biased gene/isoform expression in the mouse cerebellum, and dynamic changes of this pattern between young and adult stages of cerebellar development. We present a bayesian variable selection model to estimate the difference in expression between the paternal and maternal genes, while incorporating relevant factors and its interactions into the model. Next, we apply our model to our experimental data, and further on we validate our predictions using pyrosequencing follow-up experiments. We subsequently applied our model to the pyrosequencing data across multiple brain regions. Our method, combined with the validation experiments, allowed us to find novel imprinted genes, and investigate, for the first time, imprinting dynamics across brain regions and across development. In the fourth chapter we move from the controlled-experiments in mouse isogenic lines to the highly variant world of human genetics in observational studies. In this chapter we introduce a Bayesian Regression Allelic Imbalance Model, BRAIM, that estimates the imbalance coming from two major sources: cis-regulation and imprinting. We model the cis-effect as an additive effect for the heterozygous group and we model the parent-of-origin detect with a latent variable that indicates to which parent a given allele belongs. Next, we show the performance of the model under simulation scenarios, and finally we apply the model to several experiments across multiple tissues and multiple individuals. In the fifth chapter we characterize the transcriptional regulation and gene expression of in-vitro Embryonic Stem Cells (ESCs), and two-related in-vivo cells; the Inner Cell Mass (ICM) tissue, and the embryonic tissue at day 6.5. Our objective is two fold. First we would like to understand the differences in gene expression between the ESCs and their in-vivo counterpart from where these cells were derived (ICM). Second, we want to characterize the active transcriptional regulatory regions using several histone modifications and to connect such regulatory activity with gene expression. In this chapter we used several statistical and computational methods to analyze and visualize the data, and it provides a good showcase of how combining several methods of analysis we can delve into interesting developmental biology.
APA, Harvard, Vancouver, ISO, and other styles
48

Schobel, Seth Adam Micah. "The viral genomics revolution| Big data approaches to basic viral research, surveillance, and vaccine development." Thesis, University of Maryland, College Park, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10011480.

Full text
Abstract:
<p> Since the decoding of the first RNA virus in 1976, the field of viral genomics has exploded, first through the use of Sanger sequencing technologies and later with the use next-generation sequencing approaches. With the development of these sequencing technologies, viral genomics has entered an era of big data. New challenges for analyzing these data are now apparent. Here, we describe novel methods to extend the current capabilities of viral comparative genomics. Through the use of antigenic distancing techniques, we have examined the relationship between the antigenic phenotype and the genetic content of influenza virus to establish a more systematic approach to viral surveillance and vaccine selection. Distancing of Antigenicity by Sequence-based Hierarchical Clustering (DASH) was developed and used to perform a retrospective analysis of 22 influenza seasons. Our methods produced vaccine candidates identical to or with a high concordance of antigenic similarity with those selected by the WHO. In a second effort, we have developed VirComp and OrionPlot: two independent yet related tools. These tools first generate gene-based genome constellations, or genotypes, of viral genomes, and second create visualizations of the resultant genome constellations. VirComp utilizes sequence-clustering techniques to infer genome constellations and prepares genome constellation data matrices for visualization with OrionPlot. OrionPlot is a java application for tailoring genome constellation figures for publication. OrionPlot allows for color selection of gene cluster assignments, customized box sizes to enable the visualization of gene comparisons based on sequence length, and label coloring. We have provided five analyses designed as vignettes to illustrate the utility of our tools for performing viral comparative genomic analyses. Study three focused on the analysis of respiratory syncytial virus (RSV) genomes circulating during the 2012- 2013 RSV season. We discovered a correlation between a recent tandem duplication within the G gene of RSV-A and a decrease in severity of infection. Our data suggests that this duplication is associated with a higher infection rate in female infants than is generally observed. Through these studies, we have extended the state of the art of genotype analysis, phenotype/genotype studies and established correlations between clinical metadata and RSV sequence data.</p>
APA, Harvard, Vancouver, ISO, and other styles
49

Calder, Mark. "Accelerating the BLAST algorithm via parrallel computing /." [St. Lucia, Qld.], 2004. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe18388.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Jones, Steven John Mathias. "Computational analysis of the Caenorhabditis elegans genome sequence." Thesis, Open University, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301886.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography