Academic literature on the topic 'Comparative bioinformatics'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Comparative bioinformatics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Dissertations / Theses on the topic "Comparative bioinformatics"

1

Chatzou, Maria 1985. "Large-scale comparative bioinformatics analyses." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/587086.

Full text
Abstract:
One of the main and most recent challenges of modern biology is to keep-up with growing amount of biological data coming from next generation sequencing technologies. Keeping up with the growing volumes of experiments will be the only way to make sense of the data and extract actionable biological insights. Large-scale comparative bioinformatics analyses are an integral part of this procedure. When doing comparative bioinformatics, multiple sequence alignments (MSAs) are by far the most widely used models as they provide a unique insight into the accurate measure of sequence similarities and are therefore instrumental to revealing genetic and/or functional relationships among evolutionarily related species. Unfortunately, the well-established limitation of MSA methods when dealing with very large datasets potentially compromises all downstream analysis. In this thesis I expose the current relevance of multiple sequence aligners, I show how their current scaling up is leading to serious numerical stability issues and how they impact phylogenetic tree reconstruction. For this purpose, I have developed two new methods, MEGA-Coffee, a large scale aligner and Shootstrap a novel bootstrapping measure incorporating MSA instability with branch support estimates when computing trees. The large amount of computation required by these two projects was carried using Nextflow, a new computational framework that I have developed to improve computational efficiency and reproducibility of large-scale analyses like the one carried out in the context of these studies.<br>Uno de los principales y más recientes retos de la biología moderna es poder hacer frente a la creciente cantidad de datos biológicos procedentes de las tecnologías de secuenciación de alto rendimiento. Mantenerse al día con los crecientes volúmenes de datos experimentales es el único modo de poder interpretar estos datos y extraer conclusiones biológicos relevantes. Los análisis bioinformáticos comparativos a gran escala son una parte integral de este procedimiento. Al hacer bioinformática comparativa, los alineamientos múltiple de secuencias (MSA) son con mucho los modelos más utilizados, ya que proporcionan una visión única de la medida exacta de similitudes de secuencia y son, por tanto, fundamentales para inferir las relaciones genéticas y / o funcionales entre las especies evolutivamente relacionadas. Desafortunadamente, la conocida limitación de los métodos MSA para analizar grandes bases de datos, puede potencialmente comprometer todos los análisis realizados a continuación. En esta tesis expongo la relevancia actual de los métodos de alineamientos multiples de secuencia, muestro cómo su uso en datos masivos está dando lugar a serios problemas de estabilidad numérica y su impacto en la reconstrucción del árbol filogenético. Para este propósito, he desarrollado dos nuevos métodos, MEGA-café, un alineador de gran escala y Shootstrap una nueva medida de bootstrapping que incorpora la inestabilidad del MSA con las estimaciones de apoyo de rama en el cálculo de árboles filogéneticos. La gran cantidad de cálculo requerido por estos dos proyectos se realizó utilizando Nextflow, un nuevo marco computacional que se ha desarrollado para mejorar la eficiencia computacional y la reproducibilidad del análisis a gran escala como la que se lleva a cabo en el contexto de estos estudios.
APA, Harvard, Vancouver, ISO, and other styles
2

Åkerborg, Örjan. "Taking advantage of phylogenetic trees in comparative genomics." Doctoral thesis, KTH, Beräkningsbiologi, CB, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4757.

Full text
Abstract:
Phylogenomics can be regarded as evolution and genomics in co-operation. Various kinds of evolutionary studies, gene family analysis among them, demand access to genome-scale datasets. But it is also clear that many genomics studies, such as assignment of gene function, are much improved by evolutionary analysis. The work leading to this thesis is a contribution to the phylogenomics field. We have used phylogenetic relationships between species in genome-scale searches for two intriguing genomic features, namely and A-to-I RNA editing. In the first case we used pairwise species comparisons, specifically human-mouse and human-chimpanzee, to infer existence of functional mammalian pseudogenes. In the second case we profited upon later years' rapid growth of the number of sequenced genomes, and used 17-species multiple sequence alignments. In both these studies we have used non-genomic data, gene expression data and synteny relations among these, to verify predictions. In the A-to-I editing project we used 454 sequencing for experimental verification. We have further contributed a maximum a posteriori (MAP) method for fast and accurate dating analysis of speciations and other evolutionary events. This work follows recent years' trend of leaving the strict molecular clock when performing phylogenetic inference. We discretised the time interval from the leaves to the root in the tree, and used a dynamic programming (DP) algorithm to optimally factorise branch lengths into substitution rates and divergence times. We analysed two biological datasets and compared our results with recent MCMC-based methodologies. The dating point estimates that our method delivers were found to be of high quality while the gain in speed was dramatic. Finally we applied the DP strategy in a new setting. This time we used a grid laid out on a species tree instead of on an interval. The discretisation gives together with speciation times a common timeframe for a gene tree and the corresponding species tree. This is the key to integration of the sequence evolution process and the gene evolution process. Out of several potential application areas we chose gene tree reconstruction. We performed genome-wide analysis of yeast gene families and found that our methodology performs very well.<br>QC 20100923
APA, Harvard, Vancouver, ISO, and other styles
3

Zheng, Chunfang. "Genome rearrangement algorithms applied to comparative maps." Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/27313.

Full text
Abstract:
The Hannenhalli-Pevzner algorithm for computing the evolutionary distance between two genomes is very efficient when the genomes are signed and totally ordered. But in real comparative maps, the data suffer from problems such as coarseness, missing data, no signs, paralogy, order conflicts and mapping noise. In this thesis we have developed a suite of algorithms for genome rearrangement analysis in the presence of noise and incomplete information. For coarseness and missing data, we represent each chromosome as a partial order, summarized by a directed acyclic graph (DAG). We augment each DAG to a directed graph (DG) in which all possible linearizations are embedded. The chromosomal DGs representing two genomes are combined to produce a single bicoloured graph. The major contribution of the thesis is an algorithm for extracting a maximal decomposition of some subgraph into alternating coloured cycles, determining an optimal sequence of rearrangements, and hence the genomic distance. Also based on this framework, we have proposed an algorithm to solve all the above problems of comparative maps simultaneously by adding heuristic preprocessing to the exact algorithm approach. We have applied this to the comparison of maize and sorghum genomic maps on the GRAMENE database. A further contribution treats the inflation of genome distance by high levels of noise due to incorrectly resolved paralogy and error at the mapping, sequencing and alignment levels. We have developed an algorithm to remove the noise by maximizing strips and tested its robustness as noise levels increase.
APA, Harvard, Vancouver, ISO, and other styles
4

Thelander, Tilia. "Optimisation of ForenSeq STR data analysis with FDSTools and comparative analysis with UAS." Thesis, Högskolan i Skövde, Institutionen för biovetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-20053.

Full text
Abstract:
DNA profiling with short tandem repeat data generated with massively parallel sequencing is associated with several challenges. FDSTools is an open-source software which applies correction models based on a reference database to correct DNA profiles. The correction models aim to provide an accurate representation of the true DNA profile and associated artefacts. Low analytical thresholds in FDSTools are suggested to improve detection of minor profiles in complex mixtures. The objective was to optimise FDSTools analysis for ForenSeq data, and to establish a Swedish reference database. The FDSTools analysis was subsequently compared to default analysis with the commercial Universal Analysis Software, and the likelihood ratio was evaluated. The FDSTools Library file was adapted for ForenSeq data. FASTQ files from single- and mixed-source samples were analysed with the software. The concordance between the software was assessed, and analytical thresholds in FDSTools were optimised. Likelihood ratios were calculated for sequencing- and capillary electrophoresis data to investigate the benefit of sequence level information. A reference database and correction models could not be generated, meaning that uncorrected data was used. The two software showed a 98.5% concordance. Disconcordance was caused by allele drop-out in heterozygous loci which implicated that certain markers may require individual interpretation. Lowering the analytical thresholds in FDSTools appeared to improve mixture deconvolution, but the lack of correction models obscured interpretation. Hence, without correction models optimial analytical thresholds could not be defined. Likelihood ratio based on sequencing data was not consistently higher compared to capillary electrophoresis, suggesting that sequence information is not always advantageous.
APA, Harvard, Vancouver, ISO, and other styles
5

Walter, Klaudia. "Statistical methods for comparative genomics in the field of bioinformatics." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.611909.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Johnson, Sarah. "Comparative Resistomics of Ancient and Modern Human Microbiomes." Thesis, University of North Texas, 2020. https://digital.library.unt.edu/ark:/67531/metadc1707269/.

Full text
Abstract:
Increased exposure to antibiotics has led to the dissemination of genes conferring resistance to antimicrobial metabolites throughout human microbiomes globally via horizontal gene transfer (HGT). This has resulted in the emergence of new resistant strains leading to a rising epidemic of deaths from previously treatable infections. Evidence suggests that before the age of anthropogenic antibiotic use, microbes living within a community produced antibiotic metabolites and, subsequently, maintained such genes for several useful functions and a balance of diversity in nature. The question of the origin of these resistant genes is difficult to answer, but with continued advancements in ancient genomic analysis, researchers have developed methods of acquiring a more accurate representation of the microbiome associated with our human ancestors by extracting fossilized microbial specimens from dental calculus and directly sequencing the metagenomes. This thesis outlines the production of taxonomic and functional profiles of 20 different human and non-human oral microbiome samples using metagenomics tools originally developed for living individuals, altered for use with ancient microbial specimens. Putative antimicrobial resistant (AMR) genes derived from these profiles were reconstructed and conserved functional regions were identified. From the data that is available regarding the human microbiome from a range of time points throughout history dating back to Neanderthal specimens, it is possible to elucidate relationships between these AMR genes and to better understand the evolutionary trajectory of antibiotic resistance.
APA, Harvard, Vancouver, ISO, and other styles
7

Mthombeni, Jabulani S. "A comparative bioinformatic analysis of zinc binuclear cluster proteins." Thesis, Rhodes University, 2005. http://hdl.handle.net/10962/d1004064.

Full text
Abstract:
Members of the zinc binuclear cluster family are important fungal transcriptional regulators sharing a common DNA binding domain. Da181p is a pleotropic zinc binuclear cluster protein involved in the induction of the UGA genes required for the γ-aminobutyrate nitrogen catabolic pathway in Saccharomyces cerevisiae. The zinc binuclear cluster domain is indispensable for function in Da181p and little is known about other domains in this protein. The aim of the study was to explore the zinc binuclear cluster protein family using comparative bioinformatics as a complement to biochemical and structural approaches. A database of all zinc binuclear cluster proteins was composed. A total of 118 zinc binuclear proteins are reported in this work. Thirty nine previously unidentified zinc binuclear cluster proteins were found. Four homologues of Da181p were identified by homology searching. Important sequence motifs were identified in the aligned sequences of Da181p and its homologues. The coiled coil motif found in the Ga14p zinc binuclear cluster protein could not be identified in Da181p and its homologues. This suggested that Da181p did not dimerise through this structural motif as other zinc binuclear cluster proteins. Solvent accessible site that could be phosphorylated by protein kinase C or casein kinase II and the role of such sites in the possible regulation of Da181p function were discussed.
APA, Harvard, Vancouver, ISO, and other styles
8

Li, Yang. "Understanding lineage-specific biology through comparative genomics." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:23398cc7-8bbe-4f5a-8cd9-1104591400cc.

Full text
Abstract:
A major challenge in biology is to identify how different species arose and acquired distinct phenotypic traits. High-throughput sequencing is transforming our understanding of biology by allowing us to study genomes and cellular processes at genome-wide levels. Only a decade subsequent to the publication of the first human genome draft, genome assemblies of hundreds of organisms have been produced. Yet, genome analysis remains challenging and advances have lagged far behind our sequencing abilities and other technological advances. The next generation of comparative genomicists must therefore understand, invent and apply a wide number of computational tools in order to study biology in the most efficient manner and in order to pose the most interesting questions. This thesis spans areas covering evolutionary genomics, gene regulation, and computational methods development. A major aim was to understand how genetic variation contributes to variation in phenotypic traits. This was approached using a large variety of evolutionary and comparative genomics tools. In particular, high-throughput sequencing datasets were analysed to study single-cell transcriptomics, gene duplications, gene architecture evolution, and alternative splicing. Additionally, in cases where off-the-shelf analysis tools were inexistent, novel pipelines and programs were designed and implemented to solve algorithmic problems such as scaffolding genome assemblies and short-read mapping onto small exons.
APA, Harvard, Vancouver, ISO, and other styles
9

Mostowy, Serge. "Comparative genomics of the Mycobacterium tuberculosis complex." Thesis, McGill University, 2005. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=111834.

Full text
Abstract:
The study of microbial evolution has been recently accelerated by the advent of comparative genomics, an approach enabling investigation of organisms at the whole-genome level. Tools of comparative genomics, including the DNA microarray, have been applied in bacterial genomes towards studying heterogeneity in DNA content, and to monitor global gene expression. When focused upon the study of microbial pathogens, genome analysis has provided unprecedented insight into their evolution, virulence, and host adaptation. Contributing towards this, I herein explore the evolutionary change affecting genomes of the Mycobacterium tuberculosis complex (MTC), a group of closely related bacterial organisms responsible for causing tuberculosis (TB) across a diverse range of mammals. Despite the introduction nearly a century ago of BCG, a family of live attenuated vaccines intentioned on preventing human TB, the uncertainty surrounding its usefulness is punctuated by the reality that TB continues to be responsible for claiming over 2 million lives per year. As pursued throughout this thesis, a precise understanding of the differences in genomic content among the MTC, and its impact on gene expression and biological function, promises to expose underlying mechanisms of TB pathogenesis, and suggest rational approaches towards the design of improved diagnostics and vaccines to prevent disease.<br>With the availability of whole-genome sequence data and tools of comparative genomics, our publications have advanced the recognition that large sequence polymorphisms (LSPs) deleted from Mycobacterium tuberculosis, the causative agent of TB in humans, serve as accurate markers for molecular epidemiologic assessment and phylogenetic analysis. These LSPs have proven informative both for the types of genes that vary between strains, and for the molecular signatures that characterize different MTC members. Genomic analysis of atypical MTC has revealed their diversity and adaptability, illuminating previously unexpected directions of MTC evolution. As demonstrated from parallel analysis of BCG vaccines, a phylogenetic stratification of genotypes offers a predictive framework upon which to base future genetic and phenotypic studies of the MTC. Overall, the work presented in this thesis has provided unique insights and lessons having direct clinical relevance towards understanding TB pathogenesis and BCG vaccination.
APA, Harvard, Vancouver, ISO, and other styles
10

Page, Justin Thomas. "Bioinformatics for the Comparative Genomic Analysis of the Cotton (Gossypium) Polyploid Complex." BYU ScholarsArchive, 2015. https://scholarsarchive.byu.edu/etd/5557.

Full text
Abstract:
Understanding the composition, evolution, and function of the cotton (Gossypium) genome is complicated by the joint presence of two genomes in its nucleus (AT and DT genomes). Specifically, read-mapping (a fundamental part of next-generation sequence analysis) cannot adequately differentiate reads as belonging to one genome or the other. These two genomes were derived from progenitor A-genome and D-genome diploids involved in ancestral allopolyploidization. To better understand the allopolyploid genome, we developed PolyCat to categorize reads according to their genome of origin based on homoeo-SNPs that differentiate the two genomes. We re-sequenced the genomes of extant diploid relatives of tetraploid cotton that contain the A1 (Gossypium herbaceum), A2 (Gossypium arboreum), or D5 (Gossypium raimondii) genomes. We identified 24 million SNPs between the A-diploid and D-diploid genomes. These analyses facilitated the construction of a robust index of conserved SNPs between the A-genomes and D-genomes at all detected polymorphic loci. This index can be used by PolyCat to assign reads from an allotetraploid to its genome-of-origin. Continued characterization of the Gossypium genomes will further enhance our ability to manipulate fiber and agronomic production of cotton. We re-sequenced 34 allotetraploid cotton lines, representing all 7 tetraploid cotton species. The analysis of these genomes-using PolyCat and PolyDog-provides us with the beginnings of a HapMap-like resource for cotton species, including indices of both homoeo-SNPs and allele-SNPs. With this information, we explore the phylogenetic relationships among cotton species, including the newly characterized species G. ekmanianum and G. stephensii. We examine gene conversion both recent and ancient, discovering that recent gene conversion is extremely rare, and ancient gene conversion is far less extensive than previously believed, with many previously identified conversion events being more probably due to autapamorphic SNPs in the descent of diploid relatives. In order to carry out these experiments, many tools for next-generation sequence analysis were developed. These tools, along with PolyCat and PolyDog, comprise the tool suite BamBam.
APA, Harvard, Vancouver, ISO, and other styles
More sources
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography