To see the other types of publications on this topic, follow the link: Evolutionary bioinformatics.

Dissertations / Theses on the topic 'Evolutionary bioinformatics'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Evolutionary bioinformatics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Chen, Lei. "Construction of Evolutionary Tree Models for Oncogenesis of Endometrial Adenocarcinoma." Thesis, University of Skövde, School of Humanities and Informatics, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-25.

Full text
Abstract:
<p>Endometrial adenocarcinoma (EAC) is the fourth leading cause of carcinoma in woman worldwide, but not much is known about genetic factors involved in this complex disease. During the EAC process, it is well known that losses and gains of chromosomal regions do not occur completely at random, but partly through some flow of causality. In this work, we used three different algorithms based on frequency of genomic alterations to construct 27 tree models of oncogenesis. So far, no study about applying pathway models to microsatellite marker data had been reported. Data from genome–wide scans with microsatellite markers were classified into 9 data sets, according to two biological approaches (solid tumor cell and corresponding tissue culture) and three different genetic backgrounds provided by intercrossing the susceptible rat BDII strain and two normal rat strains. Compared to previous study, similar conclusions were drawn from tree models that three main important regions (I, II and III) and two subordinate regions (IV and V) are likely to be involved in EAC development. Further information about these regions such as their likely order and relationships was produced by the tree models. A high consistency in tree models and the relationship among p19, Tp53 and Tp53 inducible</p><p>protein genes provided supportive evidence for the reliability of results.</p>
APA, Harvard, Vancouver, ISO, and other styles
2

Birkmeier, Bettina. "Integrating Prior Knowledge into the Fitness Function of an Evolutionary Algorithm for Deriving Gene Regulatory Networks." Thesis, University of Skövde, School of Humanities and Informatics, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-31.

Full text
Abstract:
<p>The topic of gene regulation is a major research area in the bioinformatics community. In this thesis prior knowledge from Gene Ontology in the form of templates is integrated into the fitness function of an evolutionary algorithm to predict gene regulatory networks. The resulting multi-objective fitness functions are then tested with MAPK network data taken from KEGG to evaluate their respective performances. The results are presented and analyzed. However, a clear tendency cannot be observed. The results are nevertheless promising and can provide motivation for further research in that direction. Therefore different ideas and approaches are suggested for future work.</p>
APA, Harvard, Vancouver, ISO, and other styles
3

Mongin, Emmanuel. "An evolutionary approach to long-range regulation." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=92333.

Full text
Abstract:
Long-range regulatory regions play important functions in the regulation of transcription and are particularly involved in the precise spatio-temporal expression of target genes. Such regions have specific characteristics, among which is their ability to regulate many target genes that can be located up to 1Mb from the transcription start site. The prediction and functional characterization of such regions remains an open problem. Evolutionary approaches have been developed to detect regulatory regions that are under purifying selection. However, little has been done with regards to the impact of long-range regulation on genome evolution.<br>This thesis focuses on three different aspects of long-range regulation: i/ First we develop a method that predicts regions particularly prone to the fixation of evolutionary breakpoints. We discuss the results obtained in the context of long-range regulation and show that this type of regulation is a major factor shaping vertebrate genomes in evolution. ii/ The second project aims at predicting functional interactions between regulatory regions and target genes based on the observation of evolutionary rearrangements in various vertebrate species. We show how this approach produces a biologically meaningful prediction dataset that will be useful to researchers working on regulation. iii/ Third, we focus on the in vivo characterization of regulatory regions. We present a powerful and reliable enhancer detection pipeline composed of an in silico approach to predict putative enhancers and an in vivo method to functionally characterize the expression specificity of predicted regions in the developing medaka fish.<br>The results presented in this thesis contribute to different areas of research such as a better understanding of evolutionary dynamics related to evolutionary rearrangements and to a better in silico and in vivo characterization of cis-regulatory regions.<br>La régulation longue distance a d'importantes fonctions dans la régulation de la transcription et est particulièrement impliquée dans la régulation spatiale et temporelle des gènes cibles. Ces régions ont des caractèristiques spécifiques telles que la capacité de contrôler different gènes à des distances jusqu'a 1Mb du site d'initiation de la transcription. La prédiction et la caractérisation fonctionelle de ces regions restent un problème d'actualité. Des approches évolutionaires ont été d´eveloppées pour détecter les régions sous pression de sélection. En revanche, peu a été fait en rapport avec l'impact de la régulation de longue distance sur l'évolution du génome.<br>Cette thèse se concentre sur trois differents aspects de la régulation longue distance: i/ Premièrement, nous developpons une méthode de prédiction des regions particulièrement sujettes à la fixation des réarrangements de l'évolution. Nous étudions les résultats obtenus dans le contexte de la régulation longue distance et nous montrons que ce type de régulation est un composant majeur dans le façonnement du génome au cours de l'évolution. ii/ Le second projet à pour but de prédire les interactions fonctionnelles entre les régions de régulation et leur gènes cible à partir de l'observation de réarrangements de l'évolution dans differentes espèces. Nous montrons comment une telle approche produit des resultants biologiquement significatifs qui seront particulièrement utiles aux chercheurs travaillant dans le domaine de la régulation. iii/ Troisièmement, nous nous concentrons sur la caractérisation fonctionnelle in vivo des regions régulatrices. Nous présentons une méthode fiable de détection des enhancers composée d'une approche informatique pour la prédiction de ces régions et d'une approche biologique pour caractériser fonctionnellement les spécificités d'expression de ces régions dans le poisson medaka.<br>Les résultats présentés dans cette thèse contribuent à une meilleure comprehension des dynamiques d'évolution en relation avec la régulation longue distance et une meilleure prédiction et caractérisation fonctionnelle de ces régions régulatrices.
APA, Harvard, Vancouver, ISO, and other styles
4

de, Castro Pereira Vinicius Moll. "Evolutionary dynamics of mobile DNA : bioinformatics and molecular case studies." Thesis, Imperial College London, 2006. http://hdl.handle.net/10044/1/11993.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

McGee, Kate. "Evolutionary Factors Shaping Haplotype and Nucleotide Diversity in Humans and Malaria." NCSU, 2008. http://www.lib.ncsu.edu/theses/available/etd-01102008-104027/.

Full text
Abstract:
Cheaper and more rapid DNA sequencing has led to the accumulation of large amounts of genetic data and has fueled the development of new methods to analyze this data. Using population genetics theory and computational methods we can explore the evolutionary forces that shape genetic variation within and among populations of humans and malaria parasites. Demographic events such as population size change influence current patterns of genetic variation. Accounting for the demographic history of a population is critical in the interpretation of population genetic analyses, particularly in detecting of regions under selection and in making inferences about linkage disequilibrium. Characterizing how recombination rates evolve is critical for the efficient design of association studies and, in turn, the understanding of the genetics behind complex phenotypes. In malaria parasites, recombination is a key element in the creation of a wide array of antigens, which help invade host cells. We examine patterns of genetic variation in humans and malaria and explore how demographic history and recombination rates affect these patterns.
APA, Harvard, Vancouver, ISO, and other styles
6

Nystedt, Björn. "Evolutionary Processes and Genome Dynamics in Host-Adapted Bacteria." Doctoral thesis, Uppsala universitet, Molekylär evolution, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-107720.

Full text
Abstract:
Many bacteria live in close association with other organisms such as plants and animals, with important implications for both health and disease. This thesis investigates bacteria that are well adapted to live inside an animal host, and describes the molecular evolutionary processes underlying host-adaptation, based on bacterial genome comparisons. Insect-transmitted bacteria of the genus Bartonella infect the red blood cells of mammals, and we investigate host adaptation and genome evolution in this genus. In Bartonella, many host-interaction systems are encoded in a highly variable chromosomal segment previously shown to be amplified and packaged into bacteriophage particles. Among all genes imported into the Bartonella ancestor, we identify the short gene cluster encoding these phage particles as the most evolutionary conserved, indicating a strong selective advantage and a role in niche adaptation. We also provide an overview of the remarkable evolutionary dynamics of type IV and type V secretion systems, including a detailed analysis of the type IV secretion system trw. Our results highlight the importance of recombination and gene conversion in the evolution of host-adaptation systems, and reveal how these mutational mechanisms result in strikingly different outcomes depending on the selective constraints. In the insect endosymbionts Buchnera and Blochmannia, we show that genes frameshifted at poly(A) tracts can remain functional due to transcriptional slippage. Selection against poly(A) tracts is very inefficient in these genomes compared to other bacteria, and we discuss why this can lead to increased rates of gene loss. Using the human pathogen Helicobacter pylori as a model, we provide a deeper understanding of why highly expressed genes evolve slowly. This thesis emphasizes the power of using complete genome sequences to study evolutionary processes. In particular, we argue that knowledge about the complex evolution of duplicated gene segments is crucial to understand host adaptation in bacteria.
APA, Harvard, Vancouver, ISO, and other styles
7

Loewe, Laurence. "Evolutionary bioinformatics predicting genetic stability of asexual genomes by global computing /." [S.l.] : [s.n.], 2003. http://deposit.ddb.de/cgi-bin/dokserv?idn=969894201.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Young, Adrian. "The Evolutionary Feedback between Genetic Conflict and Genome Architecture." Thesis, Harvard University, 2014. http://dissertations.umi.com/gsas.harvard:11482.

Full text
Abstract:
The advent of separate sexes set the stage for dramatic evolutionary innovation across a wide range of taxa. Much of this innovation is attributable to divergent evolutionary interests between now distinct sub-populations of males and females. Trade-offs inherent to these divergent life histories, coupled with a common genome, conspire to limit natural selection's ability to simultaneously maximize the fitness of both sexes. Such conflict between the sexes has therefore largely shaped the history of the genomes of sexual taxa. However, various aspects of the genomic environment&mdash;including genes' spatial distributions, abilities to regulate their expression, and rates of recombination&mdash;also feed back to influence future sex-specific evolutionary trajectories. Using various genomic resources and transcriptome sequences for the lab mouse, I test several theoretical predictions regarding this feedback between genetic conflict and features of genomic organization.
APA, Harvard, Vancouver, ISO, and other styles
9

Jin, Bo. "Evolutionary Granular Kernel Machines." Digital Archive @ GSU, 2007. http://digitalarchive.gsu.edu/cs_diss/15.

Full text
Abstract:
Kernel machines such as Support Vector Machines (SVMs) have been widely used in various data mining applications with good generalization properties. Performance of SVMs for solving nonlinear problems is highly affected by kernel functions. The complexity of SVMs training is mainly related to the size of a training dataset. How to design a powerful kernel, how to speed up SVMs training and how to train SVMs with millions of examples are still challenging problems in the SVMs research. For these important problems, powerful and flexible kernel trees called Evolutionary Granular Kernel Trees (EGKTs) are designed to incorporate prior domain knowledge. Granular Kernel Tree Structure Evolving System (GKTSES) is developed to evolve the structures of Granular Kernel Trees (GKTs) without prior knowledge. A voting scheme is also proposed to reduce the prediction deviation of GKTSES. To speed up EGKTs optimization, a master-slave parallel model is implemented. To help SVMs challenge large-scale data mining, a Minimum Enclosing Ball (MEB) based data reduction method is presented, and a new MEB-SVM algorithm is designed. All these kernel methods are designed based on Granular Computing (GrC). In general, Evolutionary Granular Kernel Machines (EGKMs) are investigated to optimize kernels effectively, speed up training greatly and mine huge amounts of data efficiently.
APA, Harvard, Vancouver, ISO, and other styles
10

Hudson, Corey M. "Informatic approaches to evolutionary systems biology." Thesis, University of Missouri - Columbia, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3577951.

Full text
Abstract:
<p> The sheer complexity of evolutionary systems biology requires us to develop more sophisticated tools for analysis, as well as more probing and biologically relevant representations of the data. My research has focused on three aspects of evolutionary systems biology. I ask whether a gene&rsquo;s position in the human metabolic network affects the degree to which natural selection prunes variation in that gene. Using a novel orthology inference tool that uses both sequence similarity and gene synteny, I inferred orthologous groups of genes for the full genomes of 8 mammals. With these orthologs, I estimated the selective constraint (the ratio of non-synonymous to synonymous nucleotide substitutions) on 1190 (or 80.2%) of the genes in the metabolic network using a maximum likelihood model of codon evolution and compared this value to the betweenness centrality of each enzyme (a measure of that enzyme&rsquo;s relative global position in the network). Second, I have focused on the evolution of metabolic systems in the presence of gene and genome duplication. I show that increases in a particular gene&rsquo;s copy number are correlated with limiting metabolic flux in the reaction associated with that gene. Finally, I have investigated the proliferative cell programs present in 6 different cancers (breast, colorectal, gastrointestinal, lung, oral squamous and prostate cancers). I found an overabundance of genes that share expression between cancer and embryonic tissue and that these genes form modular units within regulatory, proteininteraction, and metabolic networks. This despite the fact that these genes, as well as the proteins they encode and reactions they catalyze show little overlap among cancers, suggesting parallel independent reversion to an embryonic pattern of gene expression.</p>
APA, Harvard, Vancouver, ISO, and other styles
11

Kandoth, Cyriac. "A quantitative study of gene identification techniques based on evolutionary rationales." Diss., Rolla, Mo. : University of Missouri-Rolla, 2007. http://scholarsmine.mst.edu/thesis/pdf/Kandoth_09007dcc804902b3.pdf.

Full text
Abstract:
Thesis (M.S.)--University of Missouri--Rolla, 2007.<br>Vita. The entire thesis text is included in file. Title from title screen of thesis/dissertation PDF file (viewed February 6, 2008) Includes bibliographical references (p. 36).
APA, Harvard, Vancouver, ISO, and other styles
12

Thomas-Chollier, Morgane. "Evolutionary study of the Hox gene family with matrix-based bioinformatics approaches." Doctoral thesis, Universite Libre de Bruxelles, 2008. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210457.

Full text
Abstract:
Hox transcription factors are extensively investigated in diverse fields of molecular and evolutionary biology. Hox genes belong to the family of homeobox transcription factors characterised by a 60 amino acids region called homeodomain. These genes are evolutionary conserved and play crucial roles in the development of animals. In particular, they are involved in the specification of segmental identity, and in the tetrapod limb differentiation. In vertebrates, this family of genes can be divided into 14 groups of homology. Common methods to classify Hox proteins focus on the homeodomain. Classification is however hampered by the high conservation of this short domain. Since phylogenetic tree reconstruction is time-consuming, it is not suitable to classify the growing number of Hox sequences. The first goal of this thesis is therefore to design an automated approach to classify vertebrate Hox proteins in their groups of homology. This approach classifies Hox proteins on the basis of their scores for a combination of protein generalised profiles. The resulting program, HoxPred, combines predictive accuracy and time efficiency. We used this program to detect and classify Hox genes in several teleost fish genomes. In particular, it allowed us to clarify the evolutionary history of the HoxC1a genes in teleosts. Overall, HoxPred could efficiently contribute to the bioinformatics toolbox commonly used to annotate vertebrate Hox sequences. This program was then evaluated in non-vertebrate species. Although not intended for the classification of Hox proteins in distantly related species, HoxPred showed a high accuracy in bilaterians. It has also given insights into the evolutionary relationships between bilaterian posterior Hox genes, which are notoriously difficult to classify with phylogenetic trees.<p><p>As transcription factors, Hox proteins regulate target genes by specifically binding DNA on cis-regulatory elements. Only a few of these target genes have been identified so far. The second goal of this work was to evaluate whether it is possible to apply computational approaches to detect Hox cis-regulatory elements in genomic sequences. Regulatory Sequence Analysis Tools (RSAT) is a suite of bioinformatics tools dedicated to the detection of cis-regulatory elements in genomes. We participated to the development of matrix-based pattern matching approaches in RSAT. After having performed a statistical validation of the pattern-matching scores, we focused on a study case based on the vertebrate HoxB1 protein, which binds DNA with its cofactors Pbx and Meis. This study aimed at predicting combinations of cis-regulatory elements for these three transcription factors.<br>Doctorat en Sciences<br>info:eu-repo/semantics/nonPublished
APA, Harvard, Vancouver, ISO, and other styles
13

Kelly, Libusha. "Functional hotspots revealed by mutational, evolutionary, and structural characterization of ABC transporters." Diss., Search in ProQuest Dissertations & Theses. UC Only. Search in ProQuest Dissertations & Theses. UC Only, 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3324617.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Nettelblad, Jessica. "Haploid Selection in Animals." Thesis, Uppsala universitet, Evolutionsbiologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-362821.

Full text
Abstract:
Haploid selection in animal sperm is a somewhat controversial topic, but recentevidence might shed experimental light on the matter. This thesis investigates thepossibility to detect any genetic selection in an artificial setting for zebrafish spermfrom a single individual. I analyse pooled data acquired from whole-genomesequencing for two distinct groups of short- and long-lived sperm, trying to identifyshifts in allele frequencies. I augment this by designing an accurate computersimulation of selection, that manipulates selection strength and takes biologicalaspects like linkage and sequence coverage into account. This allows large scaletesting and the generation of null distributions for any test metric. The mainconclusion is that selection has to be extremely strong to be detectable unless onewould explicitly account for genetic linkage, as opposed to the straightforwardper-marker approaches that formed the initial basis for our analyses.
APA, Harvard, Vancouver, ISO, and other styles
15

Muhire, Brejnev Muhizi. "The evolutionary impacts of secondary structures within genomes of eukaryote-infecting single-stranded DNA viruses." Doctoral thesis, University of Cape Town, 2015. http://hdl.handle.net/11427/16933.

Full text
Abstract:
Includes bibliographical references<br>Secondary structures forming through base-pairing in virus genomes have been proven to regulate several processes during viral replication cycles, including genome replication, transcription, post-transcriptional activities, protein synthesis, genome packaging, generation of viral sub-genomes and evasion of host-cell immune responses. Although computational DNA/RNA folding methods based-on free energy minimisation approaches are capable of predicting structures that form within virus genomes, these methods are not entirely accurate. Notably, many of structures that are accurately predicted will likely have no biological importance within the genomes in which they reside because even randomly generated single-stranded RNA/DNA sequences will form stable secondary structures. Nevertheless, with additional genome evolution analyses involving the detection of natural selection, sequence co-evolution, and genetic recombination, it is possible to both validate the existence of, and infer the biological importance of, computationally predicted structures. Here I implement and deploy free bioinformatics tools to (1) automate nucleotide and protein sequences classification into datasets useful for downstream molecular evolution analyses; (2) improve the accuracy of computational virus-genome-scale secondary structure prediction; (3) enable the identification of biologically relevant secondary structures using signals of purifying selection, coevolution and recombination within aligned sequence datasets; and (4) enable efficient visualisation of structural and selection data for better characterisation of individual secondary structural elements. Using these tools I carried-out large scale studies that predicted and characterised novel functional secondary structures, that potentially regulate transcription, translation, gene splicing, and replication, within the genomes of eukaryote-infecting ssDNA viruses (Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae). I show that purifying selection tends to be stronger at base-paired sites than it is at unpaired sites and, wherever mutations are tolerable within paired regions, I demonstrate that there exist strong associations between base-pairing and complementary coevolution. Finally, I show that the recombinant genomes of some, but not all, eukaryote-infecting ssDNA virus groups display weak evidence of both homologous and non-homologous recombination break-points preferentially occurring at genome sites that minimally disrupt secondary structures. Altogether, these results suggest that natural selection acting to maintain important biologically functional secondary structural elements has been a major process during the evolution of eukaryote-infecting ssDNA viruses.
APA, Harvard, Vancouver, ISO, and other styles
16

Liu, Tsunglin. "Physics and bioinformatics of RNA." Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1141407392.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Pridgeon, Carey. "Diverse applications of evolutionary computation in bioinformatics : hypermotifs and gene regulatory network inference." Thesis, University of Exeter, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.479210.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Parmar, Victor. "Predicting transcription factor binding sites using phylogenetic footprinting and a probabilistic framework for evolutionary turnover." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=87000.

Full text
Abstract:
Identifying genomic locations of transcription-factor binding sites (TFBS), particularly in higher eukaryotic genomes, has been an enormous challenge. Computational methods involving identification of sequence conservation between related genomes have been the most successful since sites found in such highly conserved regions are more likely to be functional, i.e. are bound and regulate protein production. In this thesis, we present such a probabilistic algorithm for predicting TFBSs which also takes evolutionary turnovers into account. Our algorithm is validated via simulations and the results of its application on ChIP-chip data are presented.<br>L'identification des sites de fixation des facteurs de transcription (TFBS), particulièrement sur les génomes eucaryotiques plus élevés, a été un énorme défi. Les méthodes informatiques comportant l'identification de la conservation de séquence entre les génomes de différentes espèces ont eu beaucoup de succès parce que les sites trouvés dans de telles régions fortement conservées sont probablement fonctionnels (les facteurs de transcription se rajoutent sur le génome à ces sites-là et réglent la production de protéine). Dans cette thèse, nous présentons un algorithme probabiliste pour la prédiction de TFBSs qui prend en considération également le remuement évolutionnaire. Notre algorithme est validé par l'intermédiare des simulations et le résultats de son application sur des données ChIP-chip sont présentés
APA, Harvard, Vancouver, ISO, and other styles
19

Mota, Merlo Marina. "Evolutionary evidence of chromosomal rearrangements through SNAP : Selection during Niche AdaPtation." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-449171.

Full text
Abstract:
The Selection during Niche AdaPtation (SNAP) hypothesis aims to explain how the gene order in bacterial chromosomes can change as the result of bacteria adapting to a new environment. It starts with a duplication of a chromosomal segment that includes some genes providing a fitness advantage. The duplication of these genes is preserved by positive selection. However, the rest of the duplicated segment accumulates mutations, including deletions. This results in a rearranged gene order. In this work, we develop a method to identify SNAP in bacterial chromosomes. The method was tested in Salmonella and Bartonella genomes. First, each gene was assigned an orthologous group (OG). For each genus, single-copy panorthologs (SCPos), the OGs that were present in most of the genomes as one copy, were targeted. If these SCPos were present twice or more in a genome, they were used to build duplicated regions within said genome. The resulting regions were visualized and their possible compatibility with the SNAP hypothesis was discussed. Even though the method proved to be effective on Bartonella genomes, it was less efficient on Salmonella. In addition, no strong evidence of SNAP was detected in Salmonella genomes.
APA, Harvard, Vancouver, ISO, and other styles
20

Arnold, Brian. "Evolutionary Dynamics of a Multiple-Ploidy System in Arabidopsis Arenosa." Thesis, Harvard University, 2015. http://nrs.harvard.edu/urn-3:HUL.InstRepos:17467222.

Full text
Abstract:
Whole-genome duplication (WGD), which leads to polyploidy, has been implicated in speciation and biological novelty. In plants, many species have experienced historical bouts of WGD or exhibit extant ploidy variation, which is likely representative of an early stage in the evolution of new polyploid lineages. To elucidate the evolutionary dynamics of autopolyploids and species with multiple ploidy levels, I develop population genetic theory in Chapter 2 that I use in Chapter 4 to extract information about the evolutionary history of Arabidopsis arenosa, a European wildflower that has diploid and autotetraploid populations. Chapter 3 involves a separate project exploring the ascertainment bias in restriction site associated DNA sequencing (RADseq). In Chapter 2, I develop coalescent models for autotetraploid species with tetrasomic inheritance and show that the ancestral genetic process in a large population without recombination may be approximated using Kingman’s standard coalescent, with a coalescent effective population size 4N. Using this result, I was able to use existing coalescent simulation programs to show in Chapter 4 that, in A. arenosa, a widespread autotetraploid race arose from a single ancestral population. This autopolyploidization event was not accompanied by immediate reproductive isolation between diploids and tetraploids in this species, as I find evidence of extensive interploidy admixture between diploid and tetraploid populations that are geographically close. To draw these conclusions about population history in Chapter 4, I used a reduced representation genome-sequencing approach based on restriction digestion. However, I was bothered by the possibility that sampling chromosomes based on restriction digestion may introduce a bias in allele frequency estimation due to polymorphisms in restriction sites. To explore the effects of this nonrandom sampling and its sensitivity to different evolutionary parameters, we developed a coalescent-simulation framework in Chapter 3 to mimic the biased recovery of chromosomes in RAdseq experiments. We show that loci with missing haplotypes have estimated diversity statistic values that can deviate dramatically from true values and are also enriched for particular genealogical histories. These results urge caution when applying this technique to make population genetic inferences and helped me tailor analyses in Chapter 4 to accommodate for this particular method of DNA sequencing.<br>Biology, Organismic and Evolutionary
APA, Harvard, Vancouver, ISO, and other styles
21

Blischak, Paul David. "Developing Computational Tools for Evolutionary Inferences in Polyploids." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1531400134548368.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Basile, Walter. "Orphan Genes Bioinformatics : Identification and properties of de novo created genes." Doctoral thesis, Stockholms universitet, Institutionen för biokemi och biofysik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-149168.

Full text
Abstract:
Even today, many genes are without any known homolog. These "orphans" are found in all species, from Viruses to Prokaryotes and Eukaryotes. For a portion of these genes, we might simply not have enough data to find homologs yet. Some of them are imported from taxonomically distant organisms via lateral transfer; others have homologs, but mutated beyond the point of recognition. However, a sizeable fraction of orphan genes is unambiguously created via "de novo" mechanisms. The study of such novel genes can contribute to our understanding of the emergence of functional novelty and the adaptation of species to new ecological niches. In this work, we first survey the field of orphan studies, and illustrate some of the common issues. Next, we analyze some of the intrinsic properties of orphans proteins, including secondary structure elements and Intrinsic Structural Disorder; specifically, we observe that in young proteins the relationship between these properties and the G+C content of their coding sequence is stronger than in older proteins. We then tackle some of the methodological problems often found in orphan studies. We find that using evolutionarily close species, and sensitive, state-of-the art homology recognition methods is instrumental to the identification of a set of orphans enriched in de novo created ones. Finally, we compare how intrinsic disorder is distributed in bacteria versus eukaryota. Eukaryotic proteins are longer and more disordered; the difference is to be attributed primarily to eukaryotic-specific domains and linker regions. In these sections of the proteins, a higher frequency of the disorder-promoting amino acid Serine can be observed in Eukaryotes.<br><p>At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 3: Submitted. Paper 4: Manuscript.</p>
APA, Harvard, Vancouver, ISO, and other styles
23

Cury, Jean. "Evolutionary genomics of conjugative elements and integrons." Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCB062/document.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Chapman, Samuel D. "Applying Evolutionary Computation and Ensemble Approaches to Protein Contact Map and Protein Function Determination." Thesis, North Carolina Agricultural and Technical State University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10191042.

Full text
Abstract:
<p> Proteins are important biological molecules that perform many different functions in an organism. They are composed of sequences of amino acids that play a large part in determining both their structure and function. In turn, the structures of proteins are related to their functions. Using computational methods for protein study is a popular approach, offering the possibility of being faster and cheaper than experimental methods. These software-based methods are able to take information such as the protein sequence and other empirical data and output predictions such as protein structure or function.</p><p> In this work, we have developed a set of computational methods that are used in the application of protein structure prediction and protein function prediction. For protein structure prediction, we use the evolution of logic circuits to produce logic circuit classifiers that predict the protein contact map of a protein based on high-dimensional feature data. The diversity of the evolved logic circuits allows for the creation of ensembles of classifiers, and the answers from these ensembles are combined to produce more-accurate answers. We also apply a number of ensemble algorithms to our results.</p><p> Our protein function prediction work is based on the use of six existing computational protein function prediction methods, of which four were optimized for use on a benchmark dataset, along with two others developed by collaborators. We used a similar ensemble framework, combining the answers from the six methods into an ensemble using an algorithm, CONS, that we helped develop.</p><p> Our contact map prediction study demonstrated that it was possible to evolve logic circuits for this purpose, and that ensembles of the classifiers improved performance. The results fell short of state-of-the-art methods, and additional ensemble algorithms failed to improve the performance. However, the method was also able to work as a feature detector, discovering salient features from the high-dimensional input data, a computationally-intractable problem. In our protein function prediction work, the combination of methods similarly led to a robust ensemble. The CONS ensemble, while not performing as well as the best individual classifier in absolute terms, was nevertheless very close in terms of performance. More intriguingly, there were many specific cases where it performed better than any single method, indicating that this ensemble provided valuable information not captured by any single methods. </p><p> To our knowledge, this is the first time the evolution of logic circuits has been used in any Bioinformatics problem, and it is expected that as the method becomes more developed, results will improve. It is also expected that the feature-detection aspect of this method can be used in other studies. The function prediction study also marks, to our knowledge, the most-comprehensive ensemble classification for protein function prediction. Finally, we expect that the ensemble classification methods used and developed in our protein structure and function work here will pave the way towards stronger ensemble predictors in the future.</p>
APA, Harvard, Vancouver, ISO, and other styles
25

Peterson, Mark Erik. "Evolutionary constraints on the structural similarity of proteins and applications to comparative protein structure modeling." Diss., Search in ProQuest Dissertations & Theses. UC Only, 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3339202.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Gamalielsson, Jonas. "Models for Protein Structure Prediction by Evolutionary Algorithms." Thesis, University of Skövde, Department of Computer Science, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-623.

Full text
Abstract:
<p>Evolutionary algorithms (EAs) have been shown to be competent at solving complex, multimodal optimisation problems in applications where the search space is large and badly understood. EAs are therefore among the most promising classes of algorithms for solving the Protein Structure Prediction Problem (PSPP). The PSPP is how to derive the 3D-structure of a protein given only its sequence of amino acids. This dissertation defines, evaluates and shows limitations of simplified models for solving the PSPP. These simplified models are off-lattice extensions to the lattice HP model which has been proposed and is claimed to possess some of the properties of real protein folding such as the formation of a hydrophobic core. Lattice models usually model a protein at the amino acid level of detail, use simple energy calculations and are used mainly for search algorithm development. Off-lattice models usually model the protein at the atomic level of detail, use more complex energy calculations and may be used for comparison with real proteins. The idea is to combine the fast energy calculations of lattice models with the increased spatial possibilities of an off-lattice environment allowing for comparison with real protein structures. A hypothesis is presented which claims that a simplified off-lattice model which considers other amino acid properties apart from hydrophobicity will yield simulated structures with lower Root Mean Square Deviation (RMSD) to the native fold than a model only considering hydrophobicity. The hypothesis holds for four of five tested short proteins with a maximum of 46 residues. Best average RMSD for any model tested is above 6Å, i.e. too high for useful structure prediction and excludes significant resemblance between native and simulated structure. Hence, the tested models do not contain the necessary biological information to capture the complex interactions of real protein folding. It is also shown that the EA itself is competent and can produce near-native structures if given a suitable evaluation function. Hence, EAs are useful for eventually solving the PSPP.</p>
APA, Harvard, Vancouver, ISO, and other styles
27

Chatzou, Maria 1985. "Large-scale comparative bioinformatics analyses." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/587086.

Full text
Abstract:
One of the main and most recent challenges of modern biology is to keep-up with growing amount of biological data coming from next generation sequencing technologies. Keeping up with the growing volumes of experiments will be the only way to make sense of the data and extract actionable biological insights. Large-scale comparative bioinformatics analyses are an integral part of this procedure. When doing comparative bioinformatics, multiple sequence alignments (MSAs) are by far the most widely used models as they provide a unique insight into the accurate measure of sequence similarities and are therefore instrumental to revealing genetic and/or functional relationships among evolutionarily related species. Unfortunately, the well-established limitation of MSA methods when dealing with very large datasets potentially compromises all downstream analysis. In this thesis I expose the current relevance of multiple sequence aligners, I show how their current scaling up is leading to serious numerical stability issues and how they impact phylogenetic tree reconstruction. For this purpose, I have developed two new methods, MEGA-Coffee, a large scale aligner and Shootstrap a novel bootstrapping measure incorporating MSA instability with branch support estimates when computing trees. The large amount of computation required by these two projects was carried using Nextflow, a new computational framework that I have developed to improve computational efficiency and reproducibility of large-scale analyses like the one carried out in the context of these studies.<br>Uno de los principales y más recientes retos de la biología moderna es poder hacer frente a la creciente cantidad de datos biológicos procedentes de las tecnologías de secuenciación de alto rendimiento. Mantenerse al día con los crecientes volúmenes de datos experimentales es el único modo de poder interpretar estos datos y extraer conclusiones biológicos relevantes. Los análisis bioinformáticos comparativos a gran escala son una parte integral de este procedimiento. Al hacer bioinformática comparativa, los alineamientos múltiple de secuencias (MSA) son con mucho los modelos más utilizados, ya que proporcionan una visión única de la medida exacta de similitudes de secuencia y son, por tanto, fundamentales para inferir las relaciones genéticas y / o funcionales entre las especies evolutivamente relacionadas. Desafortunadamente, la conocida limitación de los métodos MSA para analizar grandes bases de datos, puede potencialmente comprometer todos los análisis realizados a continuación. En esta tesis expongo la relevancia actual de los métodos de alineamientos multiples de secuencia, muestro cómo su uso en datos masivos está dando lugar a serios problemas de estabilidad numérica y su impacto en la reconstrucción del árbol filogenético. Para este propósito, he desarrollado dos nuevos métodos, MEGA-café, un alineador de gran escala y Shootstrap una nueva medida de bootstrapping que incorpora la inestabilidad del MSA con las estimaciones de apoyo de rama en el cálculo de árboles filogéneticos. La gran cantidad de cálculo requerido por estos dos proyectos se realizó utilizando Nextflow, un nuevo marco computacional que se ha desarrollado para mejorar la eficiencia computacional y la reproducibilidad del análisis a gran escala como la que se lleva a cabo en el contexto de estos estudios.
APA, Harvard, Vancouver, ISO, and other styles
28

Al-Ouran, Rami. "Motif Selection: Identification of Gene Regulatory Elements using Sequence CoverageBased Models and Evolutionary Algorithms." Ohio University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1449003717.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Fiedler, Lindsey J. "Evolutionary Dynamics of Influenza Type B in the Presence of Vaccination: An Ecological Study." Scholar Commons, 2019. https://scholarcommons.usf.edu/etd/7786.

Full text
Abstract:
Understanding the evolutionary dynamics of influenza type B in human hosts is a public health concern as we strive to minimize the disease burden in seasonal epidemics. Vaccination is considered the best defense against contracting influenza, and everyone over the age of 6 months is advised to get vaccinated before each season. The effect that vaccine-acquired immunity has on the evolution of influenza B remains unclear. In the U.S., vaccine-uptake is irregular across the states, and the differing coverages present an opportunity to study how vaccination influences viral evolution. This thesis analyzes the evolutionary patterns of influenza B in the presence of vaccine-induced selective pressure. Using an ecological study design, estimates on statewide vaccination coverages from the Centers for Disease Control and Prevention were related to influenza B sequence data. The phylogenies and the frequencies of single nucleotide polymorphisms for high and low coverage states across three influenza seasons were compared to evaluate if there was evidence of vaccination influencing evolution. Overall, the results show that vaccination does not significantly impact the evolutionary dynamics of influenza B with both high and low coverage states showing interspersed phylogenetic trees and similar antigenic diversities.
APA, Harvard, Vancouver, ISO, and other styles
30

Rendel, Mark D. "The evolutionary dynamics of neutral networks : lessons from RNA." Thesis, University of Oxford, 2008. http://ora.ox.ac.uk/objects/uuid:85107ca7-fada-4582-95e7-17b5bbb038cd.

Full text
Abstract:
The evolutionary options of a population are strongly influenced by the avail- ability of adaptive mutants. In this thesis, I use the concept of neutral networks to show that neutral drift can actually increase the accessibility of adaptive mu- tants, and therefore facilitate adaptive evolutionary change. Neutral networks are groups of unique genotypes which all code for the same phenotype, and are connected by simple point mutations. I calculate the size and shape of the networks in a small but exhaustively enumerated space of RNA genotypes by mapping the sequences to RNA secondary structure phenotypes. The qual- itative results are similar to those seen in many other genotype–phenotype map models, despite some significant methodological differences. I show that the boundary of each network has single point–mutation connections to many more phenotypes than the average individual genotype within that network. This means that paths involving a series of neutral point–mutation steps across a network can allow evolution to adaptive phenotypes which would otherwise be extremely unlikely to arise spontaneously. This can be likened to walking along a flat ridge in an adaptive landscape, rather than traversing or jumping across a lower fitness valley. Within this model, when a genotype is made up of just 10 bases, the mean neutral path length is 1.88 point mutations. Furthermore, the map includes some networks that are so convoluted that the path through the network is longer than the direct route between two sequences. A minimum length adaptive walk across the genotype space usually takes as many neutral steps as adaptive ones on its way to the optimum phenotype. Finally I show that the shape of a network can have a very important affect on the number of generations it takes a population to drift across it, and that the more routes between two sequences, the fewer generations required for a population to find an advantageous sequence. My conclusion is that, within the RNA map at least, the size, shape and connectivity of neutral networks all have a profound effect on the way that sequences change and populations evolve, and by not considering them, we risk missing an important evolutionary mechanism.
APA, Harvard, Vancouver, ISO, and other styles
31

Pajak, Maciej. "Evolutionary conservation and diversification of complex synaptic function in human proteome." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/31108.

Full text
Abstract:
The evolution of synapses from early proto-synaptic protein complexes in unicellular eukaryotes to sophisticated machines comprising thousands of proteins parallels the emergence of finely tuned synaptic plasticity, a molecular correlate for memory and learning. Phenotypic change in organisms is ultimately the result of evolution of their genotype at the molecular level. Selection pressure is a measure of how changes in genome sequence that arise though naturally occurring processes in populations are fixed or eliminated in subsequent generations. Inferring phylogenetic information about proteins such as the variation of selection pressure across coding sequences can provide valuable information not only about the origin of proteins, but also the contribution of specific sites within proteins to their current roles within an organism. Recent evolutionary studies of synaptic proteins have generated attractive hypotheses about the emergence of finely-tuned regulatory mechanisms in the post-synaptic proteome related to learning, however, these analyses are relatively superficial. In this thesis, I establish a scalable molecular phylogenetic modelling framework based on three new inference methodologies to investigate temporal and spatial aspects of selection pressure changes for the whole human proteome using protein orthologs from up to 68 taxa. Temporal modelling of evolutionary selection pressure reveals informative features and patterns for the entire human proteome and identifies groups of proteins that share distinct diversification timelines. Multi-ontology enrichment analysis of these gene cohorts was used to aid biological interpretation, but these approaches are statistically under powered and do not capture a clear picture of the emergence of synaptic plasticity. Subsequent pathway-centric analysis of key synaptic pathways extends the interpretation of temporal data and allows for revision of previous hypotheses about the evolution of complex synaptic function. I proceed to integrate inferred selection pressure timeline information in the context of static protein-protein interaction data. A network analysis of the full human proteome reveals systematic patterns linking the temporal profile of proteins’ evolution and their topological role in the interaction graph. These graphs were used to test a mechanistic hypothesis that proposed a propagating diversification signal between interactors using the temporal modelling data and network analysis tools. Finally, I analyse the data of amino-acid level spatial modelling of selection pressure events in Arc, one of the master regulators of synaptic plasticity, and its interactors for which detailed experimental data is available. I use the Arc interactome as an example to discuss episodic and localised diversifying selection pressure events in tightly coupled complexes of protein and showcase potential for a similar systematic analysis of larger complexes of proteins using a pathway-centric approach. Through my work I revised our understanding of temporal evolutionary patterns that shaped contemporary synaptic function through profiling of emergence and refinement of proteins in multiple pathways of the nervous system. I also uncovered systematic effects linking dependencies between proteins with their active diversification, and hypothesised about their extension to domain level selection pressure events.
APA, Harvard, Vancouver, ISO, and other styles
32

Long, Hannah Katherine. "Evolutionary usage and developmental roles of vertebrate non-methylated DNA." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:78b14c1d-1fa3-46f1-815f-a8ba55579c43.

Full text
Abstract:
Vertebrate genomes exhibit global methylation of cytosine residues where they occur in a cytosine-guanine dinucleotide (CpG) context and this epigenetic mark is generally thought to be repressive to transcription. Punctuating this pervasive DNA methylation landscape are short, contiguous regions of non-methylated DNA which are found at two thirds of mammalian gene promoters. These non-methylated regions exhibit CpG content close to expected levels as they escape the depletion of CpGs observed across the methylated fraction of the genome. The unique nucleotide properties of these CpG island (CGI) regions enable their identification by computational prediction in mammalian genomes. Owing to a lack of high-resolution genome-wide DNA methylation profiles in non-mammalian species, these CGI predictions have often been used as a proxy for non-methylated DNA in these organisms. In contrast to mammals, CGI predictions in cold-blooded vertebrates rarely coincide with gene promoters, leading to the belief that CGls are significantly divergent between vertebrate species, and that unique promoter-associated features may have been acquired during warmblooded vertebrate evolution. This thesis is primarily concerned with the location, establishment and biological function of non-methylated islands of DNA in vertebrate genomes. To experimentally determine genome-wide profiles of non-methylated DNA, a novel biochemical technique was established called biotinylated ZF-CxxC affinity purification (Bio-CAP), and development of this method is discussed in Chapter 3. Experimental analysis of non-methylated DNA profiles in this thesis initially addresses two main questions: (1) 'How does the non-methylated DNA landscape compare genome-wide for seven vertebrates considering distinct tissue types and developmental stages?' (2) 'How are vertebrate non-methylated regions of DNA defined and interpreted in the nuclear environment?' To address the first question, non-methylated DNA was profiled by Bio-CAP sequencing across the genomes of seven diverse vertebrate species, representing all major branch points of vertebrate evolution, and the results are discussed in Chapters 4 and S. Contrary to previously held dogma, experimentally determined nonmethylated islands of DNA (NMls) constitute an ancient epigenetic feature of vertebrate gene regulatory elements. However, despite having numerous high-resolution maps of vertebrate non-methylated DNA, the means by which NMls are identified and maintained in the nuclear environment remains poorly understood. To address the second question and identify features which determine the methylation state of DNA, exogenous DNA sequences were introduced into mouse embryonic stem (ES) c~.II~. Non-methylated DNA was profiled by Bio-CAP sequencing to investigate how different features, such as sequence-specific binding motifs, chromatin architecture and nucleotide composition of a given DNA sequence impact local DNA methylation patterns. Interestingly, the majority of exogenous promoters were appropriately non-methylated in mouse ES cells, germline and somatic cells suggesting that gene promoters have retained strong signals for the nonmethylated state across millions of years of evolution (discussed in Chapter 6). During mouse embryogenesis, genome-scale DNA demethylation and remethylation events occur to remodel the epigenetic landscape and loss of DNA methylation during this time leads to embryonic lethality. To investigate the biological function of non-methylated DNA, the third question addressed in this thesis is (3) 'What is the developmental importance of non-methylated islands of DNA during vertebrate embryogenesis?' To investigate this, members of the ZF-CxxC domain-containing family of chromatin modifiers were ablated in zebrafish embryos to perturb the chromatin landscape at NMls, and therefore interfere with their function during early development (Chapter 7). Early embryonic development and patterning was disrupted in knockdown embryos, suggesting that interpretation of non-methylated DNA and placement of chromatin modifications at NMls is essential for normal zebrafish embryogenesis. Together this work sheds light on the evolutionary origins of NMls, the mechanisms involved in the recognition and establishment of nonmethylated loci and provides an insight into the function of non-methylated DNA during early embryonic development.
APA, Harvard, Vancouver, ISO, and other styles
33

Sjöstrand, Joel. "Reconciling gene family evolution and species evolution." Doctoral thesis, Stockholms universitet, Numerisk analys och datalogi (NADA), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-93346.

Full text
Abstract:
Species evolution can often be adequately described with a phylogenetic tree. Interestingly, this is the case also for the evolution of homologous genes; a gene in an ancestral species may – through gene duplication, gene loss, lateral gene transfer (LGT), and speciation events – give rise to a gene family distributed across contemporaneous species. However, molecular sequence evolution and genetic recombination make the history – the gene tree – non-trivial to reconstruct from present-day sequences. This history is of biological interest, e.g., for inferring potential functional equivalences of extant gene pairs. In this thesis, we present biologically sound probabilistic models for gene family evolution guided by species evolution – effectively yielding a gene-species tree reconciliation. Using Bayesian Markov-chain Monte Carlo (MCMC) inference techniques, we show that by taking advantage of the information provided by the species tree, our methods achieve more reliable gene tree estimates than traditional species tree-uninformed approaches. Specifically, we describe a comprehensive model that accounts for gene duplication, gene loss, a relaxed molecular clock, and sequence evolution, and we show that the method performs admirably on synthetic and biological data. Further-more, we present two expansions of the inference procedure, enabling it to pro-vide (i) refined gene tree estimates with timed duplications, and (ii) probabilistic orthology estimates – i.e., that the origin of a pair of extant genes is a speciation. Finally, we present a substantial development of the model to account also for LGT. A sophisticated algorithmic framework of dynamic programming and numerical methods for differential equations is used to resolve the computational hurdles that LGT brings about. We apply the method on two bacterial datasets where LGT is believed to be prominent, in order to estimate genome-wide LGT and duplication rates. We further show that traditional methods – in which gene trees are reconstructed and reconciled with the species tree in separate stages – are prone to yield inferior gene tree estimates that will overestimate the number of LGT events.<br>Arters evolution kan i många fall beskrivas med ett träd, vilket redan Darwins anteckningsböcker från HMS Beagle vittnar om. Detta gäller också homologa gener; en gen i en ancestral art kan – genom genduplikationer, genförluster, lateral gentransfer (LGT) och artbildningar – ge upphov till en genfamilj spridd över samtida arter. Att från sekvenser från nu levande arter rekonstruera genfamiljens framväxt – genträdet – är icke-trivialt på grund av genetisk rekombination och sekvensevolution. Genträdet är emellertid av biologiskt intresse, i synnerhet för att det möjliggör antaganden om funktionellt släktskap mellan nutida genpar. Denna avhandling behandlar biologiskt välgrundade sannolikhetsmodeller för genfamiljsevolution. Dessa modeller tar hjälp av artevolutionens starka inverkan på genfamiljens historia, och ger väsentligen upphov till en förlikning av genträd och artträd. Genom Bayesiansk inferens baserad på Markov-chain Monte Carlo (MCMC) visar vi att våra metoder presterar bättre genträdsskattningar än traditionella ansatser som inte tar artträdet i beaktning. Mer specifikt beskriver vi en modell som omfattar genduplikationer, genförluster, en relaxerad molekylär klocka, samt sekvensevolution, och visar att metoden ger högkvalitativa skattningar på både syntetiska och biologiska data. Vidare presenterar vi två utvidgningar av detta ramverk som möjliggör (i) genträdsskattningar med tidpunkter för duplikationer, samt (ii) probabilistiska ortologiskattningar – d.v.s. att två nutida gener härstammar från en artbildning. Slutligen presenterar vi en modell som inkluderar LGT utöver ovan nämnda mekanismer. De beräkningsmässiga svårigheter som LGT ger upphov till löses med ett intrikat ramverk av dynamisk programmering och numeriska metoder för differentialekvationer. Vi tillämpar metoden för att skatta LGT- och duplikationsraten hos två bakteriella dataset där LGT förmodas ha spelat en central roll. Vi visar också att traditionella metoder – där genträd skattas och förlikas med artträdet i separata steg – tenderar att ge sämre genträdsskattningar, och därmed överskatta antalet LGT-händelser.<br><p>At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 3: Manuscript. Paper 5: Manuscript.</p>
APA, Harvard, Vancouver, ISO, and other styles
34

Dwivedi, Bhakti. "Impact of molecular evolutionary footprints on phylogenetic accuracy a simulation study /." Dayton, Ohio : University of Dayton, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1250807136.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Alachiotis, Nikolaos [Verfasser], Alexandros [Akademischer Betreuer] Stamatakis, and Arndt (Prof Dr ). Dollas Apostolos [Akademischer Betreuer] Bode. "Algorithms and Computer Architectures for Evolutionary Bioinformatics / Nikolaos Alachiotis. Gutachter: Arndt (Prof. Dr.),Dollas, Apostolos Bode ; Alexandros Stamatakis. Betreuer: Alexandros Stamatakis." München : Universitätsbibliothek der TU München, 2012. http://d-nb.info/1031514384/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Giner, Delgado Carla. "Large-scale evolutionary analysis of polymorphic inversions in the human genome." Doctoral thesis, Universitat Autònoma de Barcelona, 2017. http://hdl.handle.net/10803/459114.

Full text
Abstract:
Les inversions cromosòmiques són variants estructurals on un fragment de genoma s'inverteix sense canviar-ne el contingut, i durant anys, els seus efectes subtils però importants han fascinat els biòlegs evolutius. De fet, les inversions van ser descobertes per primer cop fa cent anys en mosques de la fruita, i aviat es va fer evident la seva associació a processos evolutius com l'adaptació local i l'especiació. Tot i així, en el moment en què vivim de la genòmica i el big data, les inversions tendeixen a quedar fora de l'abast de les tecnologies més comunes i encara se sap poc d'elles. Durant els últims anys, el Projecte InvFEST ha tingut com a objectiu ampliar el nostre coneixement sobre les inversions humanes mitjançant la validació i genotipació d'una gran part de les inversions predites. En concret, aquest projecte ha generat un recurs molt útil format per 45 inversions comunes (d'entre 83 pb i 415 kpb) amb genotips d'alta qualitat per a un total de 550 individus de set poblacions diverses. En aquesta tesi s'utilitzen les dades poblacionals generades, junt amb les seqüències del Projecte 1000 Genomes, per a realitzar la primera anàlisi detallada de les propietats evolutives de les inversions polimòrfiques humanes. Per aconseguir-ho, s'han fet servir diferents mètodes que combinen models teòrics, simulacions i comparacions amb altres tipus de mutacions. A part d'obtenir una caracterització completa de les dades, els resultats confirmen que hi ha diferències importants entre inversions generades per diferents mecanismes. La distribució de freqüències de les 21 inversions creades per mecanismes no homòlegs (NH) és semblant a l'esperada per a variants neutres si es controlen els biaixos en la detecció, indicant que no es troben sota una forta selecció negativa. La recombinació s'inhibeix en tota la longitud de la inversió, sense que s'observi cap intercanvi de variants entre orientacions, i la inhibició podria estendre's alguna kilobase més enllà dels punts de trencament. Com a resultat, els nivells de variació genòmica es veuen força afectats per les inversions NH, tal i com prediuen les simulacions realitzades. Mentre que les inversions antigues i a freqüències intermèdies augmenten la variació nucleotídica, les inversions més recents poden crear l'efecte contrari. En canvi, la majoria de les inversions creades per recombinació homòloga no al·lèlica (NAHR) (19/24) han aparegut múltiples vegades de manera independent en diferents haplotips de la mostra. Els elevats nivells de recurrència es reflecteixen en diferents mesures: aquestes inversions estan enriquides en freqüències intermèdies, comparteixen polimorfismes nucleotídics entre orientacions, i es troben en baix desequilibri de lligament amb variants properes, dificultant-ne la seva detecció indirecta amb variants correlacionades. Finalment, per tal de trobar aquelles inversions candidates a tenir efectes funcionals, s'han explorat diverses senyals de selecció natural a partir de freqüències, diferenciació poblacional i patrons de variació nucleotídica. S'han identificat deu inversions candidates, tres d'elles de més de 1.5 milions d'anys i mantingudes a freqüències intermèdies, possiblement per selecció equilibradora, una d'elles també es troba a genomes d'hominins antics. Altres candidates sembla que han augmentat ràpidament de freqüència en algunes poblacions, que és consistent amb selecció positiva. Cal destacar que més de la meitat de candidates estan situades en regions gèniques, el qual suggereix que poden tenir un efecte funcional. Per tant, aquest treball ofereix una visió global de la dinàmica de les inversions i el seu paper com modificadors genòmics, obrint noves línies d'investigació.<br>Chromosomal inversions are structural variants that invert a fragment of the genome without usually modifying its content, and their subtle but powerful effects in natural populations have fascinated evolutionary biologists for a long time. Discovered a century ago in fruit flies, their association with different evolutionary processes, such as local adaptation and speciation, was soon evident in several species. However, in the current era of genomics and big data, inversions frequently escape the grasp of current technologies and remain largely overlooked in humans. During the last few years, the InvFEST Project has aimed to address the missing knowledge about human inversions by validating and genotyping a large fraction of predicted polymorphisms. In particular, it has generated one of the most useful data sets on human inversions, consisting of 45 common inversions (with sizes from 83 bp to 415 kbp) genotyped at high-quality in 550 individuals of seven populations of diverse ancestry. This thesis takes advantage of the available population-scale information, combined with whole-genome sequences available from the 1000 Genomes Project, to carry out the first detailed analysis of the evolutionary properties of human polymorphic inversions. The methods used combine theoretical models, simulations and empirical comparisons with other mutation types. Besides the complete characterization of the data set, the results confirm fundamental differences between inversions created by different mechanisms. The frequency distribution of the 21 inversions originated by non-homologous mechanisms (NH) is similar to that expected for neutral variants when controlling for detection biases, which indicates that they are not subjected to strong negative selection. Recombination is completely inhibited across the whole inversion length, with no clear genetic exchange found, and possibly over a few kbp beyond the breakpoints. As a result, NH inversions strongly affect local genome variation levels, as predicted by computer simulations, with older inversions increasing total nucleotide diversity, while younger ones at very high frequency could have the opposite effect. In contrast, most inversions created by non-allelic homologous recombination (NAHR) (19/24) have appeared independently in different haplotypes in the sample. These high recurrence levels are reflected in several measures: they are enriched in intermediate frequencies, share multiple nucleotide polymorphisms between orientations, and have little linkage disequilibrium with neighbouring variants, which limits their detection by tag SNP strategies. Finally, in order to find inversions that are functional candidates, different signatures of selection on inversions were explored based on their frequencies, population differentiation and sequence variation patterns. Ten candidates were revealed, with three of them found to be >1.5 million years old and maintained at intermediate frequencies, possibly by balancing selection. One of these was also found in archaic hominins. Other candidates seem to have reached high frequencies in a short period of time in some populations, consistent with positive selection. Notably, over half of the candidates are located within gene regions, which suggests that they may have functional effects. Thus, this work offers an overview of inversion dynamics and their role as genomic modifiers, opening interesting avenues of investigation.
APA, Harvard, Vancouver, ISO, and other styles
37

Laetsch, Dominik Robert. "On the evolution of effector gene families in potato cyst nematodes." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/31244.

Full text
Abstract:
Potato cyst nematodes (PCN) are economically relevant plant parasites that infect potato crops. The genomes of three PCN species are available and genome data have been generated for several populations of PCN, to address questions related to the molecular basis of plant parasitism. In this thesis, I employ approaches of comparative genomics to highlight differences and similarities between PCNs and other nematode species. I present two new software solutions to address challenges associated with the field of comparative genomics: BlobTools, a taxonomic interrogation toolkit for quality control of genome assemblies, and KinFin, a solution for the analysis of protein orthology data. I apply both software solutions to genomic datasets of nematodes, platyhelminths, and tardigrades. Based on KinFin analysis of plant parasitic nematodes, I identify protein families in PCNs likely to be involved in host-parasitic interaction, termed effectors, and discuss their functions. I highlight examples of horizontal gene transfer from bacteria to plant parasitic nematodes. Through genomic data of European and South American populations of PCNs, I address variation in populations, infer phylogenetic relationships, and try to estimate the effect of selection on effector genes identified through KinFin. Furthermore, I estimate the rate of variation across the reference genomes of two PCNs.
APA, Harvard, Vancouver, ISO, and other styles
38

Wei, Yulong. "Microbes Carry Distinct Genomic Signatures in Adaptation to Their Translation Machinery and Host Environments." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42422.

Full text
Abstract:
How do bacteria grow and replicate rapidly? How do viruses and phages adapt to their host environments? Bacteria require efficient translation to grow and replicate rapidly, and translation is often rate-limited by initiation. A feature that is conserved across bacterial lineages is the Shine-Dalgarno (SD) sequence at the mRNA 5’ UTR, which pairs with the anti-SD sequence located at the 3’ end of mature 16S rRNA. Nonetheless, much about this interaction remains unclear. Chapter 2 reveals evolutionary differences between Cyanobacteria and chloroplast translation initiation using a new model (DtoStart) that better define optimal SD sequence and an RNA-Seq-based approach that reliably characterize the 3’ end of mature 16S rRNAs. Efficacy of translation elongation depends much on tRNA-mediated codon adaptation. In Escherichia coli, selection favours major codons because they are rapidly decoded by abundantly available cognate tRNAs. Nonetheless, the degree codon bias correlates with tRNA availability is unclear in many bacterial species because tRNA abundance is often inadequately approximated by gene copy numbers. To better understand tRNA-mediated codon bias, Chapter 3 describes an RNA-Seq-based approach to robustly quantify tRNA abundance. Finally, Chapter 4 evaluates the degree optimal translation initiation and elongation signals affect ribosome dynamics. The emergence of COVID-19 pandemic poses a serious global health emergency. To establish infection during cell entry, the coronavirus Spike protein binds to the host ACE2 receptor, and a high binding potential between these two players is key to infectivity. While SARS-CoV-2 transmits efficiently in humans, it is less clear which other mammals are at risk of being infected. Chapter 5 investigates the host range of SARS-CoV-2 through comparative sequence analyses at the ACE2 receptors and the Spike proteins. As obligate parasites, coronaviruses regularly infect host tissues that express antiviral proteins (AVPs) in abundance and must evade or adapt to the host cellular environments post-entry. Two AVPs that shape viral genomes are ZAP that binds to CpG dinucleotides to facilitate viral transcript degradation, and APOBEC3 which deaminates C into U leading to dysfunctional transcripts. Chapter 6 shows that coronavirus genomes are CpG deficient to evade ZAP and are subjected to constant C to U deamination by APOBEC3. This thesis examines two key concepts of microbial genome evolution: 1) coevolution between gene features and the translation machinery in bacteria, and 2) adaptation of viruses to the hosts they infect. Chapters 2, 3, and 4 are aimed at improving our understanding in bacterial gene expression in the applications of transgenic biosynthesis and phage therapy. Chapters 5 and 6 are aimed at improving our understanding in the origin and evolution of SARS-CoV-2 and our ability to control the spread of infection.
APA, Harvard, Vancouver, ISO, and other styles
39

Belmadani, Manuel. "MotifGP: DNA Motif Discovery Using Multiobjective Evolution." Thesis, Université d'Ottawa / University of Ottawa, 2016. http://hdl.handle.net/10393/34213.

Full text
Abstract:
The motif discovery problem is becoming increasingly important for molecular biologists as new sequencing technologies are producing large amounts of data, at rates which are unprecedented. The solution space for DNA motifs is too large to search with naive methods, meaning there is a need for fast and accurate motif detection tools. We propose MotifGP, a multiobjective motif discovery tool evolving regular expressions that characterize overrepresented motifs in a given input dataset. This thesis describes and evaluates a multiobjective strongly typed genetic programming algorithm for the discovery of network expressions in DNA sequences. Using 13 realistic data sets, we compare the results of our tool, MotifGP, to that of DREME, a state-of-art program. MotifGP outperforms DREME when the motifs to be sought are long, and the specificity is distributed over the length of the motif. For shorter motifs, the performance of MotifGP compares favourably with the state-of-the-art method. Finally, we discuss the advantages of multi-objective optimization in the context of this specific motif discovery problem.
APA, Harvard, Vancouver, ISO, and other styles
40

Morrison, Kevin S. "Topological Data Analysis and Applications to Influenza." Miami University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=miami1595864809447239.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Pensch, Raphaela. "Non-coding constraint mutations impact the gene regulatory system in osteosarcoma." Thesis, Uppsala universitet, Institutionen för medicinsk biokemi och mikrobiologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-448082.

Full text
Abstract:
The non-coding space makes up around 98 % of the genome, but cancer-driving mutations have so far mostly been discovered in protein-coding regions. The majority of somatic non-coding mutations are neutral passenger mutations and identifying non-coding mutations with driving roles in cancer poses a challenge. In this work, evolutionary constraint was used to explore the non-coding space in human osteosarcoma to improve our understanding of how evolutionary constraint can be applied to identify non-coding driver mutations in cancer and describe the unknown role of non-coding mutations in osteosarcoma. Evolutionary constraint scores derived from an alignment of 33 mammals were used to extract non-coding mutations in functional elements from somatic variants of 38 osteosarcoma samples and genes with an enrichment of non-coding constraint mutations in their regulatory regions were identified. The investigation of those genes revealed that non-coding constraint mutations are likely involved in key osteosarcoma pathways. Furthermore, novel osteosarcoma genes and mechanisms were proposed based on the non-coding constraint mutation enrichment analysis. The regulatory potential of individual non-coding constraint mutations was evaluated based on regulatory annotations, functional evidence, transcription factor affinity predictions and electrophoretic mobility shift assays. We concluded that the analysis of non-coding constraint mutations is an efficient way to discover non-coding mutations with functional impact in osteosarcoma which likely play an important role in the disease.
APA, Harvard, Vancouver, ISO, and other styles
42

Cooper, Lizette. "Evolutionary investigation of group I introns in nuclear ribosomal internal transcribed spacers in Neoselachii." Bowling Green State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu154229759945368.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Coghill, Lyndon M. "Statistical and Comparative Phylogeography of Mexican Freshwater Taxa in Extreme Aquatic Environments." ScholarWorks@UNO, 2013. http://scholarworks.uno.edu/td/1724.

Full text
Abstract:
Phylogeography aims to understand the processes that underlie the distribution of genetic variation within and among closely related species. Although the means by which this goal might be achieved differ considerably from those that spawned the field some thirty years ago, the foundation and conceptual breakthroughs made by Avise are nonetheless the same and are as relevant today as they were two decades ago. Namely, patterns of neutral genetic variation among individuals carry the signature of a species’ demographic past, and the spatial and temporal environmental heterogeneity across a species’ geographic range can influence patterns of evolutionary change. Aquatic systems throughout Mexico provide unique opportunities to study phenotypic plasticity and evolution in relation to climatic and environmental selective forces. There are several unique, often isolated aquatic environments throughout Mexico that have a history of geographic isolation and reconnection. The first study presented herein shows significant mitochondrial sequence divergence was also discovered between L. megalotis populations on either side of the Sierra de San Marcos that bisects the valley of Cuatro Ciénegas and that the populations in the valley are genetically distinct from those found outside of the valley. The second study recovered signals of two divergence events in Cuatro Ciénegas for six codistributed taxa, and reveals that both events occured in the Pleistocene during periods of increased aridity suggesting that climatic effects might have played a role in these species’ divergence. The final study presents an Illumina-based high-resolution species phylogeny for Astyanax mexicanus providing added support that there are multiple origins to cave populations and further clarifying the uniqueness of the Sabinos and Rio Subterráneo caves.
APA, Harvard, Vancouver, ISO, and other styles
44

Correa, Leonardo de Lima. "Uma proposta de algoritmo memético baseado em conhecimento para o problema de predição de estruturas 3-D de proteínas." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/156640.

Full text
Abstract:
Algoritmos meméticos são meta-heurísticas evolutivas voltadas intrinsecamente à exploração e incorporação de conhecimentos relacionados ao problema em estudo. Nesta dissertação, foi proposto um algoritmo memético multi populacional baseado em conhecimento para lidar com o problema de predição de estruturas tridimensionais de proteínas voltado à modelagem de estruturas livres de similaridades conformacionais com estruturas de proteínas determinadas experimentalmente. O algoritmo em questão, foi estruturado em duas etapas principais de processamento: (i) amostragem e inicialização de soluções; e (ii) otimização dos modelos estruturais provenientes da etapa anterior. A etapa I objetiva a geração e classificação de diversas soluções, a partir da estratégia Lista de Probabilidades Angulares, buscando a definição de diferentes grupos estruturais e a criação de melhores estruturas a serem incorporadas à meta-heurística como soluções iniciais das multi populações. A segunda etapa consiste no processo de otimização das estruturas oriundas da etapa I, realizado por meio da aplicação do algoritmo memético de otimização, o qual é fundamentado na organização da população de indivíduos em uma estrutura em árvore, onde cada nodo pode ser interpretado como uma subpopulação independente, que ao longo do processo interage com outros nodos por meio de operações de busca global voltadas a características do problema, visando o compartilhamento de informações, a diversificação da população de indivíduos, e a exploração mais eficaz do espaço de busca multimodal do problema O algoritmo engloba ainda uma implementação do algoritmo colônia artificial de abelhas, com o propósito de ser utilizado como uma técnica de busca local a ser aplicada em cada nodo da árvore. O algoritmo proposto foi testado em um conjunto de 24 sequências de aminoácidos, assim como comparado a dois métodos de referência na área de predição de estruturas tridimensionais de proteínas, Rosetta e QUARK. Os resultados obtidos mostraram a capacidade do método em predizer estruturas tridimensionais de proteínas com conformações similares a estruturas determinadas experimentalmente, em termos das métricas de avaliação estrutural Root-Mean-Square Deviation e Global Distance Total Score Test. Verificou-se que o algoritmo desenvolvido também foi capaz de atingir resultados comparáveis ao Rosetta e ao QUARK, sendo que em alguns casos, os superou. Corroborando assim, a eficácia do método.<br>Memetic algorithms are evolutionary metaheuristics intrinsically concerned with the exploiting and incorporation of all available knowledge about the problem under study. In this dissertation, we present a knowledge-based memetic algorithm to tackle the threedimensional protein structure prediction problem without the explicit use of template experimentally determined structures. The algorithm was divided into two main steps of processing: (i) sampling and initialization of the algorithm solutions; and (ii) optimization of the structural models from the previous stage. The first step aims to generate and classify several structural models for a determined target protein, by the use of the strategy Angle Probability List, aiming the definition of different structural groups and the creation of better structures to initialize the initial individuals of the memetic algorithm. The Angle Probability List takes advantage of structural knowledge stored in the Protein Data Bank in order to reduce the complexity of the conformational search space. The second step of the method consists in the optimization process of the structures generated in the first stage, through the applying of the proposed memetic algorithm, which uses a tree-structured population, where each node can be seen as an independent subpopulation that interacts with others, over global search operations, aiming at information sharing, population diversity, and better exploration of the multimodal search space of the problem The method also encompasses ad-hoc global search operators, whose objective is to increase the exploration capacity of the method turning to the characteristics of the protein structure prediction problem, combined with the Artificial Bee Colony algorithm to be used as a local search technique applied to each node of the tree. The proposed algorithm was tested on a set of 24 amino acid sequences, as well as compared with two reference methods in the protein structure prediction area, Rosetta and QUARK. The results show the ability of the method to predict three-dimensional protein structures with similar foldings to the experimentally determined protein structures, regarding the structural metrics Root-Mean-Square Deviation and Global Distance Total Score Test. We also show that our method was able to reach comparable results to Rosetta and QUARK, and in some cases, it outperformed them, corroborating the effectiveness of our proposal.
APA, Harvard, Vancouver, ISO, and other styles
45

Nyrén, Karl. "Phylogenetic analysis of secretion systems in Francisellaceae and Legionellales : Investigating events of intracellularization." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-448062.

Full text
Abstract:
Host-adapted bacteria are pathogens that, through evolutionary time and host-adaptive events, acquired the ability to manipulate hosts into assisting their own reproduction and spread. Through these host-adaptive events, free-living pathogens may be rendered unable to reproduce without their host, which is an irreversible step in evolution. Francisellaceae and Legionellales, two orders of Gammaproteobacteria, are cases where host-adaptation has lead to an intracellular lifestyle. Both orders use secretion systems, in combination with effector proteins, to invade and control their hosts. A current view is that Francisellaceae and Legionellales went through host-adaptive events at two separate time points. However, F. hongkongensis, a member of Francisellaceae shares the same secretion system as the order of Legionellales. Additionally, two host-adapted Gammaproteobacteria, Piscirickettsia spp. and Berkiella spp., swaps phylogenetic positions between Legionellales and Francisellaceae depending on methods applied - indicating shared features of Francisellaceae and Legionellales. In this study, we set up a workflow to screen public metagenomic data for candidate host-adaptive bacteria. Using this data, we attempted to assert the phylogenetic position and possibly resolve evolutionary events that occurred in Legionellales, F. hongkongensis, Francisellaceae, Piscirickettsia spp. and Berkiella spp. We successfully acquired 23 candidate host-adapted MAGs by (i) scanning for genes, among reads before assembly, using PhyloMagnet, and (ii) screening for complete secretion systems with MacSyFinder. The phylogenetic results turned out indecisive in the placement ofBerkiella spp. and Piscirickettsia. However, results found in this study indicate that, contrary to previous beliefs, it is possible that it was one intracellularization event of a common ancestor that gave rise to the intracellular lifestyle of Francisellaceae and Legionellales.
APA, Harvard, Vancouver, ISO, and other styles
46

Coimbra, Klein Cecilia. "Bioinformatic study of the metabolic dialog between a non-pathogenic trypanosomatid and its endosymbiont with evolutionary and functional goals." Phd thesis, Université Claude Bernard - Lyon I, 2013. http://tel.archives-ouvertes.fr/tel-01050338.

Full text
Abstract:
In this thesis, we presented three main types of analyses of metabolism, most of which involved symbiosis: metabolic dialogue between a trypanosomatid and its symbiont, comparative analyses of metabolic networks and exploration of metabolomics data. All of them were essentially based on genomics data where metabolic capabilities were predicted from the annotated genes of the target organism, and were further refined with other types of data depending on the aim and scope of each investigation. The metabolic dialogue between a trypanosomatid and its symbiont was explored with functional and evolutionary goals which included analysing the classically defined pathways for the synthesis of essential amino acids and vitamins, exploring the genome-scale metabolic networks and searching for potential horizontal gene transfers from bacteria to the trypanosomatids. The comparative analyses performed focused on the common metabolic capabilities of different lifestyle groups of bacteria and we proposed a method to automatically establish the common and the group-specific activities. The application of our method on metabolic stories enumeration to the yeast response to cadmium exposure was a validation of this approach on a well-studied biological response to stress. We showed that the method captured well the underlying knowledge as it extracted stories allowing for further interpretations of the metabolomics data mapped into the genome-scale metabolic model of yeast
APA, Harvard, Vancouver, ISO, and other styles
47

Svärd, Karl. "Developing new methods for estimating population divergence times from sequence data." Thesis, Uppsala universitet, Institutionen för medicinsk biokemi och mikrobiologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-450123.

Full text
Abstract:
Methods for estimating past demographic events of populations are powerful tools in order to get insights of otherwise hidden pasts. The genetic data of people is a valuable resource for these purposes as patterns of variation can inform of the past evolutionary forces and historical events that generated them. There is, however, a lack of methods within the field that uses this information to its full extent. That is why this project has looked at developing a set of new alternatives for estimating demographic events. The work done has been based on modifying the purely sequence based method TTo (Two-Two-outgroup) for estimating divergence times of two populations. The modifications consisted of using beta distributions to model the polymorphic diversity of the ancestral population in order to increase the max sample size possible. The finished project resulted in two implemented methods: TT-beta and a partial variant of MM. TT-beta was able to produce estimations in the same region as TTo and showed that the usage of beta distributions had real potential. For MM there only was a partial implementation able to be done, but this one also showed promise and the ability to use varying sample sizes to estimate demographic values.
APA, Harvard, Vancouver, ISO, and other styles
48

Botha, Stephen Gordon. "The effect of evolutionary rate estimation methods on correlations observed between substitution rates in models of evolution." Thesis, Stellenbosch : Stellenbosch University, 2012. http://hdl.handle.net/10019.1/19938.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Rodrigues, Ernesto Luis Malta. "Inferência de gramática formais livres de contexto utilizando computação evolucionária com aplicação em bioinformática." Universidade Tecnológica Federal do Paraná, 2007. http://repositorio.utfpr.edu.br/jspui/handle/1/103.

Full text
Abstract:
A inferência gramatical lida com o problema de aprender um classificador capaz de reconhecer determinada construção ou característica em um conjunto qualquer de exemplos. Neste trabalho, um modelo de inferência gramatical baseado em uma variante de Programação Genética é proposto. A representação de cada indivíduo é baseada em uma lista ligada de árvores representando o conjunto de produções da gramática. A atuação dos operadores genéticos é feita de forma heurística. Além disto, dois novos operadores genéticos são apresentados. O primeiro, denominado Aprendizagem Incremental, é capaz de reconhecer, com base em exemplos, quais regras de produção estão faltando. O segundo, denominado Expansão, é capaz de prover a diversidade necessária. Em experimentos efetuados, o modelo proposto inferiu com sucesso seis gramáticas regulares e duas gramáticas livres de contexto: parênteses e palíndromos de quatro letras, tanto o comum quanto o disjunto, sendo superior a abordagens recentes. Atualmente, modelos de inferência gramatical têm sido aplicados a problemas de reconhecimento de sequências biológicas de DNA. Neste trabalho, dois problemas de identificação de padrão foram abordados: reconhecimento de promotores e splice-junction. Para o primeiro, o modelo proposto obteve resultado superior a outras abordagens. Para o segundo, o modelo proposto apresentou bons resultados. O modelo foi estendido para o uso de gramáticas fuzzy, mais especificamente, as gramáticas fuzzy fracionárias. Para tal, um método de estimação adequado dos valores da função de pertinência das produções da gramática é proposto. Os resultados obtidos na identificação de splice-junctions comprovam a utilidade do modelo de inferência gramatical fuzzy proposto.<br>Grammatical inference deals with the task of learning a classifier that can recognize a particular pattern in a set of examples. In this work, a new grammatical inference model based on a variant of Genetic Programming is proposed. In this approach, an individual is a list of structured trees representing their productions. Ordinary genetic operators are modified so as to bias the search and two new operators are proposed. The first one, called Incremental Learning, is able to recognize, based on examples, which productions are missing. The second, called Expansion is able to provide the diversity necessary to achieve convergence. In a suite of experiments performed, the proposed model successfully inferred six regular grammars and two context-free grammars: parentheses and palindromes with four letters, including the disjunct one. Results achieved were better than those obtained by recently published algorithms. Nowadays, grammatical inference has been applied to problems of recognition of biological sequences of DNA. In this work, two problems of this class were addressed: recognition of promoters and splice junction detection. In the former, the proposed model obtained results better than other published approaches. In the latter, the proposed model showed promising results. The model was extended to support fuzzy grammars, namely the fuzzy fractional grammars. Furthermore, an appropriate method of estimation of the values of the production's membership function is also proposed. Results obtained in the identification of splice junctions shows the utility of the fuzzy inference model proposed.
APA, Harvard, Vancouver, ISO, and other styles
50

Yu, Jinchao. "développement méthodologique et applications de la prédiction des interactions protéine-protéine." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS021.

Full text
Abstract:
Les interactions protéine-protéine (IPP) jouent un rôle essentiel dans le vivant. Mon travail de thèse s’est concentré sur développement de méthodes bio-informatiques pour la prédiction et la modélisation structurale des IPP. Mon objectif était d'améliorer le pouvoir prédictif des méthodes permettant de prédire les structures d’assemblages macromoléculaires (docking) et d'aborder les problèmes rencontrés par les biologistes sur des cas réels d’interactions.Pour obtenir des modèles de protéines isolées de meilleure qualité, j’ai tout d’abord développé le serveur HHalign-Kbest basé sur des algorithmes d’alignements sous-optimaux. Ensuite, dans le domaine du « docking », j’ai élaboré le serveur InterEvDock qui prend en compte les informations de coévolution entre protéines. Les validations en aveugle montrent que ce serveur atteint de meilleures performances que d’autres serveurs de référence lorsque l’information évolutive est disponible.Afin de tester plus à fond nos méthodes, nous avons participé au concours CAPRI - un concours international pour la prédiction des interactions protéiques. Sur les sessions couvrant la période 2013-2016, notre groupe s’est classé 1er. Enfin, j'ai développé un jeu de données d’apprentissage et de test, PPI4DOCK. Il contient un très grand nombre de cibles de complexes (plus de 1000) et permettra d'améliorer les méthodes de docking à partir des structures expérimentales ou de modèles.En termes d'applications, je me investis dans différents projets collaboratifs, qui touchent des domaines aussi variés que, la recherche de partenaires pour le chaperon d’histone Asf1; la prédiction des modes d’interaction entre CENP-F et Nup133 dans le contexte de la mitose et de Exo70 et Abi dans celui de la régulation de la mobilité cellulaire; la simulation des modes de liaison entre le complexe Ku et ses partenaires peptidiques, dans les voies de réparation de l'ADN<br>Protein-protein interactions (PPIs) play essential roles in life. My PhD work aimed at developing advanced bioinformatics methods in the field of PPI prediction at the structural scale. My goal was to improve the predictive power of methods which model the structures of macromolecular assemblies (docking) and to tackle real-life problems faced by biologists.First, I developed HHalign-Kbest server using algorithms for the search of suboptimal solutions to gain better-quality models. Second, in the field of protein docking, I built InterEvDock server which can take co-evolutionary information into account. It yields better performance than other state-of-the-art servers. In order to further test our methods, we participated in CAPRI – an international challenge for prediction of protein interactions. Over years 2013-2016, our group ranked 1st at the 6th CAPRI evaluation meeting. At last, I developed a realistic benchmark dataset PPI4DOCK, largest dataset so far, in order to improve docking methods for the scientific community.In terms of applications, I was involved in a variety of collaborative projects with different labs. As representative examples, I searched for binding partners of the histone chaperone Asf1; I studied the CENP-F/Nup133 interaction in the context of mitosis and the Exo70/Abi interaction related to cell mobility regulation; I also simulated the binding modes of multiple peptides, partners of Ku complex involved in DNA repair pathway
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography