To see the other types of publications on this topic, follow the link: Bioinformatics analyses.

Dissertations / Theses on the topic 'Bioinformatics analyses'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Bioinformatics analyses.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Chatzou, Maria 1985. "Large-scale comparative bioinformatics analyses." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/587086.

Full text
Abstract:
One of the main and most recent challenges of modern biology is to keep-up with growing amount of biological data coming from next generation sequencing technologies. Keeping up with the growing volumes of experiments will be the only way to make sense of the data and extract actionable biological insights. Large-scale comparative bioinformatics analyses are an integral part of this procedure. When doing comparative bioinformatics, multiple sequence alignments (MSAs) are by far the most widely used models as they provide a unique insight into the accurate measure of sequence similarities and are therefore instrumental to revealing genetic and/or functional relationships among evolutionarily related species. Unfortunately, the well-established limitation of MSA methods when dealing with very large datasets potentially compromises all downstream analysis. In this thesis I expose the current relevance of multiple sequence aligners, I show how their current scaling up is leading to serious numerical stability issues and how they impact phylogenetic tree reconstruction. For this purpose, I have developed two new methods, MEGA-Coffee, a large scale aligner and Shootstrap a novel bootstrapping measure incorporating MSA instability with branch support estimates when computing trees. The large amount of computation required by these two projects was carried using Nextflow, a new computational framework that I have developed to improve computational efficiency and reproducibility of large-scale analyses like the one carried out in the context of these studies.
Uno de los principales y más recientes retos de la biología moderna es poder hacer frente a la creciente cantidad de datos biológicos procedentes de las tecnologías de secuenciación de alto rendimiento. Mantenerse al día con los crecientes volúmenes de datos experimentales es el único modo de poder interpretar estos datos y extraer conclusiones biológicos relevantes. Los análisis bioinformáticos comparativos a gran escala son una parte integral de este procedimiento. Al hacer bioinformática comparativa, los alineamientos múltiple de secuencias (MSA) son con mucho los modelos más utilizados, ya que proporcionan una visión única de la medida exacta de similitudes de secuencia y son, por tanto, fundamentales para inferir las relaciones genéticas y / o funcionales entre las especies evolutivamente relacionadas. Desafortunadamente, la conocida limitación de los métodos MSA para analizar grandes bases de datos, puede potencialmente comprometer todos los análisis realizados a continuación. En esta tesis expongo la relevancia actual de los métodos de alineamientos multiples de secuencia, muestro cómo su uso en datos masivos está dando lugar a serios problemas de estabilidad numérica y su impacto en la reconstrucción del árbol filogenético. Para este propósito, he desarrollado dos nuevos métodos, MEGA-café, un alineador de gran escala y Shootstrap una nueva medida de bootstrapping que incorpora la inestabilidad del MSA con las estimaciones de apoyo de rama en el cálculo de árboles filogéneticos. La gran cantidad de cálculo requerido por estos dos proyectos se realizó utilizando Nextflow, un nuevo marco computacional que se ha desarrollado para mejorar la eficiencia computacional y la reproducibilidad del análisis a gran escala como la que se lleva a cabo en el contexto de estos estudios.
APA, Harvard, Vancouver, ISO, and other styles
2

Lindskog, Mats. "Computational analyses of biological sequences -applications to antibody-based proteomics and gene family characterization." Doctoral thesis, KTH, School of Biotechnology (BIO), 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-527.

Full text
Abstract:

Following the completion of the human genome sequence, post-genomic efforts have shifted the focus towards the analysis of the encoded proteome. Several different systematic proteomics approaches have emerged, for instance, antibody-based proteomics initiatives, where antibodies are used to functionally explore the human proteome. One such effort is HPR (the Swedish Human Proteome Resource), where affinity-purified polyclonal antibodies are generated and subsequently used for protein expression and localization studies in normal and diseased tissues. The antibodies are directed towards protein fragments, PrESTs (Protein Epitope Signature Tags), which are selected based on criteria favourable in subsequent laboratory procedures.

This thesis describes the development of novel software (Bishop) to facilitate the selection of proper protein fragments, as well as ensuring a high-throughput processing of selected target proteins. The majority of proteins were successfully processed by this approach, however, the design strategy resulted in a number ofnfall-outs. These proteins comprised alternative splice variants, as well as proteins exhibiting high sequence similarities to other human proteins. Alternative strategies were developed for processing of these proteins. The strategy for handling of alternative splice variants included the development of additional software and was validated by comparing the immunohistochemical staining patterns obtained with antibodies generated towards the same target protein. Processing of high sequence similarity proteins was enabled by assembling human proteins into clusters according to their pairwise sequence identities. Each cluster was represented by a single PrEST located in the region of the highest sequence similarity among all cluster members, thereby representing the entire cluster. This strategy was validated by identification of all proteins within a cluster using antibodies directed to such cluster specific PrESTs using Western blot analysis. In addition, the PrEST design success rates for more than 4,000 genes were evaluated.

Several genomes other than human have been finished, currently more than 300 genomes are fully sequenced. Following the release of the tree model organism black cottonwood (Populus trichocarpa), a bioinformatic analysis identified unknown cellulose synthases (CesAs), and revealed a total of 18 CesA family members. These genes are thought to have arisen from several rounds of genome duplication. This number is significantly higher than previous studies performed in other plant genomes, which comprise only ten CesA family members in those genomes. Moreover, identification of corresponding orthologous ESTs belonging to the closely related hybrid aspen (P. tremula x tremuloides) for two pairs of CesAs suggest that they are actively transcribed. This indicates that a number of paralogs have preserved their functionalities following extensive genome duplication events in the tree’s evolutionary history.

APA, Harvard, Vancouver, ISO, and other styles
3

Dafalla, Israa Yahia Al Hag Ibrahim. "Improving SARS-CoV-2 analyses from wastewater." Thesis, Högskolan i Skövde, Institutionen för biovetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-20237.

Full text
Abstract:
Wastewater-based epidemiology (WBE) analyzes wastewater for the presence of biological and chemical substances to make public health conclusions. COVID-19 disease is caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) that infected individuals shed also in their feces, making WBE an alternative way to track SARS-CoV-2 in populations. There are many limitations to the detection and quantification of SARS-CoV-2 from wastewater, such as sample quality, storage conditions or viral concentration. This thesis aims to determine the extent of these limitations and the factors that contribute to them. Other viruses can help the measurements for example Bovine coronavirus (BCoV) can be spiked as a process surrogate, while Pepper mild mottle virus (PMMoV), a fecal biomarker is used to estimate the prevalence of SARS-CoV-2 infection. This study involved two distinct wastewater samples. For method comparison both samples were processed with two methods: virus concentration by electronegative (EN) filtration or direct RNA extraction method. From the RNA extracts RT-qPCR assays were performed to identify and quantify SARS-CoV-2, BCoV, and PMMoV. Based on the obtained cycle threshold (Ct) values, viral gene copy numbers and virus concentration of the original wastewater samples were calculated. Statistical tests were conducted to assess suggested hypothesizes and variations within the data. Results revealed differences in viral contents due to different sample qualities and as a result of freezing and thawing. Furthermore, different sample processing methods led to differences in quantification. In conclusion, improving analysis of SARS-CoV-2 in wastewater using methodologies with better detection efficiency leads to more reliable results.
APA, Harvard, Vancouver, ISO, and other styles
4

Stenberg, Johan. "Software Tools for Design of Reagents for Multiplex Genetic Analyses." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6832.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Fleischmann, Susanne [Verfasser]. "Bioinformatics analyses of the Escherichia coli toxome / Susanne Fleischmann." Berlin : Freie Universität Berlin, 2019. http://d-nb.info/1188239902/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Guy, Colin Paul. "RadB from archaea : bioinformatics, biochemistry and yeast two-hybrid analyses." Thesis, University of Nottingham, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.446393.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Yu, Xiaoqing. "Statistical Methods and Analyses for Next-generation Sequencing Data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=case1403708200.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Linke, Burkhard [Verfasser]. "Conveyor : a workflow engine for bioinformatics analyses / Burkhard Linke. Technische Fakultät." Bielefeld : Universitätsbibliothek Bielefeld, Hochschulschriften, 2012. http://d-nb.info/1020344385/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Rajaonarifara, Elinambinina. "A bioinformatic study on the feasibility of a cross-species proteomics analyses of mycobacteria." Master's thesis, University of Cape Town, 2013. http://hdl.handle.net/11427/3073.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Shankar, Vijay. "Extension of Multivariate Analyses to the Field of Microbial Ecology." Wright State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1464358122.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Al-Maeni, Mohammad Abdul Rahmman Mohammad. "Bioinformatics analyses of genetic variation in genomes of Neisseria meningitidis (the meningococcus)." Thesis, University of Leicester, 2017. http://hdl.handle.net/2381/40692.

Full text
Abstract:
Genetic variation is one of the key concepts underlying persistence of Neisseria meningitidis in its host and counteracting both innate and adaptive immune responses of the host. The mechanism of evolution involves the combined action of de novo mutation, recombination, and localised hypermutation. This study aimed to understand the contributions of these processes to within host evolution during host persistence of N. meningitidis for a period of months. A study of 40 isolates from one carrier and representing six months persistent carriage showed that de novo mutation resulting in single nucleotide polymorphisms (SNPs) was the major factor in structuring of the population. Allelic variants were subject to dynamic temporal fluctuations through persistence of meningococcal isolates over several months. Conversely, recombination was found to a powerful mechanism for generating SNPs and insertion/deletion within 25 paired isolates from 25 carriers and representing between 1 and 6 months host persistence. The processes of de novo mutation and recombination were infrequent but exhibited trends toward surface antigens especially pilin, porins, iron acquisition and capsule genes. Variation in intergenic regions was also examined in these isolates and a high level of variation was observed in conserved functional patterns of Corriea elements. Three carriers were examined for changes in expression of three phase variable genes (opc, hpuAb, and nalP); these were in the OFF state indicating that there may have been selection for low expression. A further investigation of microevolution within a clonal complex found a low rate of recombination within the 25 CC-174 disease and carriage isolates but it was many folds higher than recombination rates of species forming clonal population. Variable genes were distributed into several schemes including bacterial secretion systems, iron acquisition, capsule, surface antigen, bacterial mobility proteins, and antimicrobial resistance, and toxin genes. In conclusion, all the processes of evolution de novo mutation, recombination, and localized hypermutation facilitate asymptomatic carriage of meningococci and microevolution of CC-174.
APA, Harvard, Vancouver, ISO, and other styles
12

Xia, Jing. "Bioinformatics analyses of alternative splicing, est-based and machine learning-based prediction." Thesis, Manhattan, Kan. : Kansas State University, 2008. http://hdl.handle.net/2097/1113.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Amberkar, Sandeep [Verfasser], and Roland [Akademischer Betreuer] Eils. "Integrative bioinformatics analyses of genome-wide RNAi screens / Sandeep Amberkar ; Betreuer: Roland Eils." Heidelberg : Universitätsbibliothek Heidelberg, 2014. http://d-nb.info/1180032608/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Collison, Matthew Geoffrey. "Human-microbiota interactions in health and disease : bioinformatics analyses of gut microbiome datasets." Thesis, University of Newcastle upon Tyne, 2018. http://hdl.handle.net/10443/4154.

Full text
Abstract:
The human gut harbours a vast diversity of microbial cells, collectively known as the gut microbiota, that are crucial for human health and dysfunctional in many of the most prevalent chronic diseases. Until recently culture dependent methods limited our ability to study the microbiota in depth including the collective genomes of the microbiota, the microbiome. Advances in culture independent metagenomic sequencing technologies have since provided new insights into the microbiome and lead to a rapid expansion of data rich resources for microbiome research. These high throughput sequencing methods and large datasets provide new opportunities for research with an emphasis on bioinformatics analyses and a novel field for drug discovery through data mining. In this thesis I explore a range of metagenomics analyses to extract insights from metagenomics data and inform drug discovery in the microbiota. Firstly I survey the existing technologies and data sources available for data mining therapeutic targets. Then I analyse 16S metagenomics data combined with metabolite data from mice to investigate the treatment model of a proposed antibiotic treatment targetting the microbiota. Then I investigate the occurence frequency and diversity of proteases in metagenomics data in order to inform understanding of host-microbiota-diet interactions through protein and peptide associated glycan degradation by the gut microbiota. Finally I develop a system to facilitate the process of integrating metagenomics data for gene annotations. One of the main challenges in leveraging the scale of data availability in microbiome research is managing the data resources from microbiome studies. Through a series of analytical studies I used metagenomics data to identify community trends, to demonstrate therapeutic interventions and to do a wide scale screen for proteases that are central to human-microbiota interactions. These studies articulated the requirement for a computational framework to integrate and access metagenomics data in a reproducible way using a scalable data store. The thesis concludes explaining how data integration in microbiome research is needed to provide the insights into metagenomics data that are required for drug discovery.
APA, Harvard, Vancouver, ISO, and other styles
15

Gloriam, David E. "G Protein-Coupled Receptors; Discovery of New Human Members and Analyses of the Entire Repertoires in Human, Mouse and Rat." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis : Universitetsbiblioteket [distributör], 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6745.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Kirk, Michael School of Biotechnology &amp Biomolecular Science UNSW. "Bioinformatic analyses of microarray experiments on genetic control of gene expression level." Awarded by:University of New South Wales. School of Biotechnology and Biomolecular Science, 2006. http://handle.unsw.edu.au/1959.4/25986.

Full text
Abstract:
The advent of microarray technology, allowing measurement of gene expression levels for thousands of genes in parallel, has made possible experiments designed to investigate the genetic control of variation in gene expression level (described in the literature as ???genetical genomics??? or ???eQTL??? experiments). Published results from these studies, in yeast and in mice, show that genetic variation is an important factor in gene regulation, and furthermore that individual polymorphisms modify the expression level of many genes. The concern of this thesis is the bioinformatic analyses of the expression level and genotype data sets that are the raw material for these studies. In particular this thesis addresses the two issues of detection of artefactual effects, and maximizing the information that can be extracted from the data. It is shown that while a polymorphism affecting the expression of many genes may be readily detected, care must be taken to determine whether the detected effect is genuinely one of genetic control of expression level, rather than the effect of correlations in measured expression level not of genetic cause. A significance test is devised to distinguish between these cases. The detection of artefactual correlation is explored further in the reanalysis of the published data from a large yeast study. A critique is given of the permutation method used to ascribe genetic control as the cause of inter gene expression level correlation. The presence of some degree of artefactual correlation is shown, and novel methods are presented for identifying such artefacts. To extend the analyses that may be applied to eQTL data, an algorithm is presented for determining secondary eQTLs for gene expression level (as opposed to a single primary QTL), along with a significance test for the putative QTL found. The technique is demonstrated on a large public data set. In addition to the use for which they are intended, the data sets generated for eQTL studies provide opportunities for additional analyses. In this thesis a method is developed for calculating a genome wide map of meiotic recombination frequency from the genotype data for multiple segregant strains. The method is demonstrated on the published genotype data generated for a large yeast eQTL study.
APA, Harvard, Vancouver, ISO, and other styles
17

Dunning, Mark J. "Genome-wide analyses using bead-based microarrays." Thesis, University of Cambridge, 2008. https://www.repository.cam.ac.uk/handle/1810/218542.

Full text
Abstract:
Microarrays are now an established tool for biological research and have a wide range of applications. In this thesis I investigate the BeadArray microarray technology developed by Illumina. The design of this technology is unique and gives rise to many computational and statistical challenges. However, I show how knowledge from other microarray technologies can be used to our advantage. I describe the beadarray software package, which is now used by researchers around the world. The development of this software was motivated by the fact that Illumina's software (BeadStudio) gives a summarised view of Illumina data and does not gives users any control over several processing steps that were found to be crucial for other microarray technologies. A main feature of beadarray is the ability to access raw data. The advantages of such data include the ability to perform more detailed quality assessment and greater control over the analysis at all stages. The analysis of a control experiment shows that the processing steps used in BeadStudio can be improved. In particular, utilising variances calculated from the raw data can increase the ability to detect genes which have different expression levels between samples, a common goal for microarray studies. The data from the control experiment are made available for other researchers to use and validate their own analysis methods. One issue discovered during the analysis of the control experiment was that only half of the intended genes could be reliably measured due to problems in the design of the probes targetting particular genes. By considering a large set of publicly available Illumina arrays, I show how such unreliable measurements can affect the analysis of Illumina data. I also show how potential problems can be identified in advance of an experiment and incorporated into an analysis pipeline.
APA, Harvard, Vancouver, ISO, and other styles
18

Francis, Ore. "Bioinformatics, phylogenetic and biochemical analyses of the proteins of the muskelin/RanBP9/CTLH complex." Thesis, University of Bristol, 2014. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.665153.

Full text
Abstract:
Ubiquitination is an essential post-translational modification that regulates signalling and protein turnover in eukaryotic cells. However, many ubiquitin E3 ligases remain poorly understood. The mammalian muskelin/RanBP9/CTLH complex contains eight proteins, five of which, RanBP9 (RanBPM), TWA1, Maea, Rmnd5a and muskelin, share striking similarities of domain organisation. In Saccharomyces cerevisiae, the related GID complex includes the Rmnd5a homologue GID2 which has E3 ubiquitin ligase activity and down-regulates gluconeogenesis. E3 ubiquitin ligase activity of mammalian Rmnd5a has not been reported. To better understand the large mammalian complex a major goal of this thesis was to analyse its evolution as a multi-protein system. Bioinformatic studies identify that TWA1, Rmnd5 and Maea are conserved throughout five eukaryotic supergroups. RanBPM is absent from excavates and from some lineages within other super-groups, and muskelin is present only in opisthokonts. Phylogenetic analysis based on the shared sequence regions that correspond to the lissencephaly-l homology (LisH) and C-terminal to LisH (CTLH) domains revealed closer relationships between Rmnd5 and MAEA, and TWAl and RanBPM, respectively. In-depth sequence analyses confirmed the greater similarity of the LisH/ CTLH domains of Rmnd5 and MAEA vs. TWAl and RanBPM, respectively, and id~ntified unique signatures of conserved residues within the LisH and CTLH domains of each protein. ~ further goal was to purify and express Rmnd5a and TWAl for laboratory experiments. Bacterially expressed Rmnd5a exhibits E3 ubiquitin ligase activity in Escherichia coli BL21lysates but not as a purified protein. Bacterial expression and purification of TWAl enabled biophysical characterisation of TWAl as an all a-helical, natively-dimerised protein. TWAl crystals were produced. When optimized, crystals diffracted to 3.5A, though a 3D structure was not resolved. Threaded structure predictions of Rmnd5a and TWAl agreed with secondary structure prediction algorithms. These studies advance knowledge of structural! functional relationships of proteins in this poorly-understood complex.
APA, Harvard, Vancouver, ISO, and other styles
19

Rands, Chris M. D. "Analyses of functional sequence in mammalian and avian genomes." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:27e0ac20-eb27-423c-9493-a8a1c6cc57b8.

Full text
Abstract:
The first draft sequence of the human genome was published over a decade ago, yet interpreting the functional importance of nucleotides in genomes is still an ongoing challenge. I took a comparative genomic approach to identify functional sequence using signatures of natural selection in DNA sequences. Mutations that are purged or propagated by selection mark sequences of significance for biological fitness. I developed and refined methods for estimating the quantity of sequence constrained with respect to insertions and deletions (indels) between two genome sequences, a quantity I termed αselIndel. This sequence is evolving more slowly than surrounding neutral sequence due to the purging of deleterious indel variants, and thus this sequence is likely to be functional. I estimated αselIndel between diverse mammalian and avian species pairs, and found a strong negative correlation between αselIndel and the divergence between the species’ genome sequences. This implies that functional sequence turns over rapidly as it is lost and gained over time. I quantified the variable levels of sequence constraint, and rates of sequence turnover, for different types of human biochemically annotated element. Furthermore, I found that similar rates of functional turnover have occurred across mammalian and avian evolution. Finally, I identified positively selected amino acid residues that may be important for Darwin’s finch beak development, and found evidence of adaptively evolving reproductive proteins in the ancestral songbird lineage. Collectively these results demonstrate the wide-spread nature of lineage-specific functional sequence with implications for understanding species traits and the use of model organisms to inform human biology.
APA, Harvard, Vancouver, ISO, and other styles
20

Acar, Hande. "Bioinformatic Analyses In Microsatellite-based Genetic Diversity Of Turkish Sheep Breeds." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612585/index.pdf.

Full text
Abstract:
In the present study, within and among breed genetic diversity in thirteen Turkish sheep breeds (Sakiz, Karagü
l, Hemsin, Ç
ine Ç
apari, Norduz, Herik, Akkaraman, Dagliç
, Gö

eada, Ivesi, Karayaka, Kivircik and Morkaraman
in total represented by 628 individuals) were analyzed based on 20 microsatellite loci. Loci were amplified by Polymerase Chain Reactions and products were electronically recorded and converted into [628 x 20] matrix representing genotypes of individuals. Reliability of the genotyping and genetic diversity analyses were done by means of various bioinformatics tools. For the analyses, various statistical methods (Fisher'
s Exact Test, Neighbor-Joining tree construction, Factorial Correspondence Analysis (FCA), Analysis of Molecular Variation, Structure Analysis and Delaunay Analysis) were used. Since, inputs of some software were not compatible with the outputs of other software some Java classes were written whenever necessary. Analyses revealed that among the major breeds Dagliç
, Karayaka and Morkaraman breeds are highly admixed but Kivircik, Akkaraman and Ivesi are relatively distinct. Among the minor breeds, distinctness of Hemsin, Sakiz, Ç
ine Ç
apari, Gö

eada and Karagü
l are more pronounced compared to all of the examined breeds. Since highly admixed individuals can be identified by Structure and FCA tests, results of the present study, which is part of a national project with the acronym TURKHAYGEN-I (www.turkhaygen.gov.tr), were found to be promising in establishing and managing relatively pure conservation flocks for the Turkish native sheep breeds which are believed to be the reservoirs of genetic variability.
APA, Harvard, Vancouver, ISO, and other styles
21

Guo, Cheng. "GENOME WIDE ANALYSES OF ALTERNATIVE POLYADENYLATION IN ARABIDOPSIS." Miami University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=miami1479081485753738.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Hixson, Kim Kathleen. "Network and Multi-Omics Analyses of Arabidopsis Arogenate Dehydratase Knock-Out and Over-Expression Mutants." Thesis, Washington State University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10785490.

Full text
Abstract:

Arogenate dehydratases (ADTs) are enzymes found within the aromatic amino acid pathway. They are responsible for catalyzing the final step in phenylalanine (Phe) biosynthesis in vascular plants. While being essential for protein production in all living systems, Phe additionally is the starting precursor to a multitude of secondary metabolites produced in the phenylpropanoid pathway. Our group discovered that by knocking out ADT isoenzymes in Arabidopsis thaliana , measurable reductions in lignin levels can be achieved in stem tissue. This finding provides the opportunity to study potential mechanisms related to lignin biosynthesis and could have implications in bioengineering applications where alterations in lignin level might be desired.

Any alteration to a gene family, as important as that of the ADTs, imparts plant-wide biomolecular changes and because of this, it is not only important to know that lignin is reduced but that optimal plant function is maintained or to understand how it has been changed in order to mediate any undesirable effects. Here we utilized a multitude of analytical platforms and data analysis techniques on both ADT knock-outs (KOs) and over-expression (OE) lines. By using both KO and OE lines we could provide validation to our findings, as KO and OE mutants of the same enzyme/s typically show converse biomolecular abundance changes. As a systems level understanding was desired, we utilized a multi-omics strategy (metabolomics, transcriptomics and proteomics).

Identified metabolites showed which metabolite and metabolite classes were most affected. Major KEGG defined pathway changes were identified at the transcript and protein enzyme family level. Integration of all omics data revealed which enzymatic reactions were most correlated to observed metabolite abundance changes. Network and clustering algorithms identified patterns of molecular change between metabolites, transcripts and proteins and these patterns were further correlated to reveal possible post-transcriptional regulatory processes involved in lignin biosynthesis.

Taken altogether, these data informed us of how ADT alterations affect the entire biomolecular system of Arabidopsis and also revealed targets for future studies aimed at elucidating further how lignin biosynthesis is regulated at the post-transcriptional and translational levels.

APA, Harvard, Vancouver, ISO, and other styles
23

Cicek, A. Ercument. "METABOLIC NETWORK-BASED ANALYSES OF OMICS DATA." Case Western Reserve University School of Graduate Studies / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=case1372866879.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Lambert, Caroline L. "Identification and Description of Burkholderia pseudomallei Proteins that Bind HostComplement-Regulatory Proteins via in silico and in vitro Analyses." University of Toledo Health Science Campus / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=mco1533315186098586.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Snøve, Jr Ola. "Hardware-accelerated analysis of non-protein-coding RNAs." Doctoral thesis, Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, 2005. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-713.

Full text
Abstract:

A tremendous amount of genomic sequence data of relatively high quality has become publicly available due to the human genome sequencing projects that were completed a few years ago. Despite considerable efforts, we do not yet know everything that is to know about the various parts of the genome, what all the regions code for, and how their gene products contribute in the myriad of biological processes that are performed within the cells. New high-performance methods are needed to extract knowledge from this vast amount of information.

Furthermore, the traditional view that DNA codes for RNA that codes for protein, which is known as the central dogma of molecular biology, seems to be only part of the story. The discovery of many non-proteincoding gene families with housekeeping and regulatory functions brings an entirely new perspective to molecular biology. Also, sequence analysis of the new gene families require new methods, as there are significant differences between protein-coding and non-protein-coding genes.

This work describes a new search processor that can search for complex patterns in sequence data for which no efficient lookup-index is known. When several chips are mounted on search cards that are fitted into PCs in a small cluster configuration, the system’s performance is orders of magnitude higher than that of comparable solutions for selected applications. The applications treated in this work fall into two main categories, namely pattern screening and data mining, and both take advantage of the search capacity of the cluster to achieve adequate performance. Specifically, the thesis describes an interactive system for exploration of all types of genomic sequence data. Moreover, a genetic programming-based data mining system finds classifiers that consist of potentially complex patterns that are characteristic for groups of sequences. The screening and mining capacity has been used to develop an algorithm for identification of new non-protein-coding genes in bacteria; a system for rational design of effective and specific short interfering RNA for sequence-specific silencing of protein-coding genes; and an improved algorithmic step for identification of new regulatory targets for the microRNA family of non-protein-coding genes.


Paper V, VI, and VII are reprinted with kind permision of Elsevier, sciencedirect.com
APA, Harvard, Vancouver, ISO, and other styles
26

Piatkowski, Bryan. "Axillary hair developmental ultrastructure and mucilage composition in the moss Physcomitrella patens: Microscopic and bioinformatic analyses." OpenSIUC, 2015. https://opensiuc.lib.siu.edu/theses/1841.

Full text
Abstract:
Physcomitrella patens, a haploid-dominant land plant, has increasingly become useful in molecular genetic studies and is a model for early land plant evolution. This thesis work explores the mucilage secretory hair ontology, development, and ultrastructure with microscopic methods. Axillary hair development parallels that of secretory tissues found in other mosses and ultrastructure shares important similarities with liverwort mucilage papillae. These mucilage secretory structures cover the developing apex and young leaves with mucilage for protection. Changes in the hair cell wall and mucilage secretion are mediated by pectin and wall modification. Using bioinformatic methods, this thesis also investigates protein-protein interactions in Physcomitrella to understand the molecular mechanisms governing pectin biosynthesis and modification.
APA, Harvard, Vancouver, ISO, and other styles
27

Andersson, Robin. "Decoding the Structural Layer of Transcriptional Regulation : Computational Analyses of Chromatin and Chromosomal Aberrations." Doctoral thesis, Uppsala universitet, Centrum för bioinformatik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-130999.

Full text
Abstract:
Gene activity is regulated at two separate layers. Through structural and chemical properties of DNA – the primary layer of encoding – local signatures may enable, or disable, the binding of proteins or complexes of them with regulatory potential to the DNA. At a higher level – the structural layer of encoding – gene activity is regulated through the properties of higher order DNA structure, chromatin, and chromosome organization. Cells with abnormal chromosome compaction or organization, e.g. cancer cells, may thus have perturbed regulatory activities resulting in abnormal gene activity. Hence, there is a great need to decode the transcriptional regulation encoded in both layers to further our understanding of the factors that control activity and life of a cell and, ultimately, an organism. Modern genome-wide studies with those aims rely on data-intense experiments requiring sophisticated computational and statistical methods for data handling and analyses. This thesis describes recent advances of analyzing experimental data from quantitative biological studies to decipher the structural layer of encoding in human cells. Adopting an integrative approach when possible, combining multiple sources of data, allowed us to study the influences of chromatin (Papers I and II) and chromosomal aberrations (Paper IV) on transcription. Combining chromatin data with chromosomal aberration data allowed us to identify putative driver oncogenes and tumor-suppressor genes in cancer (Paper IV). Bayesian approaches enabling the incorporation of background information in the models and the adaptability of such models to data have been very useful. Their usages yielded accurate and narrow detection of chromosomal breakpoints in cancer (Papers III and IV) and reliable positioning of nucleosomes and their dynamics during transcriptional regulation at functionally relevant regulatory elements (Paper II). Using massively parallel sequencing data, we explored the chromatin landscapes of human cells (Papers I and II) and concluded that there is a preferential and evolutionary conserved positioning at internal exons nearly unaffected by the transcriptional level. We also observed a strong association between certain histone modifications and the inclusion or exclusion of an exon in the mature gene transcript, suggesting a functional role in splicing.
APA, Harvard, Vancouver, ISO, and other styles
28

Soeria-Atmadja, Daniel. "Novel Computational Analyses of Allergens for Improved Allergenicity Risk Assessment and Characterization of IgE Reactivity Relationships." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis : Universitetsbiblioteket [distributör], 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-9313.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Ohniwa, Ryosuke L. "Comparative analyses of genome architectures among prokaryote, organelle and eukaryote by nano-scale imaging, molecular genetics and bioinformatics." 京都大学 (Kyoto University), 2007. http://hdl.handle.net/2433/136993.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Yang, Bo. "Analyses bioinformatiques et classements consensus pour les données biologiques à haut débit." Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112250/document.

Full text
Abstract:
Cette thèse aborde deux problèmes relatifs à l’analyse et au traitement des données biologiques à haut débit: le premier touche l’analyse bioinformatique des génomes à grande échelle, le deuxième est consacré au développement d’algorithmes pour le problème de la recherche d’un classement consensus de plusieurs classements.L’épissage des ARN est un processus cellulaire qui modifie un ARN pré-messager en en supprimant les introns et en raboutant les exons. L’hétérodimère U2AF a été très étudié pour son rôle dans processus d’épissage lorsqu’il se fixe sur des sites d’épissage fonctionnels. Cependant beaucoup de problèmes critiques restent en suspens, notamment l’impact fonctionnel des mutations de ces sites associées à des cancers. Par une analyse des interactions U2AF-ARN à l’échelle génomique, nous avons déterminé qu’U2AF a la capacité de reconnaître environ 88% des sites d’épissage fonctionnels dans le génome humain. Cependant on trouve de très nombreux autres sites de fixation d’U2AF dans le génome. Nos analyses suggèrent que certains de ces sites sont impliqués dans un processus de régulation de l’épissage alternatif. En utilisant une approche d’apprentissage automatique, nous avons développé une méthode de prédiction des sites de fixation d’UA2F, dont les résultats sont en accord avec notre modèle de régulation. Ces résultats permettent de mieux comprendre la fonction d’U2AF et les mécanismes de régulation dans lesquels elle intervient.Le classement des données biologiques est une nécessité cruciale. Nous nous sommes intéressés au problème du calcul d’un classement consensus de plusieurs classements de données, dans lesquels des égalités (ex-aequo) peuvent être présentes. Plus précisément, il s’agit de trouver un classement dont la somme des distances aux classements donnés en entrée est minimale. La mesure de distance utilisée le plus fréquemment pour ce problème est la distance de Kendall-tau généralisée. Or, il a été montré que, pour cette distance, le problème du consensus est NP-difficile dès lors qu’il y a plus de quatre classements en entrée. Nous proposons pour le résoudre une heuristique qui est une nouvelle variante d’algorithme à pivot. Cette heuristique, appelée Consistent-pivot, s’avère à la fois plus précise et plus rapide que les algorithmes à pivot qui avaient été proposés auparavant
It is thought to be more and more important to solve biological questions using Bioinformatics approaches in the post-genomic era. This thesis focuses on two problems related to high troughput data: bioinformatics analysis at a large scale, and development of algorithms of consensus ranking. In molecular biology and genetics, RNA splicing is a modification of the nascent pre-messenger RNA (pre-mRNA) transcript in which introns are removed and exons are joined. The U2AF heterodimer has been well studied for its role in defining functional 3’ splice sites in pre-mRNA splicing, but multiple critical problems are still outstanding, including the functional impact of their cancer-associated mutations. Through genome-wide analysis of U2AF-RNA interactions, we report that U2AF has the capacity to define ~88% of functional 3’ splice sites in the human genome. Numerous U2AF binding events also occur in other genomic locations, and metagene and minigene analysis suggests that upstream intronic binding events interfere with the immediate downstream 3’ splice site associated with either the alternative exon to cause exon skipping or competing constitutive exon to induce inclusion of the alternative exon. We further build up a U2AF65 scoring scheme for predicting its target sites based on the high throughput sequencing data using a Maximum Entropy machine learning method, and the scores on the up and down regulated cases are consistent with our regulation model. These findings reveal the genomic function and regulatory mechanism of U2AF, which facilitates us understanding those associated diseases.Ranking biological data is a crucial need. Instead of developing new ranking methods, Cohen-Boulakia and her colleagues proposed to generate a consensus ranking to highlight the common points of a set of rankings while minimizing their disagreements to combat the noise and error for biological data. However, it is a NP-hard questioneven for only four rankings based on the Kendall-tau distance. In this thesis, we propose a new variant of pivot algorithms named as Consistent-Pivot. It uses a new strategy of pivot selection and other elements assignment, which performs better both on computation time and accuracy than previous pivot algorithms
APA, Harvard, Vancouver, ISO, and other styles
31

Wang, Hao. "THE POTENTIAL INDUCING PATTERN OF THE FLAX GENOME." Case Western Reserve University School of Graduate Studies / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=case1532609009820723.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Hu, Ke. "METHODS AND ANALYSES IN THE STUDY OF HUMAN DNA METHYLATION." Case Western Reserve University School of Graduate Studies / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=case1522760441838452.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Hsing, Michael. "Developing bioinformatics tools and analyses on protein indels and protein-protein interactions : novel applications for drug discovery in Staphylococcus aureus." Thesis, University of British Columbia, 2010. http://hdl.handle.net/2429/18714.

Full text
Abstract:
Infectious diseases caused by bacterial pathogens continue to be major public health concerns affecting millions of human lives annually, as conventional treatment via antibiotics has lost its effectiveness due to growing problems of drug resistance. Recent advancements in systems biology, high-throughout sequencing, protein interaction study and computer-aided drug development can offer possible solutions to antibiotic resistance through discovery of novel antimicrobials. The thesis describes several bioinformatics approaches that focus on protein interaction network (PIN) studies, analyses of targetable protein indels (insertions and deletions) and virtual compound screening for new antibacterial candidates – approaches integrated into an antibiotic discovery pipeline for methicillin-resistant Staphylococcus aureus (MRSA252). In the course of the described work we identified new drug targets corresponding to highly interacting proteins (hubs) through comprehensive PIN analysis in MRSA252. The advantage of using hub proteins as targets is established by their essentiality, non-replaceable PIN position and lower rate of mutation, all of which can help to counter bacterial resistance. To accelerate these studies hub predicting tools have been developed to assist proteomics experiments for PIN discovery and to facilitate drug target identification in pathogens. Because some bacterial proteins are conserved in humans, we applied the indel (insertion or deletion) concept to locate unique compound-binding sites that enabled us to specifically target conserved and essential bacterial hubs. We demonstrated associations between the presence of sizable indels in proteins with their essentiality and network rewiring capability, which established indels as potential markers for drug targets. To provide the research community a fast and user-friendly web portal for identification and characterization of indel-bearing drug targets, the Indel PDB database has been developed to characterize the functional and structural features of 117,266 indel sites across numerous species. Finally, combining the above bioinformatics methodologies with a rapid and efficient procedure of virtual screening allowed discovery of compounds that effectively inhibited MRSA252 cell growth with no signs of human toxicity. We anticipate that the drug discovery pipeline along with established MRSA PIN resource, hub prediction tools and indel database will provide a framework for the development of next-generation antibiotics in other existing or emerging pathogens.
APA, Harvard, Vancouver, ISO, and other styles
34

Paytuví, Gallart Andreu. "Development and application of integrative tools for the functional and structural analyses of genomes." Doctoral thesis, Universitat Autònoma de Barcelona, 2019. http://hdl.handle.net/10803/667160.

Full text
Abstract:
Des del desenvolupament de la seqüenciació de Sanger l’any 1977, els avenços tecnològics han revolucionat el camp dels òmiques. Els projectes de seqüenciació a gran escala han generat una enorme quantitat de dades que han motivat el desenvolupament d'eines bioinformàtiques per a la integració, organització i interpretació d’aquestes dades. Com que la quantitat de dades de seqüenciació produïdes a tot el món es duplica cada 7 mesos, cal millorar la seva accessibilitat, processament i interpretació. En aquest sentit, l'objectiu principal d'aquest treball és desenvolupar eines bioinformàtiques per a l'anàlisi de les característiques funcionals i estructurals dels genomes. D'una banda, la capacitat d'emmagatzematge i l'accessibilitat de les dades de seqüenciació s'ha convertit en un repte, no només per a les dades brutes, sinó també per als resultats després del processament. Aquest és el cas de la transcriptòmica, una de les òmiques més finançades actualment. Per superar les limitacions actuals sobre les bases de dades existents per als lncRNA de plantes s’ha desenvolupat Green Non-Coding (GreeNC), una de les bases de dades en línia més àmplies del camp que ha inclòs 39 plantes superiors i 6 algues, emmagatzemant d’aquesta manera més de 200,000 lncRNAs. D'altra banda, la disponibilitat d'eines de fàcils d’usar per a permetre l’anàlisi i la gestió de dades de manera eficient a gran escala ajudaria a democratitzar la bioinformàtica. Diversos programes han aparegut recentment per permetre l'anàlisi de dades RNA-seq d'una manera accessible. No obstant això, cap d'ells proporciona una solució d’extrem a extrem. En aquest context, hem aprofitat la computació al núvol per a desenvolupar una plataforma fàcil d'usar anomenada Artificial Intelligence RNA-seq (AIR). AIR és la primera solució d'extrem a extrem per a l'anàlisi de dades RNA-seq que no es limita a espècies model i que no requereix habilitats bioinformàtiques prèvies. Un cop desenvolupat, AIR s’ha validat aprofitant mostres de RNA-seq derivades de cèl·lules germinals espermatogèniques de ratolí produïdes en el nostre grup de recerca. S’ha observat un augment de la prevalença de gens no codificants durant l'espermatogènesi i el silenciament del cromosoma X. També s’han identificat gens diferencialment expressats consistents amb el desenvolupament seqüencial de l’espermatogènesi. Precisament, se sap que el genoma experimenta grans canvis en la seva organització tri-dimensional (3D) del genoma durant l'espermatogènesi. Per caracteritzar aquesta reorganització en 3D s’ha fet servir AIR i altres eines addicionals per a l'anàlisi de dades Hi-C per generar un mapa d’interaccions de la cromatina i de les característiques genòmiques funcionals de la línia germinal masculina del ratolí. Els nostres resultats han revelat patrons no descrits prèviament: (i) l'organització d’escala subcromosòmica es perd durant la profase I; (ii) l'organització d’escala supranucleosòmica es fa difusa durant l'espermatogènesi, especialment en els espermatozous; (iii) esdeveniments específics com l’agrupació de telòmers (bouquet) i la inactivació del cromosoma X han estat observats; (iv) conformacions obertes específiques de cada tipus cel·lular s’han correlacionat amb l'expressió de gens amb funcions rellevants. En general, s’han desenvolupat noves solucions bioinformàtiques per a millorar l'accessibilitat, el processament i la interpretació de les dades òmiques que han permès l’anàlisi de les característiques funcionals i estructurals dels genomes.
Since the development of the Sanger sequencing in 1977, technological advances have revolutionized the -omics field. Large-scale sequencing projects have resulted in the generation of an enormous amount of data that have motivated the development of bioinformatics tools for its integration, organization and interpretation. Due to the fact that the amount of sequencing data produced worldwide doubles every 7 months, there is the need to improve data accessibility, processing and interpretation. In this sense, the main aim of this work is to develop bioinformatics tools for the analysis of the functional and structural characteristics of genomes. On the one hand, storage capacity and accessibility of -omics data has become a challenge, not only for raw data but also for post-processing results. And this is the case for transcriptomics, one of the most funded -omics. In order to overcome current limitations on the existing databases for plant lncRNAs, we developed Green Non-Coding (GreeNC), one of the most comprehensive online databases in the field that included 39 plant species and 6 algae, representing more than 200,000 lncRNAs. On the other hand, the availability of user-friendly tools to ensure feasible large-scale data analysis and management would help to democratize bioinformatics. Several software have recently emerged to allow the analysis of RNA-seq data in an accessible way. However, none of them provides an end-to-end solution. In this context, we took advantage of cloud computing to develop a cloud-based easy-to-use platform called Artificial Intelligence RNA-seq (AIR). AIR is the first end-to-end solution for the analysis of RNA-seq data that is not limited to model species and does not require previous bioinformatics skills. Once developed, we validated AIR taking advantage of RNA-seq samples derived from mouse spermatogenic germ cells produced in our research group. We observed an increase in the prevalence of non-coding genes during spermatogenesis and detected silencing of the X chromosome. We also identified differentially expressed genes that were consistent with the sequential development of spermatogenesis. Precisely, it is known that the genome undergoes large three-dimensional (3D) conformational changes during spermatogenesis. To characterize such 3D re-organization, we made use of AIR and additional tools for Hi-C data analysis to generate an integrative atlas of the chromatin interactions and functional genomic characteristics of the mouse male germ line. Our results revealed previously undescribed patterns: (i) the sub-chromosomal organization scale is lost during prophase I, (ii) the sub-megabase organization scale becomes diffuse along spermatogenesis especially in sperm, (iii) specific events such as the telomere bouquet and the X chromosome inactivation were observed, and (iv) cell-specific open conformations correlated with the expression of genes with relevant functional roles. Overall, we have developed new bioinformatics solutions to enhance accessibility, processing and interpretation of -omics data that permitted the analysis of functional and structural features of genomes.
APA, Harvard, Vancouver, ISO, and other styles
35

Shtarkman, Yury M. "Metagenomic And Metatranscriptomic Analyses Of Lake Vostok Accretion Ice." Bowling Green State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1438867879.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Vikova, Veronika. "Analyses génomiques et épigénomiques pour le développement d’une médecine de précision dans le myélome multiple." Thesis, Montpellier, 2019. http://www.theses.fr/2019MONTT031.

Full text
Abstract:
Le myélome multiple (MM) est le second cancer hématologique le plus répandu après les lymphomes. Malgré une amélioration de sa prise en charge au cours des 20 dernières années, les traitements actuels ne permettent pas d’éviter les rechutes répétitives associées au développement de mécanismes de résistance. Les résistances aux traitements sont notamment expliquées par la forte hétérogénéité de la maladie qui rend nécessaire le développement de prises en charges adaptées aux profils moléculaires des patients. L’avènement des technologies de séquençage haut-débit permet d’accéder à des niveaux de plus en plus détaillés de l’hétérogénéité moléculaire tumorale, ce qui permettra de proposer des solutions plus performantes dans l’optique de développer une médecine personnalisée. Dans cet objectif, nous avons analysé l’exome, le transcriptome et l’épigénome de cellules primaires de patients et de lignées cellulaires de MM. Sur la base de ces analyses, nous avons non seulement mis en évidence de nouveaux mécanismes impliqués dans la physiopathologie du MM mais également de nouvelles cibles thérapeutiques potentielles, des biomarqueurs pronostiques ainsi que des signatures d’orientation thérapeutiques. Les données et résultats de nos études constituent une ressource d’intérêt pour la communauté scientifique et permettront d’améliorer la prise en charge thérapeutique des patients atteints de MM
Multiple myeloma (MM) is the second most common hematological malignancy after lymphoma. Recent advances in treatment have led to an overall survival of intensively-treated patients of 6-7 years. However, patients invariably relapse after multiple lines of treatment, with shortened intervals between relapses, and finally become resistant to all treatments, resulting in loss of clinical control over the disease in association with drug resistance. Treatment improvements will come from a better comprehension of tumorigenesis and detailed molecular analyses to develop individualized therapies taking into account the molecular heterogeneity and subclonal evolution. In this purpose, we analyzed the exome, transcriptome and epigenome of primary MM cells from patients and human MM cell lines. Our results have highlighted new mechanisms involved in the pathophysiology of MM as well as potential new therapeutic targets, prognostic signatures and theranostic biomarkers. The data and results of our studies represent an important resource to understand the mechanisms of tumor progression and drug resistance and develop new ways to diagnose and treat patients
APA, Harvard, Vancouver, ISO, and other styles
37

Taraboletti, Alexandra Anna. "Chemical and Metabolomic Analyses of Cuprizone-Induced Demyelination and Remyelination." University of Akron / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=akron1498535047689141.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Johnson, Travis Steele. "Integrative approaches to single cell RNA sequencing analysis." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1586960661272666.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Huque, Enamul. "Shape Analysis and Measurement for the HeLa cell classification of cultured cells in high throughput screening." Thesis, University of Skövde, School of Humanities and Informatics, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-27.

Full text
Abstract:

Feature extraction by digital image analysis and cell classification is an important task for cell culture automation. In High Throughput Screening (HTS) where thousands of data points are generated and processed at once, features will be extracted and cells will be classified to make a decision whether the cell-culture is going on smoothly or not. The culture is restarted if a problem is detected. In this thesis project HeLa cells, which are human epithelial cancer cells, are selected for the experiment. The purpose is to classify two types of HeLa cells in culture: Cells in cleavage that are round floating cells (stressed or dead cells are also round and floating) and another is, normal growing cells that are attached to the substrate. As the number of cells in cleavage will always be smaller than the number of cells which are growing normally and attached to the substrate, the cell-count of attached cells should be higher than the round cells. There are five different HeLa cell images that are used. For each image, every single cell is obtained by image segmentation and isolation. Different mathematical features are found for each cell. The feature set for this experiment is chosen in such a way that features are robust, discriminative and have good generalisation quality for classification. Almost all the features presented in this thesis are rotation, translation and scale invariant so that they are expected to perform well in discriminating objects or cells by any classification algorithm. There are some new features added which are believed to improve the classification result. The feature set is considerably broad rather than in contrast with the restricted sets which have been used in previous work. These features are used based on a common interface so that the library can be extended and integrated into other applications. These features are fed into a machine learning algorithm called Linear Discriminant Analysis (LDA) for classification. Cells are then classified as ‘Cells attached to the substrate’ or Cell Class A and ‘Cells in cleavage’ or Cell Class B. LDA considers features by leaving and adding shape features for increased performance. On average there is higher than ninety five percent accuracy obtained in the classification result which is validated by visual classification.

APA, Harvard, Vancouver, ISO, and other styles
40

Wakadkar, Sachin. "Analysis of transmembrane and globular protein depending on their solvent energy." Thesis, University of Skövde, School of Life Sciences, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-2971.

Full text
Abstract:

The number of experimentally determined protein structures in the protein data bank (PDB) is continuously increasing. The common features like; cellular location, function, topology, primary structure, secondary structure, tertiary structure, domains or fold are used to classify them. Therefore, there are various methods available for classification of proteins. In this work we are attempting an additional method for making appropriate classification, i.e. solvent energy. Solvation is one of the most important properties of macromolecules and biological membranes by which they remain stabilized in different environments. The energy required for solvation can be measured in term of solvent energy. Proteins from similar environments are investigated for similar solvent energy. That is, the solvent energy can be used as a measure to analyze and classify proteins. In this project solvent energy of proteins present in the Protein Data Bank (PDB) was calculated by using Jones’ algorithm. The proteins were classified into two classes; transmembrane and globular. The results of statistical analysis showed that the values of solvent energy obtained for two main classes (globular and transmebrane) were from different sets of populations. Thus, by adopting classification based on solvent energy will definitely help for prediction of cellular placement.

 

APA, Harvard, Vancouver, ISO, and other styles
41

Chawade, Aakash. "Inferring Gene Regulatory Networks in Cold-Acclimated Plants by Combinatorial Analysis of mRNA Expression Levels and Promoter Regions." Thesis, University of Skövde, School of Humanities and Informatics, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-20.

Full text
Abstract:

Understanding the cold acclimation process in plants may help us develop genetically engineered plants that are resistant to cold. The key factor in understanding this process is to study the genes and thus the gene regulatory network that is involved in the cold acclimation process. Most of the existing approaches1-8 in deriving regulatory networks rely only on the gene expression data. Since the expression data is usually noisy and sparse the networks generated by these approaches are usually incoherent and incomplete. Hence a new approach is proposed here that analyzes the promoter regions along with the expression data in inferring the regulatory networks. In this approach genes are grouped into sets if they contain similar over-represented motifs or motif pairs in their promoter regions and if their expression pattern follows the expression pattern of the regulating gene. The network thus derived is evaluated using known literature evidence, functional annotations and from statistical tests.

APA, Harvard, Vancouver, ISO, and other styles
42

Steinberg, Julia. "Functional genomics analyses of neuropsychiatric and neurodevelopmental disorders." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:e47d1ac2-de92-47d8-864b-dac0bf6669e8.

Full text
Abstract:
Recent large-scale genome-wide studies for many human disorders have identified associations with numerous genetic variants. The biological interpretation of these variants presents a major challenge. In particular, the identification of biological pathways underlying the association could provide crucial insights into the disease aetiologies. In this thesis, I used functional genomics approaches to increase our understanding of neuropsychiatric and neurodevelopmental disorders. Firstly, in an integrative analysis of autism spectrum disorder (ASD), I looked into the role of genes targeted by Fragile-X Mental Retardation Protein ("FMRP targets"). I found evidence that FMRP targets contribute to ASD via two distinct aetiologies: (1) ultra-rare and highly penetrant single disruptions of embryonically upregulated FMRP targets ("single-hit aetiology") or (2) the combination of multiple less penetrant disruptions of synaptic FMRP targets ("multiple-hit aetiology"). In particular, I developed a pathway-association test sensitive to multiple-hit aetiologies. Secondly, I carried out an integrative analysis of bipolar disorder, following up a previously identified association with long-term potentiation. The association was not consistent across independent SNP and CNV datasets. Thirdly, I addressed the difficulty in identifying functional relationships between genes by integrating different datasets into a gene functional-linkage network tuned to the nervous system ("NsNet"). NsNet identified functional links between the genes disrupted by de novo loss-of-function mutations in ASD and, separately, in schizophrenia probands more sensitively than a general functional-linkage network. Fourthly, I considered the challenge of interpreting the phenotypic impact of gene disruptions, focusing on the identification of haploinsufficient genes. I constructed a gene haploinsufficiency score based on genome-wide datasets. Compared to existing approaches, the new score performed better in identifying less-studied haploinsufficient genes. This work both extends the methodology to detect the contribution of genetic variation to neuropsychiatric disorders and also yields insights into the variant genes and the pathways that underlie them. Firstly, in an integrative analysis of autism spectrum disorder (ASD), I looked into the role of genes targeted by Fragile-X Mental Retardation Protein ("FMRP targets"). I found evidence that FMRP targets contribute to ASD via two distinct aetiologies: (1) ultra-rare and highly penetrant single disruptions of embryonically upregulated FMRP targets ("single-hit aetiology") or (2) the combination of multiple less penetrant disruptions of synaptic FMRP targets ("multiple-hit aetiology"). In particular, I developed a pathway-association test sensitive to multiple-hit aetiologies. Secondly, I carried out an integrative analysis of bipolar disorder, following up a previously identified association with long-term potentiation. The association was not consistent across independent SNP and CNV datasets. Thirdly, I addressed the difficulty in identifying functional relationships between genes by integrating different datasets into a gene functional-linkage network tuned to the nervous system ("NsNet"). NsNet identified functional links between the genes disrupted by de novo loss-of-function mutations in ASD and, separately, in schizophrenia probands more sensitively than a general functional-linkage network. Fourthly, I considered the challenge of interpreting the phenotypic impact of gene disruptions, focusing on the identification of haploinsufficient genes. I constructed a gene haploinsufficiency score based on genome-wide datasets. Compared to existing approaches, the new score performed better in identifying less-studied haploinsufficient genes. This work both extends the methodology to detect the contribution of genetic variation to neuropsychiatric disorders and also yields insights into the variant genes and the pathways that underlie them.
APA, Harvard, Vancouver, ISO, and other styles
43

Vukcevic, Damjan. "Bayesian and frequentist methods and analyses of genome-wide association studies." Thesis, University of Oxford, 2009. http://ora.ox.ac.uk/objects/uuid:8f89593e-a4ab-4df0-b297-74194be7891c.

Full text
Abstract:
Recent technological advances and remarkable successes have led to genome-wide association studies (GWAS) becoming a tool of choice for investigating the genetic basis of common complex human diseases. These studies typically involve samples from thousands of individuals, scanning their DNA at up to a million loci along the genome to discover genetic variants that affect disease risk. Hundreds of such variants are now known for common diseases, nearly all discovered by GWAS over the last three years. As a result, many new studies are planned for the future or are already underway. In this thesis, I present analysis results from actual studies and some developments in theory and methodology. The Wellcome Trust Case Control Consortium (WTCCC) published one of the first large-scale GWAS in 2007. I describe my contribution to this study and present the results from some of my follow-up analyses. I also present results from a GWAS of a bipolar disorder sub-phenotype, and a recent and on-going fine mapping experiment. Building on methods developed as part of the WTCCC, I describe a Bayesian approach to GWAS analysis and compare it to widely used frequentist approaches. I do so both theoretically, by interpreting each approach from the perspective of the other, and empirically, by comparing their performance in the context of replicated GWAS findings. I discuss the implications of these comparisons on the interpretation and analysis of GWAS generally, highlighting the advantages of the Bayesian approach. Finally, I examine the effect of linkage disequilibrium on the detection and estimation of various types of genetic effects, particularly non-additive effects. I derive a theoretical result showing how the power to detect a departure from an additive model at a marker locus decays faster than the power to detect an association.
APA, Harvard, Vancouver, ISO, and other styles
44

Bresell, Anders. "Characterization of protein families, sequence patterns, and functional annotations in large data sets." Doctoral thesis, Linköping : Department of Physics, Chemistry and Biology, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-10565.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Dwivedi, Ankit. "Functional analysis of genomic variations associated with emerging artemisinin resistant P. falciparum parasite populations and human infecting piroplasmida B. microti." Thesis, Montpellier, 2016. http://www.theses.fr/2016MONTT073/document.

Full text
Abstract:
Le programme d’élimination du paludisme de l’OMS est menacé par l’émergence etla propagation potentielle de parasites de l’espèce Plasmodium falciparum résistants à l’artémisinine. Récemment il a été montré que (a) des SNPs dans une région du chromosome 13 subissaient une forte sélection positive récente au Cambodge,(b) plusieurs sous-populations de parasites de P. falciparum résistants et sensibles à l’artémisinine étaient présentes au Cambodge, (c) des mutations dans le domaine Kelch du gène k13 sont des déterminants majeurs de la résistance à l’artémisinine dans la population parasitaire cambodgien et (d) des parasites de sous-populations du nord du Cambodge près de la Thaïlande et du Laos sont résistants à la méfloquine et portent l’allèle R539T du gène de k13.Il est donc nécessaire d’identifier la base génétique de la résistance dans le but de surveiller et de contrôler la transmission de parasites résistants au reste du monde, pour comprendre le métabolisme des parasites et pour le développement de nouveaux médicaments. Ce travail a porté sur la caractérisation de la structure de la population de P. falciparum au Cambodge et la description des propriétés métaboliques des sous-populations présentes ainsi que des flux de gènes entre ces sous-populations. Le but est d’identifier les bases génétiques associées à la transmission et l’acquisition de résistance à l’artémisinine dans le pays.La première approche par code-barre a été développée pour identifier des sous-populations à l’aide d’un petit nombre de loci. Une approche moléculaire de PCR-LDR-FMA multiplexée et basée sur la technologie LUMINEX a été mise au point pour identifier les SNP dans 537 échantillons de sang (2010 - 2011) provenant de 16centres de santé au Cambodge. La présence de sous-populations le long des frontières du pays a été établie grâce à l’analyse de 282 échantillons. Les flux de gènes ont été décrits à partir des 11 loci du code-barre. Le code-barre permet d’identifier les sous-populations de parasites associées à la résistance à l’artémisinine et à la méfloquine qui ont émergé récemment.La seconde approche de caractérisation de la structure de la population de P.falciparum au Cambodge a été définie sur la base de l’analyse de 167 génomes de parasites (données NGS de 2008 à 2011) provenant de quatre localités au Cambodge et récupérés à partir de la base de données ENA. Huit sous-populations de parasites ont pu être décrites à partir d’un jeu de 21257 SNPs caractérisés dans cette étude. La présence de sous-populations mixtes de parasite apparait comme un risque majeur pour la transmission de la résistance à l’artémisinine. L’analyse fonctionnelle montre qu’il existe un fond génétique commun aux isolats dans les populations résistantes et a confirmé l’importance de la voie PI3K dans l’acquisition de la résistance en aidant le parasite à rester sous forme de stade anneau.Nos résultats remettent en question l’origine et la persistance des sous-populations de P. falciparum au Cambodge, fournissent des preuves de flux génétique entre les sous-populations et décrivent un modèle d’acquisition de résistance à l’artémisinine.Le processus d’identification des SNPs fiables a été ensuite appliqué au génome de Babesia microti. Ce parasite est responsable de la babésiose humain (un syndrome de type malaria) et est endémique dans le nord-est des Etats-Unis. L’objectif était de valider la position taxonomique de B. microti en tant que groupe externe aux piroplasmes et d’améliorer l’annotation fonctionnelle du génome en incluant la variabilité génétique, l’expression des gènes et la capacité antigénique des protéines. Nous avons ainsi identifié de nouvelles protéines impliquées dans les interactions hôte-parasite
The undergoing WHO Malaria elimination program is threatened by the emergenceand potential spread of the Plasmodium falciparum artemisinin resistant parasite.Recent reports have shown (a) SNPs in region of chromosome 13 to be understrong recent positive selection in Cambodia, (b) presence of P. falciparum parasiteresistant and sensitive subpopulations in Cambodia, (c) the evidence that mutationsin the Kelch propeller domain of the k13 gene are major determinants ofartemisinin resistance in Cambodian parasite population and (d) parasite subpopulations in Northern Cambodia near Thailand and Laos with mefloquine drugresistance and carrying R539T allele of the k13 gene.Identifying the genetic basis of resistance is important to monitor and control thetransmission of resistant parasites and to understand parasite metabolism for the development of new drugs. This thesis focuses on analysis of P. falciparum population structure in Cambodia and description of metabolic properties of these subpopulations and gene flow among them. This could help in identifying the genetic evidence associated to transmission and acquisition of artemisinin resistance over the country.First, a barcode approach was used to identify parasite subpopulations using smallnumber of loci. A mid-throughput PCR-LDR-FMA approach based on LUMINEXtechnology was used to screen for SNPs in 537 blood samples (2010 - 2011) from 16health centres in Cambodia. Based on successful typing of 282 samples, subpopulations were characterized along the borders of the country. Gene flow was described based on the gradient of alleles at the 11 loci in the barcode. The barcode successfully identifies recently emerging parasite subpopulations associated to artemisinin and mefloquine resistance.In the second approach, the parasite population structure was defined based on167 parasite NGS genomes (2008 - 2011) originating from four locations in Cambodia,recovered from the ENA database. Based on calling of 21257 SNPs, eight parasite subpopulations were described. Presence of admixture parasite subpopulation couldbe supporting artemisinin resistance transmission. Functional analysis based on significant genes validated similar background for resistant isolates and revealed PI3K pathway in resistant populations supporting acquisition of resistance by assisting the parasite in ring stage form.Our findings question the origin and the persistence of the P. falciparum subpopulations in Cambodia, provide evidence of gene flow among subpopulations anddescribe a model of artemisinin resistance acquisition.The variant calling approach was also implemented on the Babesia microti genome.This is a malaria like syndrome, and is endemic in the North-Eastern USA. Theobjective was to validate the taxonomic position of B. microti as out-group amongpiroplasmida and improve the functional genome annotation based on genetic variation, gene expression and protein antigenicity. We identified new proteins involved in parasite host interactions
APA, Harvard, Vancouver, ISO, and other styles
46

Odelgard, Anna. "Coverage Analysis in Clinical Next-Generation Sequencing." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-379228.

Full text
Abstract:
With the new way of sequencing by NGS new tools had to be developed to be able to work with new data formats and to handle the larger data sizes compared to the previous techniques but also to check the accuracy of the data. Coverage analysis is one important quality control for NGS data, the coverage indicates how many times each base pair has been sequenced and thus how trustworthy each base call is. For clinical purposes every base of interest must be quality controlled as one wrong base call could affect the patient negatively. The softwares used for coverage analysis with enough accuracy and detail for clinical applications are sparse. Several softwares like Samtools, are able to calculate coverage values but does not further process this information in a useful way to produce a QC report of each base pair of interest. My master thesis has therefore been to create a new coverage analysis report tool, named CAR tool, that extract the coverage values from Samtools and further uses this data to produce a report consisting of tables, lists and figures. CAR tool is created to replace the currently used tool, ExCID, at the Clinical Genomics facility at SciLifeLab in Uppsala and was developed to meet the needs of the bioinformaticians and clinicians. CAR tool is written in python and launched from a terminal window. The main function of the tool is to display coverage breath values for each region of interest and to extract all sub regions below a chosen coverage depth threshold. The low coverage regions are then reported together with region name, start and stop positions, length and mean coverage value. To make the tool useful to as many as possible several settings are possible by entering different flags when calling the tool. Such settings can be to generate pie charts of each region’s coverage values, filtering of the read and bases by quality or write your own entry that will be used for the coverage calculation by Samtools. The tool has been proved to find these low coverage regions very well. Most low regions found are also found by ExCID, the currently used tool, some differences did however occur and every such region was verified by IGV. The coverage values shown in IGV coincided with those found by CAR tool. CAR tool is written to find all low coverage regions even if they are only one base pair long, while ExCID instead seem to generate larger low regions not taking very short low regions into account. To read more about the functions and how to use CAR tool I refer to User instructions in the appendix and on GitHub at the repository anod6351
APA, Harvard, Vancouver, ISO, and other styles
47

Stenerlöw, Oskar. "Artefact detection in microstructures using image analysis." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-417342.

Full text
Abstract:
Gyros Protein Technologies AB produce instruments designed to perform automated immunoassaying on plastic CDs with microstructures. While generally being a very robust process, the company had noticed that some runs on the instruments encountered problems. They hypothesised it had to do with the chamber on the CD in which the sample is added to. It was believed that the chamber was not being filled properly, leaving it completely empty or contained with a small amount of air, rather than liquid. This project aimed to investigate this hypothesis and to develop an image analysis solution that could reliably detect these occurrences. An image analysis script was developed which mainly utilised template matching and canny edge detection to assess the presence of air. The analysis had great success in detecting empty chambers and large bubbles of air, while it had some trouble with discerning small bubbles from dirt on top of the CD. Evaluating the analysis on a test set of 1305 images annotated by two people, the analysis managed to score an accuracy of 96.8 % and 99.5 % respectively.
APA, Harvard, Vancouver, ISO, and other styles
48

Tubeuf, Helene. "Développement de stratégies de criblage de mutations d'épissage dans des gènes de prédisposition au cancer. Demystifying the splicing code: new bioinformatics insights for the interpretation of genetic variants A staggering number of genetic variations affect the splicing pattern of BRCA2 exon 7: validation of the predictive power of splicing-dedicated silico analyses MLH1 exon 7, an emblematic exon sensitive to intronic mutations but not to alterations of exonic splicing regulators, sheds light into the performance of SRE-dedicated bioinformatics approaches Calibration of pathogenicity of partial splicing defects: The model of BRCA2 Exon 3." Thesis, Normandie, 2019. http://www.theses.fr/2019NORMR009.

Full text
Abstract:
Le développement du séquençage de l’ADN à haut débit a grandement facilité le criblage de variations génétiques dans le génome des patients. Désormais, l’un des principaux défis de la génétique médicale n’est donc plus la détection des variations, mais leur interprétation fonctionnelle et clinique. Récemment, nous avons montré, à l’aide de tests fonctionnels basés sur l’utilisation de minigènes, que bien que le nombre de mutations d’épissage, et en particulier celles qui affectent sa régulation, est actuellement sous-estimé, l’effet de ces variations pourrait être dorénavant prédit à l’aide d’outils bioinformatiques spécifiques. Nous avons ainsi étendu l’évaluation du caractère prédictif de ces quatre nouvelles approches bioinformatiques par une étude comparative des scores générés par ces approches avec des données expérimentales obtenues pour un total d’environ 1200 variations exoniques. Nos travaux ont ainsi démontré la fiabilité de ces approches, utilisées seules ou en combinaison, et ont permis de proposer des recommandations quant à leur utilisation en tant qu’outils de filtration pour prioriser les variations à analyser dans des tests fonctionnels axés sur l’épissage. Néanmoins, une analyse mutationnelle exhaustive ciblée sur l’exon 7 de MLH1, a mis en évidence l’échec apparent de ces approches, pourtant validées par des études menées sur l’exon 7 de BRCA2, l’exon 10 de MAPT et l’exon 5 de MSH2, laissant suggérer que ces méthodes pourraient ne pas s’appliquer de manière équivalente à tous les exons et/ou à tous les gènes. En effet, nous avons montré que cet exon était doté de caractéristiques particulières, i.e. de sites d’épissage remarquablement forts, lui conférant une résistance totale aux mutations de régulation d’épissage et mettant en échec les outils de prédictions. Ces données contribuent à mieux déterminer les limitations de ces outils bioinformatiques tout en contribuant à leur amélioration. En dépit de ces avancées, l'évaluation de la pathogénicité des mutations d'épissage reste complexe, en particulier celles conduisant à des anomalies d'épissage en phase et/ou partielles. En utilisant, comme modèle d’étude, des variations à l’origine du saut partiel de l’exon 3 de BRCA2, nos résultats ont révélé que l’activité tumeur-suppressive de BRCA2 tolère une réduction substantielle du niveau d’expression, étant donné qu’un allèle produisant jusqu’à 70% de transcrit codant une protéine déficiente n’est pas nécessairement associé à un risque élevé de développer un cancer. L’ensemble de ces données a d’importantes implications dans le diagnostic moléculaire et la prise en charge des patients et de leurs apparentés, avec un bénéfice direct pour les familles évocatrices d’une prédisposition héréditaire et devrait contribuer à l’interprétation de VSI identifiées par séquençage à haut débit dans toute autre pathologie d’origine génétique
The development of high-throughput DNA sequencing has greatly facilitated the screening of genetic variations within patient genome. Henceforth, one of the main challenges in medical genetics is no longer the detection of variations, but their functional and clinical interpretation. Recently, we showed by using splicing reporter minigene assays, that although splicing mutations, and in particular those affecting its regulation, are more prevalent than initially estimated, they could now be predicted by using dedicated bioinformatics tools. We thus extended the evaluation of the predictive power of these four newly developed computational approaches by a comparative study of the scores obtained by these approaches with experimental data for a total of about 1200 exonic variations. Our findings have demonstrated the reliability of these approaches, used alone or in combination, and allow to offer recommendations for their use as a filtration tool to prioritize the variations to be analysed as a priority in splicing-dedicated functional assays. Nevertheless, an exhaustive mutational analysis targeting MLH1 exon 7, has highlighted the apparent failure of these approaches, yet validated by studies focused on BRCA2 exon 7, MAPT exon 10 and MSH2 exon 5, suggesting that these methods might not be equivalently applicable to all exons and/or genes. Indeed, we have shown that this exon has particular characteristics, i.e. remarkably strong splice sites, conferring it a total resistance to splicing regulation mutations and defeating prediction tools. These findings help to better determine the limitations of these bioinformatics tools while contributing to their improvement. In spite of these advances, the pathogenicity assessment of splicing mutations remains complicated, especially of those leading to in-frame and/or partial splicing anomalies. By using variant-induced partial BRCA2 exon 3 skipping as a model system, we showed that BRCA2 tumor suppressor function tolerates a substantial reduction in expression level, as BRCA2 allele producing as much as 70% of transcript encoding deficient protein may not necessarily confer high-risk of developing cancer. Altogether, these data have important implications in the molecular diagnosis and clinical management of patients and their relatives, with a direct benefit for hereditary cancer-suspected families and should contribute to the interpretation of VSI identified by high throughput sequencing in any other genetic disease
APA, Harvard, Vancouver, ISO, and other styles
49

Batut, Bérénice. "Étude de l'évolution réductive des génomes bactériens par expériences d'évolution in silico et analyses bioinformatiques." Thesis, Lyon, INSA, 2014. http://www.theses.fr/2014ISAL0108/document.

Full text
Abstract:
Selon une vision populaire, l’évolution serait un processus de « progrès » qui s’accompagnerait d’un accroissement de la complexité moléculaire des êtres vivants. Cependant, les programmes de séquençage des génomes ont révélé l’existence d’espèces dont les lignées ont, au contraire, subi une réduction massive de leur génome. Ainsi, chez les cyanobactéries Prochlorococcus et Pelagibacter ubique, certaines lignées ont subi une réduction de 30% de leur génome. Une telle évolution « à rebours », dite évolution réductive, avait déjà été observée pour des bactéries endosymbiotiques, pour lesquelles la sélection naturelle n’est pas assez efficace pour éliminer les mutations délétères comme les pertes de gènes. Cela vient notamment du fait que ces bactéries endosymbiotiques subissent, à chaque reproduction de leur hôte, une réduction drastique de leur taille de population. Cette explication semble peu plausible pour des cyanobactéries marines comme Prochlorococcus et Pelagibacter, qui ont un mode de vie libre et qui font partie des bactéries les plus abondantes des océans. D’autres hypothèses ont ainsi été proposées pour expliquer l’évolution réductive comme l’adaptation à un environnement stable et pauvre en nutriments, des forts taux de mutation, mais aucun de ces hypothèses ne semble capable d’expliquer toutes les caractéristiques génomiques observées. Dans cette thèse, nous nous intéressons au cas de l’évolution réductive chez Prochlorococcus, pour laquelle de nombreuses séquences et données sont disponibles. Deux approches sont utilisées pour cette étude : une analyse phylogénétique des génomes de Prochlorococcus, et une approche théorique de simulation où nous testons différents scénarios évolutifs pouvant conduire à une évolution réductive. La combinaison de ces deux approches permet finalement de proposer un scénario plausible pour expliquer l'évolution réductive chez Prochlorococcus
Given a popular view, evolution is an incremental process based on an increase of molecular complexity of organisms. However, some organisms have undergo massive genome reduction like the endosymbionts. In this case the reduction can be explained by the Muller’s ratchet due to the endosymbiont lifestyle with small population and lack of recombination. However, in some marine bacteria, like Prochlorococcus et Pelagibacter, lineage have undergo up to 30% of genome reduction. Their lifestyle is almost the opposite to the one of the endosymbionts and reductive genome evolution can not be easily explicable by the Muller’s ratchet. Some other hypothesis has been proposed but none can explain all the observed genomic characteristics. In the thesis, I am interested in the reductive evolution of Prochlorococcus. I used two approaches: a theoretical one using simulation where different scenarios are tested and an analysis of Prochlorococcus genomes in a phylogenetic framework to determine the causes and characteristics of genome reduction. The combination of these two approaches allows to propose an hypothetical evolutive history for the reductive genome evolution of Prochlorococcus
APA, Harvard, Vancouver, ISO, and other styles
50

Mersch, Marjorie. "Analyse de la méthylation de l'ADN par séquençage haut-débit chez la Poule." Thesis, Toulouse, INPT, 2018. http://www.theses.fr/2018INPT0107/document.

Full text
Abstract:
Anticiper l’impact de fluctuations environnementales de nature climatique ou alimentaire est un enjeu crucial dans les systèmes de productions animales, et plus particulièrement sur la volaille. Cette influence de l’environnement sur les phénotypes passe en partie par des phénomènes épigénétiques, notamment la méthylation de l’ADN, et qui peuvent intervenir dans la régulation de l'expression des gènes. Ce sont des mécanismes qui n'affectent pas la séquence d'ADN mais qui peuvent être transmis par la mitose ou la méiose. Ces interactions entre épigénomes et expression des gènes sont de plus en plus étudiées dans les modèles animaux et chez les plantes. Cependant, les mécanismes de régulation de l'expression du génome par la méthylation de l’ADN sont assez peu connus chez les oiseaux. Ce travail de thèse repose sur deux dispositifs expérimentaux réalisés chez la poule, le but étant de caractériser le méthylome par séquençage haut-débit. Les profils de méthylation le long du génome, et le lien avec l’expression, sont établis d’abord par un séquençage tout-génome (WGBS) au sein d’embryons entiers, puis par un séquençage d'une sous-représentation du génome (RRBS) au sein d’hypothalamus d’individus adultes. À ce jour, aucune étude d'analyses de méthylome par RRBS chez la poule n'a été publiée. Ces deux analyses sont réalisées grâce au développement d'un pipeline bioinformatique, optimisé, disponible à la communauté scientifique. Globalement, le profil de méthylation chez la poule est similaire à ce qui est connu chez les mammifères : les îlots CpG - régions riches en dinucléotides CG, souvent peu méthylées, qui ponctuent le génome principalement dans les régions promotrices des gènes - sont globalement peu méthylés dans les promoteurs sur les données WGBS et RRBS. Les analyses du méthylome des embryons ont confirmé l'absence d'un phénomène de compensation de dose sur les chromosomes sexuels, ou la présence sur le chromosome Z d'une région hyperméthylée. Les analyses des données RRBS révèlent une hyperméthylation globale des CG sur le génome, suggérant une réponse de la méthylation à un stress environnemental. Sur les données WGBS, le niveau de méthylation dans le promoteur est négativement corrélé à l'expression du gène associé. Une méthylation allèle spécifique est également détectée entre les lignées, phénomène mis en évidence pour la première fois chez la poule et dont la fréquence est comparable à ce qui a été observé chez l'Homme. Sur les données RRBS, des résultats préliminaires de la réponse du méthylome aux stress environnementaux montrent le caractère complexe de cette relation. L’utilisation d’aliments moins énergétiques entraînerait une plus grande mobilisation des réserves lipidiques, tandis que les individus soumis à un stress à la chaleur ont un poids corporel plus léger. Une intégration de ces données à des mesures phénotypiques permettrait de faire le lien entre méthylation et environnement. Au-delà de l'aspect fondamental de cette thèse, l'application plus concrète de ces connaissances peut s'appliquer aux systèmes d'élevage pour obtenir des animaux mieux adaptés à l’environnement, en améliorant les caractères de production
Anticipating the impact of environmental changes (on climate and feed) is a crucial issue for livestock production systems, including poultry. The influence of the environment on phenotypes is partly mediated by epigenetic phenomena, including DNA methylation, which may be involved in the regulation of gene expression. These mechanisms do not affect the DNA sequence but can be inherited by mitosis or meiosis. The interactions between epigenomes and gene expression are increasingly being studied in animal models and in plants. However, the mechanisms of regulation of genome expression through DNA methylation are relatively unknown in birds. This thesis work is based on two experimental devices realized in chicken aiming to characterize the methylome by high-throughput sequencing. The methylation patterns across the genome, and their link with expression, were first established by whole-genome bisulfite sequencing (WGBS) in whole embryos, following a reduced representation bisulfite sequencing (RRBS) from hypothalamus of adults. To date, no specific chicken RRBS study has been published. These two analyses were carried out by developing an optimized bioinformatics pipeline, available for scientific community. Overall, the pattern of methylation in chicken is like those in mammals: CpG islands - dinucleotides CG-rich regions which are often poorly methylated, and which are found mainly in the promoter regions of the genome - are generally poorly methylated in promoters on WGBS and RRBS data. Embryo methylome analyses confirmed the absence of a dose-compensation phenomenon on sex chromosomes, or the presence of a hypermethylated region on the Z chromosome. The analyses of RRBS data revealed an overall hypermethylation of CGs across the genome, suggesting a methylation response to environmental stress. From the analysis of WGBS data, we found that the level of methylation in promoters was negatively correlated with the expression of the associated gene. For the first time, a specific allele methylation was also detected between chicken lines whose frequency is comparable to that observed in humans. On the RRBS data, preliminary results of the methylome response to environmental stresses showed the complex nature of this relationship. The use of a low-energy diet would led to greater mobilization of body fat, while individuals with heat stress had a lighter body weight. Integrating these data with phenotypic measurements would allow to link methylation and environment. Beyond the fundamental aspect of this thesis, the method developed in this work could be applied to livestock systems to breed animals better adapted to a changing environment, by improving production traits
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography