Log in

Relevant bibliographies by topics / Functional bioinformatics / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Functional bioinformatics.

Dissertations / Theses on the topic 'Functional bioinformatics'

Author: Grafiati

Published: 4 June 2021

Last updated: 12 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Functional bioinformatics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Bresell, Anders. "Characterization of protein families, sequence patterns, and functional annotations in large data sets." Doctoral thesis, Linköping : Department of Physics, Chemistry and Biology, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-10565.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Kemmer, Danielle. "Genomics and bioinformatics approaches to functional gene annotation /." Stockholm, 2006. http://diss.kib.ki.se/2006/91-7140-636-0/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Johansson, Annelie. "Identifying gene regulatory interactions using functional genomics data." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-230285.

Full text

Abstract:

Previously studies used correlation of DNase I hypersensitivity sites sequencing (DNase-seq) experiments to predict interactions between enhancers and its target promoter gene. We investigate the correlation methods Pearson’s correlation and Mutual Information, using DNase-seq data for 100 cell-types in regions on chromosome one. To assess the performances, we compared our results of correlation scores to Hi-C data from Jin et al. 2013. We showed that the performances are low when comparing it to the Hi-C data, and there is a need of improved correlation metrics. We also demonstrate that the use of Hi-C data as a gold standard is limited, because of its low resolution, and we suggest using another gold standard in further studies.

APA, Harvard, Vancouver, ISO, and other styles

4

Perkins, J. R. "Functional genomics and bioinformatics protocols for the elucidation of pain." Thesis, University College London (University of London), 2013. http://discovery.ucl.ac.uk/1384822/.

Full text

Abstract:

Microarray technologies enable us to profile the expression of thousands of gene transcripts within a given cell or tissue. Within pain research they have been used extensively to search for genes that change in expression as a result of the induction of a clinically-relevant pain state, often using an animal model of pain. Studying these genes has led to improvements in our understanding of the genes, pathways and other biological processes involved in pain. These themes are explored further in the first (introductory) chapter of this thesis. These experiments result in large numbers of genes declared differentially expressed between samples, many of which are not directly involved in pain. There is often little overlap of these genes between different pain models. The second chapter of this thesis is concerned with the use of systems biology methods to prioritise these genes based on their likelihood of being pain-related. In the third chapter a web-based software application is described. It allows a pain researcher to combine data from various pain-related microarray experiments with other data sources in order to build their own pain networks. Exemplary usage scenarios are presented. The fourth chapter describes a comparison between microarrays and a new technology, RNA-seq, which uses next generation sequencing technology to quantify the RNA present within a tissue. Samples obtained using a well characterised animal pain model, spinal nerve transection, are used for this purpose. In the fifth chapter the effects of RNA-seq sequencing depth on the detection of differentially expressed genes and the discovery of novel transcribed regions of the genome are investigated. In keeping with the theme of gene expression profiling using animal models of pain, the sixth chapter of this thesis reports a software package for the analysis of high-throughput RT-qPCR data and presents an experiment in which this package was used to analyse cytokine expression.

APA, Harvard, Vancouver, ISO, and other styles

5

Martin, Paul. "Post-GWAS bioinformatics and functional analysis of disease susceptibility loci." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/postgwas-bioinformatics-and-functional-analysis-of-disease-susceptibility-loci(cc0e6cee-5c32-4b75-b3d3-f7c18b6f126d).html.

Full text

Abstract:

Genome-wide association studies (GWAS) have been tremendously successful in identifying genetic variants associated with complex diseases, such as rheumatoid arthritis (RA). However, the majority of these associations lie outside traditional protein coding regions and do not necessarily represent the causal effect. Therefore, the challenges post-GWAS are to identify causal variants, link them to target genes and explore the functional mechanisms involved in disease. The aim of the work presented here is to use high level bioinformatics to help address these challenges. There is now an increasing amount of experimental data generated by several large consortia with the aim of characterising the non-coding regions of the human genome, which has the ability to refine and prioritise genetic associations. However, whilst being publicly available, manually mining and utilising it to full effect can be prohibitive. I developed an automated tool, ASSIMILATOR, which quickly and effectively facilitated the mining and rapid interpretation of this data, inferring the likely functional consequence of variants and informing further investigation. This was used in a large extended GWAS in RA which assessed the functional impact of associated variants at the 22q12 locus, showing evidence that they could affect gene regulation. Environmental factors, such as vitamin D, can also affect gene regulation, increasing the risk of disease but are generally not incorporated into most GWAS. Vitamin D deficiency is common in RA and can regulate genes through vitamin D response elements (VDREs). I interrogated a large, publicly available VDRE ChIP-Seq dataset using a permutation testing approach to test for VDRE enrichment in RA loci. This study was the first comprehensive analysis of VDREs and RA associated variants and showed that they are enriched for VDREs, suggesting an involvement of vitamin D in RA.Indeed, evidence suggests that disease associated variants effect gene regulation through enhancer elements. These can act over large distances through physical interactions. A newly developed technique, Capture Hi-C, was used to identify regions of the genome which physically interact with associated variants for four autoimmune diseases. This study showed the complex physical interactions between genetic elements, which could be mediated by regions associated with disease. This work is pivotal in fully characterising genetic associations and determining their effect on disease. Further work has re-defined the 6q23 locus, a region associated with multiple diseases, resulting in a major re-evaluation of the likely causal gene in RA from TNFAIP3 to IL20RA, a druggable target, illustrating the huge potential of this research. Furthermore, it has been used to study the genetic associations unique to multiple sclerosis in the same region, showing chromatin interactions which support previously implicated genes and identify novel candidates. This could help improve our understanding and treatment of the disease. Bioinformatics is fundamental to fully exploit new and existing datasets and has made many positive impacts on our understanding of complex disease. This empowers researchers to fully explore disease aetiology and to further the discovery of new therapies.

APA, Harvard, Vancouver, ISO, and other styles

6

Reddy, Joseph. "Identification and Analysis of Important Proteins in Protein Interaction Networks Using Functional and Topological Information." Thesis, University of Skövde, School of Life Sciences, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-2395.

Full text

Abstract:

<p>Studying protein interaction networks using functional and topological information is important for understanding cellular organization and functionality. This study deals with identifying important proteins in protein interaction networks using SWEMODE (Lubovac, et al, 2006) and analyzing topological and functional properties of these proteins with the help of information derived from modular organization in protein interaction networks as well as information available in public resources, in this case, annotation sources describing the functionality of proteins. Multi-modular proteins are short-listed from the modules generated by SWEMODE. Properties of these short-listed proteins are then analyzed using functional information from SGD Gene Ontology(GO) (Dwight, et al., 2002) and MIPS functional categories (Ruepp, et al., 2004). Topological features such as lethality and centrality of these proteins are also investigated, using graph theoretic properties and information on lethal genes from Yeast Hub (Kei-Hoi, et al., 2005). The findings of the study based on GO terms reveal that these important proteins are mostly involved in the biological process of “organelle organization and biogenesis” and a majority of these proteins belong to MIPS “cellular organization” and “transcription” functional categories. A study of lethality reveals that multi-modular proteins are more likely to be lethal than proteins present only in a single module. An examination of centrality (degree of connectivity of proteins) in the network reveals that the ratio of number of important proteins to number of hubs at different hub sizes increases with the hub size (degree).</p>

APA, Harvard, Vancouver, ISO, and other styles

7

Kusnierczyk, Waclaw. "Augmenting Bioinformatics Research with Biomedical Ontologies." Doctoral thesis, Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, 2008. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-2001.

Full text

Abstract:

<p>The main objective of the reported study was to investigate how biomedical ontologies, logically structured representations of various aspects of the biomedical reality, can help researchers in analyzing experimental data. The dissertation reports two attempts to construct tools for the analysis of high-throughput experimental results using explicit domain knowledge representations. Furthermore, integrative efforts made by the community of Open Biomedical Ontologies (OBO), in which the author has participated, are reported, and a framework for consistently connecting the Gene Ontology (GO) with the Taxonomy of Species is proposed and discussed.</p>

APA, Harvard, Vancouver, ISO, and other styles

8

Jadhav, Trishul. "Knowledge Based Gene Set analysis (KB-GSA) : A novel method for gene expression analysis." Thesis, University of Skövde, School of Life Sciences, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-4352.

Full text

Abstract:

<p>Microarray technology allows measurement of the expression levels of thousand of genes simultaneously. Several gene set analysis (GSA) methods are widely used for extracting useful information from microarrays, for example identifying differentially expressed pathways associated with a particular biological process or disease phenotype. Though GSA methods like Gene Set Enrichment Analysis (GSEA) are widely used for pathway analysis, these methods are solely based on statistics. Such methods can be awkward to use if knowledge of specific pathways involved in particular biological processes are the aim of the study. Here we present a novel method <strong><em>(Knowledge Based Gene Set Analysis: KB-GSA</em></strong>) which integrates knowledge about user-selected pathways that are known to be involved in specific biological processes. The method generates an easy to understand graphical visualization of the changes in expression of the genes, complemented with some common statistics about the pathway of particular interest.</p>

APA, Harvard, Vancouver, ISO, and other styles

9

Petri, Eric D. C. "Bioinformatics Tools for Finding the Vocabularies of Genomes." Ohio University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1213730223.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Bruskiewich, Richard Michael Maurice. "Genomic mapping, functional analysis and bioinformatics of the Werner syndrome locus (WRN)." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0013/NQ38860.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Fuente, Lorente Lorena de la. "Development of a bioinformatics approach for the functional analysis of alternative splicing." Doctoral thesis, Universitat Politècnica de València, 2019. http://hdl.handle.net/10251/124974.

Full text

Abstract:

[ES] Uno de los aspectos más apasionantes de la transcripción es la plasticidad transcriptómica y proteómica mediada por los procesos de regulación post-transcripcional (PTR). Los mecanismos PTR como el splicing alternativo (AS) y la poliadenilación alternativa (APA) han emergido como procesos estrechamente regulados que juegan un papel clave en la generación de la complejidad transcriptómica y están asociados con la coordinación de la diferenciación celular o el desarrollo de tejidos. Sin embargo nuestro conocimiento sobre cómo estos mecanismos regulan las propiedades de los productos resultantes para definir el fenotipo es aún muy reducido. La cantidad de variantes existentes y el amplio rango de posibles consecuencias funcionales, hacen su validación funcional una tarea impracticable si se realiza caso por caso. Además, la falta de herramientas para la evaluación funcional orientada a isoformas ha provocado que gran parte del trabajo computacional haya empleado pipelines ad-hoc aplicadas a sistemas biológicos específicos o simplemente hayan confiado en análisis de enriquecimiento GO, los cuales no son informativos del impacto en las propiedades de las isoformas que hay detrás de la regulación PTR. De hecho, a pesar de las más de sesenta mil publicaciones relativas al AS, muy pocas isoformas se han asociado con propiedades específicas, mientras que el número de nuevas variantes AS/APA con function desconocida crece exponencialmente debido a las técnicas de secuenciación de segunda generación (NGS). Además, y debido a limitaciones técnicas de las NGS para reconstruir la estructura de los transcritos, las tecnologías de secuenciación de tercera generación (TGS) están definiendo una nueva era en la que, por primera vez, es posible conocer la secuencia de elementos estructurales y funcionales en los mRNAs. En esta tesis se han abordado tres propósitos principales para poder avanzar en el estudio funcional de las isoformas. En primer lugar, con las TGS siendo cada vez más utilizadas, la evaluación de la calidad de los transcriptomas \textit{de novo} es esencial para asegurar la fiabilidad de la diversidad transcriptómica encontrada. La falta de análisis de calidad orientados a secuencias largas ha motivado el desarrollo de SQANTI, una pipeline automatizado para la exhaustiva evaluación de TGS transcriptomas. En segundo lugar, la información a nivel de gen de la mayoría de bases de datos funcionales sigue siendo el principal escollo para el estudio de la variabilidad entre isoformas, especialmente en el caso de las isoformas nuevas, en las que las bases de datos estáticas impiden su caracterización. Así, hemos diseñado IsoAnnot, que construye una base de datos de anotaciones funcionales con resolución a nivel de isoformas integrando información diseminada por múltiples bases de datos y métodos de predicción. Finalmente, la indisponibilidad de métodos para estudiar el impacto funcional de la regulación de isoformas, nos ha motivado a desarrollar tappAS, una herramienta dinámica, flexible y diseñada para facilitar el abordaje de este tipo de estudios. Por lo tanto, durante esta tesis hemos desarrollado una infraestructura que resuelve los retos principales del análisis funcional de isoformas, proporcionando un conjunto de nuevos métodos y herramientas que ofrecen una oportunidad única para explorar cómo el fenotipo se especifica post-transcripcionalmente, mediante la alteración de las propiedades funcionales de las isoformas expresadas. La aplicación de nuestro análisis a un doble sistema de diferenciación neuronal en ratón definió el efecto de la regulación de isoformas entre la diferenciación de motoneuronas y oligodendrocitos para múltiples elementos funcionales. Entre ellos, hemos descubierto regiones transmembrana que son diferencialmente incluidas en las isoformas expresadas entre ambos tipos celulares y cuya regulación podría estar contribuyendo al control de<br>[CAT] Un dels aspectes més emocionants de la biologia del transcriptoma és l'adaptabilitat contextual de transcriptomes i proteomes eucariotes mitjançant la regulació post-transcripcional (PTR). Els mecanismes PTR, com el splicing alternatiu (AS) i la poliadenilació alternativa (APA), s'han convertit en processos molt regulats que juguen un paper clau en la generació de la complexitat del transcriptoma i en la coordinació de la diferenciació cel·lular o del desenvolupament de teixits. No obstant això, el nostre coneixement de com aquests mecanismes imprimeixen característiques funcionals diferents al conjunt resultant d'isoformes per definir el fenotip observat és encara escàs. El nombre de variants de PTR i les seues conseqüències potencialment funcionals fa que la validació funcional sigui una tasca poc pràctica si es fa cas per cas. A més, la manca d'enfocaments funcionals orientats a isoformes ha fet que gran part del treballs computacionals per esbrinar qüestions funcionals a nivell de transcriptoma siguen estratègies computacionals ad hoc aplicades a sistemes biològics específics o bé basats en un simple anàlisi d'enriquiment GO, que no aporten informació sobre l'impacte de la PTR sobre les propietats de les isoformes. Així, malgrat les més de 60.000 publicacions existents sobre AS, poques de les isoformes existents s'han associat a propietats específiques, mentre que el nombre de noves variants AS/APA amb funcions desconegudes i fins i tot inexplorades augmenta de manera exponencial gràcies a la seqüenciació de nova generació (NGS). A causa de les limitacions tècniques del NGS per reconstruir l'estructura dels transcrits, la seqüenciació d'alt rendiment de transcrits de longitud completa mitjançant tecnologies de tercera generació (TGS) obre una nova era en la transcriptòmica, ja que millora la definició dels models genètics i, per primera vegada, permet associar amb precisió esdeveniments funcionals dins de la molècula d'ARN. Aquesta tesi aborda tres grans reptes per a progressar en l'estudi de la funció de les isoformes. En primer lloc, amb l'aparició i la popularitat creixent del TGS, la definició precisa i la caracterització completa dels transcriptomes de novo són essencials per garantir la qualitat de qualsevol conclusió sobre la diversitat del transcriptoma. La manca d'anàlisis de qualitat orientats a lectures llargues va motivar el desenvolupament de SQANTI (https://bitbucket.org/ ConesaLab / sqanti), una estratègia computacional automatitzada per a la caracterització estructural i l'avaluació de la qualitat dels transcriptomes de longitud completa. En segon lloc, els recursos funcionals existents centrats en el gen suposen una gran limitació per a l'estudi extensiu de la variabilitat funcional de les isoformes, especialment en les noves isoformes, que no es poden caracteritzar per bases de dades estàtiques. Per tant, vam dissenyar IsoAnnot, que construeix dinàmicament una base de dades amb anotacions funcionals a nivell d'isoforma, que utilitza com a informació d'entrada les seqüències dels transcrits i integra informació de diverses bases de dades i mètodes de predicció. Finalment, com no hi havia cap mètode per interrogar l'impacte funcional del PTR, vam desenvolupar nous enfocaments i eines fàcils d'utilitzar, com ara tappAS (http://tappas.org/), dissenyada per facilitar als investigadors els estudis funcionals de transcriptoma complet i de regulació d'isoformes en contexts específics. Per tant, aquesta tesi descriu el desenvolupament d'un marc d'anàlisi que aborda els reptes fonamentals de l'anàlisi funcional d'isoformes. Aplicada a un sistema de diferenciació neuronal murina, vam descobrir regions transmembrana específiques d'isoformes, la modulació de les quals per PTR podria contribuir a controlar la dinàmica mitocondrial específica del tipus cel·lular durant la determinació del destí neuronal.<br>[EN] One of the most exciting aspects of transcriptome biology is the contextual adaptability of eukaryotic transcriptomes and proteomes by post-transcriptional regulation (PTR). PTR mechanisms such as alternative splicing (AS) and alternative polyadenylation (APA) have emerged as tightly regulated processes playing a key role in generating transcriptome complexity and coordinating cell differentiation or tissue development. However, how these mechanisms imprint distinct functional characteristics on the resulting set of isoforms to define the observed phenotype remains poorly understood. The number of PTR variants and their resulting range of potentially functional consequences makes their functional validation an impractical task if done on a case-by-case basis. Besides, the lack of isoform-oriented functional profiling approaches has made that much of the computational work done to elucidate transcriptome-wide functional questions has either involved ad hoc computational pipelines applied to specific biological systems or has relied on simple GO-enrichment analysis that are not informative about the PTR impact on isoform properties. Thus, even though more than 60,000 publications on AS, a few number of existing isoforms have been associated with specific properties while the number of novel AS/APA variants with unknown and even unexplored functions is exponentially increasing thanks to the use of next-generation sequencing (NGS). Due to the technical limitations of NGS to reconstruct the transcript structure, high-throughput sequencing of full-length transcripts using third-generation technologies (TGS) is opening up a new transcriptomics era that enhances the definition of gene models and, for the first time, enables to precisely associate functional events within the RNA molecule. This thesis addresses three major challenges to the progression of the study of isoform function. First, with the emergence and increasing popularity of TGS, the accurate definition and comprehensive characterisation of de novo transcriptomes is essential to ensure the quality of any conclusions on transcriptome diversity drawn from these data. The lack of long-read oriented quality aware analysis motivated the development of SQANTI \url{(https://bitbucket.org/ConesaLab/sqanti)}, an automated pipeline for the structural characterization and quality assessment of full-length transcriptomes. Secondly, the gene-centric nature of functional resources remained the major limitation to the extended study of functional isoform variability, especially for novel isoforms, which cannot be characterised by static databases. Thus, we designed IsoAnnot, which dynamically constructs an isoform-resolved rich database of functional annotations by using as input transcript sequences and integrating information disseminated across several databases and prediction methods. Finally, because no methods to interrogate the functional impact of PTR were available, we developed novel approaches and user-friendly tools such as tappAS \url{(http://tappas.org/)}, designed to facilitate researchers the transcriptome-wide functional study of context-specific isoform regulation. Thereby, this thesis describes the development of an analysis framework that tackles the fundamental challenges of the isoform functional analysis by providing a set of novel methods and tools that offer an unique opportunity to explore how the phenotype is specified by altering the functional characteristics of expressed isoforms. Applied to a murine neural differentiation system, our pipeline profiled the effect of isoform regulation on the inclusion of several functional elements within transcripts between motor-neuron and oligodendrocyte differentiation systems and specifically, we discovered isoform-specific transmembrane regions whose modulation by PTR might contribute to control cell type-specific mitochondrial dynamics during neural fate determination.<br>This work was funded by the following grants: From 2014 to 2018. FPU: Training programme for Academic Staff. Spanish Ministry of Education, FPU2013/02348. From 2016 to 2019. NOVELSEQ: Novel methods for new challenges in the analysis of high-throughput sequencing data. MINECO, BIO2015-1658-R. From 2014 to 2017. DEANN: Developing a European American NGS Network. EU Marie Curie IRSES, GA-612583.<br>Fuente Lorente, LDL. (2019). Development of a bioinformatics approach for the functional analysis of alternative splicing [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/124974<br>TESIS

APA, Harvard, Vancouver, ISO, and other styles

12

Kelly, Libusha. "Functional hotspots revealed by mutational, evolutionary, and structural characterization of ABC transporters." Diss., Search in ProQuest Dissertations & Theses. UC Only. Search in ProQuest Dissertations & Theses. UC Only, 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3324617.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Caufield, J. Harry. "Interactomics-Based Functional Analysis: Using Interaction Conservation To Probe Bacterial Protein Functions." VCU Scholars Compass, 2016. http://scholarscompass.vcu.edu/etd/4580.

Full text

Abstract:

The emergence of genomics as a discrete field of biology has changed humanity’s understanding of our relationship with bacteria. Sequencing the genome of each newly-discovered bacterial species can reveal novel gene sequences, though the genome may contain genes coding for hundreds or thousands of proteins of unknown function (PUFs). In some cases, these coding sequences appear to be conserved across nearly all bacteria. Exploring the functional roles of these cases ideally requires an integrative, cross-species approach involving not only gene sequences but knowledge of interactions among their products. Protein interactions, studied at genome scale, extend genomics into the field of interactomics. I have employed novel computational methods to provide context for bacterial PUFs and to leverage the rich genomic, proteomic, and interactomic data available for hundreds of bacterial species. The methods employed in this study began with sets of protein complexes. I initially hypothesized that, if protein interactions reveal protein functions and interactions are frequently conserved through protein complexes, then conserved protein functions should be revealed through the extent of conservation of protein complexes and their components. The subsequent analyses revealed how partial protein complex conservation may, unexpectedly, be the rule rather than the exception. Next, I expanded the analysis by combining sets of thousands of experimental protein-protein interactions. Progressing beyond the scope of protein complexes into interactions across full proteomes revealed novel evolutionary consistencies across bacteria but also exposed deficiencies among interactomics-based approaches. I have concluded this study with an expansion beyond bacterial protein interactions and into those involving bacteriophage-encoded proteins. This work concerns emergent evolutionary properties among bacterial proteins. It is primarily intended to serve as a resource for microbiologists but is relevant to any research into evolutionary biology. As microbiomes and their occupants become increasingly critical to human health, similar approaches may become increasingly necessary.

APA, Harvard, Vancouver, ISO, and other styles

14

Shateri, Najafabadi Hamed. "A systems approach towards a functional annotation of the genome of Trypanosoma brucei." Thesis, McGill University, 2012. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=106493.

Full text

Abstract:

The pathogenic species of trypanosomatids, including Trypanosoma brucei, T. cruzi, and Leishmania spp, cause serious human as well as animal diseases, with a very high incidence and mortality rate if untreated. Although the genome sequences of several trypanosomatids have been known for several years, many aspects of gene function and gene regulation are still unclear in these organisms. Most importantly, the lack of similarity between the majority of their genes and characterized genes of other organisms has limited our understanding of the gene functions in trypanosomatids. Not only the functions of many genes are unknown, the factors that are involved in their regulation are mostly uncharacterized. Trypanosomatids primarily rely on post-transcriptional programs for regulation of gene expression, and transcriptional regulation is of least importance. The genomes of these organisms harbour a large number of RNA-binding proteins with potential role in regulating mRNA stability and translation; however, the sequence specificity of these RNA-binding proteins and their function is mostly unknown. The focus of this thesis is on development of new methods for homology-independent functional characterization of genes in trypanosomatids, and deciphering the programs that are involved in their regulation. First, I describe a novel universal relationship between codon usage and gene function, and show the utility of this relationship for functional characterization of genes in various organisms, including trypanosomatids. This relationship most probably points to the role of codon usage in dynamic regulation of protein expression in different conditions, and helps the cell to adapt to new environments and conditions by synchronously regulating proteins with required functions. Then, I introduce a computational approach for identification of function-specific cis-acting regulatory elements, and demonstrate the utility of this approach for identification of potential regulatory elements in trypanosomatids, as well as for prediction of gene function based on the flanking regulatory sequences. I also show that combination of cis-regulatory elements and codon usage is a strong predictor of gene function in trypanosomatids. In addition to these methods, which can identify biological processes and pathways, a new method for identification of protein molecular functions based on short sequence signatures is introduced in this thesis. I show that this new method is able to identify function-specific protein short motifs that present functional sites on proteins, and demonstrate the utility of these motifs in predicting protein molecular function in trypanosomatids. In addition to these sequence-based approaches, I also explore the possibility of predicting trypanosomatid gene functions based on co-expression. I present the first co-expression network of T. brucei, which is constructed by combining several microarray datasets from different studies, and use it for predicting new components of several essential pathways in this organism. This analysis suggested the presence of a conserved post-transcriptional regulatory network in trypanosomatids, which encouraged us to develop a novel framework for identification of regulatory programs with high network-level conservation across multiple species. This framework revealed an extensive set of conserved regulatory programs in trypanosomatids, many of which could be validated using available expression datasets as well as our microarray profiles of chemical perturbations. The studies described here contribute significantly to functional annotation of genes in trypanosomatids, and identify the regulatory mechanisms that govern gene expression in these organisms. Furthermore, the introduced methods can be used for functional annotation of many uncharacterized genes and identification of gene regulatory programs in virtually all organisms with available genome sequences.<br>Les espèces pathogènes de l'ordre des trypanosomatida, incluant Trypanosoma brucei, T. cruzi, et différentes espèces de Leishmania sont responsables de sérieuses maladies humaines et animales, avec une très forte incidence et taux de mortalité élevé lorsque non soignées. Bien que les génomes de plusieurs trypanosomatida soient disponibles depuis plusieurs années, de nombreux aspects de la fonction et de la régulation génique restent inexplorés chez ces organismes. Les trypanosomatida se reposent principalement sur des mécanismes post-transcriptionels pour la régulation de l'expression génique, et la régulation de la transcription n'a que peu d'importance. Les génomes de ces organismes hébergent un grand nombre de protéine se liant à l'ARN avec des rôles potentiels dans la régulation de la stabilité et de la traduction des ARNm. Néanmoins, les séquences spécifiques de ces protéines se liant à l'ARN et leurs fonctions restent principalement méconnues. L'objectif de cette thèse se situe au niveau du développement de nouvelles méthodes indépendantes de l'homologie pour permettre la caractérisation fonctionnelles de gènes chez les trypanosomatida, et de déchiffrer les mécanismes impliqués dans cette régulation. Premièrement, je décris une nouvelle relation universelle entre l'utilisation des codons et la fonction génique, et montre l'utilité de cette relation pour la caractérisation de gènes dans divers organismes, incluant les trypanosomatida. Cette relation pointe probablement vers un rôle de l'utilisation des codons dans la régulation dynamique de l'expression protéique sous diverses conditions, et aide la cellule à s'adapter à de nouveaux environnements et conditions en synchronisant la régulation des protéines avec les fonctions requises. J'ai introduis une approche computationnelle pour l'identification d'éléments cis-régulateurs fonction-spécifiques et démontré l'utilité de cette approche pour l'identification d'éléments régulateurs potentiels chez les trypanosomatida, ainsi que pour la prédiction de fonctions géniques basées sur les séquences régulatrices flanquantes. En plus de ces méthodes, qui peuvent identifier biologiquement des phénomènes et des voies métaboliques, une nouvelle procédure pour l'identification des fonctions moléculaires des protéines, basée sur de courtes signatures de séquences, est introduite dans cette thèse. Outre cette approche basée sur les séquences, j'explore également la possibilité de prédire la fonction de certains gènes des trypanosomatida en me basant sur la co-expression. Je présente le premier réseau de co-expression de T. brucei, élaboré en combinant plusieurs jeux de données de microarray provenant de différentes études, et les utilise pour prédire de nouveaux éléments de multiples voies métaboliques essentielles dans cet organisme. Cette analyse suggère la présence de réseaux post-transcriptionels conservés chez les trypanosomatida, ce qui nous encourage à mettre au point un nouveau cadre expérimental pour l'identification de mécanismes régulateurs avec un fort niveau de conservation au sein de multiples espèces. Ce cadre expérimental a révélé une somme importante de mécanismes régulateurs conservés chez les trypanosomatida, dont beaucoup pourraient êtres validés en utilisant des données d'expression disponibles ainsi qu'avec des profils de perturbations chimiques de microarrays. Les études décrites ici contribuent significativement à l'annotation génique fonctionnelle chez les trypanosomatida, et permet d'identifier des mécanismes de régulation qui gouvernent l'expression génique de ces organismes. De plus, les méthodes introduites peuvent être utilisée pour l'annotation fonctionnelle de nombreux gènes non-caractérisés et l'identification de programmes de régulation génique dans virtuellement n'importe quel organisme dont le génome est disponible.

APA, Harvard, Vancouver, ISO, and other styles

15

Sontheimer, Jana. "Functional characterization of proteins involved in cell cycle by structure-based computational methods." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2012. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-86778.

Full text

Abstract:

In the recent years, a rapidly increasing amount of experimental data has been generated by high-throughput technologies. Despite of these large quantities of protein-related data and the development of computational prediction methods, the function of many proteins is still unknown. In the human proteome, at least 20% of the annotated proteins are not characterized. Thus, the question, how to predict protein function from its amino acid sequence, remains to be answered for many proteins. Classical bioinformatics approaches for function prediction are based on inferring function from well-characterized homologs, which are identified based on sequence similarity. However, these methods fail to identify distant homologs with low sequence similarity. As protein structure is more conserved than sequence in protein families, structure-based methods (e.g. fold recognition) may recognize possible structural similarities even at low sequence similarity and therefore provide information for function inference. These fold recognition methods have already been proven to be successful for individual proteins, but their automation for high-throughput application is difficult due to intrinsic challenges of these techniques, mainly caused by a high false positive rate. Automated identification of remote homologs based on fold recognition methods would allow a signi cant improvement in functional annotation of proteins. My approach was to combine structure-based computational prediction methods with experimental data from genome-wide RNAi screens to support the establishment of functional hypotheses by improving the analysis of protein structure prediction results. In the first part of my thesis, I characterized proteins from the Ska complex by computational methods. I showed the benefit of including experimental information to identify remote homologs: Integration of functional data helped to reduce the number of false positives in fold recognition results and made it possible to establish interesting functional hypotheses based on high con dence structural predictions. Based on the structural hypothesis of a GLEBS motif in c13orf3 (Ska3), I could derive a potential molecular mechanism that could explain the observed phenotype. In the second part of my thesis, my goal was to develop computational tools and automated analysis techniques to be able to perform structure-based functional annotation in a high-throughput way. I designed and implemented key tools that were successfully integrated into a computational platform, called StrAnno, which I set up together with my colleagues. These novel computational modules include a domain prediction algorithm and a graphical overview that facilitates and accelerates the analysis of results. StrAnno can be seen as a first step towards automatic functional annotation of proteins by structure-based methods. First, the analysis of long hit lists to identify promising candidates for further analysis is substantially facilitated by integration and combination of various sequence-based computational tools and data from functional databases. Second, the developed post-processing tools accelerate the evaluation of structural and functional hypotheses. False positives from the threading result lists are removed by various filters, and analysis of the possible true positives is greatly enhanced by the graphical overview. With these two essential benefits, fold recognition techniques are applicable to large-scale approaches. By applying this developed methodology to hits from a genome-wide cell cycle RNAi screen and evaluating structural hypotheses by molecular modeling techniques, I aimed to associate biological functions to human proteins and link the RNAi phenotype to a molecular function. For two selected human proteins, c20orf43 and HJURP, I could establish interesting structural and functional hypotheses. These predictions were based on templates with low sequence identity (10-20%). The uncharacterized human protein c20orf43 might be a E3 SUMO-ligase that could be involved either in DNA repair or rRNA regulatory processes. Based on the structural hypotheses of two domains of HJURP, I predicted a potential link to ubiquitylation processes and direct DNA binding. In addition, I substantiated the cell cycle arrest phenotype of these two genes upon RNAi knockdown. Fold recognition methods are a promising alternative for functional annotation of proteins that escape sequence-based annotation due to their low sequence identity to well-characterized protein families. The structural and functional hypotheses I established in my thesis open the door to investigate the molecular mechanisms of previously uncharacterized proteins, which may provide new insights into cellular mechanisms.

APA, Harvard, Vancouver, ISO, and other styles

16

Kumar, Chanchal. "Bioinformatics methods and applications for functional analysis of mass spectrometry based proteomics data." Diss., lmu, 2008. http://nbn-resolving.de/urn:nbn:de:bvb:19-124512.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Diboun, I. "Bioinformatics protocols for analysis of functional genomics data applied to neuropathy microarray datasets." Thesis, University College London (University of London), 2010. http://discovery.ucl.ac.uk/19298/.

Full text

Abstract:

Microarray technology allows the simultaneous measurement of the abundance of thousands of transcripts in living cells. The high-throughput nature of microarray technology means that automatic analytical procedures are required to handle the sheer amount of data, typically generated in a single microarray experiment. Along these lines, this work presents a contribution to the automatic analysis of microarray data by attempting to construct protocols for the validation of publicly available methods for microarray. At the experimental level, an evaluation of amplification of RNA targets prior to hybridisation with the physical array was undertaken. This had the important consequence of revealing the extent to which the significance of intensity ratios between varying biological conditions may be compromised following amplification as well as identifying the underlying cause of this effect. On the basis of these findings, recommendations regarding the usability of RNA amplification protocols with microarray screening were drawn in the context of varying microarray experimental conditions. On the data analysis side, this work has had the important outcome of developing an automatic framework for the validation of functional analysis methods for microarray. This is based on using a GO semantic similarity scoring metric to assess the similarity between functional terms found enriched by functional analysis of a model dataset and those anticipated from prior knowledge of the biological phenomenon under study. Using such validation system, this work has shown, for the first time, that ‘Catmap’, an early functional analysis method performs better than the more recent and most popular methods of its kind. Crucially, the effectiveness of this validation system implies that such system may be reliably adopted for validation of newly developed functional analysis methods for microarray.

APA, Harvard, Vancouver, ISO, and other styles

18

Wobser, Madison June. "Determination of the Structure of Human Testis Protein Maelstrom and Examination of Functional Differences among Human Genome Variants." Walsh University Honors Theses / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=walshhonors1555605939425746.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Bebek, Gurkan. "Functional Characteristics of Cancer Driver Genes in Colorectal Cancer." Case Western Reserve University School of Graduate Studies / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=case1495012693440067.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Arvidson, Ryan Scott. "Venomics and Functional Analysis of Venom From the Emerald Jewel Wasp, Ampulex compressa." Thesis, University of California, Riverside, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10153650.

Full text

Abstract:

<p> My research involves biochemical analysis of venom from a fascinating parasitoid jewel wasp <i>Ampulex compressa.</i> Most parasitoid wasps envenomate the host by stinging into the body cavity to cause paralysis and developmental arrest, prior to deposition of eggs externally or within the body cavity. <i>A. compressa</i> instead uses a different subjugation strategy by injecting venom directly into the central nervous system, eliciting a behavioral sequence culminating in hypokinesia, a 7–10 day lethargy advantageous to wasp reproduction. Hypokinesia is a specific, venom-induced behavioral state characterized by suppression of the escape response and reduced spontaneous walking, leaving other motor functions unaffected. This specificity of action is particularly unique among venoms and interestingly, effects of the venom on the escape response are reversible as the cockroach may recover after 7–10 days if not consumed by the wasp larvae. Venom-induced hypokinesia raises an interesting biological question: How can such a potent biochemical cocktail cause such long-lasting, specific, yet reversible effects on behavior? I approached this question in two ways: objective one—bioinformatic analysis of the venom and venom gland tissue to determine what the venom is made of, and objective two—functional analysis of key venom components to determine how the venom works. To address objective 1, I used advanced bioinformatics techniques to generate transcriptomes of the venom tissue and proteomes of the venom and venom tissue. Next generation sequencing of venom gland RNA has yielded full-length coding sequences and quantification of venom transcript levels, while mass spectroscopy based protein analysis has validated the presence of venom proteins. These analyses will allow construction of a comprehensive <i>A. compressa</i> “venome” that will help inform functional analyses of the venom and its role in hypokinesia induction. For objective two, I focused on the characterization of the most abundant peptide in the venom, tentatively named Ampulexin 1, and pharmacological analysis of an interesting venom peptide neurotransmitter, called tachykinin. Analysis of venom tachykinin action on cockroach brain receptors may reveal an interesting case of the evolution of a neurotransmitter from one animal to target the nervous system of another.</p>

APA, Harvard, Vancouver, ISO, and other styles

21

Costello, James Christopher. "Data integration and applications of functional gene networks in Drosophila melanogaster." [Bloomington, Ind.] : Indiana University, 2009. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3380070.

Full text

Abstract:

Thesis (Ph.D.)--Indiana University, Dept. of Informatics, 2009.<br>Title from PDF t.p. (viewed on Jul 19, 2010). Source: Dissertation Abstracts International, Volume: 70-12, Section: B, page: 7296. Advisers: Mehmet M. Dalkilic; Justen R. Andrews.

APA, Harvard, Vancouver, ISO, and other styles

22

Mohan, Amrita. "A study of intrinsic disorder and its role in functional proteomics." [Bloomington, Ind.] : Indiana University, 2009. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3386707.

Full text

Abstract:

Thesis (Ph.D.)--Indiana University, Dept. of Informatics, 2009.<br>Title from PDF t.p. (viewed on Jul 22, 2010). Source: Dissertation Abstracts International, Volume: 70-12, Section: B, page: 7298. Adviser: Predrag Radivojac.

APA, Harvard, Vancouver, ISO, and other styles

23

Thomas, Sterling. "A Novel Method to Detect Functional Subgraphs in Biomolecular Networks." VCU Scholars Compass, 2010. http://scholarscompass.vcu.edu/etd/154.

Full text

Abstract:

Several biomolecular pathways governing the control of cellular processes have been discovered over the last several years. Additionally, advances resulting from combining these pathways into networks have produced new insights into the complex behaviors observed in cell function assays. Unfortunately, identification of important subnetworks, or “motifs”, in these networks has been slower in development. This study focused on identifying important network motifs and their rate of occurrence in two different biomolecular networks. The two networks evaluated for this study represented both ends of the spectrum of interaction knowledge by comparing a well defined network (apoptosis) with and poorly studied network that was early in development (autism). This study identified several motifs that could be important in governing and controlling cellular processes in healthy and diseased cells. Additionally, this study revealed an inverse relationship when comparing the occurrence rate of these motifs in apoptosis and autism.

APA, Harvard, Vancouver, ISO, and other styles

24

Gadekar, Veerendra Parsappa. "Functional exploration of antisense long non-coding RNAs containing transposable elements : a bioinformatics approach." Thesis, Open University, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.701364.

Full text

Abstract:

Long non-coding RNA (lncRNAs) show a wide range of regulatory functions at the transcriptional and post-transcripltional levels both in the nucleus and cytoplasm. Recently, antisense lncRNAs (ASlncRNAs) were reported to up-regulate protein synthesis post-transcriptionally through a mechanism depending on an embedded inverted SINE B2 and 5' overlap to the target mRNAs. Such ASlncRNAs are also referred as SINEUPs. Synthetic SINEUPs with identical modular organization were also demonstrated to exert the same activity suggesting a functional relationship between SINE repetitive elements and ASlncRNAs. In order to gain a broader insight on the contribution of transposable elements (TEs) in the sequence composition of ASlncRNAs, I have developed a bioinformatic pipeline that can identify and characterize. transcripts containing TEs and analyze TEs coverage for different classes of coding/non-coding sense/antisense (S/AS) pairs. I aimed at identifying if the functional activity of SINEUPs could be a widespread phenomenon across multiple similar natural ASlnRNAs in the transcriptomes of the extensively studied model organisms that have a well annotated catalog of lncRNAs. From my initial analysis I identified human and mouse are the two species that showed a significant coverage enrichment of SINE repeats among ASlncRNAs. I further performed several functional enrichment analysis for the sense coding genes overlapping to ASlncRNAs taking into consideration of different characteristics of the 5' binding domain and the 3' embedded SINE repetitive elements. This permitted me to identify the effect of these modular features over the functional associations of sense coding genes. The results of the analysis showed that the products of coding genes associated to ASlncRNAs containing SINEs are significantly enriched for rnitochondriallocalization. Further, to determine if these ASlncRNAs could exert SINEUP-like activity during stress, I analyzed the data from a published custom rnicroarray experiment study, that were associated to the polysome fractions of MRCS cell lysates in control and oxidative stress condition. The results revealed that the ASlncRNA carrying inverted or direct SINE repeats and their corresponding sense coding genes do not show any significant differential polysome loading in stress with respect to normal conditions, which is not a desired characteristic of a potential SINEUP. However, ASlncRNAs with inverted and direct SINE repeats corresponding to high translating polysome fractions showed a significantly higher ratio of means for RNA levels in stress over control, in contrast to noASlncRNA. This suggests that the ASlncRNA containing SINE elements are the key RNA molecules that are active during stress, although to determine if they are also involved in the increased polysome loading of their respective sense coding mRNAs, there is a need of further experimentation and exploration. Altogether, the work presented in this thesis provides a novel bioinformatics approach to study

APA, Harvard, Vancouver, ISO, and other styles

25

Gowrisankar, Sivakumar. "Predicting Functional Impact of Coding and Non-Coding Single Nucleotide Polymorphisms." University of Cincinnati / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1225422057.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Rands, Chris M. D. "Analyses of functional sequence in mammalian and avian genomes." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:27e0ac20-eb27-423c-9493-a8a1c6cc57b8.

Full text

Abstract:

The first draft sequence of the human genome was published over a decade ago, yet interpreting the functional importance of nucleotides in genomes is still an ongoing challenge. I took a comparative genomic approach to identify functional sequence using signatures of natural selection in DNA sequences. Mutations that are purged or propagated by selection mark sequences of significance for biological fitness. I developed and refined methods for estimating the quantity of sequence constrained with respect to insertions and deletions (indels) between two genome sequences, a quantity I termed α<sub>selIndel</sub>. This sequence is evolving more slowly than surrounding neutral sequence due to the purging of deleterious indel variants, and thus this sequence is likely to be functional. I estimated α<sub>selIndel</sub> between diverse mammalian and avian species pairs, and found a strong negative correlation between α<sub>selIndel</sub> and the divergence between the species’ genome sequences. This implies that functional sequence turns over rapidly as it is lost and gained over time. I quantified the variable levels of sequence constraint, and rates of sequence turnover, for different types of human biochemically annotated element. Furthermore, I found that similar rates of functional turnover have occurred across mammalian and avian evolution. Finally, I identified positively selected amino acid residues that may be important for Darwin’s finch beak development, and found evidence of adaptively evolving reproductive proteins in the ancestral songbird lineage. Collectively these results demonstrate the wide-spread nature of lineage-specific functional sequence with implications for understanding species traits and the use of model organisms to inform human biology.

APA, Harvard, Vancouver, ISO, and other styles

27

Lomelin, David. "Using human genetic variation to predict functional elements in non-coding genomic regions." Diss., Search in ProQuest Dissertations & Theses. UC Only, 2010. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3390057.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Mitchell, Carter Alexander. "Structural, functional, and computational insights into the ANL superfamily of enzymes." Thesis, State University of New York at Buffalo, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=3598714.

Full text

Abstract:

<p> Members of the ANL superfamily of enzymes are involved in primary and secondary metabolism throughout all domains of life and identify key pathways that contribute to essential physiological reactions as well as defense mechanisms to evade competition. Specifically, acetyl-CoA synthetases are directly involved in energy metabolism, while NonRibosoaml Peptide Synthetases and some Aryl-CoA Ligases produce secondary natural products that confer virulence for the producing organism. Due to the ANL superfamily's ubiquitous involvement in primary and secondary metabolism, gaining an understanding of how these enzymes work and identifying ways to regulate them could provide an alternative route for antibiotic targets. It is well documented that domain alternation is paramount for the ANL superfamily of enzymes including the adenylation and thioester-forming reactions of NRPS adenylation domains. This thesis utilizes structural and functional analysis in conjunction with computational methods to further our understanding of these unique enzymes. </p><p> In chapter 2 we present the structure of an adenylation:Peptidyl Carrier Protein di-omain NRPS from the cryptic PA1221 biosynthetic operon from <i> Pseudomonas aeruginosa.</i> The PA1221 structure is the second example of an adenylation:PCP in the PDB and validates the chimeric fusion interactions of EntE-B. The similar interacting regions are between the 2<sup>nd</sup> PCP helix and a helix in the N-terminal subdomain of the adenylation domain as well as the loop connecting the longest β-strands of the C-terminal subdomains interacting with loop 1 of the PCP. </p><p> Chapter 3 presents the structure of an acetoacetatyl-CoA Synthetase that is a confirmed substrate for a protein acetyltransferase, PatA, for inactivation through acetylation of the catalytic A10 lysine. This <i>Streptomyces lividans</i> acetoacetyl-CoA synthetase is the first structure to fully resolve the loop connecting C-terminal extension helix to the C-terminal subdomain. The C-terminal extension is only present in ACS proteins revealing an interaction where the C-terminal extension stabilizes the dynamic P-loop in the adenylate forming conformation. </p><p> In chapter 4 we further explore the PA1221 operon by functionally identifying the substrate preference of PA1215, the hypothetical fatty-acyl-CoA Ligase, that is proposed to acylate the charge PCP of PA1221. We computationally validate the substrate preference with a homology model and AutoDock to gain insight into the proteins slow kinetics. We also provide further insight into the biochemistry of a subset of ANL superfamily members, the phenylacetic acid CoA ligases, involved in the utilization of aryl-carboxylic acids as a carbon source as well as the derivatization of penicillin. We analyze their unique dimeric structures identifying structural motifs that are contributed through the dimeric interface, but are otherwise located to different sides of the enzyme in a monomeric form. </p><p> Finally, to help identify how the protein moves between the two productive conformations we subject members of the superfamily to computational dynamic simulations including Anisotropic Network Modeling, Interpolative Elastic Network Modeling, all-atom molecular dynamics, and analyze the output from these methods with Principal Component and Normal Mode Analysis. We developed a method to visualize a dynamic reaction coordinate through measuring the Conformation Determining Angle (defined by structural motifs that are present in superfamily members) and use this metric to interrogate all ANL superfamily member PDB entries for domain organization. Finally, we test our hypothesis that domain alternation proceeds through an extended, open conformation with structural comparisons and MD. Here we report functional and structural analysis of ANL superfamily members that are related through bacterial cell metabolism and natural product biosynthesis.</p>

APA, Harvard, Vancouver, ISO, and other styles

29

Soares, Dinesh Christopher. "Bioinformatics studies on sequence, structure and functional relationships of proteins involved in the complement system." Thesis, University of Edinburgh, 2007. http://hdl.handle.net/1842/11424.

Full text

Abstract:

Regulators of complement activation (RCA) ensure that a complement-mediated immune response is proportionate and targeted against infection. RCA proteins are characterised by numerous occurrences of a single module-type; the complement control protein (CCP) module. In this work, comprehensive bioinformatics analyses of sequence and structure of CCP modules was undertaken. Through extensive database and literature searches, CCP module sequences and structures were assembled and large-scale <i>all-against-all </i>sequence and structure comparisons performed, along with analysis of intermodular orientations for pairs of modules and larger fragments. Based upon optimal use of experimentally determined CCP module structures as templates, an automated large-scale protein structure comparative modelling procedure was implemented for a large set of CCP-module sequences. The models are publicly available online at “The CCP module model database”, which also serves as a comprehensive resource for information on CCP modules. The models are shown to serve as a rich vein of information for design of mutants, interpretation of phenotypic consequences of polymorphisms, and prediction of function. For example, the models proved useful for inferring the consequences of several disease-associated sequence variations of complement proteins, CR1, factor H, MCP; and another CCP-containing protein, SRPX2. Finally, homology models of C5 and C5b were created on the basis of the recent landmark publication of C3 and C3b structures. This exercise revealed the existence of a novel putative disulfide bond specific to C5. Additionally it helped revisit previous peptide and mutant-based studies and provided insight into the latter stages of complement assembly.

APA, Harvard, Vancouver, ISO, and other styles

30

Atkinson, Holly J. "The structural and functional landscape of protein superfamilies: From the thioredoxin fold to parasite peptidases." Diss., Search in ProQuest Dissertations & Theses. UC Only, 2009. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3359576.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Liu, Yichuan Tozeren Aydin. "Functional signatures in protein-protein interactions and their impact on signaling pathways /." Philadelphia, Pa. : Drexel University, 2010. http://hdl.handle.net/1860/3257.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Sharman, Joanna Louise. "Visualising Plasmodium falciparum functional genomic data in MaGnET : malaria genome exploration tool." Thesis, University of Edinburgh, 2009. http://hdl.handle.net/1842/5936.

Full text

Abstract:

Malaria affects the lives of 500 million people around the world each year. The disease is caused by protozoan parasites of the genus Plasmodium, whose ability to evade the immune system and quickly evolve resistance to drugs poses a major challenge for disease control. The results of several Plasmodium genome sequencing projects have revealed how little is known about the function of their genes (over half of the approximately 5400 genes in Plasmodium falciparum, the most deadly human parasite, are annotated as hypothetical ). Recently, several large-scale studies have attempted to shed light on the processes in which genes are involved; for example, the use of DNA microarrays to profile the parasite s gene expression. With the emergence of varied types of functional genomic data comes a need for effective tools that allow biologists (and bioinformaticians) to explore these data. The goal of exploration/browsing-style analyses will typically be to derive clues towards the function of thus far uncharacterised gene products, and to formulate experimentally testable hypotheses. Graphic interfaces to individual data sets are obviously beneficial in this endeavour. However, effective visual data exploration requires also that interfaces to different functional genomic data are integrated and that the user can carry forward a selected group of genes (not merely one at a time) across a variety of data sets. Non-expert users especially benefit from workbenchlike tools offering access to the data in this way. Still, only very few of the contemporary publicly available software have implemented such functionality. This work introduces a novel software tool for the integrated visualisation of functional genomic data relating to P. falciparum: the Malaria Genome Exploration Tool (MaGnET). MaGnET consists of a light-weight Java program for effective visualisation linked to a MySQL database for data storage. In order to maximise accessibility, the program is publicly available over the World Wide Web (http://www.malariagenomeexplorer.org/). MaGnET incorporates a Genome Viewer for visualising the location of genomic features, a Protein-Protein Interaction Viewer for visualising networks of experimentally determined interactions and an Expression Data Viewer for displaying mRNA and protein expression data. Complex database queries can easily be constructed in the Data Analysis Viewer. An advantage over most other tools is that all sections are fully integrated, allowing users to carry selected groups of genes across different datasets. Furthermore, MaGnET provides useful advanced visualisation features, including mapping of expression data onto genomic location or protein-protein interaction network. The inclusion of available third-party Java software has expanded the visualisation capability of MaGnET; for example, the Jmol viewer has been incorporated for viewing 3-D protein structures. An effort has been made to only include data in MaGnET that is at least of reasonable quality. The MaGnET database collates experimental data from various public Plasmodium resources (e.g. PlasmoDB) and from published functional genomic studies, such as DNA microarrays. In addition, through careful filtering and labelling we have been able to include some predicted annotation that has not been experimentally confirmed, such as Gene Ontology and InterPro functional assignments and modelled protein structures. The application of MaGnET to malaria biology is demonstrated through a series of small studies. Initial examples show how MaGnET can be used to effectively demonstrate results from previously published analyses. This is followed up by using MaGnET to make a set of predictions about the possible functions of selected uncharacterised genes and suggesting follow-up experiments.

APA, Harvard, Vancouver, ISO, and other styles

33

Phatak, Mukta. "Lipid Accessibility Prediction and Identification of Functional Hotspots in Transmembrane Proteins." University of Cincinnati / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1267631564.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Miller, Shannon Dawn. "Thesis: Functional and Phylogenetic Analysis of EXO-1,3-Beta Glucanase Gene (PinsEXO1) from the Pathogenic Comycete Pythium Insidiosum." Bowling Green State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1430494422.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Oelofse, Andries Johannes. "Development of a MAIME-compliant microarray data management system for functional genomics data integration." Pretoria : [s.n.], 2006. http://upetd.up.ac.za/thesis/available/etd-08222007-135249.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Maragkakis, Emmanouil Verfasser], Ivo [Akademischer Betreuer] Grosse, Artemis-Geōrgia [Akademischer Betreuer] [Chatzēgeōrgiu, and Wojciech [Akademischer Betreuer] Makalowski. "Bioinformatics approach for microRNA target prediction and functional analysis / Emmanouil Maragkakis. Betreuer: Ivo Grosse ; Artemis Hatzigeorgiou ; Wojciech Makalowski." Halle, Saale : Universitäts- und Landesbibliothek Sachsen-Anhalt, 2011. http://d-nb.info/1025202783/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Hedberg, Lilia. "Identification of obesity-associated SNPs in the human genome : Method development and implementation for SOLiD sequencing data analysis." Thesis, Linköpings universitet, Institutionen för klinisk och experimentell medicin, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-57932.

Full text

Abstract:

Over the last few years, genome-wide association studies (GWAS) have been used to identify numerous obesity associated SNPs in the human genome. By using linkage studies, candidate obesity genes have been identified. When SNPs in the first intron of FTO were found to be associated to BMI, it became the first gene to be linked to common obesity. In order to look for causative explanations behind the associated SNPs, a re-sequencing of FTO had been performed on the SOLiD sequencing platform. In-house candidate gene, SLCX, was also sequenced in order to evaluate a potential obesity association. The purpose of this project was to analyse the sequences and also to evaluate the quality of the SOLiD sequencing. A part of the project consisted in performing PCRs and selecting genomic regions for future sequencing projects. I developed and implemented a sequence analysis strategy to identify obesity associated SNPs. I found 39 obesity-linked SNPs in FTO, a majority of which were located in introns 1 and 8. I also identified 3 associated intronic SNPs in SLCX. I found that the SOLiD sequencing coverage varies between non-repetitive and repetitive genomic regions, and that it is highest near amplicon ends. Interestingly, coverage varies significantly between different amplicons even after repetitive sequences have been removed, which indicates that it is affected by features inherent to the sequence. Still, the observed allele frequencies for known SNPs were highly correlated with the SNP frequencies documented in HapMap. In conclusion, I verify that SNPs in FTO are associated with obesity and also identify a previously unassociated gene, SLCX, as a potential obesity gene. Re-sequencing of genomic regions on the SOLiD platform was proven to be successful for SNP identification, although the difference in sequencing coverage might be problematic.

APA, Harvard, Vancouver, ISO, and other styles

38

Jüttemann, Thomas. "Adding 3D-structural context to protein-protein interaction data from high-throughput experiments." Thesis, University of Edinburgh, 2011. http://hdl.handle.net/1842/5666.

Full text

Abstract:

In the past decade, automatisation has led to an immense increase of data in biology. Next generation sequencing techniques will produce a vast amount of sequences across all species in the coming years. In many cases, identifying the function and biological role of a protein from its sequence can be a complicated and time-intensive task. The identification of a protein's interaction partners is a tremendous help for understanding the biological context in which it is involved. In order to fully characterise a protein-protein interaction (PPIs), it is necessary to know the three-dimensional structure of the interacting partners. Despite optimisation efforts from projects such as the Protein Structure Initivative, determining the structure of a protein through crystallography remains a time- and cost-intensive procedure. The primary aim of the research described in this dissertation was to produce a World Wide Web resource that facilitates visual exploration and validation (or questioning) of data derived from functional genomics experiments, by building upon existing structural information about direct physical PPIs. Secondary aims were (i) to demonstrate the utility of the new resource, and (ii) its application in biological research. We created a database that emphasises specifically the intersection between the PPIs-results emerging from the structural biology and functional genomics communities. The BISC database holds BInary SubComplexes and Modellable Interactions in current functional genomics databases (BICS-MI). It is publicly available at hyyp://bisc.cse.ucsc.edu. BISC is divided in three sections that deliver three types of information of interest to users seeking to investigate or browse PPIs. The template section (BISCHom and BISCHet) is devoted to those PPIs that are characterised in structural detail, i.e. binary SCs extracted from experimentally determined three-dimensional structures. BISCHom and BISCHet contain the homodimeric (13,583 records) and heterodimeric (5612 records) portions of these, respectively. Besides interactive, embedded Jmol displays emphasising the interface, standard information and links are provided, e.g. sequence information and SPOP classification for both partners, interface size and energy scores (PISA). An automated launch of the MolSurfer program enables the user to investigate electrostatic and hydrophobic correlation between the partners, at the inter-molecular interface. The modellable interactions section (BISC0MI) identifies potentially modellable interactions in three major functional genomics interaction databases (BioGRID), IntAct, HPRD). To create BISC-MI all PPIs that are amenable to automated homology modelling based on conservative similarity cut-offs and whose partner protein sequences have recrods in the UniProt database, have been extracted. The modellable interaction services (BISC-MI Services) section offers, upon user request, modelled SC-structures for any PPIs in BISC-MI. This is enabled through an untomated template-based (homology) modelling protocol using the popular MODELLER program. First, a multiple sequence alignment (MSA) is generated using MUSCLE, between the target and homologous proteins collected from UniProt (only reviewed proteins from organisms whose genome has been completely sequenced are included to find putative orthologs). Then a sequence-to-profile alignment is generated to integrate the template structure in the MSA. All models are produced upon user request to ensure that the most recent sequence data for the MSAs are used. Models generated through this protocol are expected to be more accurate generally than models offered by other automated resources that rely on pairwise alignments, e.g. ModBase. Two small studies were carried out to demonstrate the usability and utility of BISC in biological research. (1) Interaction data in functional genomics databases often suffers from insufficient experimental and reporting standards. For example, multiple protein complexes are typically recorded as an inferred set of binary interactions. Using the 20S core particle of the yeast proteasome as an example, we demonstrate how the BISC Web resource can be used as a starting point for further investigation of such inferred interactions. (2) Malaria, a mosquito-borne disease, affects 3500-500 million people worldwide. Still very little is known about the malarial parasites' genes and their protein functions. For Plasmodium falciparum, the most lethal among the malaria parasites, only one experimentally derived medium scale PPIs set is available. The validity of this set has been doubted in the the malarial biologist community. We modelled and investigated eleven binary interactions from this set using the BISC modelling pipeline. Alongside we compared the BISC models of the individual partners to those obtained from ModBase.

APA, Harvard, Vancouver, ISO, and other styles

39

Buchser, William James. "Functional Genomics: Phenotypic Screening of Regeneration Associated Genes in Central Nervous System Neurons." Scholarly Repository, 2009. http://scholarlyrepository.miami.edu/oa_dissertations/278.

Full text

Abstract:

Adult mammalian central nervous system (CNS) neurons are unable to extend axons after injury, partially owing to the inhibitory myelin and chondroitin sulfate proteoglycans (CSPGs) present in the environment. A neuron's intrinsic state is also important for determining its regenerative potential. Peripheral nervous system (PNS) neurons, unlike their CNS counterparts, have increased ability to regrow their axons after injury, even in the presence of inhibitory molecules. With the goal of discovering novel regeneration associated genes, we have isolated the genes differentially expressed by PNS neurons. We then developed a high throughput neuronal transfection method to test whether these genes were sufficient to modify neurite growth in vitro. Using high content screening, we measured the ability of cerebellar neurons to initiate neurite outgrowth on inhibitory and permissive substrates. This combination of technologies (subtractive hybridization, microarray, high throughput electroporation and high content screening) allowed phenotypic examination of neurons after the overexpression of over a thousand genes. Additionally, kinases and phosphatases were assayed for their ability to modify neurite outgrowth in hippocampal neurons. Results from both of these large unbiased screens confirmed many of the existing candidates for neurite growth during development and regeneration. We also discovered many novel genes which promoted neurite outgrowth such as GPX3, EIF2B5, RBMX, CHKA, IRF6, and PKN2. To accurately interpret the large volume of data, new methods of analysis were performed. Finally, we developed novel techniques that took advantage of public databases to cluster genes and determine whether those clusters produced robust changes in neurite growth. In summary, we have provided a vast repository of functional data to study axon development and regeneration after injury as well as developing the tools needed to interpret that data.

APA, Harvard, Vancouver, ISO, and other styles

40

Paszkowski-Rogacz, Maciej. "Integration and analysis of phenotypic data from functional screens." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2011. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-63063.

Full text

Abstract:

Motivation: Although various high-throughput technologies provide a lot of valuable information, each of them is giving an insight into different aspects of cellular activity and each has its own limitations. Thus, a complete and systematic understanding of the cellular machinery can be achieved only by a combined analysis of results coming from different approaches. However, methods and tools for integration and analysis of heterogenous biological data still have to be developed. Results: This work presents systemic analysis of basic cellular processes, i.e. cell viability and cell cycle, as well as embryonic stem cell pluripotency and differentiation. These phenomena were studied using several high-throughput technologies, whose combined results were analysed with existing and novel clustering and hit selection algorithms. This thesis also introduces two novel data management and data analysis tools. The first, called DSViewer, is a database application designed for integrating and querying results coming from various genome-wide experiments. The second, named PhenoFam, is an application performing gene set enrichment analysis by employing structural and functional information on families of protein domains as annotation terms. Both programs are accessible through a web interface. Conclusions: Eventually, investigations presented in this work provide the research community with novel and markedly improved repertoire of computational tools and methods that facilitate the systematic analysis of accumulated information obtained from high-throughput studies into novel biological insights.

APA, Harvard, Vancouver, ISO, and other styles

41

Narayanan, Kanchana. "MAVEN: a tool for Visualization and Functional Analysis of Genome-Wide Association Studies." Cleveland, Ohio : Case Western Reserve University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=case1269455528.

Full text

Abstract:

Thesis (Master of Sciences)--Case Western Reserve University, 2010<br>Department of EECS - Computer and Information Sciences Title from PDF (viewed on 2010-05-25) Includes abstract Includes bibliographical references and appendices Available online via the OhioLINK ETD Center

APA, Harvard, Vancouver, ISO, and other styles

42

Abdullah, Gadija. "Functional analysis of miRNA regulated genes in prostate cancer as potential diagnostic molecules." University of the Western Cape, 2016. http://hdl.handle.net/11394/5648.

Full text

Abstract:

>Magister Scientiae - MSc<br>Prostate Cancer is the leading cause of cancer-related death in males in the Western world. It is a common biological disease originating from the reproductive system of the male namely, the prostate gland, usually in older patients (over the age of 50) and with a family history of this disease. The disease shows clinical aggressiveness due to genetic alterations of gene expression in prostate epithelial cells. Prostate cancer is currently diagnosed by biopsy and prostate cancer screening via the Prostate-Specific Antigen (PSA) blood test. Early detection is critical and although PSA was discovered to aid in the diagnoses of this cancer at its early stages, it has a disadvantage due to its low specificity thus causing unnecessary biopsies of healthy individuals and overtreatment of patients. Although various studies and efforts have been made to identify the ideal biomarker for prostate cancer and many even being applied to clinical use, it is still challenging and has not replaced the best-known biomarker PSA. PSA test has minimal invasive characteristics, at relatively low cost together with high sensitivity but low specificity. Biomarker discovery is a challenging process and a good biomarker has to be sensitive, specific and its test highly standardized and reproducible as well as identify risk for or diagnose a disease, assess disease severity or progression, predict prognosis or guide treatment. Computational biology plays a significant role in the discovery of new biomarkers, the analyses of disease states and the validation of potential biomarkers. Bioinformatic approaches are effective for the detection of potential micro ribonucleic acid (miRNA) in cancer. Altered miRNA expression may serve as a biomarker for cancer diagnosis and treatment. Small non-protein coding RNA, miRNA are small regulatory RNA molecules that modulate the expression of their target genes. miRNAs influence numerous cancer-relevant processes such as proliferation, cell cycle control, apoptosis, differentiation, migration and metabolism. Discovery and existence of extracellular miRNAs that circulate in the blood of cancer patients has raised the possibility that miRNAs may serve as novel diagnostic markers. Since a single miRNA is said to be able to target several mRNAs, aberrant miRNA expression is capable of disrupting the expression of several mRNAs and proteins. Biomarker discovery for prostate cancer of mRNA and miRNA expression are strongly needed to enable more accurate detection of prostate cancer, improve prediction of tumour aggressiveness and facilitate diagnosis. The aim of this project was to focus on functional analyses of genes and their protein products regulated by previously identified miRNA in prostate cancer using bioinformatics as a tool. Most proteins function in collaboration with other proteins and therefore this study further aims to identify these protein-protein interactions and the biological relevance of these interactions as it relates to Prostate cancer. Various computational databases were used such as STRING, DAVID and GeneHub-GEPIS for functional analyses of these miRNA regulated genes. The main focus was on the 21 genes regulated by several miRNAs identified in a previous study. Results from this study identified six genes; ERP44, GP1BA, IFNG, SEPT2, TNFRSF13C and TNFSF4, as possible diagnostic biomarkers for prostate cancer. These results are promising, since the targeted biomarkers would be easily detectable in bodily fluids with the Gene Ontology (GO) analysis of these gene products showing enrichment for cell surface expression. The six genes identified in silico were associated to transcription factors (TFs) to confirm regulatory control of these TFs in cancer promoting processes and more specifically prostate cancer. The CREB, E2F, Nkx3-1 and p53 TFs were discovered to be linked to the genes IFNG, GP1BA, SEPT2 and TNFRSF13C respectively. The expression of these TFs show strong association with cancer and cancer related pathways specifically prostate cancer and thus demonstrates that these genes can be assessed as possible biomarkers for prostate cancer. The prognostic and predictive values of the candidate genes were evaluated to assess their relationship to prognosis of this disease by means of several in silico prognostic databases. The results revealed expression differences for the majority of the candidate genes were not significantly sufficient to be distinguished as strong prognostic biomarkers in several prostate cancer populations. Although one marker, GP1BA was supported as having prognostic value for prostate cancer based on it's statistical pvalue in one of the prostate cancer patient datasets used. Another candidate gene SEPT2 showed promise as it has some prognostic value in the early stages of the disease. Although the results yielded, based on the in silico analysis, were not the discovery of an ideal diagnostic marker based on the set criteria in this study, further analysis using a molecular approach qRT-PCR can be considered for a detailed followup study on selected candidate genes to evaluate their roles in disease initiation and progression of prostate cancer using cell lines as well as patient samples.<br>CSIR

APA, Harvard, Vancouver, ISO, and other styles

43

Chen, Jing. "Computational Selection and Prioritization of Disease Candidate Genes." University of Cincinnati / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1211228557.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Paytuví, Gallart Andreu. "Development and application of integrative tools for the functional and structural analyses of genomes." Doctoral thesis, Universitat Autònoma de Barcelona, 2019. http://hdl.handle.net/10803/667160.

Full text

Abstract:

Des del desenvolupament de la seqüenciació de Sanger l’any 1977, els avenços tecnològics han revolucionat el camp dels òmiques. Els projectes de seqüenciació a gran escala han generat una enorme quantitat de dades que han motivat el desenvolupament d'eines bioinformàtiques per a la integració, organització i interpretació d’aquestes dades. Com que la quantitat de dades de seqüenciació produïdes a tot el món es duplica cada 7 mesos, cal millorar la seva accessibilitat, processament i interpretació. En aquest sentit, l'objectiu principal d'aquest treball és desenvolupar eines bioinformàtiques per a l'anàlisi de les característiques funcionals i estructurals dels genomes. D'una banda, la capacitat d'emmagatzematge i l'accessibilitat de les dades de seqüenciació s'ha convertit en un repte, no només per a les dades brutes, sinó també per als resultats després del processament. Aquest és el cas de la transcriptòmica, una de les òmiques més finançades actualment. Per superar les limitacions actuals sobre les bases de dades existents per als lncRNA de plantes s’ha desenvolupat Green Non-Coding (GreeNC), una de les bases de dades en línia més àmplies del camp que ha inclòs 39 plantes superiors i 6 algues, emmagatzemant d’aquesta manera més de 200,000 lncRNAs. D'altra banda, la disponibilitat d'eines de fàcils d’usar per a permetre l’anàlisi i la gestió de dades de manera eficient a gran escala ajudaria a democratitzar la bioinformàtica. Diversos programes han aparegut recentment per permetre l'anàlisi de dades RNA-seq d'una manera accessible. No obstant això, cap d'ells proporciona una solució d’extrem a extrem. En aquest context, hem aprofitat la computació al núvol per a desenvolupar una plataforma fàcil d'usar anomenada Artificial Intelligence RNA-seq (AIR). AIR és la primera solució d'extrem a extrem per a l'anàlisi de dades RNA-seq que no es limita a espècies model i que no requereix habilitats bioinformàtiques prèvies. Un cop desenvolupat, AIR s’ha validat aprofitant mostres de RNA-seq derivades de cèl·lules germinals espermatogèniques de ratolí produïdes en el nostre grup de recerca. S’ha observat un augment de la prevalença de gens no codificants durant l'espermatogènesi i el silenciament del cromosoma X. També s’han identificat gens diferencialment expressats consistents amb el desenvolupament seqüencial de l’espermatogènesi. Precisament, se sap que el genoma experimenta grans canvis en la seva organització tri-dimensional (3D) del genoma durant l'espermatogènesi. Per caracteritzar aquesta reorganització en 3D s’ha fet servir AIR i altres eines addicionals per a l'anàlisi de dades Hi-C per generar un mapa d’interaccions de la cromatina i de les característiques genòmiques funcionals de la línia germinal masculina del ratolí. Els nostres resultats han revelat patrons no descrits prèviament: (i) l'organització d’escala subcromosòmica es perd durant la profase I; (ii) l'organització d’escala supranucleosòmica es fa difusa durant l'espermatogènesi, especialment en els espermatozous; (iii) esdeveniments específics com l’agrupació de telòmers (bouquet) i la inactivació del cromosoma X han estat observats; (iv) conformacions obertes específiques de cada tipus cel·lular s’han correlacionat amb l'expressió de gens amb funcions rellevants. En general, s’han desenvolupat noves solucions bioinformàtiques per a millorar l'accessibilitat, el processament i la interpretació de les dades òmiques que han permès l’anàlisi de les característiques funcionals i estructurals dels genomes.<br>Since the development of the Sanger sequencing in 1977, technological advances have revolutionized the -omics field. Large-scale sequencing projects have resulted in the generation of an enormous amount of data that have motivated the development of bioinformatics tools for its integration, organization and interpretation. Due to the fact that the amount of sequencing data produced worldwide doubles every 7 months, there is the need to improve data accessibility, processing and interpretation. In this sense, the main aim of this work is to develop bioinformatics tools for the analysis of the functional and structural characteristics of genomes. On the one hand, storage capacity and accessibility of -omics data has become a challenge, not only for raw data but also for post-processing results. And this is the case for transcriptomics, one of the most funded -omics. In order to overcome current limitations on the existing databases for plant lncRNAs, we developed Green Non-Coding (GreeNC), one of the most comprehensive online databases in the field that included 39 plant species and 6 algae, representing more than 200,000 lncRNAs. On the other hand, the availability of user-friendly tools to ensure feasible large-scale data analysis and management would help to democratize bioinformatics. Several software have recently emerged to allow the analysis of RNA-seq data in an accessible way. However, none of them provides an end-to-end solution. In this context, we took advantage of cloud computing to develop a cloud-based easy-to-use platform called Artificial Intelligence RNA-seq (AIR). AIR is the first end-to-end solution for the analysis of RNA-seq data that is not limited to model species and does not require previous bioinformatics skills. Once developed, we validated AIR taking advantage of RNA-seq samples derived from mouse spermatogenic germ cells produced in our research group. We observed an increase in the prevalence of non-coding genes during spermatogenesis and detected silencing of the X chromosome. We also identified differentially expressed genes that were consistent with the sequential development of spermatogenesis. Precisely, it is known that the genome undergoes large three-dimensional (3D) conformational changes during spermatogenesis. To characterize such 3D re-organization, we made use of AIR and additional tools for Hi-C data analysis to generate an integrative atlas of the chromatin interactions and functional genomic characteristics of the mouse male germ line. Our results revealed previously undescribed patterns: (i) the sub-chromosomal organization scale is lost during prophase I, (ii) the sub-megabase organization scale becomes diffuse along spermatogenesis especially in sperm, (iii) specific events such as the telomere bouquet and the X chromosome inactivation were observed, and (iv) cell-specific open conformations correlated with the expression of genes with relevant functional roles. Overall, we have developed new bioinformatics solutions to enhance accessibility, processing and interpretation of -omics data that permitted the analysis of functional and structural features of genomes.

APA, Harvard, Vancouver, ISO, and other styles

45

Han, Nam Shik. "Systematic approaches for modelling and visualising responses to perturbation of transcriptional regulatory networks." Thesis, University of Manchester, 2013. https://www.research.manchester.ac.uk/portal/en/theses/systematic-approaches-for-modelling-and-visualising-responses-to-perturbation-of-transcriptional-regulatory-networks(3f4cf115-3b68-457f-8fd6-0f7609d5b9bc).html.

Full text

Abstract:

One of the greatest challenges in modern biology is to understand quantitatively the mechanisms underlying messenger Ribonucleic acid (mRNA) transcription within the cell. To this end, integrated functional genomics attempts to use the vast wealth of data produced by modern large scale genomic projects to understand how the genome is deployed to create a diversity of tissues and species. The expression levels of tens or hundreds of thousands genes are profiled at multiple time points or different experimental conditions in the genomic projects. The profiling results are deposited in large scale quantitative data files that are not possible to analyse without systematic computational methods. In particular, it is much more difficult to experimentally measure the concentration level of transcription factor proteins and their affinity for the promoter region of genes, while it is relatively easy to measure the result of transcription using experimental techniques such as microarrays. In the absence of such biological experiments, it becomes necessary to use in silico techniques to determine the transcription factor regulatory activities given existing gene expression profile data. It therefore presents significant challenges and opportunities to the computer science community. This PhD Project made use of one such in silico technique to determine the differences (if any) in transcription factor regulatory activities of different experimental conditions and time points.The research aim of the Project was to understand the transcriptional regulatory mechanism that controls the sophisticated process of gene expression in cells. In particular, differences in the downstream signalling from which transcription factors can play a role in predisposition to diseases such as Parasitic disease, Cancer, and Neuroendocrine disease. To address this question I have had access to large integrated genomics datasets generated in studies on parasitic disease, lung cancer, and endocrine (hormone) disease. The current state-of-the-art takes existing knowledge and asks "How do these data relate to what we already know?" By applying machine learning approaches the project explored the role that such data can play in uncovering new biological knowledge.

APA, Harvard, Vancouver, ISO, and other styles

46

Bieliková, Michaela. "Bioinformatický nástroj pro odhad abundance bakteriálních funkčních molekul v biologických vzorcích na základě metagenomických dat 16S rRNA." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403164.

Full text

Abstract:

Ľudské telo je prostredím pre život neuveriteľného množstva mikróbov. Niektoré z nich môžu spôsobovať rôzne choroby, ale ďalšie, napríklad črevný mikrobióm, sú pre život a zdravie človeka nepostrádateľné. Nanešťastie, črevný mikrobióm nie je detailne preštudovaný, pretože obsahuje tisíce rôznych druhov baktérií, z ktorých väčšina sa nedá kultivovať v laboratórnych podmienkach. Riešením tohto problému sú nové rýchle metódy sekvenovania v kombináciou s bioinformatickými nástrojmi na výpočet funkčného profilu baktérií vo vzorke. V tejto práci si predstavíme existujúce nástroje predpovedajúce funkčný profil, a následne navrhneme nový nástroj, ktorý môže implementovať konsenzus nad výsledkami existujúcich nástrojov, alebo sa môže jednať o úplne nový nástroj.

APA, Harvard, Vancouver, ISO, and other styles

47

Hvidsten, Torgeir R. "Predicting Function of Genes and Proteins from Sequence, Structure and Expression Data." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis : Univ.-bibl. [distributör], 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-4490.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Ferrer, Samuel. "STAIRS : Data reduction strategy on genomics." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-383465.

Full text

Abstract:

Background. An enormous accumulation of genomic data has been taking place over the last ten years. This makes the activities of visualization and manual inspection, key steps in trying to understand large datasets containing DNA sequences with millions of letters. This situation has created a gap between data complexity and qualified personnel due to the need of trading between visualization, reduction capacity and exploratory functions, features rarely achieved by existing tools, such as SRA toolkit (https://www.ncbi.nlm.nih.gov/sra/docs/toolkitsoft/), for instance. A novel approach to the problem of genomic analysis and visualization was pursued in this project, by means of STrAtified Interspersed Reduction Structures (STAIRS). Result. Ten weeks of intense work resulted in novel algorithms to compress data, transform it into stairs vectors and align them. Smith–Waterman and Needleman–Wunsch algorithms have been specially modified for this purpose and the application brought about statistical performance and behavioural charts.

APA, Harvard, Vancouver, ISO, and other styles

49

Cubuk, Cankut. "Modeling Functional Modules Using Statistical and Machine Learning Methods." Doctoral thesis, Universitat Politècnica de València, 2020. http://hdl.handle.net/10251/156175.

Full text

Abstract:

[ES] La comprensión de los aspectos de la funcionalidad de las células que cuentan para los mecanismos de las enfermedades es el mayor reto de la medicina personalizada. A pesar de la disponibilidad creciente de los datos de genómica y transcriptómica, sigue existiendo una notable brecha entre la detección de las perturbaciones en la expresión de genes y la comprensión de su contribución en los mecanismos moleculares que últimamente tienen relación importante con el fenotipo estudiado. A lo largo de la última década, distintos modelos computacionales y matemáticos se han propuesto para el análisis de las rutas. Sin embargo, estos modelos no toman en cuenta los mecanismos dinámicos de las rutas como la estructura y las interacciones entre genes y proteínas. En esta tesis doctoral, presento dos modelos matemáticos ligeramente distintos, para integrar los datos transcriptómicos masivos de humano con un conocimiento previo de de las rutas de señalización y metabólicas para estimar las actividades mecánicas que están detrás de esas rutas (MPAs). Las MPAs son variables continuas con valores de nivel individual que pueden ser usadas con los modelos de aprendizaje de máquinas y métodos estadísticos para determinar los biomarcadores que podemos usar para los diagnósticos tempranos y la clasificación de subtipos de enfermedades, además de poder sugerir las dianas terapéuticas potenciales para las intervenciones individualizadas. El objetivo global es desarrollar nuevos y avanzados enfoques de la biología de sistemas para proponer unas hipótesis funcionales que nos ayuden a entender e interpretar los mecanismos complejos de las enfermedades. Estos mecanismos son cruciales para mejorar los tratamientos personalizados y predecir los resultados clínicos. En primer lugar, contribuí al desarrollo de un método que está diseñado para extraer las subrutas elementales desde la ruta de señalización con sus actividades estimadas. Posteriormente, este algoritmo se ha adaptado a los módulos metabólicos y se ha implementado como una herramienta web. Finalmente , el método ha revelado un panorama metabólico para una lista completa de diferentes tipos de cánceres. En este estudio, analicé el perfil metabólico de 25 tipos de cáncer distintos y se validó el método usando varios enfoques computacionales y experimentales. Cada método desarrollado en esta tesis ha sido enfrentado a otros métodos similares existentes, evaluados por sus sensibilidades y especificidades, experimentalmente validados cuando fue posible y usados para predecir resultados clínicos de varios tipos de cánceres. La investigación descrita en esta tesis y los resultados obtenidos fueron publicados en distintas revistas arbitradas que están relacionadas con el cáncer y biología de sistemas, y también en los periódicos nacionales.<br>[CA] La comprensió dels aspectes de la funcionalitat de les cèl·lules que compten per als mecanismes de les malalties és el major repte de la medicina personalitzada. Malgrat la disponibilitat creixent de les dades de genòmica i transcriptómica, continua existint una notable bretxa entre la detecció de les pertorbacions en l'expressió de gens i la comprensió de la seua contribució en els mecanismes moleculars que últimament tenen relació important amb el fenotip estudiat. Al llarg de l'última dècada, diferents models computacionals i matemàtics s'han proposat per a l'anàlisi de les rutes. No obstant això, aquests models no tenen en compte els mecanismes dinàmics de les rutes com l'estructura i les interaccions entre gens i proteïnes. En aquesta tesi doctoral, presente dos models matemàtics lleugerament diferents, per a integrar les dades transcriptómicos massius d'humà amb un coneixement previ de de les rutes de senyalització i metabòliques per a estimar les activitats mecàniques que estan darrere d'aqueixes rutes (MPAs). Les MPAs són variables contínues amb valors de nivell individual que poden ser usades amb els models d'aprenentatge de màquines i mètodes estadístics per a determinar els biomarcadores que podem usar per als diagnòstics primerencs i la classificació de subtipus de malalties, a més de poder suggerir les dianes terapèutiques potencials per a les intervencions individualitzades. L'objectiu global és desenvolupar nous i avançats enfocaments de la biologia de sistemes per a proposar unes hipòtesis funcionals que ens ajuden a entendre i interpretar els mecanismes complexos de les malalties. Aquests mecanismes són crucials per a millorar els tractaments personalitzats i predir els resultats clínics. En primer lloc, vaig contribuir al desenvolupament d'un mètode que està dissenyat per a extraure les subrutas elementals des de la ruta de senyalització amb les seues activitats estimades. Posteriorment, aquest algorisme s'ha adaptat als mòduls metabòlics i s'ha implementat com una eina web. Finalment, el mètode ha revelat un panorama metabòlic per a una llista completa de diferents tipus de càncers. En aquest estudi, vaig analitzar el perfil metabòlic de 25 tipus de càncer diferents i es va validar el mètode usant diversos enfocaments computacionals i experimentals. Cada mètode desenvolupat en aquesta tesi ha sigut enfrontat a altres mètodes similars existents, avaluats per les seues sensibilitats i especificitats, experimentalment validats quan va ser possible i usats per a predir resultats clínics de diversos tipus de càncers. La investigació descrita en aquesta tesi i els resultats obtinguts van ser publicats en diferents revistes arbitrades que estan relacionades amb el càncer i biologia de sistemes, i també en els periòdics nacionals.<br>[EN] Understanding the aspects of the cell functionality that account for disease or drug action mechanisms is the main challenge for precision medicine. In spite of the increasing availability of genomic and transcriptomic data, there is still a gap between the detection of perturbations in gene expression and the understanding of their contribution to the molecular mechanisms that ultimately account for the phenotype studied. Over the last decade, different computational and mathematical models have been proposed for pathway analysis. However, they are not taking into account the dynamic mechanisms contained by pathways as represented in their layout and the interactions between genes and proteins. In this thesis, I present two slightly different mathematical models to integrate human transcriptomic data with prior knowledge of signalling and metabolic pathways to estimate the Mechanistic Pathway Activities (MPAs). MPAs are continuous and individual level values that can be used with machine learning and statistical methods to determine biomarkers for the early diagnosis and subtype classification of the diseases, and also to suggest potential therapeutic targets for individualized therapeutic interventions. The overall objective is, developing new and advanced systems biology approaches to propose functional hypotheses that help us to understand and interpret the complex mechanism of the diseases. These mechanisms are crucial for robust personalized drug treatments and predict clinical outcomes. First, I contributed to the development of a method which is designed to extract elementary sub-pathways from a signalling pathway and to estimate their activity. Second, this algorithm adapted to metabolic modules and it is implemented as a webtool. Third, the method used to reveal a pan-cancer metabolic landscape. In this study, I analyzed the metabolic module profile of 25 different cancer types and the method is also validated using different computational and experimental approaches. Each method developed in this thesis was benchmarked against the existing similar methods, evaluated for their sensitivity and specificity, experimentally validated when it is possible and used to predict clinical outcomes of different cancer types. The research described in this thesis and the results obtained were published in different systems biology and cancer-related peer-reviewed journals and also in national newspapers.<br>Cubuk, C. (2020). Modeling Functional Modules Using Statistical and Machine Learning Methods [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/156175<br>TESIS

APA, Harvard, Vancouver, ISO, and other styles

50

López, Ferrando Víctor. "Functional characterization of single amino acid variants." Doctoral thesis, Universitat de Barcelona, 2019. http://hdl.handle.net/10803/668545.

Full text

Abstract:

Single amino acid variants (SAVs) are one of the main causes of Mendelian disorders, and play an important role in the development of many complex diseases. At the same time, they are the most common kind of variation affecting coding DNA, without generally presenting any damaging effect. With the advent of next generation sequencing technologies, the detection of these variants in patients and the general population is easier than ever, but the characterization of the functional effects of each variant remains an open challenge. It is our objective in this work to tackle this problem by developing machine learning based in silico SAVs pathology predictors. Having the PMut classic predictor as a starting point, we have rethought the entire supervised learning pipeline, elaborating new training sets, features and classifiers. PMut2017 is the first result of these efforts, a new general-purpose predictor based on SwissVar and trained on 12 different conservation scores. Its performance, evaluated bothby cross-validation and different blind tests, was in line with the best predictors published to date. Continuing our efforts in search for more accurate predictors, especially for those cases were general predictors tend to fail, we developed PMut-S, a suite of 215 protein-specific predictors. Similar to PMut in nature, Pmut-S introduced the use of co-evolution conservation features and balanced training sets, and showed improved performance, specially for those proteins that were more commonly misclassified by PMut. Comparing PMut-S to other specific predictors we proved that it is possible to train specific predictors using a unique automated pipeline and match the results of most gene specific predictors released to date. The implementation of the machine learning pipeline of both PMut and PMut-S was released as an open source Python module: PyMut, which bundles functions implementing the features computation and selection, classifier training and evaluation, plots drawing, among others. Their predictions were also made available in a rich web portal, which includes a precomputed repository with analyses of more than 700 million variants on over 100,000 human proteins, together with relevant contextual information such as 3D visualizationsof protein structures, links to databases, functional annotations, and more.<br>Les mutacions puntuals d’aminoàcids són la principal causa de moltes malalties mendelianes, i juguen un paper important en el desenvolupament de moltes malalties complexes. Alhora, són el tipus de variant més comuna que afecta l’ADN codificant de proteïnes, sense provocar, en general, cap efecte advers. Amb l’adveniment de la seqüenciació de nova generació, la detecció d’aquestes variants en pacients i en la població general és més fàcil que mai, però la caracterització dels efectes funcionals de cada variant segueix sent un repte. El nostre objectiu en aquest treball és abordar aquest problema desenvolupant predictors de patologia in silico basats en l’aprenentatge automàtic. Prenent el predictor clàssic PMut com a punt de partida, hem repensat tot el procés d’aprenentatge supervisat, elaborant nous conjunts d’entrenament, descriptors i classificadors. PMut2017 és el primer resultat d’aquests esforços, un nou predictor basat en SwissVar i entrenat amb 12 mètriques de conservació de seqüència. La seva precisió, mesurada mitjançant validació creuada i amb tests cecs, s’ha mostrar en línia amb els millors predictors publicats a dia d’avui. Continuant els nostres esforços en la cerca de predictors més acurats, hem desenvolupat PMut-S, un conjunt de 215 predictors específics per cada proteïna. Similar a PMut en la seva concepció, PMut-S introdueix l’ús de descriptors basats en la coevolució i conjunts d’entrenament balancejats, millorant el rendiment de PMut2017 en 0.1 punts del coeficient de correlació de Matthews. Comparant PMut-S a d’altres predictors específics hem provat que és possible entrenar predictors específics seguint un únic procediment automatitzat i assolir uns resultats tan bon com els de la majoria de predictors específics publicats. La implementació del procediment d’aprenentatge automàtic tant de PMut com de PMut-S ha sigut publicat com a un mòdul de Python de codi obert: PyMut, el qual inclou les funcions que implementen el càlcul dels descriptors i la seva selecció, l’entrenament i avaluació dels classificadors, el dibuix de diverses gràfiques... Les prediccions també estan disponibles en un portal web que inclou un repositori precalculat amb els anàlisis de més de 700 milions de variants en més de 100 mil proteïnes humanes, junt a rellevant informació de context com visualitzacions 3D de les proteïnes, enllaços a bases de dades, anotacions funcionals i molt més.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!